Linux Signal Handle

1 Linux signal handle

  • 信号接收
  • SIGSTOP流程
  • SIGKILL流程
  • SIGSEGV流程

1.1 man pages

2 信号基本概念以及定义

2.1 Signal设计目的

  • siganl提供一个基础的异步通知机制而设计
  • signal是一种IPC手段

与其他IPC手段不同的是,不需要专门的线程block住去等待消息

2.2 1. Linux 信号定义

include/uapi/asm-generic/signal.h

#define _NSIG       64
#define _NSIG_BPW   __BITS_PER_LONG
#define _NSIG_WORDS (_NSIG / _NSIG_BPW)

#define SIGHUP       1
#define SIGINT       2
#define SIGQUIT      3
#define SIGILL       4
#define SIGTRAP      5
#define SIGABRT      6
#define SIGIOT       6
#define SIGBUS       7
#define SIGFPE       8
#define SIGKILL      9
#define SIGUSR1     10
#define SIGSEGV     11
#define SIGUSR2     12
#define SIGPIPE     13
#define SIGALRM     14
#define SIGTERM     15
#define SIGSTKFLT   16
#define SIGCHLD     17
#define SIGCONT     18
#define SIGSTOP     19
#define SIGTSTP     20
#define SIGTTIN     21
#define SIGTTOU     22
#define SIGURG      23
#define SIGXCPU     24
#define SIGXFSZ     25
#define SIGVTALRM   26
#define SIGPROF     27
#define SIGWINCH    28
#define SIGIO       29
#define SIGPOLL     SIGIO
/*
#define SIGLOST     29
*/
#define SIGPWR      30
#define SIGSYS      31
#define SIGUNUSED   31

/* These should not be considered constants from userland.  */
#define SIGRTMIN    32
#ifndef SIGRTMAX
#define SIGRTMAX    _NSIG
#endif
  • 内核相关结构体

    Signal 内部数据结构

    struct task_struct {
      /* Signal handlers: */
      struct signal_struct      *signal;    //同一线程组共有的sigpending链表
      struct sighand_struct     *sighand;
      sigset_t          blocked;
      sigset_t          real_blocked;
      /* Restored if set_restore_sigmask() was used: */
      sigset_t          saved_sigmask;
      struct sigpending     pending;        //私有的sigpending链表
      unsigned long         sas_ss_sp;
      size_t                sas_ss_size;
      unsigned int          sas_ss_flags;
    };
    
    struct signal_struct {
        atomic_t        sigcnt;
        atomic_t        live;
        int         nr_threads;
        struct list_head    thread_head;
    
        wait_queue_head_t   wait_chldexit;  /* for wait4() */
      /* shared signal handling: */
        struct sigpending   shared_pending;
    
        /* thread group exit support */
        int         group_exit_code;
        struct task_struct  *group_exit_task;
        struct rlimit rlim[RLIM_NLIMITS];
    };
    
    struct sigpending {
        struct list_head list;
        sigset_t signal;
    };
    
    struct sigqueue {
        struct list_head list;
        int flags;
        siginfo_t info;
        struct user_struct *user;
    };
    
    struct sighand_struct {
        atomic_t        count;
        struct k_sigaction  action[_NSIG];
        spinlock_t      siglock;
        wait_queue_head_t   signalfd_wqh;
    };
    

    http://on-img.com/chart_image/5b50a492e4b0be50eab7a65a.png?_=1532930354799
    > sigset_t: bitmap for signal state

3 信号发送

  1. raise(3)
    发送信号给当前线程
  2. kill(2)
    发送给特定进程,进程组,或全部进程
  3. killpg(3)
    发送给进程组
  4. pthread_kill(3)
    发送给指定线程
  5. tgkill(2)
    发送给指定线程,通常用来实现pthread_kill
  6. sigqueue(3)
    发送信号给指定进程,可以携带一个int,或者指针类型数据。
    sigqueue编程示例

3.1 内核中信号发送流程

linux-send-signal-process.png

不管从哪条路径发送信号,最终入口都是__send_signal

3.1.1 alloc sigqueue 结构题

注: alloc失败时,内核向进程发送的信号可以顺利发送 ### task选择 ###
complete_signal函数 1. 优先给主线程 2. 在所有线程中查找可以注册的线程
### 在加入信号链表,设置对应的bitmap之后返回 ###

4 信号接收

  1. sig_action 设置信号处理函数
  2. sigwait 同步等待信号
  3. sigsuspend 同步等待信号,仅一次
  4. sigblock 阻塞信号
  5. siginterrupt 更改restart_systemcall行为,默认false(0)
  6. sigpause 废弃,用sigsuspend

4.1 信号处理途径

  • Kernel handler
  • 如果进程没有实现信号处理函数,则由内核默认处理函数处理
  • 部分信号(SIGSTOP,SIGKILL)用户进程无权设置处理函数,也不能block
  • Process defined handler
  • 如果设置了信号处理函数,则可以跳转到自己处理函数执行
  • Ignore
  • 进程设置忽略信号

4.1.1 Kernel handler

  • Ignore
  • Terminate
  • Coredump
  • Stop
“ +--------------------+------------------+
 * | POSIX signal     | default action |
 * +------------------+------------------+
 * | SIGHUP           | terminate
 * | SIGINT           | terminate
 * | SIGQUIT          | coredump
 * | SIGILL           | coredump
 * | SIGTRAP          | coredump
 * | SIGABRT/SIGIOT   | coredump
 * | SIGBUS           | coredump
 * | SIGFPE           | coredump
 * | SIGKILL          | terminate
 * | SIGUSR1          | terminate
 * | SIGSEGV          | coredump
 * | SIGUSR2          | terminate
 * | SIGPIPE          | terminate
 * | SIGALRM          | terminate
 * | SIGTERM          | terminate
 * | SIGCHLD          | ignore
 * | SIGCONT          | ignore
 * | SIGSTOP          | stop
 * | SIGTSTP          | stop
 * | SIGTTIN          | stop
 * | SIGTTOU          | stop
 * | SIGURG           | ignore
 * | SIGXCPU          | coredump
 * | SIGXFSZ          | coredump
 * | SIGVTALRM        | terminate
 * | SIGPROF          | terminate
 * | SIGPOLL/SIGIO    | terminate
 * | SIGSYS/SIGUNUSED | coredump
 * | SIGSTKFLT        | terminate
 * | SIGWINCH         | ignore
 * | SIGPWR           | terminate
 * | SIGRTMIN-SIGRTMAX| terminate
 * +------------------+------------------+
 * | non-POSIX signal | default action |
 * +------------------+------------------+
 * | SIGEMT           | coredump |
 * +--------------------+------------------+”

摘录来自: Raghu Bharadwaj. "Mastering Linux Kernel Development: A
kernel developer's reference manual。" iBooks.

4.1.2 Linux signal process flow

linux-kill-process.png

4.1.3 Process defined handler

user-signal-handle.png >
摘录来自: Raghu Bharadwaj. "Mastering Linux Kernel Development: A kernel
developer's reference manual。" iBooks.

4.2 do_signal_stop流程 (SIGSTOP)

main with flags:JOBCTL_STOP_PENDING, group_stop_count is threads thread1
wakeup with JOBCTL_STOP_DEQUEUED thread2 wakeup with
JOBCTL_STOP_DEQUEUED do_notify_parent_cldstop //last one send this
signal

linux-do_signal_stop.png
linux-singal-d-state-exit.png

4.3 kill流程 (SIGKILL)

linux-SIGKILL-process.png

4.4 SEGV流程

4.4.1 异常处理表

static const struct fault_info fault_info[] = {
    { do_bad,       SIGKILL, SI_KERNEL, "ttbr address size fault"   },
    { do_bad,       SIGKILL, SI_KERNEL, "level 1 address size fault"    },
    { do_bad,       SIGKILL, SI_KERNEL, "level 2 address size fault"    },
    { do_bad,       SIGKILL, SI_KERNEL, "level 3 address size fault"    },
    { do_translation_fault, SIGSEGV, SEGV_MAPERR,   "level 0 translation fault" },
    { do_translation_fault, SIGSEGV, SEGV_MAPERR,   "level 1 translation fault" },
    { do_translation_fault, SIGSEGV, SEGV_MAPERR,   "level 2 translation fault" },
    { do_translation_fault, SIGSEGV, SEGV_MAPERR,   "level 3 translation fault" },
    { do_bad,       SIGKILL, SI_KERNEL, "unknown 8"         },
    { do_page_fault,    SIGSEGV, SEGV_ACCERR,   "level 1 access flag fault" },
    { do_page_fault,    SIGSEGV, SEGV_ACCERR,   "level 2 access flag fault" },
    { do_page_fault,    SIGSEGV, SEGV_ACCERR,   "level 3 access flag fault" },
    { do_bad,       SIGKILL, SI_KERNEL, "unknown 12"            },
    { do_page_fault,    SIGSEGV, SEGV_ACCERR,   "level 1 permission fault"  },
    { do_page_fault,    SIGSEGV, SEGV_ACCERR,   "level 2 permission fault"  },
    { do_page_fault,    SIGSEGV, SEGV_ACCERR,   "level 3 permission fault"  },
    { do_sea,       SIGBUS,  BUS_OBJERR,    "synchronous external abort"    },
    { do_bad,       SIGKILL, SI_KERNEL, "unknown 17"            },
  ...
};
static void do_bad_area(unsigned long addr, unsigned int esr, struct pt_regs *regs)
{
    /*
     * If we are in kernel mode at this point, we have no context to
     * handle this fault with.
     */
    if (user_mode(regs)) {
        const struct fault_info *inf = esr_to_fault_info(esr);
        struct siginfo si = {
            .si_signo   = inf->sig,
            .si_code    = inf->code,
            .si_addr    = (void __user *)addr,
        };

        __do_user_fault(&si, esr);
    } else {
        __do_kernel_fault(addr, esr, regs);
    }
}

static void __do_user_fault(struct siginfo *info, unsigned int esr)
{
  ...
  arm64_force_sig_info(info, esr_to_fault_info(esr)->name, current);
}

4.4.2 tomestoned进程

system/core/debuggerd/tombstoned/tombstoned.rc

service tombstoned /system/bin/tombstoned
    user tombstoned
    group system

    # Don't start tombstoned until after the real /data is mounted.
    class late_start

    socket tombstoned_crash seqpacket 0666 system system
    socket tombstoned_intercept seqpacket 0666 system system
    socket tombstoned_java_trace seqpacket 0666 system system
    writepid /dev/cpuset/system-background/tasks

4.4.3 信号处理函数

for android N
Android进程Crash处理流程
for android O

/*
 * This code is called after the linker has linked itself and
 * fixed it's own GOT. It is safe to make references to externs
 * and other non-local data at this point.
 */
static ElfW(Addr) __linker_init_post_relocation(KernelArgumentBlock& args) {
  ProtectedDataGuard guard;
  ...
#ifdef __ANDROID__
  debuggerd_callbacks_t callbacks = {
    .get_abort_message = []() {
      return g_abort_message;
    },
    .post_dump = &notify_gdb_of_libraries,
  };
  debuggerd_init(&callbacks);
#endif
  g_linker_logger.ResetState();
  ...
}

// Handler that does crash dumping by forking and doing the processing in the child.
// Do this by ptracing the relevant thread, and then execing debuggerd to do the actual dump.
static void debuggerd_signal_handler(int signal_number, siginfo_t* info, void* context) {
  ...
  debugger_thread_info thread_info = {
    .crash_dump_started = false,
    .pseudothread_tid = -1,
    .crashing_tid = __gettid(),
    .signal_number = signal_number,
    .info = info
  };

  // Set PR_SET_DUMPABLE to 1, so that crash_dump can ptrace us.
  int orig_dumpable = prctl(PR_GET_DUMPABLE);
  if (prctl(PR_SET_DUMPABLE, 1) != 0) {
    fatal_errno("failed to set dumpable");
  }

  // Essentially pthread_create without CLONE_FILES (see debuggerd_dispatch_pseudothread).
  pid_t child_pid =
    clone(debuggerd_dispatch_pseudothread, pseudothread_stack,
          CLONE_THREAD | CLONE_SIGHAND | CLONE_VM | CLONE_CHILD_SETTID | CLONE_CHILD_CLEARTID,
          &thread_info, nullptr, nullptr, &thread_info.pseudothread_tid);
  if (child_pid == -1) {
    fatal_errno("failed to spawn debuggerd dispatch thread");
  }
  // Wait for the child to start...
  futex_wait(&thread_info.pseudothread_tid, -1);

  // and then wait for it to finish.
  futex_wait(&thread_info.pseudothread_tid, child_pid);
}

static int debuggerd_dispatch_pseudothread(void* arg) {
  debugger_thread_info* thread_info = static_cast<debugger_thread_info*>(arg);

  for (int i = 0; i < 1024; ++i) {
    close(i);
  }

  int devnull = TEMP_FAILURE_RETRY(open("/dev/null", O_RDWR));

  // devnull will be 0.
  TEMP_FAILURE_RETRY(dup2(devnull, STDOUT_FILENO));
  TEMP_FAILURE_RETRY(dup2(devnull, STDERR_FILENO));

  int pipefds[2];
  if (pipe(pipefds) != 0) {
    fatal_errno("failed to create pipe");
  }

  // Don't use fork(2) to avoid calling pthread_atfork handlers.
  int forkpid = clone(nullptr, nullptr, 0, nullptr);
  if (forkpid == -1) {
    async_safe_format_log(ANDROID_LOG_FATAL, "libc",
                          "failed to fork in debuggerd signal handler: %s", strerror(errno));
  } else if (forkpid == 0) {
    TEMP_FAILURE_RETRY(dup2(pipefds[1], STDOUT_FILENO));
    close(pipefds[0]);
    close(pipefds[1]);

    raise_caps();

    char main_tid[10];
    char pseudothread_tid[10];
    char debuggerd_dump_type[10];
    async_safe_format_buffer(main_tid, sizeof(main_tid), "%d", thread_info->crashing_tid);
    async_safe_format_buffer(pseudothread_tid, sizeof(pseudothread_tid), "%d",
                             thread_info->pseudothread_tid);
    async_safe_format_buffer(debuggerd_dump_type, sizeof(debuggerd_dump_type), "%d",
                             get_dump_type(thread_info));

    execl(CRASH_DUMP_PATH, CRASH_DUMP_NAME, main_tid, pseudothread_tid, debuggerd_dump_type,
          nullptr);

    fatal_errno("exec failed");
  } else {
    close(pipefds[1]);
    char buf[4];
    ssize_t rc = TEMP_FAILURE_RETRY(read(pipefds[0], &buf, sizeof(buf)));
    if (rc == -1) {
      async_safe_format_log(ANDROID_LOG_FATAL, "libc", "read of IPC pipe failed: %s",
                            strerror(errno));
    } else if (rc == 0) {
      async_safe_format_log(ANDROID_LOG_FATAL, "libc", "crash_dump helper failed to exec");
    } else if (rc != 1) {
      async_safe_format_log(ANDROID_LOG_FATAL, "libc",
                            "read of IPC pipe returned unexpected value: %zd", rc);
    } else {
      if (buf[0] != '\1') {
        async_safe_format_log(ANDROID_LOG_FATAL, "libc", "crash_dump helper reported failure");
      } else {
        thread_info->crash_dump_started = true;
      }
    }
    close(pipefds[0]);

    // Don't leave a zombie child.
    int status;
    if (TEMP_FAILURE_RETRY(waitpid(forkpid, &status, 0)) == -1) {
      async_safe_format_log(ANDROID_LOG_FATAL, "libc", "failed to wait for crash_dump helper: %s",
                            strerror(errno));
    } else if (WIFSTOPPED(status) || WIFSIGNALED(status)) {
      async_safe_format_log(ANDROID_LOG_FATAL, "libc", "crash_dump helper crashed or stopped");
      thread_info->crash_dump_started = false;
    }
  }

  syscall(__NR_exit, 0);
  return 0;
}

clone参数 clone(debuggerd_dispatch_pseudothread, pseudothread_stack,
CLONE_THREAD | CLONE_SIGHAND | CLONE_VM | CLONE_CHILD_SETTID |
CLONE_CHILD_CLEARTID, &thread_info, nullptr, nullptr,
&thread_info.pseudothread_tid); //
http://androidxref.com/9.0.0_r3/xref/bionic/libc/bionic/pthread_create.cpp#302
int flags = CLONE_VM | CLONE_FS | CLONE_FILES | CLONE_SIGHAND |
CLONE_THREAD | CLONE_SYSVSEM | CLONE_SETTLS | CLONE_PARENT_SETTID |
CLONE_CHILD_CLEARTID;

5 Signal调试

5.1 tracing event

/d/tracing/events/signal/signal_generate
/d/tracing/events/signal/signal_deliver

remote_job_disp-24083 [000] d..2  5497.143322: signal_deliver: sig=9 errno=0 code=0 sa_handler=0 sa_flags=0
ActivityManager-7834  [001] d..2  5497.171845: signal_generate: sig=9 errno=0 code=0 comm=id.printspooler pid=25155 grp=1 res=0
   FileObserver-25196 [000] d..3  5497.176538: signal_generate: sig=17 errno=0 code=262146 comm=main pid=7514 grp=1 res=0
           main-7514  [002] d..2  5497.176804: signal_deliver: sig=17 errno=0 code=262146 sa_handler=7f836cfef8 sa_flags=0
ActivityManager-7834  [001] d..2  5497.222412: signal_generate: sig=9 errno=0 code=0 comm=rsonalassistant pid=24800 grp=1 res=0
ActivityManager-7834  [001] d..2  5497.227639: signal_generate: sig=9 errno=0 code=0 comm=rsonalassistant pid=24800 grp=1 res=2
  Profile Saver-24878 [000] d..3  5497.229721: signal_generate: sig=17 errno=0 code=262146 comm=main pid=717 grp=1 res=0
           main-717   [001] d..2  5497.230300: signal_deliver: sig=17 errno=0 code=262146 sa_handler=f31702e1 sa_flags=4000000
remote_job_disp-24083 [000] d..3  5497.285461: signal_generate: sig=17 errno=0 code=262146 comm=main pid=717 grp=1 res=0
           main-717   [001] d..2  5497.285844: signal_deliver: sig=17 errno=0 code=262146 sa_handler=f31702e1 sa_flags=4000000
        SysUiBg-8259  [000] d.h6  5497.365086: signal_generate: sig=32 errno=0 code=131070 comm=POSIX timer 344 pid=15551 grp=0 res=0
      Thread-24-25413 [001] d.h3  5497.751070: signal_generate: sig=32 errno=0 code=131070 comm=POSIX timer 0 pid=8158 grp=0 res=0
      Thread-24-25413 [001] d.h3  5497.868609: signal_generate: sig=32 errno=0 code=131070 comm=POSIX timer 344 pid=15551 grp=0 res=0
  Binder:7518_3-8391  [002] d.h2  5497.958303: signal_generate: sig=14 errno=0 code=128 comm=sensors.qcom pid=614 grp=1 res=0
          perfd-2666  [007] .n.1  5498.123490: tracing_mark_write: B|459|perf_lock_acq: send output handle 10233 to client(pid 7767, tid=8320)

5.2 查看进程信号屏蔽,处理信息

cat /proc/xxx/status

mido:/ # cat /proc/7767/status                                                                                                                             
Name:   system_server
State:  S (sleeping)
Tgid:   7767
Pid:    7767
PPid:   7514
TracerPid:  0
Uid:    1000    1000    1000    1000
Gid:    1000    1000    1000    1000
Ngid:   0
FDSize: 1024
Groups: 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1018 1021 1032 3001 3002 3003 3006 3007 3009 3010 9801
VmPeak:  2826432 kB
VmSize:  2702540 kB
VmLck:    144456 kB
VmPin:         0 kB
VmHWM:    383340 kB
VmRSS:    325492 kB
VmData:   438872 kB
VmStk:      8196 kB
VmExe:        16 kB
VmLib:    140536 kB
VmPTE:      1824 kB
VmSwap:    24688 kB
Threads:    211
SigQ:   6/10397                         //size/limits
SigPnd: 0000000000000000              //挂起,等待处理的信号(本线程专属)
ShdPnd: 0000000000000000              //挂起,等待处理的信号(线程组公用)
SigBlk: 0000000000001204              //被sigwait注册处理的信号, 这里 3) SIGQUIT, 10) SIGUSR1, 13) SIGPIPE被上层通过系统调用等待
SigIgn: 0000000000000001              //忽略的信号
SigCgt: 20000002000084f8              //被上层通过sigaction注册捕捉的信号,这个地方SIGABRT, SIGBUS, SIGSEGV等异常信号都被捕捉,用以输出tomestone
CapInh: 0000000000000000
CapPrm: 0000001007897c20
CapEff: 0000001007897c20
CapBnd: 0000000000000000
Seccomp:    0
Cpus_allowed:   d7
Cpus_allowed_list:  0-2,4,6-7
Mems_allowed:   1
Mems_allowed_list:  0
voluntary_ctxt_switches:    69902
nonvoluntary_ctxt_switches: 3480
Tags: security algorithm