linux信号处理流程

Linux signal handle

提纲:

  • 信号基本概念以及定义
  • 信号发送
  • 信号接收
  • SIGSTOP流程
  • SIGKILL流程
  • SIGSEGV流程

man pages

信号基本概念以及定义

Signal设计目的

  • siganl提供一个基础的异步通知机制而设计
  • signal是一种IPC手段
    与其他IPC手段不同的是,不需要专门的线程block住去等待消息

1. Linux 信号定义

include/uapi/asm-generic/signal.h

#define _NSIG       64
#define _NSIG_BPW   __BITS_PER_LONG
#define _NSIG_WORDS (_NSIG / _NSIG_BPW)

#define SIGHUP       1
#define SIGINT       2
#define SIGQUIT      3
#define SIGILL       4
#define SIGTRAP      5
#define SIGABRT      6
#define SIGIOT       6
#define SIGBUS       7
#define SIGFPE       8
#define SIGKILL      9
#define SIGUSR1     10
#define SIGSEGV     11
#define SIGUSR2     12
#define SIGPIPE     13
#define SIGALRM     14
#define SIGTERM     15
#define SIGSTKFLT   16
#define SIGCHLD     17
#define SIGCONT     18
#define SIGSTOP     19
#define SIGTSTP     20
#define SIGTTIN     21
#define SIGTTOU     22
#define SIGURG      23
#define SIGXCPU     24
#define SIGXFSZ     25
#define SIGVTALRM   26
#define SIGPROF     27
#define SIGWINCH    28
#define SIGIO       29
#define SIGPOLL     SIGIO
/*
#define SIGLOST     29
*/
#define SIGPWR      30
#define SIGSYS      31
#define SIGUNUSED   31

/* These should not be considered constants from userland.  */
#define SIGRTMIN    32
#ifndef SIGRTMAX
#define SIGRTMAX    _NSIG
#endif

内核相关结构体

Signal 内部数据结构

struct task_struct {
  /* Signal handlers: */
  struct signal_struct      *signal;    //同一线程组共有的sigpending链表
  struct sighand_struct     *sighand;
  sigset_t          blocked;
  sigset_t          real_blocked;
  /* Restored if set_restore_sigmask() was used: */
  sigset_t          saved_sigmask;
  struct sigpending     pending;        //私有的sigpending链表
  unsigned long         sas_ss_sp;
  size_t                sas_ss_size;
  unsigned int          sas_ss_flags;
};

struct signal_struct {
    atomic_t        sigcnt;
    atomic_t        live;
    int         nr_threads;
    struct list_head    thread_head;

    wait_queue_head_t   wait_chldexit;  /* for wait4() */
  /* shared signal handling: */
    struct sigpending   shared_pending;

    /* thread group exit support */
    int         group_exit_code;
    struct task_struct  *group_exit_task;
    struct rlimit rlim[RLIM_NLIMITS];
};

struct sigpending {
    struct list_head list;
    sigset_t signal;
};

struct sigqueue {
    struct list_head list;
    int flags;
    siginfo_t info;
    struct user_struct *user;
};

struct sighand_struct {
    atomic_t        count;
    struct k_sigaction  action[_NSIG];
    spinlock_t      siglock;
    wait_queue_head_t   signalfd_wqh;
};

sigqueue结构图

sigset_t: bitmap for signal state

信号发送

  1. raise(3) 发送信号给当前线程

  2. kill(2) 发送给特定进程,进程组,或全部进程

  3. killpg(3) 发送给进程组

  4. pthread_kill(3) 发送给指定线程

  5. tgkill(2) 发送给指定线程,通常用来实现pthread_kill

  6. sigqueue(3) 发送信号给指定进程,可以携带一个int,或者指针类型数据。 sigqueue编程示例

内核中信号发送流程

graph TD
    tkill-->do_tkill
    tgkill-->do_tkill
    do_tkill-->do_send_specific
    do_send_specific-->do_send_sig_info((do_send_sig_info))
    kill-->kill_something_info
    kill_something_info-->|pid > 0|kill_pid_info
    kill_something_info-->|pid != -1|__kill_pgrp_info
    kill_something_info-->|other|group_send_sig_info
    kill_pid_info-->group_send_sig_info
    __kill_pgrp_info-->group_send_sig_info
    group_send_sig_info-->do_send_sig_info
    do_send_sig_info-->__send_signal
    __send_signal---|将信号加入到pending list|返回
    rt_sigqueueinfo-->kill_proc_info
    kill_proc_info-->kill_pid_info

不管从哪条路径发送信号,最终入口都是__send_signal

alloc sigqueue 结构题

注: alloc失败时,内核向进程发送的信号可以顺利发送

task选择

complete_signal函数

  1. 优先给主线程
  2. 在所有线程中查找可以注册的线程

    在加入信号链表,设置对应的bitmap之后返回

信号接收

  1. sig_action 设置信号处理函数
  2. sigwait 同步等待信号
  3. sigsuspend 同步等待信号,仅一次
  4. sigblock 阻塞信号
  5. siginterrupt 更改restart_systemcall行为,默认false(0)
  6. sigpause 废弃,用sigsuspend

信号处理途径

  • Kernel handler
    1. 如果进程没有实现信号处理函数,则由内核默认处理函数处理
    2. 部分信号(SIGSTOP,SIGKILL)用户进程无权设置处理函数,也不能block
  • Process defined handler
    1. 如果设置了信号处理函数,则可以跳转到自己处理函数执行
  • Ignore
    1. 进程设置忽略信号

Kernel handler

  • Ignore
  • Terminate
  • Coredump
  • Stop
“ +--------------------+------------------+
 * | POSIX signal     | default action |
 * +------------------+------------------+
 * | SIGHUP           | terminate
 * | SIGINT           | terminate
 * | SIGQUIT          | coredump
 * | SIGILL           | coredump
 * | SIGTRAP          | coredump
 * | SIGABRT/SIGIOT   | coredump
 * | SIGBUS           | coredump
 * | SIGFPE           | coredump
 * | SIGKILL          | terminate
 * | SIGUSR1          | terminate
 * | SIGSEGV          | coredump
 * | SIGUSR2          | terminate
 * | SIGPIPE          | terminate
 * | SIGALRM          | terminate
 * | SIGTERM          | terminate
 * | SIGCHLD          | ignore
 * | SIGCONT          | ignore
 * | SIGSTOP          | stop
 * | SIGTSTP          | stop
 * | SIGTTIN          | stop
 * | SIGTTOU          | stop
 * | SIGURG           | ignore
 * | SIGXCPU          | coredump
 * | SIGXFSZ          | coredump
 * | SIGVTALRM        | terminate
 * | SIGPROF          | terminate
 * | SIGPOLL/SIGIO    | terminate
 * | SIGSYS/SIGUNUSED | coredump
 * | SIGSTKFLT        | terminate
 * | SIGWINCH         | ignore
 * | SIGPWR           | terminate
 * | SIGRTMIN-SIGRTMAX| terminate
 * +------------------+------------------+
 * | non-POSIX signal | default action |
 * +------------------+------------------+
 * | SIGEMT           | coredump |
 * +--------------------+------------------+”

摘录来自: Raghu Bharadwaj. “Mastering Linux Kernel Development: A kernel developer’s reference manual。” iBooks.

graph TD
    once_sched-->|_TIF_WORK_MASK|work_pending
    work_pending-->do_notify_resume
    do_notify_resume-->do_signal
    do_signal-->get_signal
    do_signal-->|user registered sigaction|handle_signal
    handle_signal-->user_fastforward_single_step
    user_fastforward_single_step---|User Space|return
    get_signal-->|sig_kernel_stop|do_signal_stop
    get_signal-->|SIGKILL|do_group_exit
    get_signal-->|sig_kernel_ignore|return
    get_signal-->|"未注册sigAction的用户空间信号"|return

Process defined handler

用户程序处理函数流程

摘录来自: Raghu Bharadwaj. “Mastering Linux Kernel Development: A kernel developer’s reference manual。” iBooks.

do_signal_stop流程 (SIGSTOP)

main with flags:JOBCTL_STOP_PENDING, group_stop_count is threads
thread1 wakeup with JOBCTL_STOP_DEQUEUED
thread2 wakeup with JOBCTL_STOP_DEQUEUED
do_notify_parent_cldstop //last one send this signal

sequenceDiagram
Title:进程退出流程
note leFt of main: 主线程
note right of thread1: 线程1
note right of thread2: 线程2
note over main: 唤醒所有线程来暂停
main-->>thread1:我们要暂停啦
main-->>thread2:我们要暂停啦
main-->main:停止运行,设置STOP标志
note over thread1:停止运行,设置STOP标志
note over thread1: 我们有个pending的信号要处理
note over thread2: 我们有个pending的信号要处理
thread1-->thread1:do_signal_stop
thread1-->thread1:停止运行,设置STOP标志
thread2-->thread2:do_signal_stop
thread2-->thread2:停止运行,设置STOP标志
note over thread2:我是最后一个了,do_notify_parent_cldstop
sequenceDiagram
Title:进程退出D状态时,必须等待线程结束D状态才能中止运行
note left of main: 主线程
note right of thread1: 线程1
note right of thread2: 线程2
note over main,thread2:同一个线程组
note over kworker: 内核线程
note over xxx_lock:一把mutex锁
kworker->>xxx_lock:mutex_lock
thread2->>xxx_lock:mutex_lock
xxx_lock->>thread2:我已经有主了,你先歇着吧,等别人用完了我叫你
note over thread2:在这儿等着吧,先把状态切换成D状态
note over main:get_signal
note over main:没有JOBCTL_STOP_PENDING,我们不是在退出\n
note over main:取出个信号执行一下
note over main:这儿有个SIGSTOP信号\n do_signal_stop
note over main:需要唤醒所有线程来退出\n 设置上JOBCTL_STOP_PENDING标志
main-->>thread1:我们要退出啦
main-->>thread2:我们要退出啦
note over main:结束运行,设置STOP标志
note over thread1: 我们有个pending的信号要处理
note over thread1:do_signal_stop
note over thread1:结束运行,设置STOP标志
kworker-->>xxx_lock:mutex_unlock
xxx_lock->>thread2:Hi 你现在可以用了
note over thread2: 好了,我继续往下跑
note over thread2: 咦,我们有个pending的信号要处理
note over thread2:do_signal_stop
note over thread2:结束运行,设置STOP标志
note over thread2:我是最后一个了,通知父进程do_notify_parent_cldstop

kill流程 (SIGKILL)

sequenceDiagram
Title:进程退出D状态时,必须等待线程结束D状态才能被杀死
note over main: 主线程
note over thread1: 线程1
note over thread2: 线程2
note over main,thread2:同一个线程组
note over kworker: 内核线程
note over xxx_lock:一把mutex锁
kworker->>xxx_lock:mutex_lock
thread2->>xxx_lock:mutex_lock
xxx_lock->>thread2:我已经有主了,你先歇着吧,等别人用完了我叫你
note over thread2:在这儿等着吧,先把状态切换成D状态
note over main:get_signal
note over main:没有JOBCTL_STOP_PENDING,我们不是在退出\n
note over main:取出个信号执行一下
note over main:这个信号没有忽略,也没有处理函数可以跑\n do_group_exit
main->main:zap_other_threads
main-->>thread1:signal_wake_up,SIGKILL
main-->>thread2:signal_wake_up,SIGKILL
note over main:do_exit
note over thread1: 我们有个pending的信号要处理
note over thread1:这个信号没有忽略,也没有处理函数可以跑\n do_group_exit
note over thread1:do_exit
kworker-->>xxx_lock:mutex_unlock
xxx_lock->>thread2:Hi 你现在可以用了
note over thread2: 好了,我继续往下跑
note over thread2: 咦,我们有个pending的信号要处理
note over thread2:这个信号没有忽略,也没有处理函数可以跑\n do_group_exit
note over thread2:do_exit

SEGV流程

异常处理表

static const struct fault_info fault_info[] = {
    { do_bad,       SIGKILL, SI_KERNEL, "ttbr address size fault"   },
    { do_bad,       SIGKILL, SI_KERNEL, "level 1 address size fault"    },
    { do_bad,       SIGKILL, SI_KERNEL, "level 2 address size fault"    },
    { do_bad,       SIGKILL, SI_KERNEL, "level 3 address size fault"    },
    { do_translation_fault, SIGSEGV, SEGV_MAPERR,   "level 0 translation fault" },
    { do_translation_fault, SIGSEGV, SEGV_MAPERR,   "level 1 translation fault" },
    { do_translation_fault, SIGSEGV, SEGV_MAPERR,   "level 2 translation fault" },
    { do_translation_fault, SIGSEGV, SEGV_MAPERR,   "level 3 translation fault" },
    { do_bad,       SIGKILL, SI_KERNEL, "unknown 8"         },
    { do_page_fault,    SIGSEGV, SEGV_ACCERR,   "level 1 access flag fault" },
    { do_page_fault,    SIGSEGV, SEGV_ACCERR,   "level 2 access flag fault" },
    { do_page_fault,    SIGSEGV, SEGV_ACCERR,   "level 3 access flag fault" },
    { do_bad,       SIGKILL, SI_KERNEL, "unknown 12"            },
    { do_page_fault,    SIGSEGV, SEGV_ACCERR,   "level 1 permission fault"  },
    { do_page_fault,    SIGSEGV, SEGV_ACCERR,   "level 2 permission fault"  },
    { do_page_fault,    SIGSEGV, SEGV_ACCERR,   "level 3 permission fault"  },
    { do_sea,       SIGBUS,  BUS_OBJERR,    "synchronous external abort"    },
    { do_bad,       SIGKILL, SI_KERNEL, "unknown 17"            },
  ...
};
static void do_bad_area(unsigned long addr, unsigned int esr, struct pt_regs *regs)
{
    /*
     * If we are in kernel mode at this point, we have no context to
     * handle this fault with.
     */
    if (user_mode(regs)) {
        const struct fault_info *inf = esr_to_fault_info(esr);
        struct siginfo si = {
            .si_signo   = inf->sig,
            .si_code    = inf->code,
            .si_addr    = (void __user *)addr,
        };

        __do_user_fault(&si, esr);
    } else {
        __do_kernel_fault(addr, esr, regs);
    }
}

static void __do_user_fault(struct siginfo *info, unsigned int esr)
{
  ...
  arm64_force_sig_info(info, esr_to_fault_info(esr)->name, current);
}

tomestoned进程

system/core/debuggerd/tombstoned/tombstoned.rc

service tombstoned /system/bin/tombstoned
    user tombstoned
    group system

    # Don't start tombstoned until after the real /data is mounted.
    class late_start

    socket tombstoned_crash seqpacket 0666 system system
    socket tombstoned_intercept seqpacket 0666 system system
    socket tombstoned_java_trace seqpacket 0666 system system
    writepid /dev/cpuset/system-background/tasks

信号处理函数

for android N Android进程Crash处理流程
for android O

/*
 * This code is called after the linker has linked itself and
 * fixed it's own GOT. It is safe to make references to externs
 * and other non-local data at this point.
 */
static ElfW(Addr) __linker_init_post_relocation(KernelArgumentBlock& args) {
  ProtectedDataGuard guard;
  ...
#ifdef __ANDROID__
  debuggerd_callbacks_t callbacks = {
    .get_abort_message = []() {
      return g_abort_message;
    },
    .post_dump = &notify_gdb_of_libraries,
  };
  debuggerd_init(&callbacks);
#endif
  g_linker_logger.ResetState();
  ...
}

// Handler that does crash dumping by forking and doing the processing in the child.
// Do this by ptracing the relevant thread, and then execing debuggerd to do the actual dump.
static void debuggerd_signal_handler(int signal_number, siginfo_t* info, void* context) {
  ...
  debugger_thread_info thread_info = {
    .crash_dump_started = false,
    .pseudothread_tid = -1,
    .crashing_tid = __gettid(),
    .signal_number = signal_number,
    .info = info
  };

  // Set PR_SET_DUMPABLE to 1, so that crash_dump can ptrace us.
  int orig_dumpable = prctl(PR_GET_DUMPABLE);
  if (prctl(PR_SET_DUMPABLE, 1) != 0) {
    fatal_errno("failed to set dumpable");
  }

  // Essentially pthread_create without CLONE_FILES (see debuggerd_dispatch_pseudothread).
  pid_t child_pid =
    clone(debuggerd_dispatch_pseudothread, pseudothread_stack,
          CLONE_THREAD | CLONE_SIGHAND | CLONE_VM | CLONE_CHILD_SETTID | CLONE_CHILD_CLEARTID,
          &thread_info, nullptr, nullptr, &thread_info.pseudothread_tid);
  if (child_pid == -1) {
    fatal_errno("failed to spawn debuggerd dispatch thread");
  }
  // Wait for the child to start...
  futex_wait(&thread_info.pseudothread_tid, -1);

  // and then wait for it to finish.
  futex_wait(&thread_info.pseudothread_tid, child_pid);
}

static int debuggerd_dispatch_pseudothread(void* arg) {
  debugger_thread_info* thread_info = static_cast<debugger_thread_info*>(arg);

  for (int i = 0; i < 1024; ++i) {
    close(i);
  }

  int devnull = TEMP_FAILURE_RETRY(open("/dev/null", O_RDWR));

  // devnull will be 0.
  TEMP_FAILURE_RETRY(dup2(devnull, STDOUT_FILENO));
  TEMP_FAILURE_RETRY(dup2(devnull, STDERR_FILENO));

  int pipefds[2];
  if (pipe(pipefds) != 0) {
    fatal_errno("failed to create pipe");
  }

  // Don't use fork(2) to avoid calling pthread_atfork handlers.
  int forkpid = clone(nullptr, nullptr, 0, nullptr);
  if (forkpid == -1) {
    async_safe_format_log(ANDROID_LOG_FATAL, "libc",
                          "failed to fork in debuggerd signal handler: %s", strerror(errno));
  } else if (forkpid == 0) {
    TEMP_FAILURE_RETRY(dup2(pipefds[1], STDOUT_FILENO));
    close(pipefds[0]);
    close(pipefds[1]);

    raise_caps();

    char main_tid[10];
    char pseudothread_tid[10];
    char debuggerd_dump_type[10];
    async_safe_format_buffer(main_tid, sizeof(main_tid), "%d", thread_info->crashing_tid);
    async_safe_format_buffer(pseudothread_tid, sizeof(pseudothread_tid), "%d",
                             thread_info->pseudothread_tid);
    async_safe_format_buffer(debuggerd_dump_type, sizeof(debuggerd_dump_type), "%d",
                             get_dump_type(thread_info));

    execl(CRASH_DUMP_PATH, CRASH_DUMP_NAME, main_tid, pseudothread_tid, debuggerd_dump_type,
          nullptr);

    fatal_errno("exec failed");
  } else {
    close(pipefds[1]);
    char buf[4];
    ssize_t rc = TEMP_FAILURE_RETRY(read(pipefds[0], &buf, sizeof(buf)));
    if (rc == -1) {
      async_safe_format_log(ANDROID_LOG_FATAL, "libc", "read of IPC pipe failed: %s",
                            strerror(errno));
    } else if (rc == 0) {
      async_safe_format_log(ANDROID_LOG_FATAL, "libc", "crash_dump helper failed to exec");
    } else if (rc != 1) {
      async_safe_format_log(ANDROID_LOG_FATAL, "libc",
                            "read of IPC pipe returned unexpected value: %zd", rc);
    } else {
      if (buf[0] != '\1') {
        async_safe_format_log(ANDROID_LOG_FATAL, "libc", "crash_dump helper reported failure");
      } else {
        thread_info->crash_dump_started = true;
      }
    }
    close(pipefds[0]);

    // Don't leave a zombie child.
    int status;
    if (TEMP_FAILURE_RETRY(waitpid(forkpid, &status, 0)) == -1) {
      async_safe_format_log(ANDROID_LOG_FATAL, "libc", "failed to wait for crash_dump helper: %s",
                            strerror(errno));
    } else if (WIFSTOPPED(status) || WIFSIGNALED(status)) {
      async_safe_format_log(ANDROID_LOG_FATAL, "libc", "crash_dump helper crashed or stopped");
      thread_info->crash_dump_started = false;
    }
  }

  syscall(__NR_exit, 0);
  return 0;
}

clone参数
clone(debuggerd_dispatch_pseudothread, pseudothread_stack,
CLONE_THREAD | CLONE_SIGHAND | CLONE_VM | CLONE_CHILD_SETTID | CLONE_CHILD_CLEARTID,
&thread_info, nullptr, nullptr, &thread_info.pseudothread_tid);
// http://androidxref.com/9.0.0_r3/xref/bionic/libc/bionic/pthread_create.cpp#302
int flags = CLONE_VM | CLONE_FS | CLONE_FILES | CLONE_SIGHAND | CLONE_THREAD | CLONE_SYSVSEM |
CLONE_SETTLS | CLONE_PARENT_SETTID | CLONE_CHILD_CLEARTID;

Signal调试

tracing event

/d/tracing/events/signal/signal_generate
/d/tracing/events/signal/signal_deliver

remote_job_disp-24083 [000] d..2  5497.143322: signal_deliver: sig=9 errno=0 code=0 sa_handler=0 sa_flags=0
ActivityManager-7834  [001] d..2  5497.171845: signal_generate: sig=9 errno=0 code=0 comm=id.printspooler pid=25155 grp=1 res=0
   FileObserver-25196 [000] d..3  5497.176538: signal_generate: sig=17 errno=0 code=262146 comm=main pid=7514 grp=1 res=0
           main-7514  [002] d..2  5497.176804: signal_deliver: sig=17 errno=0 code=262146 sa_handler=7f836cfef8 sa_flags=0
ActivityManager-7834  [001] d..2  5497.222412: signal_generate: sig=9 errno=0 code=0 comm=rsonalassistant pid=24800 grp=1 res=0
ActivityManager-7834  [001] d..2  5497.227639: signal_generate: sig=9 errno=0 code=0 comm=rsonalassistant pid=24800 grp=1 res=2
  Profile Saver-24878 [000] d..3  5497.229721: signal_generate: sig=17 errno=0 code=262146 comm=main pid=717 grp=1 res=0
           main-717   [001] d..2  5497.230300: signal_deliver: sig=17 errno=0 code=262146 sa_handler=f31702e1 sa_flags=4000000
remote_job_disp-24083 [000] d..3  5497.285461: signal_generate: sig=17 errno=0 code=262146 comm=main pid=717 grp=1 res=0
           main-717   [001] d..2  5497.285844: signal_deliver: sig=17 errno=0 code=262146 sa_handler=f31702e1 sa_flags=4000000
        SysUiBg-8259  [000] d.h6  5497.365086: signal_generate: sig=32 errno=0 code=131070 comm=POSIX timer 344 pid=15551 grp=0 res=0
      Thread-24-25413 [001] d.h3  5497.751070: signal_generate: sig=32 errno=0 code=131070 comm=POSIX timer 0 pid=8158 grp=0 res=0
      Thread-24-25413 [001] d.h3  5497.868609: signal_generate: sig=32 errno=0 code=131070 comm=POSIX timer 344 pid=15551 grp=0 res=0
  Binder:7518_3-8391  [002] d.h2  5497.958303: signal_generate: sig=14 errno=0 code=128 comm=sensors.qcom pid=614 grp=1 res=0
          perfd-2666  [007] .n.1  5498.123490: tracing_mark_write: B|459|perf_lock_acq: send output handle 10233 to client(pid 7767, tid=8320)

查看进程信号屏蔽,处理信息

cat /proc/xxx/status

mido:/ # cat /proc/7767/status                                                                                                                             
Name:   system_server
State:  S (sleeping)
Tgid:   7767
Pid:    7767
PPid:   7514
TracerPid:  0
Uid:    1000    1000    1000    1000
Gid:    1000    1000    1000    1000
Ngid:   0
FDSize: 1024
Groups: 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1018 1021 1032 3001 3002 3003 3006 3007 3009 3010 9801
VmPeak:  2826432 kB
VmSize:  2702540 kB
VmLck:    144456 kB
VmPin:         0 kB
VmHWM:    383340 kB
VmRSS:    325492 kB
VmData:   438872 kB
VmStk:      8196 kB
VmExe:        16 kB
VmLib:    140536 kB
VmPTE:      1824 kB
VmSwap:    24688 kB
Threads:    211
SigQ:   6/10397                         //size/limits
SigPnd: 0000000000000000              //挂起,等待处理的信号(本线程专属)
ShdPnd: 0000000000000000              //挂起,等待处理的信号(线程组公用)
SigBlk: 0000000000001204              //被sigwait注册处理的信号, 这里 3) SIGQUIT, 10) SIGUSR1, 13) SIGPIPE被上层通过系统调用等待
SigIgn: 0000000000000001              //忽略的信号
SigCgt: 20000002000084f8              //被上层通过sigaction注册捕捉的信号,这个地方SIGABRT, SIGBUS, SIGSEGV等异常信号都被捕捉,用以输出tomestone
CapInh: 0000000000000000
CapPrm: 0000001007897c20
CapEff: 0000001007897c20
CapBnd: 0000000000000000
Seccomp:    0
Cpus_allowed:   d7
Cpus_allowed_list:  0-2,4,6-7
Mems_allowed:   1
Mems_allowed_list:  0
voluntary_ctxt_switches:    69902
nonvoluntary_ctxt_switches: 3480

qemu调试linux内核

qemu-system-aarch64 -machine virt -cpu cortex-a57 -machine type=virt -nographic -smp 4 -m 2048 –append "console=ttyAMA0" –kernel ~/work/src/linux.git/arch/arm64/boot/Image –initrd ~/work/buildroot-2018.05/output/images/rootfs.cpio.bz2

qemu-system-aarch64 -machine virt -cpu cortex-a57 -machine type=virt -nographic -smp 4 -m 2048 –append "console=ttyAMA0" –kernel aarch64-linux-3.16-buildroot-gdb.img -fsdev local,id=r,path=/home/schspa/work/src/signal-test,security_model=none -device virtio-9p-device,fsdev=r,mount_tag=r -s -S

compile qem
./configure –target-list=aarch64-softmmu –enable-virtfs
make -j8

mount -t 9p -o trans=virtio r /mnt

http://www.voidcn.com/article/p-xlgkygpv-p.html
https://kennedy-han.github.io/2015/06/15/QEMU-arm64-guide.html

crash工具使用小技巧

  1. bt命令,不解释
  2. 查看android ion内存分配器信息
crash> p num_heaps
num_heaps = $1 = 7
crash> 
crash> 
crash> p heaps
heaps = $2 = (struct ion_heap **) 0xffffffe531e1c480
crash> rd -64 0xffffffe531e1c480 7
ffffffe531e1c480:  ffffffe531e05800 ffffffe5318fb000   .X.1.......1....
ffffffe531e1c490:  ffffffe5318fb100 ffffffe5318fb200   ...1.......1....
ffffffe531e1c4a0:  ffffffe5318fb300 ffffffe5318fb400   ...1.......1....
ffffffe531e1c4b0:  ffffffe531e21c08                    ...1....
crash> struct ion_heap.name,dev,total_allocated 0xffffffe531e05800
  name = 0xffffff98e65926b0 "system"
  dev = 0xffffffe5318f8400
  total_allocated = {
    counter = 1914679296
  }
crash> struct ion_device.clients 0xffffffe5318f8400
  clients = {
    rb_node = 0xffffffe50655da00
  }
tree -t rbtree -o ion_client.node -s ion_client.pid,name,display_name,handles 0xffffffe5318f84c0 > ion_clients_system_heap.txt
crash> struct -o ion_client.handles 0xffffffe40d0da900
struct ion_client {
  [ffffffe40d0da920] struct rb_root handles;
}
crash> tree -t rbtree -o ion_handle.node -s ion_handle.buffer 0xffffffe40d0da920
ffffffe4116fea80
  buffer = 0xffffffe40d0db800
crash> ion_buffer.size 0xffffffe40d0db800
  size = 28672

这样,就可以找到每个buffer的大小

linux-context

Linux中的上下文

基于ARM64

ARM64 exception level

ARM64 Exception level

  • Linux 内核运行于Normal world EL1
  • 目前手机上在EL1的Normal world只运行了一个Guest OS,Linux
  • Secure world运行的镜像是tz app, trustzone系统
  • EL2 目前用作虚拟化
  • EL3 执行资源保护的检查
    [//]: 上图仅适用于ARM64 i.e.AARCH64

Linux中的上下文

1. 硬中断 (上半部)

  1. 硬件中断中执行,发生中断后直接跳转到异常向量表执行
  2. 不同的中断之间可以相互打断
  3. 同一个中断,同一时刻只会在同一个CPU上运行

2. 软中断 (下半部)

  1. 在中断执行完成返回时,判断是否有软中断事物需要执行,如果有,就跳转执行
  2. 可以中断除了中断之外的其他上下文
  3. 共有以下几种级别
enum
{
    HI_SOFTIRQ=0,
    TIMER_SOFTIRQ,
    NET_TX_SOFTIRQ,
    NET_RX_SOFTIRQ,
    BLOCK_SOFTIRQ,
    IRQ_POLL_SOFTIRQ,
    TASKLET_SOFTIRQ,
    SCHED_SOFTIRQ,
    HRTIMER_SOFTIRQ, /* Unused, but kept as tools rely on the
                numbering. Sigh! */
    RCU_SOFTIRQ,    /* Preferable RCU should always be the last softirq */

    NR_SOFTIRQS
};
  1. 软中断个数,以及实现函数在编译时确定,运行时不能动态注册
  2. irq_handle->gic_handle_irq()->handle_domain_irq()->irq_exit() -> invoke_softirq()
    执行顺序:按照顺序一个一个执行
  3. 同一个softirq可以在多个cpu core上同时执行,所以需要加锁同步

3. tasklet

  1. 利用软中断实现的可以动态注册的底半部机制
  2. 两个softirq等级

+ tasklet_vec ——– TASKLET_SOFTIRQ — tasklet_action
+ tasklet_hi_vec ——- HI_SOFTIRQ —— tasklet_hi_action
3. 同一个tasklet只会在同一个cpu core中执行,不会同时调度到多个cpu core中执行

4. 原子上下文

  1. 在进程上下文中调用了spinlock, 禁止抢占,以及关闭中断的函数之后就属于原子上下文
  2. 原子上下文中不可以调用能使进程休眠的函数
  3. 上述的硬中断,softirq,tasklet都属于原子上下文

5. 进程上下文

  1. 普通的内核线程,普通进程

Linux 同步技术

Unreliable Guide To Locking

各种上下文中加锁方法

IRQ Handler A IRQ Handler B Softirq A Softirq B Tasklet A Tasklet B Timer A Timer B User Context A User Context B
IRQ Handler A None
IRQ Handler B SLIS None
Softirq A SLI SLI SL
Softirq B SLI SLI SL SL
Tasklet A SLI SLI SL SL None
Tasklet B SLI SLI SL SL SL None
Timer A SLI SLI SL SL SL SL None
Timer B SLI SLI SL SL SL SL SL None
User Context A SLI SLI SLBH SLBH SLBH SLBH SLBH SLBH None
User Context B SLI SLI SLBH SLBH SLBH SLBH SLBH SLBH MLI None
Table: Table of Locking Requirements
NAME lock_type
SLIS spin_lock_irqsave
SLI spin_lock_irq
SL spin_lock
SLBH spin_lock_bh
MLI mutex_lock_interruptible

Linux中各种锁

原子变量

  1. 原子操作是各种锁的基石
  2. 原子操作在同步中的作用
多个core之间竞争
多个进程之间竞争(why ?)
Instance 1 Instance 2
read very_important_count (5)
read very_important_count (5)
add 1 (6)
add 1 (6)
write very_important_count (6)
write very_important_count (6)

ARM64实现:
ARMv7-A and ARMv8-A architectures both provide support for exclusive memory accesses.
In A64, this is the Load/Store exclusive (LDXR/STXR) pair.

crash64> dis _raw_spin_lock
0xffffff8f41b5e4f8 <__cpuidle_text_end>:        mrs     x2, sp_el0
0xffffff8f41b5e4fc <_raw_spin_lock+4>:  ldr     w1, [x2,#16]
0xffffff8f41b5e500 <_raw_spin_lock+8>:  add     w1, w1, #0x1
0xffffff8f41b5e504 <_raw_spin_lock+12>: str     w1, [x2,#16]
0xffffff8f41b5e508 <_raw_spin_lock+16>: prfm    pstl1strm, [x0]
0xffffff8f41b5e50c <_raw_spin_lock+20>: ldaxr   w1, [x0] //独占的load
0xffffff8f41b5e510 <_raw_spin_lock+24>: add     w2, w1, #0x10, lsl #12
0xffffff8f41b5e514 <_raw_spin_lock+28>: stxr    w3, w2, [x0] //独占的store
0xffffff8f41b5e518 <_raw_spin_lock+32>: cbnz    w3, 0xffffff8f41b5e50c  //如果w3不是0,说明有其他的设备访问过[x0],这次的读改写操作需要重新开始
0xffffff8f41b5e51c <_raw_spin_lock+36>: eor     w2, w1, w1, ror #16
0xffffff8f41b5e520 <_raw_spin_lock+40>: cbz     w2, 0xffffff8f41b5e538
0xffffff8f41b5e524 <_raw_spin_lock+44>: sevl
0xffffff8f41b5e528 <_raw_spin_lock+48>: wfe
0xffffff8f41b5e52c <_raw_spin_lock+52>: ldaxrh  w3, [x0]
0xffffff8f41b5e530 <_raw_spin_lock+56>: eor     w2, w3, w1, lsr #16
0xffffff8f41b5e534 <_raw_spin_lock+60>: cbnz    w2, 0xffffff8f41b5e528
0xffffff8f41b5e538 <_raw_spin_lock+64>: ret

Memory barrier

  1. Data Memory Barrier (DMB). This forces all earlier-in-program-order memory accesses to become globally visible before any subsequent accesses. 会强制化使所有对内存的操作可以被下边的指令可见
  2. Data Synchronization Barrier (DSB). All pending loads and stores, cache maintenance instructions, and all TLB maintenance instructions, are completed before program execution continues. A DSB behaves like a DMB, but with additional properties. 加入了更多的tlb,cache相关flush操作,比DMB更强力
  3. Instruction Synchronization Barrier (ISB). This instruction flushes the CPU pipeline and prefetch buffers, causing instructions after the ISB to be fetched (or re-fetched) from cache or memory. flush流水线,重新装载流水线指令缓存。

例子:

LDR X0, [X3]
LDNP X2, X1, [X0] // Xo may not be loaded when the instruction executes!
To correct the above, you need an explicit load barrier:
LDR X0, [X3]
DMB nshld
LDNP X2, X1, [X0]

spinlock

crash64> whatis arch_spinlock_t
typedef struct {
    u16 owner;
    u16 next;
} arch_spinlock_t;

spinlock通过两个域实现,防止多cpu竞争而导致活锁
通过汇编实现以提高性能
Linux内核同步机制之(四):spin lock

C代码(仅参考实现,有些应该原子操作的各位看官自己心里有数即可):

_raw_spin_lock(arch_spinlock_t *lock){
    arch_spinlock_t local_lock;
    local_lock = *lock;
    lock.next++;
retry:
    if (lock.owner == local_lock.next)
        return ;
    wfe;
    goto retry;
}

rwlock

typedef struct {
    volatile unsigned int lock;
} arch_rwlock_t;

同步原语

  • 同时可以有多个执行体一起执行读操作
  • 一次只能有一个执行体执行写操作
  • 写操作必须等待读操作完成

lock值定义

| 31 | 30 0 |
|-|-|
| Write Thread Counter |Read Thread Counter|
Linux中常见同步机制设计原理

seqlock

typedef struct {
    struct seqcount seqcount;
    spinlock_t lock;
} seqlock_t;

typedef struct seqcount {
    unsigned sequence;
#ifdef CONFIG_DEBUG_LOCK_ALLOC
    struct lockdep_map dep_map;
#endif
} seqcount_t;

同步原语

  • 同时可以有多个执行体一起执行读操作
  • 一次只能有一个执行体执行写操作
  • 写操作无需等待读操作完成
  • 如果读操作被打断,需要重新开始读

seqcount值初始化为0,偶数代表有读者正在持有锁

void read(void)
{
       bool x, y;

       do {
               int s = read_seqcount_begin(&seq);

               x = X; y = Y;

         } while (read_seqcount_retry(&seq, s));
}

Tree RCU (非SRCU,原子上下文,使用RCU之后不可睡眠,不可被抢占)

同步原语

  • 无锁化操作,读写都不需要持有任何锁
  • 性能最高,但是适用范围小,适用复杂
int register_cxl_calls(struct cxl_calls *calls)
{
    if (cxl_calls)
        return -EBUSY;

    rcu_assign_pointer(cxl_calls, calls);
    return 0;
}
EXPORT_SYMBOL_GPL(register_cxl_calls);

void unregister_cxl_calls(struct cxl_calls *calls)
{
    BUG_ON(cxl_calls->owner != calls->owner);
    RCU_INIT_POINTER(cxl_calls, NULL);
    synchronize_rcu();
}
EXPORT_SYMBOL_GPL(unregister_cxl_calls);

两个概念

Grace Periodguan (GP)宽限期
  • 临界区开始到所有的CPU core都进入过一次QS的这段时期,在此时期内,认为RCU保护的旧对象不能释放
Quiescent State (QS)静止状态
  • 进行调度即说明进入了QS, rcu有bh, sched等,具体需要调用的函数,视保护变量适用的上下文而定

GP示意图

Mutex Lock

  1. Mutex Lock在获取锁失败时会让出CPU执行权,转而去执行其他程序
  2. Mutex Lock适用于需要长时间获取锁的情况
  3. Mutex Lock获取之后也可以发生进程切换以及睡眠
  4. 持锁,释放锁开销比较大

android event log

cat /system/etc/event-log-tags
frameworks/base/services/core/java/com/android/server/am/EventLogTags.logtags

cpu

2721 cpu (total|1|6),(user|1|6),(system|1|6),(iowait|1|6),(irq|1|6),(softirq|1|6)
05-12 19:39:29.354 1562 1792 I cpu : [98,1,91,5,0,0]

总运行时间98,用户空间1,system空间91,iowait 5,irq 0, softirq 0.

binder_sample

监控每个进程的主线程的binder transaction的耗时情况, 当超过阈值时,则输出相应的目标调用信息.
52004 binder_sample (descriptor|3),(method_num|1|5),(time|1|3),(blocking_package|3),(sample_percent|1|6)
05-12 19:39:35.682 7185 7185 I binder_sample: [android.app.IActivityManager,26,52234,android.process.media,100]
执行android.app.IActivityManager接口,code=26,耗时52234ms,block在android.process.media进程,百分比100%

dvm_lock_sample

print at art/runtime/monitor.cc
当某个线程等待lock的时间blocked超过阈值,则输出当前的持锁状态 ;
20003 dvm_lock_sample (process|3),(main|1|5),(thread|3),(time|1|3),(file|3),(line|1|5),(ownerfile|3),(ownerline|1|5),(sample_percent|1|6)
05-12 19:40:52.959 1562 2100 I dvm_lock_sample: [system_server,0,WifiStateMachine,12203,ActivityManagerService.java,20291,PendingIntentRecord.java,253,0]

system_server进程,非主线程,线程WifiStateMachine,等锁12203ms,ActivityManagerService.java行20291,该锁被PendingIntentRecord.java 253持有,百分比0

am_lifecycle_sample

当app在主线程的生命周期回调方法执行时间超过阈值,则输出相应信息;
30100 am_lifecycle_sample (User|1|5),(Process Name|3),(MessageCode|1|5),(time|1|3)
05-12 19:40:52.999 21104 21104 I am_lifecycle_sample: [0,com.tencent.wework,115,11917]

android trace字段分析

示例log:

suspend all histogram:  Sum: 10.106s 99% C.I. 41.374us-14883.596us Avg: 738.922us Max: 452703us
DALVIK THREADS (138):
"Signal Catcher" daemon prio=5 tid=2 Runnable
  | group="system" sCount=0 dsCount=0 obj=0x12c010d0 self=0x7fab654a00
  | sysTid=1994 nice=0 cgrp=default sched=0/0 handle=0x7fb300b450
  | state=R schedstat=( 60023536568 9178338589 34437 ) utm=4300 stm=1702 core=3 HZ=100
  | stack=0x7fb2f11000-0x7fb2f13000 stackSize=1005KB
  | held mutexes= "mutator lock"(shared held)
  native: #00 pc 0000000000477cec  /system/lib64/libart.so (_ZN3art15DumpNativeStackERNSt3__113basic_ostreamIcNS0_11char_traitsIcEEEEiP12BacktraceMapPKcPNS_9ArtMethodEPv+220)
  native: #01 pc 0000000000477ce8  /system/lib64/libart.so (_ZN3art15DumpNativeStackERNSt3__113basic_ostreamIcNS0_11char_traitsIcEEEEiP12BacktraceMapPKcPNS_9ArtMethodEPv+216)
  native: #02 pc 000000000044bd94  /system/lib64/libart.so (_ZNK3art6Thread9DumpStackERNSt3__113basic_ostreamIcNS1_11char_traitsIcEEEEbP12BacktraceMap+472)
  native: #03 pc 0000000000463c2c  /system/lib64/libart.so (_ZN3art14DumpCheckpoint3RunEPNS_6ThreadE+904)
  native: #04 pc 000000000045b9b8  /system/lib64/libart.so (_ZN3art10ThreadList13RunCheckpointEPNS_7ClosureEb+808)
  native: #05 pc 000000000045b418  /system/lib64/libart.so (_ZN3art10ThreadList4DumpERNSt3__113basic_ostreamIcNS1_11char_traitsIcEEEEb+308)
  native: #06 pc 000000000045b2a0  /system/lib64/libart.so (_ZN3art10ThreadList14DumpForSigQuitERNSt3__113basic_ostreamIcNS1_11char_traitsIcEEEE+804)
  native: #07 pc 00000000004373d8  /system/lib64/libart.so (_ZN3art7Runtime14DumpForSigQuitERNSt3__113basic_ostreamIcNS1_11char_traitsIcEEEE+332)
  native: #08 pc 000000000043da90  /system/lib64/libart.so (_ZN3art13SignalCatcher13HandleSigQuitEv+2240)
  native: #09 pc 000000000043c5b8  /system/lib64/libart.so (_ZN3art13SignalCatcher3RunEPv+476)
  native: #10 pc 0000000000068164  /system/lib64/libc.so (_ZL15__pthread_startPv+196)
  native: #11 pc 000000000001db40  /system/lib64/libc.so (__start_thread+16)
  (no managed stack frames)

解释:
打印位置 http://androidxref.com/8.0.0_r4/xref/art/runtime/thread.cc#1517
“Signal Catcher” daemon prio=5 tid=2 Runnable
| group=”system” sCount=0 dsCount=0 obj=0x12c010d0 self=0x7fab654a00
| sysTid=1994 nice=0 cgrp=default sched=0/0 handle=0x7fb300b450
| state=R schedstat=( 60023536568 9178338589 34437 ) utm=4300 stm=1702 core=3 HZ=100
| stack=0x7fb2f11000-0x7fb2f13000 stackSize=1005KB
| held mutexes= “mutator lock”(shared held)

  • name “Signal Catcher”
  • daemon: java daemon进程java_lang_Thread_daemon
  • prio: 优先级 http://androidxref.com/8.0.0_r4/xref/art/runtime/thread_android.cc#36
  • tid: thread id.
  • Runnable: 进程状态 R,S, D, etc..
  • group=”system”, java_lang_ThreadGroup_name
  • sCount: thread->tls32_.suspend_count
  • dsCount: thread->tls32_.debug_suspend_count
  • flags=” << thread->tls32_.state_and_flags.as_struct.flags
  • obj=” << reinterpret_cast<void*>(thread->tlsPtr_.opeer)
  • self=” << reinterpret_cast(thread) << “\n”;
  • sysTid=1994
  • nice=0 getpriority系统调用
  • cgrp=default /proc/self/task/pid/cgroup cpu:行,没有就是default
  • sched : sched_getscheduler()/sched_getparam() 系统调用
  • handle=0x7fb300b450 reinterpret_cast<void*>(thread->tlsPtr_.pthread_self);
  • state=R 进程状态 下述信息中的S
    (3) state %c
    One of the following characters, indicating process
    state:

                    R  Running
    
                    S  Sleeping in an interruptible wait
    
                    D  Waiting in uninterruptible disk sleep
    
                    Z  Zombie
    
                    T  Stopped (on a signal) or (before Linux 2.6.33)
                       trace stopped
    
                    t  Tracing stop (Linux 2.6.33 onward)
    
                    W  Paging (only before Linux 2.6.0)
    
                    X  Dead (from Linux 2.6.0 onward)
    
                    x  Dead (Linux 2.6.33 to 3.13 only)
    
                    K  Wakekill (Linux 2.6.33 to 3.13 only)
    
                    W  Waking (Linux 2.6.33 to 3.13 only)
    
                    P  Parked (Linux 3.9 to 3.13 only)
    
cat /proc/5788/task/5788/stat
5788 (system_server) S 5459 5459 0 0 -1 1077936448 181146 0 78 0 2444 533 0 0 -2 -20 203 0 67425 2766426112 90450 18446744073709551615 366503874560 366503889516 548938123040 548938113152 548161414464 0 4612 1 36088 18446743798833821120 0 0 17 1 1 1 0 0 0 366503897704 366503899144 366714880000 548938124082 548938124181 548938124181 548938125278 0
  • schedstat=( 60023536568 9178338589 34437 )
/proc/self/task/%d/schedstat
/proc/<pid>/schedstat
----------------
schedstats also adds a new /proc/<pid>/schedstat file to include some of
the same information on a per-process level.  There are three fields in
this file correlating for that process to:
     1) time spent on the cpu
     2) time spent waiting on a runqueue
     3) # of timeslices run on this cpu
  • utm=4300
    (14) utime %lu
    Amount of time that this process has been scheduled
    in user mode, measured in clock ticks (divide by
    sysconf(_SC_CLK_TCK)). This includes guest time,
    guest_time (time spent running a virtual CPU, see
    below), so that applications that are not aware of
    the guest time field do not lose that time from
    their calculations.
  • stm=1702
    (15) stime %lu
    Amount of time that this process has been scheduled
    in kernel mode, measured in clock ticks (divide by
    sysconf(_SC_CLK_TCK)).
  • core=3
    (39) processor %d (since Linux 2.2.8)
    CPU number last executed on.
  • HZ=100 sysconf(_SC_CLK_TCK)
  • stack=0x7fb2f11000-0x7fb2f13000 stackSize=1005KB
  • held mutexes= “mutator lock”(shared held)
    锁类型
    if 读写锁:
    (exclusive held) exclusive_owner_ = self
    (shared held) exclusive_owner_ = self, maybe not locked

关于major 以及minor
有时候,系统中内存压力比较大,会出现如下log。
这个log中打印的有32% 1582/system_server: 18% user + 13% kernel / faults: 7549 minor 35 major
那么这里的minor是什么意思呢?
http://androidxref.com/8.0.0_r4/xref/frameworks/base/core/java/com/android/internal/os/ProcessCpuTracker.java#239 中的st.rel_minfaults

通过ProcessCpuTracker,android会计算上次统计到本次统计之间的fault的差值,所以下面的4431ms to -1209ms faults: 7549 minor 35 major代表system_server在5640ms内一共发生了7549 minor 35 major fault

03-10 11:42:57.440 565 565 E lowmemorykiller: Error writing /proc/4357/oom_score_adj; errno=22
03-10 11:42:57.442 9819 9819 D AndroidRuntime: Shutting down VM
03-10 11:42:57.453 1582 1597 E ActivityManager: ANR in com.tencent.mm:appbrand0
03-10 11:42:57.453 1582 1597 E ActivityManager: PID: 9165
03-10 11:42:57.453 1582 1597 E ActivityManager: Reason: Broadcast of Intent { flg=0x10 cmp=com.tencent.mm/.plugin.appbrand.task.AppBrandTaskPreloadReceiver }
03-10 11:42:57.453 1582 1597 E ActivityManager: Load: 0.0 / 0.0 / 0.0
03-10 11:42:57.453 1582 1597 E ActivityManager: CPU usage from 4431ms to -1209ms ago (2018-03-10 11:42:51.783 to 2018-03-10 11:42:57.423):
03-10 11:42:57.453 1582 1597 E ActivityManager: 95% 9165/com.tencent.mm:appbrand0: 39% user + 55% kernel / faults: 61 minor 2 major
03-10 11:42:57.453 1582 1597 E ActivityManager: 32% 1582/system_server: 18% user + 13% kernel / faults: 7549 minor 35 major
03-10 11:42:57.453 1582 1597 E ActivityManager: 19% 442/logd: 8.6% user + 10% kernel / faults: 166 minor 2 major
03-10 11:42:57.453 1582 1597 E ActivityManager: 0.9% 2544/mcd: 0.1% user + 0.7% kernel / faults: 21 minor 1 major
03-10 11:42:57.453 1582 1597 E ActivityManager: 6.9% 2758/com.miui.daemon: 6% user + 0.8% kernel / faults: 6797 minor 795 major
03-10 11:42:57.453 1582 1597 E ActivityManager: 4.9% 2483/com.android.phone: 4.2% user + 0.7% kernel / faults: 1202 minor 2 major
03-10 11:42:57.453 1582 1597 E ActivityManager: 0.1% 2988/com.miui.home: 0.1% user + 0% kernel / faults: 475 minor 14 major
03-10 11:42:57.453 1582 1597 E ActivityManager: 0.1% 774/keystore: 0% user + 0.1% kernel / faults: 28 minor

dumpsys cpuinfo

------ DUMPSYS CPUINFO (/system/bin/dumpsys -t 10 cpuinfo -a) ------
Load: 4.53 / 3.7 / 3.43
CPU usage from 24658ms to 692ms ago (2018-03-16 09:19:42.228 to 2018-03-16 09:20:06.194):
  31% 11387/system_server: 18% user + 13% kernel / faults: 30112 minor 30 major
  0% 11821/com.android.settings: 0% user + 0% kernel / faults: 28606 minor 13 major
  5.4% 11645/com.android.systemui: 3.8% user + 1.5% kernel / faults: 8902 minor
  5.2% 783/surfaceflinger: 2.9% user + 2.3% kernel / faults: 119 minor
...
 +0% 5466/kworker/7:2: 0% user + 0% kernel
 +0% 5482/dumpsys: 0% user + 0% kernel
 +0% 5525/kworker/7:3: 0% user + 0% kernel
 +0% 5562/sleep: 0% user + 0% kernel
13% TOTAL: 8.1% user + 4.5% kernel + 0.2% iowait + 0.1% softirq
------ 0.039s was the duration of 'DUMPSYS CPUINFO' ------

Load: 负载

其他信息

  31% 11387/system_server: 18% user + 13% kernel / faults: 30112 minor 30 major
  1. ##### 第一个字符:“+”, “-”, “ ”.
    “+”:
    “-“:
    ” “:
  2. ##### 百分比:
    user+system+iowait+irq+softIrq / totalTime
  3. ##### pid
  4. ##### name
  5. ##### %u
  6. ##### %sys
  7. ##### %iowait if iowait > 0
  8. ##### %irq if irq > 0
  9. ##### %softirq if softirq > 0
  10. ##### minFaults,majorFaults.

minFaults:

发生minFaults的次数,指引发page_fault,但是不需要发起磁盘io操作

majorFaults

指引发page_fault,并且需要发起磁盘io操作

android fs_trim

android fs_trim

Android中的TRIM优化

fstrim调用时机:

1. 系统启动时:

默认最小间隔时间 3天
private void startOtherServices() {
        ......
        traceBeginAndSlog("PerformFstrimIfNeeded");
        try {
            mPackageManagerService.performFstrimIfNeeded();
        } catch (Throwable e) {
            reportWtf("performing fstrim", e);
        }
        traceEnd();
        .....
}

2. 系统启动时设置的定时任务:

每日凌晨3:00
    private void handleSystemReady() {
        initIfReadyAndConnected();
        resetIfReadyAndConnected();

        // Start scheduling nominally-daily fstrim operations
        MountServiceIdler.scheduleIdlePass(mContext);
    }

RSA

rsa 备忘

从公钥以及私钥里边提取N,E, D参数

openssl.exe rsa  -inform PEM -text -noout -in .\rsa_private_key.pem
openssl.exe rsa -pubin -inform PEM -text -noout -in rsa_public_key.pem

# modulus: N
# publicExponent: E
# privateExponent: D