Debug linux kernel boot process

Table of Contents

目的

当linux内核在启动过程中出现bug时,可能需要适用gdb/ulink/ds5 debugger来进行debug,大多数的debug方法时在启动后,再连接上调试器,最后连接上调试器之后再加载symbol,打断点来完成debug。
对于大多数的情况,以上方法就足以应付,但是部分情况下(在内核初始化早期,可能还来不及连上调试器,内核就已经挂掉了),需要更加复杂的打断点方法。
这篇文章就针对这种情况进行说明。
实验环境:qemu + gdb + aarch64

初始化代码(MMU开启之前)

由于Linux内核可以被bootloader加载到任意位置来执行,在mmu开启之前,内核代码运行地址是由bootloader来决定,此时需要在bootloader处打断点来获取加载地址,通过这个地址来调试。

获取加载地址

由于需要获取bootloader的加载地址,可以在bootloader跳转到内核之前打上断点来获取加载地址,下面以uboot为例:

正确加载uboot的symbol到运行位置

在uboot倒计时开始时停止,使用调试器连接上,由于uboot有relocaddr的过程,所以要先获取uboot实际运行的地址,在aarch64平台上x18寄存器保存了gd指针的地址,通过x18和symbol文件就可以获取到relocaddr

target remote:1234
add-symbol-file ~/work/uboot/u-boot
set $uboot_relocaddr = ((gd_t *)$x18)->relocaddr
add-symbol-file ~/work/uboot/u-boot $uboot_relocaddr

在正确加载好symbol之后,适用bt命令接可以正确的获取到uboot当前执行位置,如下所示,uboot正在等待串口输入命令

(gdb) bt
#0  pl01x_serial_pending (dev=<optimized out>, input=true)
    at drivers/serial/serial_pl01x.c:321
#1  0x00000000bff316b8 in console_tstc (file=0) at common/console.c:328
#2  fgetc (file=0) at common/console.c:328
#3  0x00000000bff2fd9c in cread_line (timeout=0, len=<synthetic pointer>,
    buf=0xbffd9d50 "", prompt=0xbff80a19 "Hobot# ") at common/cli_readline.c:274
#4  cli_readline_into_buffer (prompt=0xbff80a19 "Hobot# ", buffer=0xbffd9d50 "",
    timeout=0) at common/cli_readline.c:549
#5  0x00000000bff2b130 in uboot_cli_readline (i=0xbbe93e60)
    at common/cli_hush.c:1029
#6  get_user_input (i=0xbbe93e60) at common/cli_hush.c:1029
#7  file_get (i=0xbbe93e60) at common/cli_hush.c:1108

在跳入内核之前打上断点,获取内核运行地址

(gdb) tb boot_jump_linux
Temporary breakpoint 1 at 0x880024e0: boot_jump_linux. (4 locations)
(gdb) c
Continuing.

Thread 1 hit Temporary breakpoint 1, boot_jump_linux (images=0xbffd9b20, flag=1024)
    at arch/arm/lib/bootm.c:323
323 {
(gdb) p images
$3 = (bootm_headers_t *) 0xbffd9b20
(gdb) set print pretty
(gdb) p /x *images
$6 = {
  legacy_hdr_os = 0x0,
  legacy_hdr_os_copy = {
    ih_magic = 0x0,
    ih_hcrc = 0x0,
    ih_time = 0x0,
    ih_size = 0x0,
    ih_load = 0x0,
    ih_ep = 0x0,
    ih_dcrc = 0x0,
    ih_os = 0x0,
    ih_arch = 0x0,
    ih_type = 0x0,
    ih_comp = 0x0,
    ih_name = {0x0 <repeats 32 times>}
  },
  legacy_hdr_valid = 0x0,
  os = {
    start = 0x83000000,
    end = 0x8390b000,
    image_start = 0x83000800,
    image_len = 0x8e9865,
    load = 0x89080000,
    comp = 0x1,
    type = 0x2,
    os = 0x5,
    arch = 0x0
  },
  ep = 0x89080000,
  rd_start = 0x0,
  rd_end = 0x0,
  ft_addr = 0x82000000,
  ft_len = 0x6ede,
  initrd_start = 0x0,
  initrd_end = 0x0,
  cmdline_start = 0x0,
  cmdline_end = 0x0,
  kbd = 0x0,
  verify = 0xffffffff,
  state = 0x1,
  lmb = {
    memory = {
      cnt = 0x1,
      size = 0x0,
--Type <RET> for more, q to quit, c to continue without paging--
      region = {{
          base = 0x80000000,
          size = 0x40000000
        }, {
          base = 0x0,
          size = 0x0
        }, {
          base = 0x0,
          size = 0x0
        }, {
          base = 0x0,
          size = 0x0
        }, {
          base = 0x0,
          size = 0x0
        }, {
          base = 0x0,
          size = 0x0
        }, {
          base = 0x0,
          size = 0x0
        }, {
          base = 0x0,
          size = 0x0
        }, {
          base = 0x0,
          size = 0x0
        }}
    },
    reserved = {
      cnt = 0x4,
      size = 0x0,
      region = {{
          base = 0x80000000,
          size = 0x770000
        }, {
          base = 0x82000000,
          size = 0x2000
        }, {
          base = 0x89080000,
          size = 0x100c200
        }, {
          base = 0xbbe922c0,
          size = 0x416dd40
--Type <RET> for more, q to quit, c to continue without paging--
        }, {
          base = 0x0,
          size = 0x0
        }, {
          base = 0x0,
          size = 0x0
        }, {
          base = 0x0,
          size = 0x0
        }, {
          base = 0x0,
          size = 0x0
        }, {
          base = 0x0,
          size = 0x0
        }}
    }
  }
}

hootm_headers_t中有几个变量需要注意一下,之后获取运行地址时会适用到

images->ep
内核启动地址
images->os.load
内核启动地址
images->os.start
内核加载地址

加载.head.text段,在内核第一条指令处打断点

由于symbol在加载时需要加载到物理地址上,所以可以通过下面的python脚本来辅助加载

source  ~/src/gdb-script/load-linux-init.py
load-linux-init ~/work/kernel/vmlinux images->ep
(gdb) source  ~/src/gdb-script/load-linux-init.py
(gdb) load-linux-init ~/work/kernel/vmlinux images->ep
Loading linux init head to 0x0000000083080000
Original linux text address at 0xffffff8008080000
generate command
add-symbol-file ~/work/kernel/vmlinux 0x83081000 -s .head.text 0x83080000 -s .init.text 0x83a00000
load linux image to physical address with command add-symbol-file ~/work/kernel/vmlinux 0x83081000 -s .head.t0
add symbol table from file "/home/schspa/work/kernel/vmlinux" at
    .text_addr = 0x83081000
    .head.text_addr = 0x83080000
    .init.text_addr = 0x83a00000
(gdb) b * 0x83080000
Breakpoint 2 at 0x83080000: file arch/arm64/kernel/head.S, line 82.
(gdb) c
Continuing.

Thread 1 hit Breakpoint 2, _text () at arch/arm64/kernel/head.S:82
82      add x13, x18, #0x16

初始代码(MMU开启之后)

MMU开启之后就不需要计算加载地址,可以直接打断点来调试

add-symbol-file ~/work/kernel/vmlinux
tb start_kernel
# do whatever you want
Contact me via :)
虚怀乃若谷,水深则流缓。