lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20171109051905.pdlsyrbzrwlsjbrs@wfg-t540p.sh.intel.com>
Date:   Thu, 9 Nov 2017 13:19:05 +0800
From:   Fengguang Wu <fengguang.wu@...el.com>
To:     Linus Torvalds <torvalds@...ux-foundation.org>
Cc:     Network Development <netdev@...r.kernel.org>,
        Linux Wireless List <linux-wireless@...r.kernel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [run_timer_softirq] BUG: unable to handle kernel paging request
 at 0000000000010007

On Mon, Oct 30, 2017 at 12:29:47PM -0700, Linus Torvalds wrote:
>On Sun, Oct 29, 2017 at 4:48 PM, Fengguang Wu <fengguang.wu@...el.com> wrote:
>>
>> Here are 3 dmesgs related to wireless and 1 from ethernet.
>
>Fengguang, these would be lovelier still _if_ you have DEBUG_INFO
>enabled on the kernel, and your script were to find things like
>"symbol+0xhex/0xhex", and run "./scripts/faddr2line" on them.
>
>So
>
>> [  235.425464] BUG: unable to handle kernel paging request at 0000000000010007
>> [  235.425470] IP: run_timer_softirq+0x13a/0x470
>
>would also then have
>
>   run_timer_softirq at timer.c:XYZ
>
>which would make it easier to see exactly _what_ it is that faults. As
>it is, I think there's a fair number of inlining that makes it hard to
>see the cause, but that faddrtoline would make very obvious.
>
>Finding that "symbol+xyz/abc" pattern should be fairly easy to
>automate, and should fit the 0day model fairly well. No?
>
>In this case, the trapping instruction ends up decoding to
>
>   0: 4c 8d 6c c5 90        lea    -0x70(%rbp,%rax,8),%r13
>   5: 49 8b 45 00          mov    0x0(%r13),%rax
>   9: 48 85 c0              test   %rax,%rax
>   c: 74 de                je     0xffffffffffffffec
>   e: 4d 8b 7d 00          mov    0x0(%r13),%r15
>  12: 4d 89 7c 24 08        mov    %r15,0x8(%r12)
>  17: 0f 1f 44 00 00        nopl   0x0(%rax,%rax,1)
>  1c: 49 8b 07              mov    (%r15),%rax
>  1f: 49 8b 57 08          mov    0x8(%r15),%rdx
>  23: 48 85 c0              test   %rax,%rax
>  26: 48 89 02              mov    %rax,(%rdx)
>  29: 74 04                je     0x2f
>  2b:* 48 89 50 08          mov    %rdx,0x8(%rax) <-- trapping instruction
>  2f: 41 f6 47 2a 20        testb  $0x20,0x2a(%r15)
>  34: 49 c7 47 08 00 00 00 movq   $0x0,0x8(%r15)
>
>and %rax has the value 0xffff, so yes, it will trap at 0x10007.
>
>It's not trivial to see just *wjhat* access it is.
>
>I *think* that "testb $32" is checking for TIMER_IRQSAFE in
>expire_timers(), and that the oops is due to the list operations in
>detach_timer() (inlined).
>
>Which doesn't really help: it looks like the timer lists are corrupt.
>With some luck, some register state could have the timer function
>pointer in it, and we'd get a hint of *which* timer this is, but that
>doesn't look to be the case here either.
>
>I'm not seeing anything to really help debug this here.

Yes it's accessing the list. Here is the faddr2line output.

[  190.983445] PASS fils_sk_hlp_no_rapid_commit 0.549105 2017-11-09 12:41:50.114825
[  190.983447]
[  190.986376] passed all 1 test case(s)
[  190.986377]
[  191.006705] ignored_by_lkp fils_sk_hlp_oom
[  191.006708]
[  191.007635] general protection fault: 0000 [#1] SMP
[  191.007636] Modules linked in: cmac ccm arc4 mac80211_hwsim mac80211 cfg80211 rfkill rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver sr_mod cdrom sg ata_generic pata_acpi sb_edac crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel pcbc ppdev aesni_intel crypto_simd glue_helper cryptd snd_pcm snd_timer snd soundcore pcspkr serio_raw ata_piix i2c_piix4 libata floppy parport_pc parport ip_tables
[  191.007656] CPU: 1 PID: 186 Comm: sed Not tainted 4.14.0-rc8-00013-g87df261 #1
[  191.007657] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
[  191.007659] task: ffff88007fa58000 task.stack: ffffc900008b0000
[  191.007663] RIP: 0010:run_timer_softirq+0x13a/0x470:
run_timer_softirq at include/linux/list.h:651
[  191.007664] RSP: 0018:ffff88013fd03ec8 EFLAGS: 00010086
[  191.007665] RAX: dead000000000200 RBX: 0000000000000001 RCX: ffff8800710bbb80
[  191.007665] RDX: ffff88013fd03ed0 RSI: ffff88013fd122e8 RDI: ffff88013fd03ed8
[  191.007666] RBP: ffff88013fd03f40 R08: ffff88013fd127f8 R09: ffff88013fd03ed8
[  191.007667] R10: ffff88013fd12330 R11: 000000000000008f R12: ffff88013fd122c0
[  191.007668] R13: ffff88013fd03ed0 R14: dead000000000200 R15: ffff8800710bbb80
[  191.007669] FS:  00007fabab3b9800(0000) GS:ffff88013fd00000(0000) knlGS:0000000000000000
[  191.007670] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  191.007674] CR2: 00007fc7ea4895d0 CR3: 000000007fa57002 CR4: 00000000001606e0
[  191.007678] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  191.007679] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  191.007680] Call Trace:
[  191.007681]  <IRQ>
[  191.007685]  ? kvm_clock_get_cycles+0x25/0x30:
kvm_clock_get_cycles at arch/x86/include/asm/preempt.h:81
[  191.007687]  ? ktime_get+0x40/0xa0:
ktime_get at kernel/time/timekeeping_internal.h:19
[  191.007690]  __do_softirq+0x104/0x2ab
[  191.007693]  irq_exit+0xf1/0x100:
irq_exit at kernel/softirq.c:364
[  191.007695]  smp_apic_timer_interrupt+0x69/0x140:
smp_apic_timer_interrupt at arch/x86/include/asm/irq_regs.h:27
[  191.007696]  apic_timer_interrupt+0x9d/0xb0
[  191.007697]  </IRQ>
[  191.007701] RIP: 0010:console_unlock+0x1f0/0x4f0:
console_unlock at arch/x86/include/asm/paravirt.h:778
[  191.007702] RSP: 0018:ffffc900008b3d18 EFLAGS: 00000247 ORIG_RAX: ffffffffffffff10
[  191.007703] RAX: 0000000000000001 RBX: 0000000000000000 RCX: 00000000ffffffff
[  191.007703] RDX: ffffc900008b3cd0 RSI: 0000000000000004 RDI: 0000000000000247
[  191.007704] RBP: ffffc900008b3d50 R08: 0000000000000000 R09: 0000000000000000
[  191.007705] R10: ffff8800000be400 R11: 0000000082354601 R12: 000000000000002e
[  191.007705] R13: ffffffff820755a0 R14: 0000000000000000 R15: ffffffff823546a0
[  191.007708]  vprintk_emit+0x302/0x3b0:
vprintk_emit at kernel/printk/printk.c:1760
[  191.007710]  printk_emit+0x33/0x3b:
printk_emit at kernel/printk/printk.c:1782
[  191.007713]  ? __kmalloc+0x195/0x210:
__kmalloc at mm/slab.h:421
[  191.007715]  devkmsg_write+0xf6/0x150:
devkmsg_write at kernel/printk/printk.c:815
[  191.007718]  __vfs_write+0xd3/0x150:
__vfs_write at fs/read_write.c:470
[  191.007719]  vfs_write+0xc7/0x1b0:
vfs_write at fs/read_write.c:545
[  191.007721]  SyS_write+0x46/0xa0:
SyS_write at fs/read_write.c:590
[  191.007722]  entry_SYSCALL_64_fastpath+0x1a/0xa5:
entry_SYSCALL_64_fastpath at arch/x86/entry/entry_64.S:206
[  191.007723] RIP: 0033:0x7fabaaab6720
[  191.007724] RSP: 002b:00007fff07d55008 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[  191.007725] RAX: ffffffffffffffda RBX: 0000000000000106 RCX: 00007fabaaab6720
[  191.007725] RDX: 0000000000000001 RSI: 0000000000611598 RDI: 0000000000000001
[  191.007726] RBP: 000000000000022c R08: 00007fabaad74760 R09: 00007fabaad73600
[  191.007727] R10: 000000000000001e R11: 0000000000000246 R12: 00000000021ada10
[  191.007727] R13: 0000000000fd4940 R14: 00007fff07d54f30 R15: 000000000000022c
[  191.007729] Code: 4c 8d 6c c5 90 49 8b 45 00 48 85 c0 74 de 4d 8b 7d 00 4d 89 7c 24 08 0f 1f 44 00 00 49 8b 07 49 8b 57 08 48 85 c0 48 89 02 74 04 <48> 89 50 08 41 f6 47 2a 20 49 c7 47 08 00 00 00 00 4c 89 e7 4d
[  191.007748] RIP: run_timer_softirq+0x13a/0x470:
run_timer_softirq at include/linux/list.h:651 RSP: ffff88013fd03ec8
[  191.007751] ---[ end trace bb3719ef16688f4f ]---
[  191.007752] Kernel panic - not syncing: Fatal exception in interrupt
[  191.007773] Kernel Offset: disabled

Regards,
Fengguang

View attachment "dmesg-vm-lkp-hsw01-4G-2:20171109124159:x86_64-rhel-7.2+CONFIG_DEBUG_INFO_REDUCED:4.14.0-rc8-00013-g87df261:1" of type "text/plain" (2086423 bytes)

View attachment ".config" of type "text/plain" (163325 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ