[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <dc4ca9b5-d2a2-03af-c186-204a3aad2399@redhat.com>
Date: Wed, 5 Feb 2020 12:25:15 +0800
From: lijiang <lijiang@...hat.com>
To: John Ogness <john.ogness@...utronix.de>,
Petr Mladek <pmladek@...e.com>
Cc: Peter Zijlstra <peterz@...radead.org>,
Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>,
Sergey Senozhatsky <sergey.senozhatsky@...il.com>,
Steven Rostedt <rostedt@...dmis.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Andrea Parri <parri.andrea@...il.com>,
Thomas Gleixner <tglx@...utronix.de>,
kexec@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 0/2] printk: replace ringbuffer
Hi, John Ogness
Thank you for improving the patch series and making great efforts.
I'm not sure if I missed anything else. Or are there any other related patches to be applied?
After applying this patch series, NMI watchdog detected a hard lockup, which caused that kernel can not boot, please refer to
the following call trace. And I put the complete kernel log in the attachment.
Test machine:
Intel Platform: Grantley-R Wildcat Pass CPU: Broadwell-EP, B0
Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz
65536 MB memory, 800 GB disk space
kernel: v5.5-rc7
commit: def9d2780727 ("Linux 5.5-rc7")
......
[ OK ] Started udev Coldplug all Devices.
[ 42.110978] NMI watchdog: Watchdog detected hard LOCKUP on cpu 15
[ 42.110978] Modules linked in: ip_tables xfs libcrc32c sr_mod cdrom sd_mod sg mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm_vram_helper drm_ttm_helper ttm ahci libahci ixgbe drm crc32c_intel libata mdio dca i2c_algo_bit wmi dm_mirror dm_region_hash dm_log dm_mod
[ 42.110986] CPU: 15 PID: 1395 Comm: systemd-journal Not tainted 5.5.0-rc7+ #4
[ 42.110986] Hardware name: Intel Corporation S2600WTT/S2600WTT, BIOS SE5C610.86B.01.01.6024.071720181717 07/17/2018
[ 42.110987] RIP: 0010:native_queued_spin_lock_slowpath+0x5d/0x1c0
[ 42.110988] Code: 0f ba 2f 08 0f 92 c0 0f b6 c0 c1 e0 08 89 c2 8b 07 30 e4 09 d0 a9 00 01 ff ff 75 47 85 c0 74 0e 8b 07 84 c0 74 08 f3 90 8b 07 <84> c0 75 f8 b8 01 00 00 00 66 89 07 c3 8b 37 81 fe 00 01 00 00 75
[ 42.110988] RSP: 0018:ffffbbe207a7bc48 EFLAGS: 00000002
[ 42.110989] RAX: 0000000000f80101 RBX: ffffffffa1576e80 RCX: 0000000000000000
[ 42.110990] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffa1e95660
[ 42.110990] RBP: 0000000000000000 R08: 0000000000000000 R09: 000000000000000b
[ 42.110991] R10: ffffa075df5dcf80 R11: ffffffffa0ebfda0 R12: ffffffffa1e95660
[ 42.110991] R13: ffffffffa1e97680 R14: ffffffffa17197a0 R15: 0000000000000047
[ 42.110991] FS: 00007f7c5642a980(0000) GS:ffffa075df5c0000(0000) knlGS:0000000000000000
[ 42.110992] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 42.110992] CR2: 00007ffe95f4c4c0 CR3: 000000084fbfc004 CR4: 00000000003606e0
[ 42.110993] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 42.110993] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 42.110993] Call Trace:
[ 42.110993] _raw_spin_lock+0x1a/0x20
[ 42.110994] console_unlock+0x9e/0x450
[ 42.110994] bust_spinlocks+0x16/0x30
[ 42.110994] oops_end+0x33/0xc0
[ 42.110995] general_protection+0x32/0x40
[ 42.110995] RIP: 0010:copy_data+0xf2/0x1e0
[ 42.110995] Code: eb 08 49 83 c4 08 0f 84 8e 00 00 00 4c 89 74 24 08 4c 89 cd 41 89 d6 44 89 44 24 04 49 39 db 0f 87 c6 00 00 00 4d 85 c9 74 43 <41> c7 01 00 00 00 00 48 85 db 74 37 4c 89 e7 48 89 da 41 bf 01 00
[ 42.110996] RSP: 0018:ffffbbe207a7bd80 EFLAGS: 00010002
[ 42.110996] RAX: ffffa075d44ca000 RBX: 00000000000000a8 RCX: fffffffffff000b0
[ 42.110997] RDX: 00000000000000a8 RSI: 00000fffffffff01 RDI: ffffffffa1456e00
[ 42.110997] RBP: 0801364600307073 R08: 0000000000002000 R09: 0801364600307073
[ 42.110997] R10: fffffffffff00000 R11: 00000000000000a8 R12: ffffffffa1e98330
[ 42.110998] R13: 00000000d7efbe00 R14: 00000000000000a8 R15: 00000000ffffc000
[ 42.110998] _prb_read_valid+0xd8/0x190
[ 42.110998] prb_read_valid+0x15/0x20
[ 42.110999] devkmsg_read+0x9d/0x2a0
[ 42.110999] vfs_read+0x91/0x140
[ 42.110999] ksys_read+0x59/0xd0
[ 42.111000] do_syscall_64+0x55/0x1b0
[ 42.111000] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 42.111000] RIP: 0033:0x7f7c55740b62
[ 42.111001] Code: 94 20 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b6 0f 1f 80 00 00 00 00 f3 0f 1e fa 8b 05 e6 d8 20 00 85 c0 75 12 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 56 c3 0f 1f 44 00 00 41 54 49 89 d4 55 48 89
[ 42.111001] RSP: 002b:00007ffe95f4c4a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[ 42.111002] RAX: ffffffffffffffda RBX: 00007ffe95f4e500 RCX: 00007f7c55740b62
[ 42.111002] RDX: 0000000000002000 RSI: 00007ffe95f4c4b0 RDI: 0000000000000008
[ 42.111002] RBP: 0000000000000000 R08: 0000000000000100 R09: 0000000000000003
[ 42.111003] R10: 0000000000000100 R11: 0000000000000246 R12: 00007ffe95f4c4b0
[ 42.111003] R13: 00007ffe95f4e910 R14: 0000000000000000 R15: 0000000000000000
[ 42.111003] Kernel panic - not syncing: Hard LOCKUP
[ 42.111004] Shutting down cpus with NMI
[ 42.111004] Kernel Offset: 0x1f000000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[ 42.111005] general protection fault: 0000 [#1] SMP PTI
[ 42.111005] CPU: 15 PID: 1395 Comm: systemd-journal Not tainted 5.5.0-rc7+ #4
[ 42.111005] Hardware name: Intel Corporation S2600WTT/S2600WTT, BIOS SE5C610.86B.01.01.6024.071720181717 07/17/2018
[ 42.111006] RIP: 0010:copy_data+0xf2/0x1e0
[ 42.111006] Code: eb 08 49 83 c4 08 0f 84 8e 00 00 00 4c 89 74 24 08 4c 89 cd 41 89 d6 44 89 44 24 04 49 39 db 0f 87 c6 00 00 00 4d 85 c9 74 43 <41> c7 01 00 00 00 00 48 85 db 74 37 4c 89 e7 48 89 da 41 bf 01 00
[ 42.111007] RSP: 0018:ffffbbe207a7bd80 EFLAGS: 00010002
[ 42.111007] RAX: ffffa075d44ca000 RBX: 00000000000000a8 RCX: fffffffffff000b0
[ 42.111008] RDX: 00000000000000a8 RSI: 00000fffffffff01 RDI: ffffffffa1456e00
[ 42.111008] RBP: 0801364600307073 R08: 0000000000002000 R09: 0801364600307073
[ 42.111008] R10: fffffffffff00000 R11: 00000000000000a8 R12: ffffffffa1e98330
[ 42.111009] R13: 00000000d7efbe00 R14: 00000000000000a8 R15: 00000000ffffc000
[ 42.111009] FS: 00007f7c5642a980(0000) GS:ffffa075df5c0000(0000) knlGS:0000000000000000
[ 42.111010] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 42.111010] CR2: 00007ffe95f4c4c0 CR3: 000000084fbfc004 CR4: 00000000003606e0
[ 42.111011] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 42.111011] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 42.111012] Call Trace:
[ 42.111012] _prb_read_valid+0xd8/0x190
[ 42.111012] prb_read_valid+0x15/0x20
[ 42.111013] devkmsg_read+0x9d/0x2a0
[ 42.111013] vfs_read+0x91/0x140
[ 42.111013] ksys_read+0x59/0xd0
[ 42.111014] do_syscall_64+0x55/0x1b0
[ 42.111014] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 42.111014] RIP: 0033:0x7f7c55740b62
[ 42.111015] Code: 94 20 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b6 0f 1f 80 00 00 00 00 f3 0f 1e fa 8b 05 e6 d8 20 00 85 c0 75 12 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 56 c3 0f 1f 44 00 00 41 54 49 89 d4 55 48 89
[ 42.111015] RSP: 002b:00007ffe95f4c4a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[ 42.111016] RAX: ffffffffffffffda RBX: 00007ffe95f4e500 RCX: 00007f7c55740b62
[ 42.111016] RDX: 0000000000002000 RSI: 00007ffe95f4c4b0 RDI: 0000000000000008
[ 42.111017] RBP: 0000000000000000 R08: 0000000000000100 R09: 0000000000000003
[ 42.111017] R10: 0000000000000100 R11: 0000000000000246 R12: 00007ffe95f4c4b0
[ 42.111017] R13: 00007ffe95f4e910 R14: 0000000000000000 R15: 0000000000000000
[ 42.111017] Modules linked in: ip_tables xfs libcrc32c sr_mod cdrom sd_mod sg mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm_vram_helper drm_ttm_helper ttm ahci libahci ixgbe drm crc32c_intel libata mdio dca i2c_algo_bit wmi dm_mirror dm_region_hash dm_log dm_mod
---hang---
Thanks.
Lianbo
> Hello,
>
> After several RFC series [0][1][2][3][4], here is the first set of
> patches to rework the printk subsystem. This first set of patches
> only replace the existing ringbuffer implementation. No locking is
> removed. No semantics/behavior of printk are changed.
>
> The VMCOREINFO is updated, which will require changes to the
> external crash [5] tool. I will be preparing a patch to add support
> for the new VMCOREINFO.
>
> This series is in line with the agreements [6] made at the meeting
> during LPC2019 in Lisbon, with 1 exception: support for dictionaries
> will _not_ be discontinued [7]. Dictionaries are stored in a separate
> buffer so that they cannot interfere with the human-readable buffer.
>
> John Ogness
>
> [0] https://lkml.kernel.org/r/20190212143003.48446-1-john.ogness@linutronix.de
> [1] https://lkml.kernel.org/r/20190607162349.18199-1-john.ogness@linutronix.de
> [2] https://lkml.kernel.org/r/20190727013333.11260-1-john.ogness@linutronix.de
> [3] https://lkml.kernel.org/r/20190807222634.1723-1-john.ogness@linutronix.de
> [4] https://lkml.kernel.org/r/20191128015235.12940-1-john.ogness@linutronix.de
> [5] https://github.com/crash-utility/crash
> [6] https://lkml.kernel.org/r/87k1acz5rx.fsf@linutronix.de
> [7] https://lkml.kernel.org/r/20191007120134.ciywr3wale4gxa6v@pathway.suse.cz
>
> John Ogness (2):
> printk: add lockless buffer
> printk: use the lockless ringbuffer
>
> include/linux/kmsg_dump.h | 2 -
> kernel/printk/Makefile | 1 +
> kernel/printk/printk.c | 836 +++++++++---------
> kernel/printk/printk_ringbuffer.c | 1370 +++++++++++++++++++++++++++++
> kernel/printk/printk_ringbuffer.h | 328 +++++++
> 5 files changed, 2114 insertions(+), 423 deletions(-)
> create mode 100644 kernel/printk/printk_ringbuffer.c
> create mode 100644 kernel/printk/printk_ringbuffer.h
>
View attachment "kernel-5.5.0-rc7.log" of type "text/x-log" (100220 bytes)
Powered by blists - more mailing lists