lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <9bc70c0d-00b3-ff5a-e8bd-d4315e367fad@intel.com>
Date:   Tue, 16 Jun 2020 10:50:05 +0800
From:   Rong Chen <rong.a.chen@...el.com>
To:     Catalin Marinas <catalin.marinas@....com>
Cc:     Nicolas Boichat <drinkcat@...omium.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Dmitry Vyukov <dvyukov@...gle.com>,
        Masahiro Yamada <yamada.masahiro@...ionext.com>,
        Kees Cook <keescook@...omium.org>,
        Petr Mladek <pmladek@...e.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>,
        Joe Lawrence <joe.lawrence@...hat.com>,
        Uladzislau Rezki <urezki@...il.com>,
        Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
        Stephen Rothwell <sfr@...b.auug.org.au>,
        Andrey Ryabinin <aryabinin@...tuozzo.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org
Subject: Re: [kmemleak] b751c52bb5: BUG:kernel_hang_in_boot_stage



On 6/10/20 6:56 PM, Catalin Marinas wrote:
> On Wed, Jun 10, 2020 at 03:51:56PM +0800, kernel test robot wrote:
>> FYI, we noticed the following commit (built with gcc-7):
>>
>> commit: b751c52bb587ae66f773b15204ef7a147467f4c7 ("kmemleak: increase DEBUG_KMEMLEAK_EARLY_LOG_SIZE default to 16K")
>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>>
>> in testcase: boot
>>
>> on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 8G
> [...]
>> BUG: kernel hang in boot stage
>>
>> To reproduce:
>>
>>          # build kernel
>> 	cd linux
>> 	cp config-5.3.0-11789-gb751c52bb587a .config
>> 	make HOSTCC=gcc-7 CC=gcc-7 ARCH=i386 olddefconfig prepare modules_prepare bzImage
> I've never tried kmemleak on i386.
>
> Anyway, I'm not sure what caused the hang (or whether it's a hang at
> all) but I suspect prior to the above commit, kmemleak probably just
> disabled itself (early log  buffer exceeded). So the bug may have been
> there already, only that kmemleak started working and tripped over it
> when the log buffer increased.

Hi,

Sorry for the late, I can reproduce the problem with command "bin/lkp 
qemu -k <bzImage> job-script",
and the kernel hangs:

[    0.333897] -----------------------------------------------------
[    0.334561]                                  |block | try |context|
[    0.335170] -----------------------------------------------------
[    0.335760]                           context:  ok  |  ok  |  ok |
[    0.337995]                               try:  ok  |  ok  |  ok |
[    0.340089]                             block:  ok  |  ok  |  ok |
[    0.342175]                          spinlock:  ok  |  ok  |  ok |
[    0.344481] -------------------------------------------------------
[    0.345068] Good, all 261 testcases passed! |
[    0.345514] ---------------------------------
KVM internal error. Suberror: 3
extra data[0]: 80000b0e
extra data[1]: 31
extra data[2]: 182
extra data[3]: bfff0
EAX=00000000 EBX=00200297 ECX=00000000 EDX=ffffffff
ESI=d2e997c0 EDI=d2e997f0 EBP=d2bbb038 ESP=c00bfff4
EIP=f4dccd57 EFL=00210046 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =007b 00000000 ffffffff 00c0f300 DPL=3 DS   [-WA]
CS =0060 00000000 ffffffff 00c09b00 DPL=0 CS32 [-RA]
SS =0068 00000000 ffffffff 00c09300 DPL=0 DS   [-WA]
DS =007b 00000000 ffffffff 00c0f300 DPL=3 DS   [-WA]
FS =00d8 23331000 ffffffff 00809300 DPL=0 DS16 [-WA]
GS =00e0 f6422900 00000018 00409100 DPL=0 DS   [--A]
LDT=0000 00000000 ffffffff 00c00000
TR =0080 ff403000 0000206b 00008b00 DPL=0 TSS32-busy
GDT=     ff401000 000000ff
IDT=     ff400000 000007ff
CR0=80050033 CR2=00000000 CR3=130fc000 CR4=00000690
DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 
DR3=0000000000000000
DR6=00000000fffe0ff0 DR7=0000000000000400
EFER=0000000000000000
Code=00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <00> 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00

>
> Is there a chance that the kernel got much slower with kmemleak enabled
> and the test scripts timed out?
no, the parent commit log is:

[    0.313845] -----------------------------------------------------
[    0.314608]                                  |block | try |context|
[    0.315314] -----------------------------------------------------
[    0.315974]                           context:  ok  |  ok  |  ok |
[    0.318261]                               try:  ok  |  ok  |  ok |
[    0.320478]                             block:  ok  |  ok  |  ok |
[    0.322562]                          spinlock:  ok  |  ok  |  ok |
[    0.324825] -------------------------------------------------------
[    0.325403] Good, all 261 testcases passed! |
[    0.325809] ---------------------------------
[    0.326221] kmemleak: Early log buffer exceeded (401), please 
increase DEBUG_KMEMLEAK_EARLY_LOG_SIZE
[    0.327065] ACPI: Core revision 20190816
[    0.327585] clocksource: hpet: mask: 0xffffffff max_cycles: 
0xffffffff, max_idle_ns: 19112604467 ns
[    0.328545] APIC: Switch to symmetric I/O mode setup
[    0.329009] Enabling APIC mode:  Flat.  Using 1 I/O APICs
[    0.329572] masked ExtINT on CPU#0
[    0.330686] ENABLING IO-APIC IRQs
[    0.331001] init IO_APIC IRQs
[    0.331274]  apic 0 pin 0 not connected

>
> Does this problem still exist with the latest mainline?
yes, still in v5.7.

Best Regards,
Rong Chen

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ