lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 31 May 2017 18:32:00 +0200 (CEST)
From:   Sebastian Ott <sebott@...ux.vnet.ibm.com>
To:     Xin Long <lucien.xin@...il.com>,
        "David S. Miller" <davem@...emloft.net>
cc:     Haidong Li <haili@...hat.com>,
        Nikolay Aleksandrov <nikolay@...ulusnetworks.com>,
        Ivan Vecera <cera@...a.cz>,
        Stephen Hemminger <stephen@...workplumber.org>,
        netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
        Heiko Carstens <heiko.carstens@...ibm.com>,
        Martin Schwidefsky <schwidefsky@...ibm.com>
Subject: Oops with commit 6d18c73 bridge: start hello_timer when enabling
 KERNEL_STP in br_stp_start

Hi,

A system running v4.12-rc3-11-gf511c0b on s390 hangs after boot with no
messages on the console. The message buffer obtained via a system dump
looked like this:

[...]
[   17.870712] virbr0: port 1(virbr0-nic) entered disabled state
[   19.618523] Unable to handle kernel pointer dereference in virtual kernel address space
[  250.028426] INFO: task jbd2/dasda1-8:100 blocked for more than 120 seconds.
[  250.028427]       Not tainted 4.12.0-rc3-00011-gf511c0b #573
[  250.028428] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  250.028429] jbd2/dasda1-8   D12808   100      2 0x00000000
[  250.028437] Stack:
[  250.028437]        00000000e8c4f9b0 0000000000000000 0000000000233afe 00000000e8c48100
[  250.028441]        00000000e8c4f978 00000000001b1c98 00000000e8c4f978 00000000e8c4f9d8
[  250.028444]        04000000efdcce00 00000000e8c48890 0000000000000000 00000000efdcce18
[  250.028447]        00000000e8c48100 00000000efdcce00 00000000e8ce8100 00000000e73c6900
[  250.028450]        00000000008da090 00000000008c4f54 00000000e8c4f9d8 00000000e8c4fa60
[  250.028453] Call Trace:
[  250.028458] ([<00000000008c4f54>] __schedule+0xb14/0xc90)
[  250.028459]  [<00000000008c5164>] schedule+0x94/0xc0 
[  250.028462]  [<00000000001802ac>] io_schedule+0x34/0x58 
[  250.028464]  [<00000000002a44c2>] wait_on_page_bit+0x16a/0x198 
[  250.028465]  [<00000000002a4576>] __filemap_fdatawait_range+0x86/0x188 
[  250.028467]  [<00000000002a46a6>] filemap_fdatawait_range+0x2e/0x58 
[  250.028471]  [<00000000004719d4>] jbd2_journal_commit_transaction+0x10e4/0x2200 
[  250.028473]  [<000000000047890a>] kjournald2+0xda/0x2c0 
[  250.028475]  [<000000000016da5e>] kthread+0x166/0x178 
[  250.028477]  [<00000000008cce7a>] kernel_thread_starter+0x6/0xc 
[  250.028479]  [<00000000008cce74>] kernel_thread_starter+0x0/0xc 
[  250.028480] INFO: lockdep is turned off.
[...]

The system should have oopsed after
[   19.618523] Unable to handle kernel pointer dereference in virtual kernel address space

not sure why it didn't. Anyway, I bisected that to:

commit 6d18c732b95c0a9d35e9f978b4438bba15412284
Author: Xin Long <lucien.xin@...il.com>
Date:   Fri May 19 22:20:29 2017 +0800

    bridge: start hello_timer when enabling KERNEL_STP in br_stp_start
    
    Since commit 76b91c32dd86 ("bridge: stp: when using userspace stp stop
    kernel hello and hold timers"), bridge would not start hello_timer if
    stp_enabled is not KERNEL_STP when br_dev_open.
    
    The problem is even if users set stp_enabled with KERNEL_STP later,
    the timer will still not be started. It causes that KERNEL_STP can
    not really work. Users have to re-ifup the bridge to avoid this.
    
    This patch is to fix it by starting br->hello_timer when enabling
    KERNEL_STP in br_stp_start.
    
    As an improvement, it's also to start hello_timer again only when
    br->stp_enabled is KERNEL_STP in br_hello_timer_expired, there is
    no reason to start the timer again when it's NO_STP.
    
    Fixes: 76b91c32dd86 ("bridge: stp: when using userspace stp stop kernel hello and hold timers")
    Reported-by: Haidong Li <haili@...hat.com>
    Signed-off-by: Xin Long <lucien.xin@...il.com>
    Acked-by: Nikolay Aleksandrov <nikolay@...ulusnetworks.com>
    Reviewed-by: Ivan Vecera <cera@...a.cz>
    Signed-off-by: David S. Miller <davem@...emloft.net>

No clue why this broke my system. I reverted that commit on top of v4.12-rc3-11-gf511c0b
to be extra sure and it booted normally.

Full dmesg, config, and bisect log are attached.

Regards,
Sebastian
View attachment "dmesg" of type "text/plain" (30777 bytes)

View attachment "config" of type "text/plain" (56938 bytes)

View attachment "bisect" of type "text/plain" (1697 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ