lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <feae6d57-e8a9-36cf-56c5-e9334d7df303@windriver.com>
Date:   Fri, 26 Jul 2019 13:36:51 -0600
From:   Chris Friesen <chris.friesen@...driver.com>
To:     rt-users <linux-rt-users@...r.kernel.org>,
        <linux-kernel@...r.kernel.org>
Subject: [RT] hit recently-fixed PREEMPT_RT CFS-bandwidth timer locking issue
 in the wild

Hi all,

I thought people might be interested to hear that we recently hit the 
bug fixed by git commit c0ad4aa4d8 on multiple lab systems running the 
RHEL 7 "kernel-rt" kernel.  (But I think other versions are at risk as 
well.)

Interestingly, when the bug hit the system just hung completely. Nothing 
was emitted on netconsole or serial console, neither the hung task timer 
nor the NMI watchdog triggered, CONFIG_DEBUG_SPINLOCK didn't output 
anything, and magic sysrq didn't work on the serial console.  As you can 
imagine this was a bit frustrating.  I was finally able to cause a panic 
by sending an NMI from the BMC and that allowed kdump to store the core 
file so I could get stack traces.

Given how annoying it was to debug, I'd recommend backporting this fix 
as far back as it applies.  HRTIMER_MODE_SOFT was introduced in mainline 
in 4.16, but at least in the RHEL7 kernel-rt package (and I think in the 
vanilla PREEMPT_RT patches as well) hrtimers are run by default in 
softirq context and so the fix might apply to all supported PREEMPT_RT 
versions.

Chris

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ