lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ddd14f62-b6c9-4984-84be-6c999ea92e30@kzalloc.com>
Date: Wed, 6 Aug 2025 00:27:01 +0900
From: Yunseong Kim <ysk@...lloc.com>
To: Steven Rostedt <rostedt@...dmis.org>
Cc: Dmitry Vyukov <dvyukov@...gle.com>,
 Andrey Konovalov <andreyknvl@...il.com>,
 Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
 Thomas Gleixner <tglx@...utronix.de>,
 Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
 Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>,
 Byungchul Park <byungchul@...com>, max.byungchul.park@...il.com,
 Yeoreum Yun <yeoreum.yun@....com>, ppbuk5246@...il.com,
 linux-usb@...r.kernel.org, linux-rt-devel@...ts.linux.dev,
 syzkaller@...glegroups.com, linux-kernel@...r.kernel.org,
 stable@...r.kernel.org
Subject: Re: [PATCH v3 0/4] kcov, usb: Fix invalid context sleep in softirq
 path on PREEMPT_RT

Hi Steve,

You're absolutely right to ask for clarification, and I now realize that
I didn’t explain the background clearly enough in my cover letter.

On 8/5/25 1:24 오전, Steven Rostedt wrote:
> On Sun,  3 Aug 2025 07:20:41 +0000
> Yunseong Kim <ysk@...lloc.com> wrote:
> 
>> This patch series resolves a sleeping function called from invalid context
>> bug that occurs when fuzzing USB with syzkaller on a PREEMPT_RT kernel.
>>
>> The regression was introduced by the interaction of two separate patches:
>> one that made kcov's internal locks sleep on PREEMPT_RT for better latency
> 
> Just so I fully understand this change. It is basically reverting the
> "better latency" changes? That is, with KCOV anyone running with PREEMPT_RT
> can expect non deterministic latency behavior?

The regression results from the interaction of two changes — and in my original
description, I inaccurately characterized one of them as being 
"for better latency." That was misleading.

The first change d5d2c51 replaced spin_lock_irqsave() with local_lock_irqsave()
in KCOV to ensure compatibility with PREEMPT_RT. This avoided using a
potentially sleeping lock with interrupts disabled.
At the time, as Sebastian noted:

 "There is no compelling reason to change the lock type to raw_spin_lock_t...
  Changing it would require to move memory allocation and deallocation outside
  of the locked section."

However, the situation changed after another patch 8fea0c8 converted the USB
HCD tasklet to a BH workqueue. As a result, usb_giveback_urb_bh() began running
with interrupts enabled, and the KCOV remote coverage collection section in
this path became re-entrant. To prevent nested coverage sections — which KCOV
doesn’t support — kcov_remote_start_usb_softirq() was updated to explicitly
disable interrupts during coverage collection f85d39d.

This combination — using a local_lock (which can sleep on RT) alongside
local_irq_save() — inadvertently created a scenario where a sleeping lock was
acquired in atomic context, triggering a kernel BUG on PREEMPT_RT.

So while the original KCOV locking change didn't require raw spinlocks at
the time, it became effectively incompatible with the USB softirq use case once
that path began relying on interrupt disabling for correctness. In this sense,
the "no compelling reason" eventually turned into a "necessary compromise."

To clarify: this patch series doesn't revert the previous change entirely.
It keeps the local_lock behavior for task context (where it's safe and
appropriate), but ensures atomic safety in interrupt/softirq contexts by
using raw spinlocks selectively where needed.

> This should be fully documented. I assume this will not be a problem as
> kcov is more for debugging and should not be enabled in production.
> 
> -- Steve
> 

Thanks again for raising this — I’ll make sure the changelog documents this
interaction more clearly.

Best regards,
Yunseong Kim

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ