lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89iKCE2ofvPTAcUbLCKE46_y3gMs_tdh1D5214WnAs8Fjmg@mail.gmail.com>
Date: Mon, 18 Aug 2025 09:58:42 -0700
From: Eric Dumazet <edumazet@...gle.com>
To: Yunseong Kim <ysk@...lloc.com>
Cc: "David S. Miller" <davem@...emloft.net>, Florian Westphal <fw@...len.de>, 
	Sebastian Andrzej Siewior <bigeasy@...utronix.de>, Clark Williams <clrkwllms@...nel.org>, 
	Steven Rostedt <rostedt@...dmis.org>, LKML <linux-kernel@...r.kernel.org>, 
	linux-rt-devel@...ts.linux.dev, netdev@...r.kernel.org
Subject: Re: [RFC] net: inet: Potential sleep in atomic context in
 inet_twsk_hashdance_schedule on PREEMPT_RT

On Mon, Aug 18, 2025 at 9:46 AM Yunseong Kim <ysk@...lloc.com> wrote:
>
> Hi everyone,
>
> I'm looking at the inet_twsk_hashdance_schedule() function in
> net/ipv4/inet_timewait_sock.c and noticed a pattern that could be
> problematic for PREEMPT_RT kernels.
>
> The code in question is:
>
>  void inet_twsk_hashdance_schedule(struct inet_timewait_sock *tw,
>                                    struct sock *sk,
>                                    struct inet_hashinfo *hashinfo,
>                                    int timeo)
>  {
>      ...
>      local_bh_disable();
>      spin_lock(&bhead->lock);

Note this pattern is quite common, you should look at other instances
like inet_put_port(),
inet_csk_listen_stop(), __inet_hash(), __tcp_close(), tcp_abort(),

>      spin_lock(&bhead2->lock);
>      ...
>  }
>
> The sequence local_bh_disable() followed by spin_lock(), In a PREEMPT_RT
> enabled kernel, spin_lock() is replaced by a mutex that an sleep.
> However, local_bh_disable() creates an atomic context by incrementing
> preempt_count, where sleeping is forbidden.
>
> If the spinlock is contended, this code would attempt to sleep inside an
> atomic context, leading to a "BUG: sleeping function called from invalid
> context" kernel panic.
>
> While this pattern is correct for non-RT kernels (and is essentially what
> spin_lock_bh() expands to), it causes critical issues in an RT environment.
>
> A possible fix would be to replace this sequence with calls to
> spin_lock_bh(). Given that two separate locks are acquired, the most direct
> change would look like this:
>
>  // local_bh_disable();  <- removed
>  spin_lock_bh(&bhead->lock);
>  // The second lock is already protected from BH by the first one
>  spin_lock(&bhead2->lock);
>
> Or, to be more explicit and safe if the logic ever changes:
>
>  spin_lock_bh(&bhead->lock);
>  spin_lock_bh(&bhead2->lock);
>
> However, since spin_lock_bh() on the first lock already disables bottom
> halves, the second lock only needs to be a plain spin_lock().
>
> I would like to ask for your thoughts on this. Is my understanding correct,
> and would a patch to change this locking pattern be welcome?
>
> It's possible the PREEMPT_RT implications were not a primary concern at
> the time.
>
> Thanks for your time and guidance.
>
> Best regards,
> Yunseong Kim

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ