linux-kernel - Re: [PATCH] rcutorture: Fix rcu_torture_pipe_update_one()/rcu_torture

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20240307111228.499a5dfd@gandalf.local.home>
Date: Thu, 7 Mar 2024 11:12:28 -0500
From: Steven Rostedt <rostedt@...dmis.org>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: "Paul E. McKenney" <paulmck@...nel.org>, linke li <lilinke99@...com>,
 joel@...lfernandes.org, boqun.feng@...il.com, dave@...olabs.net,
 frederic@...nel.org, jiangshanlai@...il.com, josh@...htriplett.org,
 linux-kernel@...r.kernel.org, mathieu.desnoyers@...icios.com,
 qiang.zhang1211@...il.com, quic_neeraju@...cinc.com, rcu@...r.kernel.org
Subject: Re: [PATCH] rcutorture: Fix
 rcu_torture_pipe_update_one()/rcu_torture_writer() data race and
 concurrency bug

On Thu, 7 Mar 2024 08:20:59 -0500
Steven Rostedt <rostedt@...dmis.org> wrote:

> When a write happens, it looks to see if the smallest watermark is hit,
> if so, calls irqwork to wakeup all the waiters.
> 
> The waiters will wake up, check to see if a signal is pending or if the
> ring buffer has hit the watermark the waiter was waiting for and exit the
> wait loop.
> 
> What the wait_index does, is just a way to force all waiters out of the wait
> loop regardless of the watermark the waiter is waiting for. Before a waiter
> goes into the wait loop, it saves the current wait_index. The waker will
> increment the wait_index and then call the same irq_work to wake up all the
> waiters.
> 
> After the wakeup happens, the waiter will test if the new wait_index
> matches what it was before it entered the loop, and if it is different, it
> falls out of the loop. Then the caller of the ring_buffer_wait() can
> re-evaluate if it needs to enter the wait again.
> 
> The wait_index loop exit was needed for when the file descriptor of a file
> that uses a ring buffer closes and it needs to wake up all the readers of
> that file descriptor to notify their tasks that the file closed.
> 
> So we can switch the:
> 
> 	rbwork->wait_index++;
> 	smp_wmb();
> 
> into just a:
> 
> 	(void)atomic_inc_return_release(&rbwork->wait_index);
> 
> and the:
> 
> 	smp_rmb()
> 	if (wait_index != work->wait_index)
> 
> into:
> 
> 	if (wait_index != atomic_read_acquire(&rb->wait_index))
> 
> I'll write up a patch.
> 
> Hmm, I have the same wait_index logic at the higher level doing basically
> the same thing (at the close of the file). I'll switch that over too.

Discussing this with Maitheu on IRC, we found two bugs with the current
implementation. One was a stupid bug with an easy fix, and the other is
actually a design flaw.

The first bug was the (wait_index != work->wait_index) check was done
*after* the schedule() call and not before it.

The second more fundamental bug is that there's still a race between the
first read of wait_index and the call to prepare_to_wait().

The ring_buffer code doesn't have enough context to know enough to loop or
not. If a file is being closed when another thread is just entering this
code, it could miss the wakeup.

As the callers of ring_buffer_wait() also do a loop, it's redundant to have
ring_buffer_wait() do a loop. It should just do a single wait, and then
exit and let the callers decide if it should loop again.

This will get rid of the need for the rbwork->wait_index and simplifies the
code.

Working on that patch now.

-- Steve