[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87cyeu5zgk.fsf@prevas.dk>
Date: Thu, 06 Mar 2025 09:35:07 +0100
From: Rasmus Villemoes <ravi@...vas.dk>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Oleg Nesterov <oleg@...hat.com>, Mateusz Guzik <mjguzik@...il.com>, K
Prateek Nayak <kprateek.nayak@....com>, "Sapkal, Swapnil"
<swapnil.sapkal@....com>, Manfred Spraul <manfred@...orfullife.com>,
Christian Brauner <brauner@...nel.org>, David Howells
<dhowells@...hat.com>, WangYuli <wangyuli@...ontech.com>,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org, "Shenoy,
Gautham Ranjal" <gautham.shenoy@....com>, Neeraj.Upadhyay@....com,
Ananth.narayan@....com
Subject: Re: [PATCH] pipe_read: don't wake up the writer if the pipe is
still full
On Wed, Mar 05 2025, Linus Torvalds <torvalds@...ux-foundation.org> wrote:
> On Mon, 3 Mar 2025 at 10:46, Linus Torvalds
> <torvalds@...ux-foundation.org> wrote:
>>
>> ENTIRELY UNTESTED, but it seems to generate ok code. It might even
>> generate better code than what we have now.
>
> Bah. This patch - which is now committed - was actually completely broken.
>
> And the reason that complete breakage didn't show up in testing is
> that I suspect nobody really tested or thought about the 32-bit case.
>
> That whole "use 16-bit indexes on 32-bit" is all fine and well, but I
> woke up in the middle of the night and realized that it doesn't
> actually work.
>
> Because now "pipe_occupancy()" is getting *entirely* the wrong
> answers. It just does
>
> return head - tail;
>
> but that only worked when the arithmetic was done modulo the size of
> the indexes. And now it isn't.
>
> So I still haven't *tested* this, but at an absolute minimum, we need
> something like this:
>
> --- a/include/linux/pipe_fs_i.h
> +++ b/include/linux/pipe_fs_i.h
> @@ -192,7 +192,7 @@
> */
> static inline unsigned int pipe_occupancy(unsigned int head,
> unsigned int tail)
> {
> - return head - tail;
> + return (pipe_index_t)(head - tail);
> }
>
> /**
>
> and there might be other cases where the pipe_index_t size might matter.
Yeah, for example
unsigned int count, head, tail, mask;
case FIONREAD:
mutex_lock(&pipe->mutex);
count = 0;
head = pipe->head;
tail = pipe->tail;
mask = pipe->ring_size - 1;
while (tail != head) {
count += pipe->bufs[tail & mask].len;
tail++;
}
mutex_unlock(&pipe->mutex);
If head has already wrapped around, say it's 0 or 1, and tail is close to
65535, that loop is gonna take forever and of course produce the wrong
result.
So yes, there are probably a lot more of these lurking.
There are probably not many tests that stuff 2^28 bytes through a pipe
to try to trigger such corner cases. Perhaps we can help whatever
automated tests are being done by initializing head and tail to
something like (pipe_index_t)-2 when the pipe is created?
Rasmus
Powered by blists - more mailing lists