linux-kernel - Re: [PATCH] pipe_read: don't wake up the writer if the pipe is still full

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87cyeu5zgk.fsf@prevas.dk>
Date: Thu, 06 Mar 2025 09:35:07 +0100
From: Rasmus Villemoes <ravi@...vas.dk>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Oleg Nesterov <oleg@...hat.com>,  Mateusz Guzik <mjguzik@...il.com>,  K
 Prateek Nayak <kprateek.nayak@....com>,  "Sapkal, Swapnil"
 <swapnil.sapkal@....com>,  Manfred Spraul <manfred@...orfullife.com>,
  Christian Brauner <brauner@...nel.org>,  David Howells
 <dhowells@...hat.com>,  WangYuli <wangyuli@...ontech.com>,
  linux-fsdevel@...r.kernel.org,  linux-kernel@...r.kernel.org,  "Shenoy,
 Gautham Ranjal" <gautham.shenoy@....com>,  Neeraj.Upadhyay@....com,
  Ananth.narayan@....com
Subject: Re: [PATCH] pipe_read: don't wake up the writer if the pipe is
 still full

On Wed, Mar 05 2025, Linus Torvalds <torvalds@...ux-foundation.org> wrote:

> On Mon, 3 Mar 2025 at 10:46, Linus Torvalds
> <torvalds@...ux-foundation.org> wrote:
>>
>> ENTIRELY UNTESTED, but it seems to generate ok code. It might even
>> generate better code than what we have now.
>
> Bah. This patch - which is now committed - was actually completely broken.
>
> And the reason that complete breakage didn't show up in testing is
> that I suspect nobody really tested or thought about the 32-bit case.
>
> That whole "use 16-bit indexes on 32-bit" is all fine and well, but I
> woke up in the middle of the night and realized that it doesn't
> actually work.
>
> Because now "pipe_occupancy()" is getting *entirely* the wrong
> answers. It just does
>
>         return head - tail;
>
> but that only worked when the arithmetic was done modulo the size of
> the indexes. And now it isn't.
>
> So I still haven't *tested* this, but at an absolute minimum, we need
> something like this:
>
>   --- a/include/linux/pipe_fs_i.h
>   +++ b/include/linux/pipe_fs_i.h
>   @@ -192,7 +192,7 @@
>     */
>    static inline unsigned int pipe_occupancy(unsigned int head,
> unsigned int tail)
>    {
>   -       return head - tail;
>   +       return (pipe_index_t)(head - tail);
>    }
>
>    /**
>
> and there might be other cases where the pipe_index_t size might matter.

Yeah, for example

      unsigned int count, head, tail, mask;

      case FIONREAD:
              mutex_lock(&pipe->mutex);
              count = 0;
              head = pipe->head;
              tail = pipe->tail;
              mask = pipe->ring_size - 1;

              while (tail != head) {
                      count += pipe->bufs[tail & mask].len;
                      tail++;
              }
              mutex_unlock(&pipe->mutex);

If head has already wrapped around, say it's 0 or 1, and tail is close to
65535, that loop is gonna take forever and of course produce the wrong
result.

So yes, there are probably a lot more of these lurking.

There are probably not many tests that stuff 2^28 bytes through a pipe
to try to trigger such corner cases. Perhaps we can help whatever
automated tests are being done by initializing head and tail to
something like (pipe_index_t)-2 when the pipe is created?

Rasmus