linux-kernel - Re: [PATCH 1/2] pipe: change pipe_write() to never add a zero-sized buffer

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <fff8727a-29d7-45fe-a997-d9bd55e07f52@amd.com>
Date: Tue, 11 Feb 2025 09:29:02 +0530
From: K Prateek Nayak <kprateek.nayak@....com>
To: Oleg Nesterov <oleg@...hat.com>
CC: Christian Brauner <brauner@...nel.org>, Jeff Layton <jlayton@...nel.org>,
	David Howells <dhowells@...hat.com>, "Gautham R. Shenoy"
	<gautham.shenoy@....com>, Mateusz Guzik <mjguzik@...il.com>, Neeraj Upadhyay
	<Neeraj.Upadhyay@....com>, Oliver Sang <oliver.sang@...el.com>, "Swapnil
 Sapkal" <swapnil.sapkal@....com>, WangYuli <wangyuli@...ontech.com>,
	<linux-fsdevel@...r.kernel.org>, <linux-kernel@...r.kernel.org>, "Linus
 Torvalds" <torvalds@...ux-foundation.org>
Subject: Re: [PATCH 1/2] pipe: change pipe_write() to never add a zero-sized
 buffer

Hello Oleg,

On 2/10/2025 10:52 PM, Oleg Nesterov wrote:
> Hi Prateek,
> 
> On 02/10, K Prateek Nayak wrote:
>>
>>   1-groups     1.00 [ -0.00]( 7.19)                0.95 [  4.90](12.39)
>>   2-groups     1.00 [ -0.00]( 3.54)                1.02 [ -1.92]( 6.55)
>>   4-groups     1.00 [ -0.00]( 2.78)                1.01 [ -0.85]( 2.18)
>>   8-groups     1.00 [ -0.00]( 1.04)                0.99 [  0.63]( 0.77)
>> 16-groups     1.00 [ -0.00]( 1.02)                1.00 [ -0.26]( 0.98)
>>
>> I don't see any regression / improvements from a performance standpoint
> 
> Yes, this patch shouldn't make any difference performance-wise, at least
> in this case. Although I was thinking the same thing when I sent "pipe_read:
> don't wake up the writer if the pipe is still full" ;)
> 
>> Tested-by: K Prateek Nayak <kprateek.nayak@....com>
> 
> Thanks! Please see v2, I've included you tag.

Thank you. I can confirm it is same as the variant I tested.

> 
> Any chance you can also test the patch below?
> 
> To me it looks like a cleanup which makes the "merge small writes" logic
> more understandable. And note that "page-align the rest of the writes"
> doesn't work anyway if "total_len & (PAGE_SIZE-1)" can't fit in the last
> buffer.
> 
> However, in this particular case with DATASIZE=100 this patch can increase
> the number of copy_page_from_iter()'s in pipe_write(). And with this change
> receiver() can certainly get the short reads, so this can increase the
> number of sys_read() calls.
> 
> So I am just curious if this change can cause any noticeable regression on
> your machine.

For the sake of science:

==================================================================
Test          : sched-messaging
Units         : Normalized time in seconds
Interpretation: Lower is better
Statistic     : AMean
==================================================================
Case:         baseline[pct imp](CV)  merge_writes[pct imp](CV)
  1-groups     1.00 [ -0.00](12.39)     1.08 [ -7.62](11.73)
  2-groups     1.00 [ -0.00]( 6.55)     0.97 [  2.52]( 3.01)
  4-groups     1.00 [ -0.00]( 2.18)     1.00 [  0.42]( 1.97)
  8-groups     1.00 [ -0.00]( 0.77)     1.03 [ -3.35]( 5.07)
16-groups     1.00 [ -0.00]( 0.98)     1.01 [ -1.37]( 2.20)

I see some improvements up until 4 groups (160 tasks) but beyond that it
goes into a slight regression territory but the variance is large to
draw any conclusions.

Science experiment concluded.

> 
> Thank you!
> 
> Oleg.
> 
> --- a/fs/pipe.c
> +++ b/fs/pipe.c
> @@ -459,16 +459,16 @@ anon_pipe_write(struct kiocb *iocb, struct iov_iter *from)
>   	was_empty = pipe_empty(head, pipe->tail);
>   	chars = total_len & (PAGE_SIZE-1);
>   	if (chars && !was_empty) {
> -		unsigned int mask = pipe->ring_size - 1;
> -		struct pipe_buffer *buf = &pipe->bufs[(head - 1) & mask];
> +		struct pipe_buffer *buf = pipe_buf(pipe, head - 1);
>   		int offset = buf->offset + buf->len;
> +		int avail = PAGE_SIZE - offset;
>   
> -		if ((buf->flags & PIPE_BUF_FLAG_CAN_MERGE) &&
> -		    offset + chars <= PAGE_SIZE) {
> +		if (avail && (buf->flags & PIPE_BUF_FLAG_CAN_MERGE)) {
>   			ret = pipe_buf_confirm(pipe, buf);
>   			if (ret)
>   				goto out;
>   
> +			chars = min_t(ssize_t, chars, avail);
>   			ret = copy_page_from_iter(buf->page, offset, chars, from);
>   			if (unlikely(ret < chars)) {
>   				ret = -EFAULT;
> 

-- 
Thanks and Regards,
Prateek