[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <787c3b62-f5d8-694d-cd2f-24b40848e39f@kernel.dk>
Date: Sat, 11 Feb 2023 08:33:04 -0700
From: Jens Axboe <axboe@...nel.dk>
To: Ming Lei <ming.lei@...hat.com>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
Andy Lutomirski <luto@...nel.org>,
Dave Chinner <david@...morbit.com>,
Matthew Wilcox <willy@...radead.org>,
Stefan Metzmacher <metze@...ba.org>,
linux-fsdevel <linux-fsdevel@...r.kernel.org>,
Linux API Mailing List <linux-api@...r.kernel.org>,
io-uring <io-uring@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Al Viro <viro@...iv.linux.org.uk>,
Samba Technical <samba-technical@...ts.samba.org>
Subject: Re: copy on write for splice() from file to pipe?
On 2/11/23 8:05 AM, Ming Lei wrote:
> On Sat, Feb 11, 2023 at 07:13:44AM -0700, Jens Axboe wrote:
>> On 2/10/23 8:18?PM, Ming Lei wrote:
>>> On Fri, Feb 10, 2023 at 02:08:35PM -0800, Linus Torvalds wrote:
>>>> On Fri, Feb 10, 2023 at 1:51 PM Jens Axboe <axboe@...nel.dk> wrote:
>>>>>
>>>>> Speaking of splice/io_uring, Ming posted this today:
>>>>>
>>>>> https://lore.kernel.org/io-uring/20230210153212.733006-1-ming.lei@redhat.com/
>>>>
>>>> Ugh. Some of that is really ugly. Both 'ignore_sig' and
>>>> 'ack_page_consuming' just look wrong. Pure random special cases.
>>>>
>>>> And that 'ignore_sig' is particularly ugly, since the only thing that
>>>> sets it also sets SPLICE_F_NONBLOCK.
>>>>
>>>> And the *only* thing that actually then checks that field is
>>>> 'splice_from_pipe_next()', where there are exactly two
>>>> signal_pending() checks that it adds to, and
>>>>
>>>> (a) the first one is to protect from endless loops
>>>>
>>>> (b) the second one is irrelevant when SPLICE_F_NONBLOCK is set
>>>>
>>>> So honestly, just NAK on that series.
>>>>
>>>> I think that instead of 'ignore_sig' (which shouldn't exist), that
>>>> first 'signal_pending()' check in splice_from_pipe_next() should just
>>>> be changed into a 'fatal_signal_pending()'.
>>>
>>> Good point, here the signal is often from task_work_add() called by
>>> io_uring.
>>
>> Usually you'd use task_sigpending() to distinguis the two, but
>> fatal_signal_pending() as Linus suggests would also work. The only
>> concern here is that since you'll be potentially blocking on waiting for
>> the pipe to be readable - if task does indeed have task_work pending and
>> that very task_work is the one that will ensure that the pipe is now
>> readable, then you're waiting condition will never be satisfied.
>
> The 2nd signal_pending() will break the loop to get task_work handled,
> so it is safe to only change the 1st one to fatal_signal_pending().
OK, but then the ignore_sig change should not be needed at all, just
changing that first bit to fatal_signal_pending() would do the trick?
--
Jens Axboe
Powered by blists - more mailing lists