lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20536da2-57e1-4cf3-b040-bd456b86111a@gmail.com>
Date: Sat, 30 Dec 2023 21:15:53 +0000
From: Pavel Begunkov <asml.silence@...il.com>
To: Jens Axboe <axboe@...nel.dk>, David Wei <dw@...idwei.uk>,
 io-uring@...r.kernel.org, netdev@...r.kernel.org
Cc: Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
 "David S. Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>,
 Jesper Dangaard Brouer <hawk@...nel.org>, David Ahern <dsahern@...nel.org>,
 Mina Almasry <almasrymina@...gle.com>
Subject: Re: [RFC PATCH v3 15/20] io_uring: add io_recvzc request

On 12/21/23 21:32, Jens Axboe wrote:
> On 12/21/23 11:59 AM, Pavel Begunkov wrote:
>> On 12/20/23 18:09, Jens Axboe wrote:
>>> On 12/20/23 10:04 AM, Pavel Begunkov wrote:
>>>> On 12/20/23 16:27, Jens Axboe wrote:
>>>>> On 12/19/23 2:03 PM, David Wei wrote:
>>>>>> diff --git a/io_uring/net.c b/io_uring/net.c
>>>>>> index 454ba301ae6b..7a2aadf6962c 100644
>>>>>> --- a/io_uring/net.c
>>>>>> +++ b/io_uring/net.c
[...]
>>>>>> +    if (issue_flags & IO_URING_F_UNLOCKED)
>>>>>> +        return -EAGAIN;
>>>>>
>>>>> This seems odd, why? If we're called with IO_URING_F_UNLOCKED set, then
>>>>
>>>> It's my addition, let me explain.
>>>>
>>>> io_recvzc() -> io_zc_rx_recv() -> ... -> zc_rx_recv_frag()
>>>>
>>>> This chain posts completions to a buffer completion queue, and
>>>> we don't want extra locking to share it with io-wq threads. In
>>>> some sense it's quite similar to the CQ locking, considering
>>>> we restrict zc to DEFER_TASKRUN. And doesn't change anything
>>>> anyway because multishot cannot post completions from io-wq
>>>> and are executed from the poll callback in task work.
>>>>
>>>>> it's from io-wq. And returning -EAGAIN there will not do anything to
>>>>
>>>> It will. It's supposed to just requeue for polling (it's not
>>>> IOPOLL to keep retrying -EAGAIN), just like multishots do.
>>>
>>> It definitely needs a good comment, as it's highly non-obvious when
>>> reading the code!
>>>
>>>> Double checking the code, it can actually terminate the request,
>>>> which doesn't make much difference for us because multishots
>>>> should normally never end up in io-wq anyway, but I guess we
>>>> can improve it a liitle bit.
>>>
>>> Right, assumptions seems to be that -EAGAIN will lead to poll arm, which
>>> seems a bit fragile.
>>
>> The main assumption is that io-wq will eventually leave the
>> request alone and push it somewhere else, either queuing for
>> polling or terminating, which is more than reasonable. I'd
> 
> But surely you don't want it terminated from here? It seems like a very
> odd choice. As it stands, if you end up doing more than one loop, then
> it won't arm poll and it'll get failed.
>> add that it's rather insane for io-wq indefinitely spinning
>> on -EAGAIN, but it has long been fixed (for !IOPOLL).
> 
> There's no other choice for polling, and it doesn't do it for

zc rx is !IOPOLL, that's what I care about.

> non-polling. The current logic makes sense - if you do a blocking
> attempt and you get -EAGAIN, that's really the final result and you
> cannot sanely retry for !IOPOLL in that case. Before we did poll arm for
> io-wq, even the first -EAGAIN would've terminated it. Relying on -EAGAIN
> from a blocking attempt to do anything but fail the request with -EAGAIN
> res is pretty fragile and odd, I think that needs sorting out.
> 
>> As said, can be made a bit better, but it won't change anything
>> for real life execution, multishots would never end up there
>> after they start listening for poll events.
> 
> Right, probably won't ever be a thing for !multishot. As mentioned in my
> original reply, it really just needs a comment explaining exactly what
> it's doing and why it's fine.
> 
>>>> And it should also use IO_URING_F_IOWQ, forgot I split out
>>>> it from *F_UNLOCK.
>>>
>>> Yep, that'd be clearer.
>>
>> Not "clearer", but more correct. Even though it's not
>> a bug because deps between the flags.
> 
> Both clearer and more correct, I would say.
> 
[...]
>>>
>>> Oracle coding?
>>
>> I.e. knowing how later patches (should) look like.
>>
>>> Each patch stands separately, there's no reason not to make this one as
>>
>> They are not standalone, you cannot sanely develop anything not
>> thinking how and where it's used, otherwise you'd get a set of
>> functions full of sleeping but later used in irq context or just
>> unfittable into a desired framework. By extent, code often is
>> written while trying to look a step ahead. For example, first
>> patches don't push everything into io_uring.c just to wholesale
>> move it into zc_rx.c because of exceeding some size threshold.
> 
> Yes, this is how most patch series are - they will compile separately,
> but obviously won't necessarily make sense or be functional until you
> get to the end. But since you very much do have future knowledge in
> these patches, there's no excuse for not making them interact with each
> other better. Each patch should not pretend it doesn't know what comes

Which is exactly the reason why it is how it is.

> next in a series, if you can make a followup patch simpler with a tweak
> to a previous patch, that is definitely a good idea.
> 
> And here, even the end result would be better imho without having
> 
> if (a) {
> 	big blob of stuff
> } else {
> 	other blob of stuff
> }
> 
> when it could just be
> 
> if (a)
> 	return big_blog_of_stuff();
> 
> return other_blog_of_stuff();
> 
> instead.

That sounds like a good general advice, but the "blobs" are
not big nor expected to grow to require splitting, I can't say
it makes it any cleaner or simpler.

>>> clean as it can be. And an error case with the main bits inline is a lot
>>
>> I agree that it should be clean among all, but it _is_ clean
>> and readable, all else is stylistic nit picking. And maybe it's
>> just my opinion, but I also personally appreciate when a patch is
>> easy to review, which includes not restructuring all written before
>> with every patch, which also helps with back porting and other
>> developing aspects.
> 
> But that's basically my point, it even makes followup patches simpler to
> read as well. Is it stylistic? Certainly, I just prefer having the above
> rather than two big indentations. But it also makes the followup patch
> simpler
> and it's basically a one-liner change at that point, and a
> bigger hunk of added code that's the new function that handles the new
> case.
> 
>>> nicer imho than two separate indented parts. For the latter addition
>>> instead of the -EOPNOTSUPP, would probably be nice to have it in a
>>> separate function. Probably ditto for the page pool case here now, would
>>> make the later patch simpler too.
>>
>> If we'd need it in the future, we'll change it then, patches
>> stand separately, at least it's IMHO not needed in the current
>> series.
> 
> It's still an RFC series, please do change it for v4.

It's not my patch, but I don't view it as moving the patches in
any positive direction.

-- 
Pavel Begunkov

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ