lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f762abbd-5732-456f-97f2-df91b006016e@gmail.com>
Date: Sat, 12 Oct 2024 00:12:25 +0100
From: Pavel Begunkov <asml.silence@...il.com>
To: Mina Almasry <almasrymina@...gle.com>
Cc: David Wei <dw@...idwei.uk>, Stanislav Fomichev <stfomichev@...il.com>,
 io-uring@...r.kernel.org, netdev@...r.kernel.org,
 Jens Axboe <axboe@...nel.dk>, Jakub Kicinski <kuba@...nel.org>,
 Paolo Abeni <pabeni@...hat.com>, "David S. Miller" <davem@...emloft.net>,
 Eric Dumazet <edumazet@...gle.com>, Jesper Dangaard Brouer
 <hawk@...nel.org>, David Ahern <dsahern@...nel.org>
Subject: Re: [PATCH v1 03/15] net: generalise net_iov chunk owners

On 10/11/24 23:25, Mina Almasry wrote:
> On Fri, Oct 11, 2024 at 3:02 PM Pavel Begunkov <asml.silence@...il.com> wrote:
>>
>> On 10/11/24 19:44, David Wei wrote:
>>> On 2024-10-09 09:28, Stanislav Fomichev wrote:
>>>> On 10/08, Pavel Begunkov wrote:
>>>>> On 10/8/24 16:46, Stanislav Fomichev wrote:
>>>>>> On 10/07, David Wei wrote:
>>>>>>> From: Pavel Begunkov <asml.silence@...il.com>
>>>>>>>
>>>>>>> Currently net_iov stores a pointer to struct dmabuf_genpool_chunk_owner,
>>>>>>> which serves as a useful abstraction to share data and provide a
>>>>>>> context. However, it's too devmem specific, and we want to reuse it for
>>>>>>> other memory providers, and for that we need to decouple net_iov from
>>>>>>> devmem. Make net_iov to point to a new base structure called
>>>>>>> net_iov_area, which dmabuf_genpool_chunk_owner extends.
>>>>>>>
>>>>>>> Signed-off-by: Pavel Begunkov <asml.silence@...il.com>
>>>>>>> Signed-off-by: David Wei <dw@...idwei.uk>
>>>>>>> ---
>>>>>>>     include/net/netmem.h | 21 ++++++++++++++++++++-
>>>>>>>     net/core/devmem.c    | 25 +++++++++++++------------
>>>>>>>     net/core/devmem.h    | 25 +++++++++----------------
>>>>>>>     3 files changed, 42 insertions(+), 29 deletions(-)
>>>>>>>
>>>>>>> diff --git a/include/net/netmem.h b/include/net/netmem.h
>>>>>>> index 8a6e20be4b9d..3795ded30d2c 100644
>>>>>>> --- a/include/net/netmem.h
>>>>>>> +++ b/include/net/netmem.h
>>>>>>> @@ -24,11 +24,20 @@ struct net_iov {
>>>>>>>            unsigned long __unused_padding;
>>>>>>>            unsigned long pp_magic;
>>>>>>>            struct page_pool *pp;
>>>>>>> - struct dmabuf_genpool_chunk_owner *owner;
>>>>>>> + struct net_iov_area *owner;
>>>>>>
>>>>>> Any reason not to use dmabuf_genpool_chunk_owner as is (or rename it
>>>>>> to net_iov_area to generalize) with the fields that you don't need
>>>>>> set to 0/NULL? container_of makes everything harder to follow :-(
>>>>>
>>>>> It can be that, but then io_uring would have a (null) pointer to
>>>>> struct net_devmem_dmabuf_binding it knows nothing about and other
>>>>> fields devmem might add in the future. Also, it reduces the
>>>>> temptation for the common code to make assumptions about the origin
>>>>> of the area / pp memory provider. IOW, I think it's cleaner
>>>>> when separated like in this patch.
>>>>
>>>> Ack, let's see whether other people find any issues with this approach.
>>>> For me, it makes the devmem parts harder to read, so my preference
>>>> is on dropping this patch and keeping owner=null on your side.
>>>
>>> I don't mind at this point which approach to take right now. I would
>>> prefer keeping dmabuf_genpool_chunk_owner today even if it results in a
>>> nullptr in io_uring's case. Once there are more memory providers in the
>>> future, I think it'll be clearer what sort of abstraction we might need
>>> here.
>>
>> That's the thing about abstractions, if we say that devmem is the
>> only first class citizen for net_iov and everything else by definition
>> is 2nd class that should strictly follow devmem TCP patterns, and/or
>> that struct dmabuf_genpool_chunk_owner is an integral part of net_iov
>> and should be reused by everyone, then preserving the current state
>> of the chunk owner is likely the right long term approach. If not, and
>> net_iov is actually a generic piece of infrastructure, then IMHO there
>> is no place for devmem sticking out of every bit single bit of it, with
>> structures that are devmem specific and can even be not defined without
>> devmem TCP enabled (fwiw, which is not an actual problem for
>> compilation, juts oddness).
>>
> 
> There is no intention of devmem TCP being a first class citizen or
> anything.

Let me note to avoid being misread, that kind of prioritisation can
have place and that's fine, but that usually happens when you build
on top of older code or user base sizes are much different. And
again, theoretically dmabuf_genpool_chunk_owner could be common code,
i.e. if you want to use dmabuf you need to use the structure
regardless of the provider of choice, and it'll do all dmabuf
handling. But the current chunk owner goes beyond that, and
would need some splitting if someone tries to have that kind of
an abstraction.

> Abstractly speaking, we're going to draw a line in the sand
> and say everything past this line is devmem specific and should be
> replaced by other users. In this patch you drew the line between
> dmabuf_genpool_chunk_owner and net_iov_area, which is fine by me on
> first look. What Stan and I were thinking at first glance is
> preserving dmabuf_* (and renaming) and drawing the line somewhere
> else, which would have also been fine.

True enough, I drew the line when it was convenient, io_uring
needs an extendible abstraction that binds net_iovs, and we'll
also have several different sets of net_iovs, so it fell onto
the object holding the net_iov array as the most natural option.
In that sense, we could've had that binding pointing to an
allocated io_zcrx_area, which would then point further into
io_uring, but that's one extra indirection.

As an viable alternative I don't like that much, instead of
trying to share struct net_iov_area, we can just make struct
struct net_iov::owner completely provider dependent and make
it void *, providers will be allowed to store there whatever
they wish.

> My real issue is whether its safe to do all this container_of while
> not always checking explicitly for the type of net_iov. I'm not 100%
> sure checking in tcp.c alone is enough, yet. I need to take a deeper
> look, no changes requested from me yet.

That's done the typical way everything in the kernel and just
inheritance works. When you get into devmem.c the page pool
callbacks, you for sure know that net_iov's passed are devmem's,
nobody should ever take one net_iov's ops and call it with a
second net_iov without care, the page pool follows it.

When someone wants to operate with devmem net_iov's but doesn't
have a callback it has to validate the origin as tcp.c now does
in 5/15. The nice part is that this patch changes types, so all
such places either explicitly listed in this patch, or it has to
pass through one of the devmem.h helpers, which is yet trivially
checkable.

> FWIW I'm out for the next couple of weeks. I'll have time to take a
> look during that but not as much as now.
> 

-- 
Pavel Begunkov

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ