lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240710125533.7a14bbe7@kernel.org>
Date: Wed, 10 Jul 2024 12:55:33 -0700
From: Jakub Kicinski <kuba@...nel.org>
To: Mina Almasry <almasrymina@...gle.com>
Cc: netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
 linux-doc@...r.kernel.org, linux-alpha@...r.kernel.org,
 linux-mips@...r.kernel.org, linux-parisc@...r.kernel.org,
 sparclinux@...r.kernel.org, linux-trace-kernel@...r.kernel.org,
 linux-arch@...r.kernel.org, linux-kselftest@...r.kernel.org,
 bpf@...r.kernel.org, linux-media@...r.kernel.org,
 dri-devel@...ts.freedesktop.org, Donald Hunter <donald.hunter@...il.com>,
 "David S. Miller" <davem@...emloft.net>, Eric Dumazet
 <edumazet@...gle.com>, Paolo Abeni <pabeni@...hat.com>, Jonathan Corbet
 <corbet@....net>, Richard Henderson <richard.henderson@...aro.org>, Ivan
 Kokshaysky <ink@...assic.park.msu.ru>, Matt Turner <mattst88@...il.com>,
 Thomas Bogendoerfer <tsbogend@...ha.franken.de>, "James E.J. Bottomley"
 <James.Bottomley@...senpartnership.com>, Helge Deller <deller@....de>,
 Andreas Larsson <andreas@...sler.com>, Jesper Dangaard Brouer
 <hawk@...nel.org>, Ilias Apalodimas <ilias.apalodimas@...aro.org>, Steven
 Rostedt <rostedt@...dmis.org>, Masami Hiramatsu <mhiramat@...nel.org>,
 Mathieu Desnoyers <mathieu.desnoyers@...icios.com>, Arnd Bergmann
 <arnd@...db.de>, Steffen Klassert <steffen.klassert@...unet.com>, Herbert
 Xu <herbert@...dor.apana.org.au>, David Ahern <dsahern@...nel.org>, Willem
 de Bruijn <willemdebruijn.kernel@...il.com>, Shuah Khan <shuah@...nel.org>,
 Sumit Semwal <sumit.semwal@...aro.org>, Christian König
 <christian.koenig@....com>, Bagas Sanjaya <bagasdotme@...il.com>, Christoph
 Hellwig <hch@...radead.org>, Nikolay Aleksandrov <razor@...ckwall.org>,
 Taehee Yoo <ap420073@...il.com>, Pavel Begunkov <asml.silence@...il.com>,
 David Wei <dw@...idwei.uk>, Jason Gunthorpe <jgg@...pe.ca>, Yunsheng Lin
 <linyunsheng@...wei.com>, Shailend Chand <shailend@...gle.com>, Harshitha
 Ramamurthy <hramamurthy@...gle.com>, Shakeel Butt <shakeel.butt@...ux.dev>,
 Jeroen de Borst <jeroendb@...gle.com>, Praveen Kaligineedi
 <pkaligineedi@...gle.com>, Willem de Bruijn <willemb@...gle.com>, Kaiyuan
 Zhang <kaiyuanz@...gle.com>
Subject: Re: [PATCH net-next v16 04/13] netdev: netdevice devmem allocator

On Wed, 10 Jul 2024 12:29:58 -0700 Mina Almasry wrote:
> On Wed, Jul 10, 2024 at 9:37 AM Jakub Kicinski <kuba@...nel.org> wrote:
> > On Wed, 10 Jul 2024 00:17:37 +0000 Mina Almasry wrote:  
> > > +     net_devmem_dmabuf_binding_get(binding);  
> >
> > Why does every iov need to hold a ref? pp holds a ref and does its own
> > accounting, so it won't disappear unless all the pages are returned.  
> 
> I guess it doesn't really need to, but this is the design/approach I
> went with, and I actually prefer it a bit. The design is borrowed from
> how struct dev_pagemap does this, IIRC. Every page allocated from the
> pgmap holds a reference to the pgmap to ensure the pgmap doesn't go
> away while some page that originated from it is out in the wild, and
> similarly I did so in the binding here.

Oh, you napi_pp_put_page() on the other end! I can see how that could
be fine.

> We could assume that the page_pool is accounting iovs for us, but that
> is not always true, right? page_pool_return_page() disconnects a
> netmem from the page_pool and AFAIU the page_pool can go away while
> there is such a netmem still in use in the net stack. Currently this
> can't happen with iovs because I currently don't support non-pp
> refcounting for iovs (so they're always recyclable), but you have a
> comment on the other patch asking why that works; depending on how we
> converge on that conversation, the details of how the pp refcounting
> could change.

Even then - we could take the ref as the page "leaks" out of the pool,
rather than doing it on the fast path, right? Or just BUG_ON() 'cause
that reference ain't coming back ;)

> It's nice to know that the binding refcounting will work regardless of
> the details of how the pp refcounting works. IMHO having the binding
> rely on the pp refcounting to ensure all the iovs are freed introduces
> some fragility.
> 
> Additionally IMO the net_devmem_dmabuf_binding_get/put aren't so
> expensive to want to optimize out, right? The allocation is a slow
> path anyway and the fast path recycles netmem.

Yes, I should have read patch 10. I think it's avoidable :) but with
recycling it can indeed perform just fine (do you happen to have
recycling rate stats from prod runs?)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ