[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZEFlG9rINkutmpCT@infradead.org>
Date: Thu, 20 Apr 2023 09:15:23 -0700
From: Christoph Hellwig <hch@...radead.org>
To: Alexander Lobakin <aleksander.lobakin@...el.com>
Cc: Christoph Hellwig <hch@...radead.org>,
Jakub Kicinski <kuba@...nel.org>,
Xuan Zhuo <xuanzhuo@...ux.alibaba.com>, netdev@...r.kernel.org,
Björn Töpel <bjorn@...nel.org>,
Magnus Karlsson <magnus.karlsson@...el.com>,
Maciej Fijalkowski <maciej.fijalkowski@...el.com>,
Jonathan Lemon <jonathan.lemon@...il.com>,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Paolo Abeni <pabeni@...hat.com>,
Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
Jesper Dangaard Brouer <hawk@...nel.org>,
John Fastabend <john.fastabend@...il.com>, bpf@...r.kernel.org,
virtualization@...ts.linux-foundation.org,
"Michael S. Tsirkin" <mst@...hat.com>,
Guenter Roeck <linux@...ck-us.net>,
Gerd Hoffmann <kraxel@...hat.com>,
Jason Wang <jasowang@...hat.com>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Jens Axboe <axboe@...nel.dk>,
Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: [PATCH net-next] xsk: introduce xsk_dma_ops
On Thu, Apr 20, 2023 at 03:59:39PM +0200, Alexander Lobakin wrote:
> Hmm, currently almost all Ethernet drivers map Rx pages once and then
> just recycle them, keeping the original DMA mapping. Which means pages
> can have the same first mapping for very long time, often even for the
> lifetime of the struct device. Same for XDP sockets, the lifetime of DMA
> mappings equals the lifetime of sockets.
> Does it mean we'd better review that approach and try switching to
> dma_alloc_*() family (non-coherent/caching in our case)?
Yes, exactly. dma_alloc_noncoherent can be used exactly as alloc_pages
+ dma_map_* by the driver (including the dma_sync_* calls on reuse), but
has a huge number of advantages.
> Also, I remember I tried to do that for one my driver, but the thing
> that all those functions zero the whole page(s) before returning them to
> the driver ruins the performance -- we don't need to zero buffers for
> receiving packets and spend a ton of cycles on it (esp. in cases when 4k
> gets zeroed each time, but your main body of traffic is 64-byte frames).
Hmm, the single zeroing when doing the initial allocation shows up
in these profiles?
Powered by blists - more mailing lists