lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230731114427.0da1f73b@kernel.org>
Date: Mon, 31 Jul 2023 11:44:27 -0700
From: Jakub Kicinski <kuba@...nel.org>
To: Michael Chan <michael.chan@...adcom.com>
Cc: Jesper Dangaard Brouer <hawk@...nel.org>, davem@...emloft.net,
 netdev@...r.kernel.org, edumazet@...gle.com, pabeni@...hat.com,
 gospo@...adcom.com, bpf@...r.kernel.org, somnath.kotur@...adcom.com, Ilias
 Apalodimas <ilias.apalodimas@...aro.org>
Subject: Re: [PATCH net-next 3/3] bnxt_en: Let the page pool manage the DMA
 mapping

On Mon, 31 Jul 2023 11:16:55 -0700 Michael Chan wrote:
> > > Remember pp.max_len is used for dma_sync_for_device.
> > > If driver is smart, it can set pp.max_len according to MTU, as the (DMA
> > > sync for) device knows hardware will not go beyond this.
> > > On Intel "dma_sync_for_device" is a no-op, so most drivers done
> > > optimized for this. I remember is had HUGE effects on ARM EspressoBin board.  
> >
> > Note that (AFAIU) there is no MTU here, these are pages for LRO/GRO,
> > they will be filled with TCP payload start to end. page_pool_put_page()
> > does nothing for non-last frag, so we'll only sync for the last
> > (BNXT_RX_PAGE-sized) frag released, and we need to sync the entire
> > host page.  
> 
> Correct, there is no MTU here.  Remember this matters only when
> PAGE_SIZE > BNXT_RX_PAGE_SIZE (e.g. 64K PAGE_SIZE and 32K
> BNXT_RX_PAGE_SIZE).  I think we want to dma_sync_for_device for 32K in
> this case.

Maybe I'm misunderstanding. Let me tell you how I think this works and
perhaps we should update the docs based on this discussion.

Note that the max_len is applied to the full host page when the full
host page is returned. Not to fragments, and not at allocation.

The .max_len is the max offset within the host page that the HW may
access. For page-per-packet, 1500B MTU this could matter quite a bit,
because we only have to sync ~1500B rather than 4096B.

      some wasted headroom/padding, pp.offset can be used to skip
    /        device may touch this section
   /        /                     device will not touch, sync not needed
  /        /                     /
|**| ===== MTU 1500B ====== | - skb_shinfo and unused --- |
   <------ .max_len -------->

For fragmented pages it becomes:

                         middle skb_shinfo
                        /                         remainder
                       /                               |
|**| == MTU == | - shinfo- |**| == MTU == | - shinfo- |+++|
   <------------ .max_len ---------------->

So max_len will only exclude the _last_ shinfo and the wasted space
(reminder of dividing page by buffer size). We must sync _all_ packet
sections ("== MTU ==") within the packet.

In bnxt's case - the page is fragmented (latter diagram), and there is
no start offset or wasted space. Ergo .max_len = PAGE_SIZE.

Where did I get off the track?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ