lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 20 Mar 2020 15:37:33 -0700
From:   Alexander Duyck <alexander.duyck@...il.com>
To:     Jesper Dangaard Brouer <brouer@...hat.com>
Cc:     Maciej Fijalkowski <maciej.fijalkowski@...el.com>,
        "Jubran, Samih" <sameehj@...zon.com>,
        Jeff Kirsher <jeffrey.t.kirsher@...el.com>,
        Netdev <netdev@...r.kernel.org>, bpf@...r.kernel.org,
        zorik@...zon.com, akiyano@...zon.com, gtzalik@...zon.com,
        Toke Høiland-Jørgensen <toke@...e.dk>,
        Daniel Borkmann <borkmann@...earbox.net>,
        Alexei Starovoitov <alexei.starovoitov@...il.com>,
        John Fastabend <john.fastabend@...il.com>,
        David Ahern <dsahern@...il.com>,
        Willem de Bruijn <willemdebruijn.kernel@...il.com>,
        Ilias Apalodimas <ilias.apalodimas@...aro.org>,
        Lorenzo Bianconi <lorenzo@...nel.org>,
        Björn Töpel <bjorn.topel@...el.com>,
        kuba@...nel.org
Subject: Re: [PATCH RFC v1 05/15] ixgbe: add XDP frame size to driver

On Fri, Mar 20, 2020 at 2:44 PM Jesper Dangaard Brouer
<brouer@...hat.com> wrote:
>
> On Wed, 18 Mar 2020 14:23:09 -0700
> Alexander Duyck <alexander.duyck@...il.com> wrote:
>
> > On Wed, Mar 18, 2020 at 1:04 PM Maciej Fijalkowski
> > <maciej.fijalkowski@...el.com> wrote:
> > >
> > > On Tue, Mar 17, 2020 at 06:29:33PM +0100, Jesper Dangaard Brouer wrote:
> > > > The ixgbe driver uses different memory models depending on PAGE_SIZE at
> > > > compile time. For PAGE_SIZE 4K it uses page splitting, meaning for
> > > > normal MTU frame size is 2048 bytes (and headroom 192 bytes).
> > >
> > > To be clear the 2048 is the size of buffer given to HW and we slice it up
> > > in a following way:
> > > - 192 bytes dedicated for headroom
> > > - 1500 is max allowed MTU for this setup
> > > - 320 bytes for tailroom (skb shinfo)
> > >
> > > In case you go with higher MTU then 3K buffer would be used and it would
> > > came from order1 page and we still do the half split. Just FYI all of this
> > > is for PAGE_SIZE == 4k and L1$ size == 64.
> >
> > True, but for most people this is the most common case since these are
> > the standard for x86.
> >
> > > > For PAGE_SIZE larger than 4K, driver advance its rx_buffer->page_offset
> > > > with the frame size "truesize".
> > >
> > > Alex, couldn't we base the truesize here somehow on ixgbe_rx_bufsz() since
> > > these are the sizes that we are passing to hw? I must admit I haven't been
> > > in touch with systems with PAGE_SIZE > 4K.
> >
> > With a page size greater than 4K we can actually get many more uses
> > out of a page by using the frame size to determine the truesize of the
> > packet. The truesize is the memory footprint currently being held by
> > the packet. So once the packet is filled we just have to add the
> > headroom and tailroom to whatever the hardware wrote instead of having
> > to use what we gave to the hardware. That gives us better efficiency,
> > if we used ixgbe_rx_bufsz() we would penalize small packets and that
> > in turn would likely hurt performance.
> >
> > > >
> > > > When driver enable XDP it uses build_skb() which provides the necessary
> > > > tailroom for XDP-redirect.
> > >
> > > We still allow to load XDP prog when ring is not using build_skb(). I have
> > > a feeling that we should drop this case now.
> > >
> > > Alex/John/Bjorn WDYT?
> >
> > The comment Jesper had about using using build_skb() when XDP is in
> > use is incorrect. The two are not correlated. The underlying buffer is
> > the same, however we drop the headroom and tailroom if we are in
> > _RX_LEGACY mode. We default to build_skb and the option of switching
> > to legacy Rx is controlled via the device private flags.
>
> Thanks for catching that.
>
> > However with that said the change itself is mostly harmless, and
> > likely helps to resolve issues that would be seen if somebody were to
> > enable XDP while having the RX_LEGACY flag set.
>
> So what is the path forward(?).  Are you/Intel okay with disallowing
> XDP when the RX_LEGACY flag is set?

Why would we need to disallow it? It won't work for the redirect use
case, but other use cases should work just fine. I thought with this
patch set you were correctly reporting the headroom or tailroom so
that we would either reallocate or just drop the frame if it cannot be
handled.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ