lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJ+HfNg=vUqUszLX+C6LvYRq8J_X_UiU4x63FCaXWqdRjV5DGA@mail.gmail.com>
Date:   Wed, 30 Jan 2019 15:16:15 +0100
From:   Björn Töpel <bjorn.topel@...il.com>
To:     Magnus Karlsson <magnus.karlsson@...il.com>
Cc:     Jakub Kicinski <jakub.kicinski@...ronome.com>,
        Network Development <netdev@...r.kernel.org>,
        Björn Töpel <bjorn.topel@...el.com>,
        intel-wired-lan <intel-wired-lan@...ts.osuosl.org>,
        Magnus Karlsson <magnus.karlsson@...el.com>
Subject: Re: [Intel-wired-lan] [PATCH net] ixgbe: fix potential RX buffer
 starvation for AF_XDP

Den ons 30 jan. 2019 kl 10:35 skrev Magnus Karlsson <magnus.karlsson@...il.com>:
>
> On Tue, Jan 29, 2019 at 8:15 PM Jakub Kicinski
> <jakub.kicinski@...ronome.com> wrote:
[...]
> >
> > I may not understand the problem fully, but isn't it kind of normal
> > that if you create a ring empty you'll never receive packets?  And it
> > should be reasonably easy to catch while writing an app from scratch
> > (i.e. it behaves deterministically).
>
> Agree that this should be the normal behavior for a NIC. The question
> is how to get out of this situation.There are two options: punt this
> to the application writer or fix this in the driver. I chose to fix
> the driver since this removes complexity in the application.
>

Magnus' fix addresses a race/timing issue. At zero-copy initialization
point, if the fill ring was empty, the driver (both i40e and ixgbe)
would stop retrying to "allocate" zero-copy frames from the fill
ring. So, frames would never be received, even if the fill ring was
filled at a later point.

If the driver runs-dry in terms of Rx buffer if one or more frames has
been received, the driver will retry polling the fill-ring. However at
initialization point, if the fill-ring was empty, the driver would
just give up and never retry.

As Magnus stated, there is no "notify the kernel that the items has
appeared in the fill ring" (other than a HW mechanism where the tail
pointer is a door bell) on the Rx side, so for the Intel drivers it's
up to the driver to solve this.

> > Plus user space can already do the kick manually: create the ring empty,
> > attach prog, put packets on the ring, and kick xmit - that would work,
> > no?
>
> Unfortunately there is no way such a kick could be guaranteed to work
> since it only guarantees to wake up Tx. If the Rx is in another NAPI
> context, then this will not work. On i40e and ixgbe, the Rx and Tx
> processing is in the same NAPI so this would indeed work. But can I
> guarantee that this is true for all current and future NICs?
> Personally I would like to have Tx in a separate NAPI as I could get
> better performance that way, at least in the i40e driver. You also do
> not need to have any Tx at all in your socket, as it could be Rx only.
> These arguments are why I went with the approach to fix this in the
> driver, instead of a hard-to-explain extra step in the application
> that might or might not work depending on your driver.
>
> I believe the crux of the problem is that we need to tell the driver
> to take buffers from the fill ring then put references to them in the
> HW Rx ring so that packets can be received from the wire. In this
> patch I went for installing an XDP program and calling bind() as "the"
> two places. But there could be a scenario where I do bind then install
> the XDP program and after that populate my fill queue and not call any
> other syscall ever! This would not be covered by this patch :-(.
>
> So you might be correct that the only way to fix this is from the
> application. But how to do it in a simple and elegant way? I do not
> think sendto() is the right answer, but what about poll()? Can we wire
> that up to kick Rx through the busy poll mechanism? I would like the
> solution to be simple and not complicate things for the user, if
> possible. Another idea would be to use a timer in the driver to
> periodically check the fill queue in the case it is empty. This would
> be easy for application writers but add complexity for people
> implementing AF_XDP ZC support in their drivers. Any ideas? What do
> you prefer?
>

Hmm, actually your patch *does* cover the starve Rx scenarios. One
downside, which is related to the fact that the AF_XDP ZC ndo doens't
state if there actually is a socket that needs Rx (we'll just enable
Rx zero unconditionally), is that the driver will try to allocate
buffers from the fill ring even if the are no sockets with Rx
queues. In this case, the fill ring probably never be filled from
userland. :-) But fixing that would go in to another patch IMO!

I think we can leave the NAPI discussion a side for now. Currently,
the SPSC queue setup require a single NAPI context. Let's relax that
later (if ever). (Or did I misunderstand your points?)


Cheers,
Björn


> Thanks: Magnus
>
> > Putting the kick in ixgbe_xdp_setup() seems a tiny bit random to me,
> > but perhaps I'm not seeing the rationale clearly.
> _______________________________________________
> Intel-wired-lan mailing list
> Intel-wired-lan@...osl.org
> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ