lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 23 Aug 2022 22:18:06 -0700
From:   John Fastabend <john.fastabend@...il.com>
To:     Maxim Mikityanskiy <maximmi@...dia.com>,
        "maciej.fijalkowski@...el.com" <maciej.fijalkowski@...el.com>
Cc:     "magnus.karlsson@...el.com" <magnus.karlsson@...el.com>,
        "alexandr.lobakin@...el.com" <alexandr.lobakin@...el.com>,
        "bjorn@...nel.org" <bjorn@...nel.org>,
        "daniel@...earbox.net" <daniel@...earbox.net>,
        "kuba@...nel.org" <kuba@...nel.org>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "bpf@...r.kernel.org" <bpf@...r.kernel.org>,
        Saeed Mahameed <saeedm@...dia.com>,
        "ast@...nel.org" <ast@...nel.org>
Subject: Re: [PATCH v2 bpf-next 00/14] xsk: stop NAPI Rx processing on full
 XSK RQ

Maxim Mikityanskiy wrote:
> On Tue, 2022-08-23 at 13:24 +0200, Maciej Fijalkowski wrote:
> > On Tue, Aug 23, 2022 at 09:49:43AM +0000, Maxim Mikityanskiy wrote:
> > > Anyone from Intel? Maciej, Björn, Magnus?
> > 
> > Hey Maxim,
> > 
> > how about keeping it simple and going with option 1? This behavior was
> > even proposed in the v1 submission of the patch set we're talking about.
> 
> Yeah, I know it was the behavior in v1. It was me who suggested not
> dropping that packet, and I didn't realize back then that it had this
> undesired side effect - sorry for that!

Just want to reiterate what was said originally, you'll definately confuse
our XDP programs if they ever saw the same pkt twice. It would confuse
metrics and any "tap" and so on.

> 
> Option 1 sounds good to me as the first remedy, we can start with that.
> 
> However, it's not perfect: when NAPI and the application are pinned to
> the same core, if the fill ring is bigger than the RX ring (which makes
> sense in case of multiple sockets on the same UMEM), the driver will
> constantly get into this condition, drop one packet, yield to
> userspace, the application will of course clean up the RX ring, but
> then the process will repeat.

Maybe dumb question haven't followed the entire thread or at least
don't recall it. Could you yield when you hit a high water mark at
some point before pkt drop?

> 
> That means, we'll always have a small percentage of packets dropped,
> which may trigger the congestion control algorithms on the other side,
> slowing down the TX to unacceptable speeds (because packet drops won't
> disappear after slowing down just a little).
> 
> Given the above, we may need a more complex solution for the long term.
> What do you think?
> 
> Also, if the application uses poll(), this whole logic (either v1 or
> v2) seems not needed, because poll() returns to the application when
> something becomes available in the RX ring, but I guess the reason for
> adding it was that fantastic 78% performance improvement mentioned in
> the cover letter?
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ