netdev - Re: [PATCH bpf-next 0/6] xsk: exit NAPI loop when AF

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20200904165837.16d8ecfd@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com>
Date:   Fri, 4 Sep 2020 16:58:37 -0700
From:   Jakub Kicinski <kuba@...nel.org>
To:     Björn Töpel <bjorn.topel@...el.com>
Cc:     Jesper Dangaard Brouer <brouer@...hat.com>,
        Björn Töpel 
        <bjorn.topel@...il.com>, Eric Dumazet <eric.dumazet@...il.com>,
        ast@...nel.org, daniel@...earbox.net, netdev@...r.kernel.org,
        bpf@...r.kernel.org, magnus.karlsson@...el.com,
        davem@...emloft.net, john.fastabend@...il.com,
        intel-wired-lan@...ts.osuosl.org
Subject: Re: [PATCH bpf-next 0/6] xsk: exit NAPI loop when AF_XDP Rx ring is
 full

On Fri, 4 Sep 2020 16:32:56 +0200 Björn Töpel wrote:
> On 2020-09-04 16:27, Jesper Dangaard Brouer wrote:
> > On Fri,  4 Sep 2020 15:53:25 +0200
> > Björn Töpel <bjorn.topel@...il.com> wrote:
> >   
> >> On my machine the "one core scenario Rx drop" performance went from
> >> ~65Kpps to 21Mpps. In other words, from "not usable" to
> >> "usable". YMMV.  
> > 
> > We have observed this kind of dropping off an edge before with softirq
> > (when userspace process runs on same RX-CPU), but I thought that Eric
> > Dumazet solved it in 4cd13c21b207 ("softirq: Let ksoftirqd do its job").
> > 
> > I wonder what makes AF_XDP different or if the problem have come back?
> >   
> 
> I would say this is not the same issue. The problem is that the softirq 
> is busy dropping packets since the AF_XDP Rx is full. So, the cycles 
> *are* split 50/50, which is not what we want in this case. :-)
> 
> This issue is more of a "Intel AF_XDP ZC drivers does stupid work", than 
> fairness. If the Rx ring is full, then there is really no use to let the 
> NAPI loop continue.
> 
> Would you agree, or am I rambling? :-P

I wonder if ksoftirqd never kicks in because we are able to discard 
the entire ring before we run out of softirq "slice".


I've been pondering the exact problem you're solving with Maciej
recently. The efficiency of AF_XDP on one core with the NAPI processing.

Your solution (even though it admittedly helps, and is quite simple)
still has the application potentially not able to process packets 
until the queue fills up. This will be bad for latency.

Why don't we move closer to application polling? Never re-arm the NAPI 
after RX, let the application ask for packets, re-arm if 0 polled. 
You'd get max batching, min latency.

Who's the rambling one now? :-D