[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8f698ac5-916f-9bb0-cce2-f00fba6ba407@intel.com>
Date: Tue, 8 Sep 2020 08:58:30 +0200
From: Björn Töpel <bjorn.topel@...el.com>
To: Jakub Kicinski <kuba@...nel.org>
Cc: Jesper Dangaard Brouer <brouer@...hat.com>,
Björn Töpel <bjorn.topel@...il.com>,
Eric Dumazet <eric.dumazet@...il.com>, ast@...nel.org,
daniel@...earbox.net, netdev@...r.kernel.org, bpf@...r.kernel.org,
magnus.karlsson@...el.com, davem@...emloft.net,
john.fastabend@...il.com, intel-wired-lan@...ts.osuosl.org
Subject: Re: [PATCH bpf-next 0/6] xsk: exit NAPI loop when AF_XDP Rx ring is
full
On 2020-09-07 20:40, Jakub Kicinski wrote:
> On Mon, 7 Sep 2020 15:37:40 +0200 Björn Töpel wrote:
>> > I've been pondering the exact problem you're solving with Maciej
>> > recently. The efficiency of AF_XDP on one core with the NAPI processing.
>> >
>> > Your solution (even though it admittedly helps, and is quite simple)
>> > still has the application potentially not able to process packets
>> > until the queue fills up. This will be bad for latency.
>> >
>> > Why don't we move closer to application polling? Never re-arm the NAPI
>> > after RX, let the application ask for packets, re-arm if 0 polled.
>> > You'd get max batching, min latency.
>> >
>> > Who's the rambling one now? :-D
>> >
>>
>> :-D No, these are all very good ideas! We've actually experimented
>> with it with the busy-poll series a while back -- NAPI busy-polling
>> does exactly "application polling".
>>
>> However, I wonder if the busy-polling would have better performance
>> than the scenario above (i.e. when the ksoftirqd never kicks in)?
>> Executing the NAPI poll *explicitly* in the syscall, or implicitly
>> from the softirq.
>>
>> Hmm, thinking out loud here. A simple(r) patch enabling busy poll;
>> Exporting the napi_id to the AF_XDP socket (xdp->rxq->napi_id to
>> sk->sk_napi_id), and do the sk_busy_poll_loop() in sendmsg.
>>
>> Or did you have something completely different in mind?
>
> My understanding is that busy-polling is allowing application to pick
> up packets from the ring before the IRQ fires.
>
> What we're more concerned about is the IRQ firing in the first place.
>
> application: busy | needs packets | idle
> -----------------------+---------------+----------------------
> standard | | polls NAPI | keep polling? sleep?
> busy poll | IRQ on | IRQ off | IRQ off IRQ on
> -------------+---------+---------------+----------------------
> | | polls once |
> AF_XDP | IRQ off | IRQ off | IRQ on
>
>
> So busy polling is pretty orthogonal. It only applies to the
> "application needs packets" time. What we'd need is for the application
> to be able to suppress NAPI polls, promising the kernel that it will
> busy poll when appropriate.
>
Ah, nice write-up! Thanks! A strict busy-poll mechanism, not the
opportunistic (existing) NAPI busy-poll.
This would be a new kind of mechanism, and a very much welcome one in
AF_XDP-land. More below.
>> As for this patch set, I think it would make sense to pull it in since
>> it makes the single-core scenario *much* better, and it is pretty
>> simple. Then do the application polling as another, potentially,
>> improvement series.
>
> Up to you, it's extra code in the driver so mostly your code to
> maintain.
>
> I think that if we implement what I described above - everyone will
> use that on a single core setup, so this set would be dead code
> (assuming RQ is sized appropriately). But again, your call :)
>
Now, I agree that the busy-poll you describe above would be the best
option, but from my perspective it's a much larger set that involves
experimenting. I will explore that, but I still think this series should
go in sooner to make the single core scenario usable *today*.
Ok, back to the busy-poll ideas. I'll call your idea "strict busy-poll",
i.e. the NAPI loop is *only* driven by userland, and interrupts stay
disabled. "Syscall driven poll-mode driver". :-)
On the driver side (again, only talking Intel here, since that's what I
know the details of), the NAPI context would only cover AF_XDP queues,
so that other queues are not starved.
Any ideas how strict busy-poll would look, API/implmentation-wise? An
option only for AF_XDP sockets? Would this make sense to regular
sockets? If so, maybe extend the existing NAPI busy-poll with a "strict"
mode?
I'll start playing around a bit, but again, I think this simple series
should go in just to make AF_XDP single core usable *today*.
Thanks!
Björn
Powered by blists - more mailing lists