[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220408111756.1339cb68@kernel.org>
Date: Fri, 8 Apr 2022 11:17:56 -0700
From: Jakub Kicinski <kuba@...nel.org>
To: Maxim Mikityanskiy <maximmi@...dia.com>,
Maciej Fijalkowski <maciej.fijalkowski@...el.com>
Cc: bpf@...r.kernel.org, ast@...nel.org, daniel@...earbox.net,
magnus.karlsson@...el.com, bjorn@...nel.org,
netdev@...r.kernel.org, brouer@...hat.com,
alexandr.lobakin@...el.com, Tariq Toukan <tariqt@...dia.com>,
Saeed Mahameed <saeedm@...dia.com>
Subject: Re: [PATCH bpf-next 00/10] xsk: stop softirq processing on full XSK
Rx queue
On Fri, 8 Apr 2022 15:48:44 +0300 Maxim Mikityanskiy wrote:
> >> 4. A slow or malicious AF_XDP application may easily cause an overflow of
> >> the hardware receive ring. Your feature introduces a mechanism to pause the
> >> driver while the congestion is on the application side, but no symmetric
> >> mechanism to pause the application when the driver is close to an overflow.
> >> I don't know the behavior of Intel NICs on overflow, but in our NICs it's
> >> considered a critical error, that is followed by a recovery procedure, so
> >> it's not something that should happen under normal workloads.
> >
> > I'm not sure I follow on this one. Feature is about overflowing the XSK
> > receive ring, not the HW one, right?
>
> Right. So we have this pipeline of buffers:
>
> NIC--> [HW RX ring] --NAPI--> [XSK RX ring] --app--> consumes packets
>
> Currently, when the NIC puts stuff in HW RX ring, NAPI always runs and
> drains it either to XSK RX ring or to /dev/null if XSK RX ring is full.
> The driver fulfills its responsibility to prevent overflows of HW RX
> ring. If the application doesn't consume quick enough, the frames will
> be leaked, but it's only the application's issue, the driver stays
> consistent.
>
> After the feature, it's possible to pause NAPI from the userspace
> application, effectively disrupting the driver's consistency. I don't
> think an XSK application should have this power.
+1
cover letter refers to busy poll, but did that test enable prefer busy
poll w/ the timeout configured right? It seems like similar goal can
be achieved with just that.
Powered by blists - more mailing lists