[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9523677f696a6376c79d32cbec7d6e7ceb1b0500.camel@gmail.com>
Date: Wed, 31 May 2023 08:37:56 -0700
From: Alexander H Duyck <alexander.duyck@...il.com>
To: Alexander Lobakin <aleksander.lobakin@...el.com>,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>
Cc: Maciej Fijalkowski <maciej.fijalkowski@...el.com>,
Magnus Karlsson <magnus.karlsson@...el.com>,
Michal Kubiak <michal.kubiak@...el.com>,
Larysa Zaremba <larysa.zaremba@...el.com>,
Jesper Dangaard Brouer <hawk@...nel.org>,
Ilias Apalodimas <ilias.apalodimas@...aro.org>,
Christoph Hellwig <hch@....de>,
Paul Menzel <pmenzel@...gen.mpg.de>, netdev@...r.kernel.org,
intel-wired-lan@...ts.osuosl.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH net-next v3 03/12] iavf: optimize Rx buffer allocation a
bunch
On Tue, 2023-05-30 at 17:00 +0200, Alexander Lobakin wrote:
> The Rx hotpath code of IAVF is not well-optimized TBH. Before doing any
> further buffer model changes, shake it up a bit. Notably:
>
> 1. Cache more variables on the stack.
> DMA device, Rx page size, NTC -- these are the most common things
> used all throughout the hotpath, often in loops on each iteration.
> Instead of fetching (or even calculating, as with the page size) them
> from the ring all the time, cache them on the stack at the beginning
> of the NAPI polling callback. NTC will be written back at the end,
> the rest are used read-only, so no sync needed.
> 2. Don't move the recycled buffers around the ring.
> The idea of passing the page of the right-now-recycled-buffer to a
> different buffer, in this case, the first one that needs to be
> allocated, moreover, on each new frame, is fundamentally wrong. It
> involves a few o' fetches, branches and then writes (and one Rx
> buffer struct is at least 32 bytes) where they're completely unneeded,
> but gives no good -- the result is the same as if we'd recycle it
> inplace, at the same position where it was used. So drop this and let
> the main refilling function take care of all the buffers, which were
> processed and now need to be recycled/refilled.
> 3. Don't allocate with %GPF_ATOMIC on ifup.
> This involved introducing the @gfp parameter to a couple functions.
> Doesn't change anything for Rx -> softirq.
> 4. 1 budget unit == 1 descriptor, not skb.
> There could be underflow when receiving a lot of fragmented frames.
> If each of them would consist of 2 frags, it means that we'd process
> 64 descriptors at the point where we pass the 32th skb to the stack.
> But the driver would count that only as a half, which could make NAPI
> re-enable interrupts prematurely and create unnecessary CPU load.
> 5. Shortcut !size case.
> It's super rare, but possible -- for example, if the last buffer of
> the fragmented frame contained only FCS, which was then stripped by
> the HW. Instead of checking for size several times when processing,
> quickly reuse the buffer and jump to the skb fields part.
> 6. Refill the ring after finishing the polling loop.
> Previously, the loop wasn't starting a new iteration after the 64th
> desc, meaning that we were always leaving 16 buffers non-refilled
> until the next NAPI poll. It's better to refill them while they're
> still hot, so do that right after exiting the loop as well.
> For a full cycle of 64 descs, there will be 4 refills of 16 descs
> from now on.
>
> Function: add/remove: 4/2 grow/shrink: 0/5 up/down: 473/-647 (-174)
>
> + up to 2% performance.
>
> Signed-off-by: Alexander Lobakin <aleksander.lobakin@...el.com>
What is the workload that is showing the performance improvement?
<...>
> @@ -1350,14 +1297,6 @@ static bool iavf_is_non_eop(struct iavf_ring *rx_ring,
> union iavf_rx_desc *rx_desc,
> struct sk_buff *skb)
I am pretty sure the skb pointer here is an unused variable. We needed
it for ixgbe to support RSC. I don't think you have any code that uses
it in this function and I know we removed the variable for i40e, see
commit d06e2f05b4f18 ("i40e: adjust i40e_is_non_eop").
Powered by blists - more mailing lists