[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <97aad935-d5fb-713e-fd0f-d84bbd733a8f@solarflare.com>
Date: Fri, 13 Apr 2018 13:36:28 +0100
From: Edward Cree <ecree@...arflare.com>
To: David Miller <davem@...emloft.net>
CC: <linux-net-drivers@...arflare.com>, <netdev@...r.kernel.org>
Subject: Re: [PATCH net 2/2] sfc: limit ARFS workitems in flight per channel
It turns out this may all be moot anyway: I figured out why I was seeing
ARFS storms and it wasn't the configuration issue I originally blamed.
My current ndo_rx_flow_steer() implementation, efx_filter_rfs(), returns
0 for success, but the caller expects a filter ID to be returned (which
we can't give it because we don't know what the filter ID will be until
we start mucking around in the software state that's now protected by a
sleepable lock).
As a result, when we call rps_may_expire_flow(), and pass it the _actual_
filter ID, this doesn't match the one set_rps_cpu() recorded, so the
function returns true and we immediately expire the filter. Then the
next packet to come along isn't steered, so ARFS asks us to insert a
steering filter again.
As a quick fix I've simply tried making the rps_may_expire_flow() calls
also pass a filter ID of 0, which prevents the ARFS storms. This is
safe; it may cause us to delay expiring a filter when flow_ids collide,
but that can happen anyway with other drivers' implementations (e.g.
mlx4 and mlx5 can potentially reuse filter IDs) so I presume it is OK.
I'll post a v2 with that fix in place of this Patch #2 shortly, then try
to follow up with a counter-generated ID (similar to what mlx have).
-Ed
Powered by blists - more mailing lists