[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <cc08834c-ccb3-263a-2967-f72a9d72535a@solarflare.com>
Date: Mon, 25 Nov 2019 10:31:12 +0000
From: Edward Cree <ecree@...arflare.com>
To: Nicholas Johnson <nicholas.johnson-opensource@...look.com.au>,
"Alexander Lobakin" <alobakin@...nk.ru>
CC: David Miller <davem@...emloft.net>,
"jiri@...lanox.com" <jiri@...lanox.com>,
"edumazet@...gle.com" <edumazet@...gle.com>,
"idosch@...lanox.com" <idosch@...lanox.com>,
"pabeni@...hat.com" <pabeni@...hat.com>,
"petrm@...lanox.com" <petrm@...lanox.com>,
"sd@...asysnail.net" <sd@...asysnail.net>,
"f.fainelli@...il.com" <f.fainelli@...il.com>,
"jaswinder.singh@...aro.org" <jaswinder.singh@...aro.org>,
"ilias.apalodimas@...aro.org" <ilias.apalodimas@...aro.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"johannes.berg@...el.com" <johannes.berg@...el.com>,
"emmanuel.grumbach@...el.com" <emmanuel.grumbach@...el.com>,
"luciano.coelho@...el.com" <luciano.coelho@...el.com>,
"linuxwifi@...el.com" <linuxwifi@...el.com>,
"kvalo@...eaurora.org" <kvalo@...eaurora.org>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"linux-wireless@...r.kernel.org" <linux-wireless@...r.kernel.org>
Subject: Re: [PATCH v2 net-next] net: core: use listified Rx for GRO_NORMAL in
napi_gro_receive()
On 25/11/2019 09:09, Nicholas Johnson wrote:
> The default value of /proc/sys/net/core/gro_normal_batch was 8.
> Setting it to 1 allowed it to connect to Wi-Fi network.
>
> Setting it back to 8 did not kill the connection.
>
> But when I disconnected and tried to reconnect, it did not re-connect.
>
> Hence, it appears that the problem only affects the initial handshake
> when associating with a network, and not normal packet flow.
That sounds like the GRO batch isn't getting flushed at the endof the
NAPI — maybe the driver isn't calling napi_complete_done() at the
appropriate time?
Indeed, from digging through the layers of iwlwifi I eventually get to
iwl_pcie_rx_handle() which doesn't really have a NAPI poll (the
napi->poll function is iwl_pcie_dummy_napi_poll() { WARN_ON(1);
return 0; }) and instead calls napi_gro_flush() at the end of its RX
handling. Unfortunately, napi_gro_flush() is no longer enough,
because it doesn't call gro_normal_list() so the packets on the
GRO_NORMAL list just sit there indefinitely.
It was seeing drivers calling napi_gro_flush() directly that had me
worried in the first place about whether listifying napi_gro_receive()
was safe and where the gro_normal_list() should go.
I wondered if other drivers that show up in [1] needed fixing with a
gro_normal_list() next to their napi_gro_flush() call. From a cursory
check:
brocade/bna: has a real poller, calls napi_complete_done() so is OK.
cortina/gemini: calls napi_complete_done() straight after
napi_gro_flush(), so is OK.
hisilicon/hns3: calls napi_complete(), so is _probably_ OK.
But it's far from clear to me why *any* of those drivers are calling
napi_gro_flush() themselves...
-Ed
[1]: https://elixir.bootlin.com/linux/latest/ident/napi_gro_flush
Powered by blists - more mailing lists