[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d535d5142e42b8c550f0220200e3779d@dlink.ru>
Date: Mon, 25 Nov 2019 15:02:24 +0300
From: Alexander Lobakin <alobakin@...nk.ru>
To: Paolo Abeni <pabeni@...hat.com>
Cc: Johannes Berg <johannes@...solutions.net>,
Edward Cree <ecree@...arflare.com>,
Nicholas Johnson <nicholas.johnson-opensource@...look.com.au>,
David Miller <davem@...emloft.net>, jiri@...lanox.com,
edumazet@...gle.com, idosch@...lanox.com, petrm@...lanox.com,
sd@...asysnail.net, f.fainelli@...il.com,
jaswinder.singh@...aro.org, ilias.apalodimas@...aro.org,
linux-kernel@...r.kernel.org, emmanuel.grumbach@...el.com,
luciano.coelho@...el.com, linuxwifi@...el.com,
kvalo@...eaurora.org, netdev@...r.kernel.org,
linux-wireless@...r.kernel.org
Subject: Re: [PATCH v2 net-next] net: core: use listified Rx for GRO_NORMAL in
napi_gro_receive()
Paolo Abeni wrote 25.11.2019 14:42:
> On Mon, 2019-11-25 at 12:05 +0100, Johannes Berg wrote:
>> On Mon, 2019-11-25 at 13:58 +0300, Alexander Lobakin wrote:
>> > Edward Cree wrote 25.11.2019 13:31:
>> > > On 25/11/2019 09:09, Nicholas Johnson wrote:
>> > > > The default value of /proc/sys/net/core/gro_normal_batch was 8.
>> > > > Setting it to 1 allowed it to connect to Wi-Fi network.
>> > > >
>> > > > Setting it back to 8 did not kill the connection.
>> > > >
>> > > > But when I disconnected and tried to reconnect, it did not re-connect.
>> > > >
>> > > > Hence, it appears that the problem only affects the initial handshake
>> > > > when associating with a network, and not normal packet flow.
>> > > That sounds like the GRO batch isn't getting flushed at the endof the
>> > > NAPI — maybe the driver isn't calling napi_complete_done() at the
>> > > appropriate time?
>> >
>> > Yes, this was the first reason I thought about, but didn't look at
>> > iwlwifi yet. I already knew this driver has some tricky parts, but
>> > this 'fake NAPI' solution seems rather strange to me.
>>
>> Truth be told, we kinda just fudged it until we got GRO, since that's
>> what we really want on wifi (to reduce the costly TCP ACKs if
>> possible).
>>
>> Maybe we should call napi_complete_done() instead? But as Edward noted
>> (below), we don't actually really do NAPI polling, we just fake it for
>> each interrupt since we will often get a lot of frames in one
>> interrupt
>> if there's high throughput (A-MPDUs are basically coming in all at the
>> same time). I've never really looked too much at what exactly happens
>> here, beyond seeing the difference from GRO.
>>
>>
>> > > Indeed, from digging through the layers of iwlwifi I eventually get to
>> > > iwl_pcie_rx_handle() which doesn't really have a NAPI poll (the
>> > > napi->poll function is iwl_pcie_dummy_napi_poll() { WARN_ON(1);
>> > > return 0; }) and instead calls napi_gro_flush() at the end of its RX
>> > > handling. Unfortunately, napi_gro_flush() is no longer enough,
>> > > because it doesn't call gro_normal_list() so the packets on the
>> > > GRO_NORMAL list just sit there indefinitely.
>> > >
>> > > It was seeing drivers calling napi_gro_flush() directly that had me
>> > > worried in the first place about whether listifying napi_gro_receive()
>> > > was safe and where the gro_normal_list() should go.
>> > > I wondered if other drivers that show up in [1] needed fixing with a
>> > > gro_normal_list() next to their napi_gro_flush() call. From a cursory
>> > > check:
>> > > brocade/bna: has a real poller, calls napi_complete_done() so is OK.
>> > > cortina/gemini: calls napi_complete_done() straight after
>> > > napi_gro_flush(), so is OK.
>> > > hisilicon/hns3: calls napi_complete(), so is _probably_ OK.
>> > > But it's far from clear to me why *any* of those drivers are calling
>> > > napi_gro_flush() themselves...
>> >
>> > Agree. I mean, we _can_ handle this particular problem from networking
>> > core side, but from my point of view only rethinking driver's logic is
>> > the correct way to solve this and other issues that may potentionally
>> > appear in future.
>>
>> Do tell what you think it should be doing :)
>>
>> One additional wrinkle is that we have firmware notifications, command
>> completions and actual RX interleaved, so I think we do want to have
>> interrupts for the notifications and command completions?
>
> I think it would be nice moving the iwlwifi driver to full/plain NAPI
> mode. The interrupt handler could keep processing extra work as it does
> now and queue real pkts on some internal queue, and than schedule the
> relevant napi, which in turn could process such queue in the napi poll
> method. Likely I missed tons of details and/or oversimplified it...
Yep, full NAPI is the best variant, but I may miss a lot too.
> For -net, I *think* something as dumb and hacky as the following could
> possibly work:
> ----
> diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/rx.c
> b/drivers/net/wireless/intel/iwlwifi/pcie/rx.c
> index 4bba6b8a863c..df82fad96cbb 100644
> --- a/drivers/net/wireless/intel/iwlwifi/pcie/rx.c
> +++ b/drivers/net/wireless/intel/iwlwifi/pcie/rx.c
> @@ -1527,7 +1527,7 @@ static void iwl_pcie_rx_handle(struct iwl_trans
> *trans, int queue)
> iwl_pcie_rxq_alloc_rbs(trans, GFP_ATOMIC, rxq);
>
> if (rxq->napi.poll)
> - napi_gro_flush(&rxq->napi, false);
> + napi_complete_done(&rxq->napi, 0);
>
> iwl_pcie_rxq_restock(trans, rxq);
> }
> ---
napi_complete_done(napi, 0) has an equivalent static inline
napi_complete(napi). I'm not sure it will work without any issues
as iwlwifi doesn't _really_ turn NAPI into scheduling state.
I'm not very familiar with iwlwifi, but as a work around manual
napi_gro_flush() you can also manually flush napi->rx_list to
prevent packets from stalling:
diff -Naur a/drivers/net/wireless/intel/iwlwifi/pcie/rx.c
b/drivers/net/wireless/intel/iwlwifi/pcie/rx.c
--- a/drivers/net/wireless/intel/iwlwifi/pcie/rx.c 2019-11-25
14:55:03.610355230 +0300
+++ b/drivers/net/wireless/intel/iwlwifi/pcie/rx.c 2019-11-25
14:57:29.399556868 +0300
@@ -1526,8 +1526,16 @@
if (unlikely(emergency && count))
iwl_pcie_rxq_alloc_rbs(trans, GFP_ATOMIC, rxq);
- if (rxq->napi.poll)
+ if (rxq->napi.poll) {
+ if (rxq->napi.rx_count) {
+ netif_receive_skb_list(&rxq->napi.rx_list);
+
+ INIT_LIST_HEAD(&rxq->napi.rx_list);
+ rxq->napi.rx_count = 0;
+ }
+
napi_gro_flush(&rxq->napi, false);
+ }
iwl_pcie_rxq_restock(trans, rxq);
}
> Cheers,
>
> Paolo
Regards,
ᚷ ᛖ ᚢ ᚦ ᚠ ᚱ
Powered by blists - more mailing lists