[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d79ae05e-e75a-de2f-f2e3-bc73637e1501@nbd.name>
Date: Wed, 22 Jul 2020 14:27:42 +0200
From: Felix Fietkau <nbd@....name>
To: Rajkumar Manoharan <rmanohar@...eaurora.org>,
Rakesh Pillai <pillair@...eaurora.org>
Cc: ath10k@...ts.infradead.org, linux-wireless@...r.kernel.org,
linux-kernel@...r.kernel.org, kvalo@...eaurora.org,
johannes@...solutions.net, davem@...emloft.net, kuba@...nel.org,
netdev@...r.kernel.org, dianders@...omium.org, evgreen@...omium.org
Subject: Re: [RFC 2/7] ath10k: Add support to process rx packet in thread
On 2020-07-21 23:53, Rajkumar Manoharan wrote:
> On 2020-07-21 10:14, Rakesh Pillai wrote:
>> NAPI instance gets scheduled on a CPU core on which
>> the IRQ was triggered. The processing of rx packets
>> can be CPU intensive and since NAPI cannot be moved
>> to a different CPU core, to get better performance,
>> its better to move the gist of rx packet processing
>> in a high priority thread.
>>
>> Add the init/deinit part for a thread to process the
>> receive packets.
>>
> IMHO this defeat the whole purpose of NAPI. Originally in ath10k
> irq processing happened in tasklet (high priority) context which in
> turn push more data to net core even though net is unable to process
> driver data as both happen in different context (fast producer - slow
> consumer)
> issue. Why can't CPU governor schedule the interrupts in less loaded CPU
> core?
> Otherwise you can play with different RPS and affinity settings to meet
> your
> requirement.
>
> IMO introducing high priority tasklets/threads is not viable solution.
I'm beginning to think that the main problem with NAPI here is that the
work done by poll functions on 802.11 drivers is significantly more CPU
intensive compared to ethernet drivers, possibly more than what NAPI was
designed for.
I'm considering testing a different approach (with mt76 initially):
- Add a mac80211 rx function that puts processed skbs into a list
instead of handing them to the network stack directly.
- Move all rx processing to a high priority thread, keep a driver
internal queue for fully processed packets.
- Schedule NAPI poll on completion.
- NAPI poll function pulls from the internal queue and passes to the
network stack.
With this approach, the network stack retains some control over the
processing rate of rx packets, while the scheduler can move the CPU
intensive processing around to where it fits best.
What do you think?
- Felix
Powered by blists - more mailing lists