[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Y7viEa4BC3yJRXIS@hirez.programming.kicks-ass.net>
Date: Mon, 9 Jan 2023 10:44:49 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Jakub Kicinski <kuba@...nel.org>
Cc: tglx@...utronix.de, jstultz@...gle.com, edumazet@...gle.com,
netdev@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 3/3] softirq: don't yield if only expedited handlers are
pending
On Thu, Dec 22, 2022 at 02:12:44PM -0800, Jakub Kicinski wrote:
> In networking we try to keep Tx packet queues small, so we limit
> how many bytes a socket may packetize and queue up. Tx completions
> (from NAPI) notify the sockets when packets have left the system
> (NIC Tx completion) and the socket schedules a tasklet to queue
> the next batch of frames.
>
> This leads to a situation where we go thru the softirq loop twice.
> First round we have pending = NET (from the NIC IRQ/NAPI), and
> the second iteration has pending = TASKLET (the socket tasklet).
So to me that sounds like you want to fix the network code to not do
this then. Why can't the NAPI thing directly queue the next batch; why
do you have to do a softirq roundtrip like this?
> On two web workloads I looked at this condition accounts for 10%
> and 23% of all ksoftirqd wake ups respectively. We run NAPI
> which wakes some process up, we hit need_resched() and wake up
> ksoftirqd just to run the TSQ (TCP small queues) tasklet.
>
> Tweak the need_resched() condition to be ignored if all pending
> softIRQs are "non-deferred". The tasklet would run relatively
> soon, anyway, but once ksoftirqd is woken we're risking stalls.
>
> I did not see any negative impact on the latency in an RR test
> on a loaded machine with this change applied.
Ignoring need_resched() will get you in trouble with RT people real
fast.
Powered by blists - more mailing lists