[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Zc-MRN2tUmsCQLZO@slm.duckdns.org>
Date: Fri, 16 Feb 2024 06:24:36 -1000
From: Tejun Heo <tj@...nel.org>
To: Eric Dumazet <edumazet@...gle.com>
Cc: torvalds@...ux-foundation.org, mpatocka@...hat.com,
linux-kernel@...r.kernel.org, dm-devel@...ts.linux.dev,
msnitzer@...hat.com, ignat@...udflare.com, damien.lemoal@....com,
bob.liu@...cle.com, houtao1@...wei.com, peterz@...radead.org,
mingo@...nel.org, netdev@...r.kernel.org, allen.lkml@...il.com,
kernel-team@...a.com, "David S. Miller" <davem@...emloft.net>,
David Ahern <dsahern@...nel.org>, Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>, David Wei <davidhwei@...a.com>
Subject: Re: [PATCH 6/8] net: tcp: tsq: Convert from tasklet to BH workqueue
Hello, Eric. How have you been?
On Fri, Feb 16, 2024 at 09:23:00AM +0100, Eric Dumazet wrote:
...
> TSQ matters for high BDP, and is very time sensitive.
>
> Things like slow TX completions (firing from napi poll, BH context)
> can hurt TSQ.
>
> If we add on top of these slow TX completions, an additional work
> queue overhead, I really am not sure...
Just to be sure, the workqueue here is executing in the same softirq context
as tasklets. This isn't the usual workqueue which has to go through the
scheduler. The only difference would be that workqueue does a bit more work
(e.g. to manage the currenty executing hashtable) than tasklet. It's
unlikely to show noticeable latency penalty in any practical case although
the extra overhead would likely be visible in targeted microbenches where
all that happens is scheduling and running noop work items.
> I would recommend tests with pfifo_fast qdisc (not FQ which has a
> special override for TSQ limits)
David, do you think this is something we can do?
> Eventually we could add in TCP a measure of the time lost because of
> TSQ, regardless of the kick implementation (tasklet or workqueue).
> Measuring the delay between when a tcp socket got tcp_wfree approval
> to deliver more packets, and time it finally delivered these packets
> could be implemented with a bpftrace program.
I don't have enough context here but it sounds like you are worried about
adding latency in that path. This conversion is unlikely to make a
noticeable difference there. The interface and sementics are workqueue but
the work items are being executed exactly the same way from the same
softirqs as tasklets. Would testing with pfifo_fast be sufficient to dispel
your concern?
Thanks.
--
tejun
Powered by blists - more mailing lists