[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f0c51f71-dae8-447f-bbdb-73d2ce607893@davidwei.uk>
Date: Fri, 16 Feb 2024 08:52:25 -0700
From: David Wei <dw@...idwei.uk>
To: Eric Dumazet <edumazet@...gle.com>, Tejun Heo <tj@...nel.org>
Cc: torvalds@...ux-foundation.org, mpatocka@...hat.com,
linux-kernel@...r.kernel.org, dm-devel@...ts.linux.dev, msnitzer@...hat.com,
ignat@...udflare.com, damien.lemoal@....com, bob.liu@...cle.com,
houtao1@...wei.com, peterz@...radead.org, mingo@...nel.org,
netdev@...r.kernel.org, allen.lkml@...il.com, kernel-team@...a.com,
"David S. Miller" <davem@...emloft.net>, David Ahern <dsahern@...nel.org>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
David Wei <davidhwei@...a.com>
Subject: Re: [PATCH 6/8] net: tcp: tsq: Convert from tasklet to BH workqueue
On 2024-02-16 01:23, Eric Dumazet wrote:
> !-------------------------------------------------------------------|
> This Message Is From an External Sender
>
> |-------------------------------------------------------------------!
>
> On Fri, Feb 16, 2024 at 6:31 AM Tejun Heo <tj@...nel.org> wrote:
>>
>> Hello,
>>
>> On Mon, Jan 29, 2024 at 11:11:53PM -1000, Tejun Heo wrote:
>>> The only generic interface to execute asynchronously in the BH context is
>>> tasklet; however, it's marked deprecated and has some design flaws. To
>>> replace tasklets, BH workqueue support was recently added. A BH workqueue
>>> behaves similarly to regular workqueues except that the queued work items
>>> are executed in the BH context.
>>>
>>> This patch converts TCP Small Queues implementation from tasklet to BH
>>> workqueue.
>>>
>>> Semantically, this is an equivalent conversion and there shouldn't be any
>>> user-visible behavior changes. While workqueue's queueing and execution
>>> paths are a bit heavier than tasklet's, unless the work item is being queued
>>> every packet, the difference hopefully shouldn't matter.
>>>
>>> My experience with the networking stack is very limited and this patch
>>> definitely needs attention from someone who actually understands networking.
>>
>> On Jakub's recommendation, I asked David Wei to perform production memcache
>> benchmark on the backported conversion patch. There was no discernible
>> difference before and after. Given that this is likely as hot as it gets for
>> the path on a real workloal, the conversions shouldn't hopefully be
>> noticeable in terms of performance impact.
>>
>> Jakub, I'd really appreciate if you could ack. David, would it be okay if I
>> add your Tested-by?
Yes, that's fine.
>
> I presume memcache benchmark is using small RPC ?
It is not a benchmark but a prod shadow, but yes the requests are small.
>
> TSQ matters for high BDP, and is very time sensitive.
>
> Things like slow TX completions (firing from napi poll, BH context)
> can hurt TSQ.
>
> If we add on top of these slow TX completions, an additional work
> queue overhead, I really am not sure...
>
> I would recommend tests with pfifo_fast qdisc (not FQ which has a
> special override for TSQ limits)
>
> Eventually we could add in TCP a measure of the time lost because of
> TSQ, regardless of the kick implementation (tasklet or workqueue).
> Measuring the delay between when a tcp socket got tcp_wfree approval
> to deliver more packets, and time it finally delivered these packets
> could be implemented with a bpftrace program.
Powered by blists - more mailing lists