[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <29655a73-5d4c-4773-a425-e16628b8ba7a@grimberg.me>
Date: Fri, 3 May 2024 10:31:50 +0300
From: Sagi Grimberg <sagi@...mberg.me>
To: Aurelien Aptel <aaptel@...dia.com>, linux-nvme@...ts.infradead.org,
netdev@...r.kernel.org, hch@....de, kbusch@...nel.org, axboe@...com,
chaitanyak@...dia.com, davem@...emloft.net, kuba@...nel.org
Cc: Boris Pismenny <borisp@...dia.com>, aurelien.aptel@...il.com,
smalin@...dia.com, malin1024@...il.com, ogerlitz@...dia.com,
yorayz@...dia.com, galshalom@...dia.com, mgurtovoy@...dia.com,
edumazet@...gle.com, pabeni@...hat.com, dsahern@...nel.org, ast@...nel.org,
jacob.e.keller@...el.com
Subject: Re: [PATCH v24 01/20] net: Introduce direct data placement tcp
offload
On 5/2/24 10:04, Aurelien Aptel wrote:
> Sagi Grimberg <sagi@...mberg.me> writes:
>> Well, you cannot rely on the fact that the application will be pinned to a
>> specific cpu core. That may be the case by accident, but you must not and
>> cannot assume it.
> Just to be clear, any CPU can read from the socket and benefit from the
> offload but there will be an extra cost if the queue CPU is different
> from the offload CPU. We use cfg->io_cpu as a hint.
Understood. It is usually the case as io threads are not aligned to the
rss steering rules (unless
arfs is used).
>
>> Even today, nvme-tcp has an option to run from an unbound wq context,
>> where queue->io_cpu is set to WORK_CPU_UNBOUND. What are you going to
>> do there?
> When the CPU is not bound to a specific core, we will most likely always
> have CPU misalignment and the extra cost that goes with it.
Yes, as done today.
>
> But when it is bound, which is still the default common case, we will
> benefit from the alignment. To not lose that benefit for the default
> most common case, we would like to keep cfg->io_cpu.
Well, this explanation is much more reasonable. Setting .affinity_hint
argument
seems like a proper argument to the interface and nvme-tcp can set it to
queue->io_cpu.
>
> Could you clarify what are the advantages of running unbounded queues,
> or to handle RX on a different cpu than the current io_cpu?
See the discussion related to the patch from Li Feng:
https://lore.kernel.org/lkml/20230413062339.2454616-1-fengli@smartx.com/
>
>> nvme-tcp may handle rx side directly from .data_ready() in the future, what
>> will the offload do in that case?
> It is not clear to us what the benefit of handling rx in .data_ready()
> will achieve. From our experiment, ->sk_data_ready() is called either
> from queue->io_cpu, or sk->sk_incoming_cpu. Unless you enable aRFS,
> sk_incoming_cpu will be constant for the whole connection. Can you
> clarify would handling RX from data_ready() provide?
Save the context switching to a kthread from softirq, can reduce latency
substantially
for some workloads.
Powered by blists - more mailing lists