[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <03a47920-9165-1d49-1380-fb4c5061df67@grimberg.me>
Date: Wed, 19 Apr 2023 12:32:22 +0300
From: Sagi Grimberg <sagi@...mberg.me>
To: Li Feng <fengli@...rtx.com>
Cc: Keith Busch <kbusch@...nel.org>, Jens Axboe <axboe@...com>,
Christoph Hellwig <hch@....de>,
"open list:NVM EXPRESS DRIVER" <linux-nvme@...ts.infradead.org>,
linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] nvme/tcp: Add support to set the tcp worker cpu affinity
>> Hey Li,
>>
>>> The default worker affinity policy is using all online cpus, e.g. from 0
>>> to N-1. However, some cpus are busy for other jobs, then the nvme-tcp will
>>> have a bad performance.
>>> This patch adds a module parameter to set the cpu affinity for the nvme-tcp
>>> socket worker threads. The parameter is a comma separated list of CPU
>>> numbers. The list is parsed and the resulting cpumask is used to set the
>>> affinity of the socket worker threads. If the list is empty or the
>>> parsing fails, the default affinity is used.
>>
>> I can see how this may benefit a specific set of workloads, but I have a
>> few issues with this.
>>
>> - This is exposing a user interface for something that is really
>> internal to the driver.
>>
>> - This is something that can be misleading and could be tricky to get
>> right, my concern is that this would only benefit a very niche case.
> Our storage products needs this feature~
> If the user doesn’t know what this is, they can keep it default, so I thinks this is
> not unacceptable.
It doesn't work like that. A user interface is not something exposed to
a specific consumer.
>> - If the setting should exist, it should not be global.
> V2 has fixed it.
>>
>> - I prefer not to introduce new modparams.
>>
>> - I'd prefer to find a way to support your use-case without introducing
>> a config knob for it.
>>
> I’m looking forward to it.
If you change queue_work_on to queue_work, ignoring the io_cpu, does it
address your problem?
Not saying that this should be a solution though.
How many queues does your controller support that you happen to use
queue 0 ?
Also, what happens if you don't pin your process to a specific cpu, does
that change anything?
Powered by blists - more mailing lists