lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8aa4d7e4-4685-4fb2-b6a6-c88bcc7a7740@grimberg.me>
Date: Sun, 21 Jul 2024 14:09:53 +0300
From: Sagi Grimberg <sagi@...mberg.me>
To: Christoph Hellwig <hch@....de>, Ping Gan <jacky_gam_2001@....com>
Cc: hare@...e.de, kch@...dia.com, linux-nvme@...ts.infradead.org,
 linux-kernel@...r.kernel.org, ping.gan@...l.com
Subject: Re: [PATCH v2 0/2] nvmet: support unbound_wq for RDMA and TCP




On 19/07/2024 8:31, Christoph Hellwig wrote:
> On Wed, Jul 17, 2024 at 05:14:49PM +0800, Ping Gan wrote:
>> When running nvmf on SMP platform, current nvme target's RDMA and
>> TCP use bounded workqueue to handle IO, but when there is other high
>> workload on the system(eg: kubernetes), the competition between the
>> bounded kworker and other workload is very radical. To decrease the
>> resource race of OS among them, this patchset will enable unbounded
>> workqueue for nvmet-rdma and nvmet-tcp; besides that, it can also
>> get some performance improvement. And this patchset bases on previous
>> discussion from below session.
> So why aren't we using unbound workqueues by default?  Who makea the
> policy decision and how does anyone know which one to chose?
>

The use-case presented is a case where the cpu resources are shared
between nvmet and other workloads running on the system. The ask is to
prevent nvmet to run io threads from specific cpu cores, and vice-versa, to
minimize interference.

The decision is made by the administrator that decides which resources are
dedicated to nvmet vs. other workloads (which are containers in this case).

Changing to unbound workqueues universally needs to prove that it is better
in the general case, outside of this specific use-case. Meaning that 
latency is
not affected by having unbound kthreads accessing the nvme device, the 
rdma qp
and/or the tcp socket.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ