linux-kernel - Re: WQ_UNBOUND workqueue warnings from multiple drivers

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <f5d57e3b-8168-41af-8e36-c7a21ef3a475@grimberg.me>
Date: Sun, 7 Apr 2024 23:08:23 +0300
From: Sagi Grimberg <sagi@...mberg.me>
To: Kamaljit Singh <Kamaljit.Singh1@....com>,
 Chaitanya Kulkarni <chaitanyak@...dia.com>
Cc: "kbusch@...nel.org" <kbusch@...nel.org>,
 "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
 "linux-nvme@...ts.infradead.org" <linux-nvme@...ts.infradead.org>
Subject: Re: WQ_UNBOUND workqueue warnings from multiple drivers

On 03/04/2024 2:50, Kamaljit Singh wrote:
> Sagi, Chaitanya,
>   
> Sorry for the delay, found your replies in the junk folder :(
>   
>>   Was the test you were running read-heavy?
> No, most of the failing fio tests were doing heavy writes. All were with 8 Controllers and 32 NS each. io-specs are below.
>
> [1] bs=16k, iodepth=16, rwmixread=0, numjobs=16
> Failed in ~1 min
>
> Some others were:
> [2] bs=8k, iodepth=16, rwmixread=5, numjobs=16
> [3] bs=8k, iodepth=16, rwmixread=50, numjobs=16

Interesting, that is the opposite of what I would suspect (I thought that
the workload would be read-only or read-mostly).

Does this happen with a 90-%100% read workload?

If we look at nvme_tcp_io_work() it is essentially looping
doing send() and recv() and every iteration checks if a 1ms
deadline elapsed. The fact that it happens on a 100% write
workload leads me to conclude that the only way this can
happen if sending a single 16K request to a controller on its
own takes more than 10ms, which is unexpected...

Question, are you working with a Linux controller? what
is the ctrl ioccsz?