linux-kernel - Re: [Bug Report] nvme connect deadlock in allocating tag

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20240428102527.37462-1-wangbing.kuang@shopee.com>
Date: Sun, 28 Apr 2024 18:25:27 +0800
From: kwb <wangbing.kuang@...pee.com>
To: sagi@...mberg.me
Cc: axboe@...com,
	chunguang.xu@...pee.com,
	hch@....de,
	james.smart@...adcom.com,
	kbusch@...nel.org,
	linux-kernel@...r.kernel.org,
	linux-nvme@...ts.infradead.org,
	wangbing.kuang@...pee.com
Subject: Re: [Bug Report] nvme connect deadlock in allocating tag

>On 28/04/2024 12:16, Wangbing Kuang wrote:
>> "The error_recovery work should unquiesce the admin_q, which should fail
>> fast all pending admin commands,
>> so it is unclear to me how the connect process gets stuck."
>> I think the reason is: the command can be unquiesce but the tag cannot be
>> return until command success.
>
>The error recovery also cancels all pending requests. See 
>nvme_cancel_admin_tagset

nvme_cancel_admin_tagset can cancel requests before stop admin queue, but 
cannot cancel requests before next reconnect time.
The time line is:
recover failed(we can reproduce by hang io for more time) 
-> reconnect delay 
-> multi nvme list issue(used up tagset) 
-> reconnect start(wait for tag when call nvme_enabel_ctrl and nvme_wait_ready)


>>
>> "What is step (2) - make nvme io timeout to recover the connection?"
>> I use spdk-nvmf-target for backend.  It is easy to set read/write
>> nvmf-target io  hang and unhang.  So I just set the io hang for over 30
>> seconds, then trigger linux-nvmf-host trigger io timeout event. then io
>> timeout will trigger connection recover.
>> by the way, I use multipath=0
>
>Interesting, does this happen with multipath=Y ?
>I didn't expect people to be using multipath=0 for fabrics in the past few
>years.

No certain, I did not test on multipath=Y.We choose multipath=0 cos less code and we need only one path

>>
>> "Is this reproducing with upstream nvme? or is this some distro kernel
>> where this happens?"
>> it is reproduced in a kernel based from v5.15, but I think this is common
>> error.
>
>It would be beneficial to verify this.

ok, test need more time, but we can first verify it only in v5.15.

>Do you have the below patch applied?
>de105068fead ("nvme: fix reconnection fail due to reserved tag allocation")

yes, my modification is inspired from the commit. Chungguang.xu is my coleague