linux-kernel - Re: [PATCH RFC 3/3] nvme: delay failover by command quiesce timeout

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <3dad09ce-151d-41fc-8137-56a931c4c224@flourine.local>
Date: Wed, 16 Apr 2025 10:30:11 +0200
From: Daniel Wagner <dwagner@...e.de>
To: Mohamed Khalfella <mkhalfella@...estorage.com>
Cc: Daniel Wagner <wagi@...nel.org>, Christoph Hellwig <hch@....de>, 
	Sagi Grimberg <sagi@...mberg.me>, Keith Busch <kbusch@...nel.org>, Hannes Reinecke <hare@...e.de>, 
	John Meneghini <jmeneghi@...hat.com>, randyj@...estorage.com, linux-nvme@...ts.infradead.org, 
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH RFC 3/3] nvme: delay failover by command quiesce timeout

On Tue, Apr 15, 2025 at 05:40:16PM -0700, Mohamed Khalfella wrote:
> On 2025-04-15 14:17:48 +0200, Daniel Wagner wrote:
> > Pasthrough commands should fail immediately. Userland is in charge here,
> > not the kernel. At least this what should happen here.
> 
> I see your point. Unless I am missing something these requests should be
> held equally to bio requests from multipath layer. Let us say app
> submitted write a request that got canceled immediately, how does the app
> know when it is safe to retry the write request?

Good question, but nothing new as far I can tell. If the kernel doesn't
start to retry passthru IO commands, we have to figure out how to pass
additional information to the userland.

> Holding requests like write until it is safe to be retried is the whole
> point of this work, right?

My first goal was to address the IO commands submitted via the block
layer. I didn't had the IO passthru interface on my radar. I agree,
getting the IO passthru path correct is also good idea.

Anyway, I don't have a lot of experience with the IO passthru interface.
A quick test to fitgure out what the status quo is: 'nvme read'
results in an EINTR or 'Resource temporarily unavailable' (EAGAIN or
EWOULDBLOCK) when a tcp connection between host and target is blocked
(*), though it comes from an admin command. It looks like I have to dive
into this a bit more.

(*) I think this might be the same problem as the systemd folks have
reported.