lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <594f73f2-59b0-bbcb-d7a0-6d89e2446830@gmail.com>
Date:   Sat, 1 Jul 2023 05:11:11 -0700
From:   James Smart <jsmart2021@...il.com>
To:     Daniel Wagner <dwagner@...e.de>, linux-nvme@...ts.infradead.org
Cc:     linux-kernel@...r.kernel.org, linux-block@...r.kernel.org,
        Chaitanya Kulkarni <kch@...dia.com>,
        Shin'ichiro Kawasaki <shinichiro@...tmail.com>,
        Sagi Grimberg <sagi@...mberg.me>,
        Hannes Reinecke <hare@...e.de>,
        James Smart <jsmart2021@...il.com>
Subject: Re: [PATCH v2 4/5] nvme-fc: Make initial connect attempt synchronous


On 6/20/2023 6:37 AM, Daniel Wagner wrote:
> Commit 4c984154efa1 ("nvme-fc: change controllers first connect to use
> reconnect path") made the connection attempt asynchronous in order to
> make the connection attempt from autoconnect/boot via udev/systemd up
> case a bit more reliable.
> 
> Unfortunately, one side effect of this is that any wrong parameters
> provided from userspace will not be directly reported as invalid, e.g.
> auth keys.
> 
> So instead having the policy code inside the kernel it's better to
> address this in userspace, for example in nvme-cli or nvme-stas.
> 
> This aligns the fc transport with tcp and rdma.

As much as you want to make this change to make transports "similar", I 
am dead set against it unless you are completing a long qualification of 
the change on real FC hardware and FC-NVME devices. There is probably 
1.5 yrs of testing of different race conditions that drove this change. 
You cannot declare success from a simplistic toy tool such as fcloop for 
validation.

The original issues exist, probably have even morphed given the time 
from the original change, and this will seriously disrupt the transport 
and any downstream releases.  So I have a very strong NACK on this change.

Yes - things such as the connect failure results are difficult to return 
back to nvme-cli. I have had many gripes about the nvme-cli's behavior 
over the years, especially on negative cases due to race conditions 
which required retries. It still fails this miserably.  The async 
reconnect path solved many of these issues for fc.

For the auth failure, how do we deal with things if auth fails over time 
as reconnects fail due to a credential changes ?  I would think 
commonality of this behavior drives part of the choice.

-- james

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ