lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7kcc75btp5bo5oqjnpqlwwo37l2f4atwfemknbmvqagrqicl2i@njn4tai7e4m7>
Date:   Thu, 6 Jul 2023 14:07:08 +0200
From:   Daniel Wagner <dwagner@...e.de>
To:     James Smart <jsmart2021@...il.com>
Cc:     linux-nvme@...ts.infradead.org, linux-kernel@...r.kernel.org,
        linux-block@...r.kernel.org, Chaitanya Kulkarni <kch@...dia.com>,
        Shin'ichiro Kawasaki <shinichiro@...tmail.com>,
        Sagi Grimberg <sagi@...mberg.me>,
        Hannes Reinecke <hare@...e.de>, Ewan Milne <emilne@...hat.com>
Subject: Re: [PATCH v2 4/5] nvme-fc: Make initial connect attempt synchronous

Hi James,

On Sat, Jul 01, 2023 at 05:11:11AM -0700, James Smart wrote:
> As much as you want to make this change to make transports "similar", I am
> dead set against it unless you are completing a long qualification of the
> change on real FC hardware and FC-NVME devices. There is probably 1.5 yrs of
> testing of different race conditions that drove this change. You cannot
> declare success from a simplistic toy tool such as fcloop for validation.
> 
> The original issues exist, probably have even morphed given the time from
> the original change, and this will seriously disrupt the transport and any
> downstream releases.  So I have a very strong NACK on this change.
> 
> Yes - things such as the connect failure results are difficult to return
> back to nvme-cli. I have had many gripes about the nvme-cli's behavior over
> the years, especially on negative cases due to race conditions which
> required retries. It still fails this miserably.  The async reconnect path
> solved many of these issues for fc.
> 
> For the auth failure, how do we deal with things if auth fails over time as
> reconnects fail due to a credential changes ?  I would think commonality of
> this behavior drives part of the choice.

Alright, what do you think about the idea to introduce a new '--sync' option to
nvme-cli which forwards this info to the kernel that we want to wait for the
initial connect to succeed or fail? Obviously, this needs to handle signals too.

>From what I understood this is also what Ewan would like to have.

Hannes thought it would make sense to use the same initial connect logic in
tcp/rdma, because there could also be transient erros (e.g. spanning tree
protocol). In short making the tcp/rdma do the same thing as fc?

So let's drop the final patch from this series for the time. Could you give some
feedback on the rest of the patches?

Thanks,
Daniel

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ