lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b4302ee2-584d-48b1-b0fe-77035879c15e@grimberg.me>
Date: Wed, 22 May 2024 13:52:03 +0300
From: Sagi Grimberg <sagi@...mberg.me>
To: Nilay Shroff <nilay@...ux.ibm.com>, John Meneghini <jmeneghi@...hat.com>,
 kbusch@...nel.org, hch@....de, emilne@...hat.com
Cc: linux-nvme@...ts.infradead.org, linux-kernel@...r.kernel.org,
 jrani@...estorage.com, randyj@...estorage.com, hare@...nel.org
Subject: Re: [PATCH v3 1/1] nvme: multipath: Implemented new iopolicy
 "queue-depth"



On 22/05/2024 13:48, Nilay Shroff wrote:
>
> On 5/21/24 20:14, John Meneghini wrote:
>> On 5/21/24 06:16, Sagi Grimberg wrote:
>>>>> Exactly, nvme_mpath_init_ctrl resets the counter.
>>>> Except you're right, the counter reset needs to move to nvme_mpath_init_identify()
>>>> or some place that is called on every controller reset.
>>> This however raises the question of how much failover/reset tests this patch has seen...
>> I has received quite a bit of testing with failover and controller resets.  I shared some of the testing that was done at LSFMM last week.
>>
>> It has received enough testing to make me confident that this code is safe.  That is: it won't panic, corrupt data, or otherwise do any harm.  We believe the error paths will not be affected by this change... but I agree that running the error paths could negatively impact the accuracy of the nr_active counters... which could lead to an inaccurate outcome with the queue-depth policy.
>>
>> I agree the nr_counter initialize should move to nvme_mpath_init_identify(), or maybe be done there in addition to in nvme_mpath_init_ctrl(). I'm will to make that change now... if that's what people want.  I don't think it would require any extensive retesting.
>>
>> /John
>>
>>
> I think with Keith's recent proposed patch for fixing io accounting on failover, the
> nvme_mpath_end_request() would be called even for cancelled IO and so the nr_active
> counter shall be adjusted correctly for cancelled IO requests. Having said that, IMO
> you shall consider moving initialization of nr_active counter to nvme_mpath_init_identify()
> as that's common function invoked from regular controller initialization code path as well
> the reset code path.

Yes, and preferably with a comment explaining why its there (despite 
having nothing to do with identify...)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ