linux-kernel - Re: [PATCH v4 1/1] nvme: multipath: Implemented new iopolicy "queue-depth"

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <cef5a764-ffab-495c-bea2-e4c6a7c76944@redhat.com>
Date: Wed, 22 May 2024 12:23:51 -0400
From: John Meneghini <jmeneghi@...hat.com>
To: Keith Busch <kbusch@...nel.org>
Cc: hch@....de, sagi@...mberg.me, emilne@...hat.com,
 linux-nvme@...ts.infradead.org, linux-kernel@...r.kernel.org,
 jrani@...estorage.com, randyj@...estorage.com, hare@...nel.org
Subject: Re: [PATCH v4 1/1] nvme: multipath: Implemented new iopolicy
 "queue-depth"

On 5/22/24 11:56, Keith Busch wrote:
> On Wed, May 22, 2024 at 11:42:12AM -0400, John Meneghini wrote:
>> +static void nvme_subsys_iopolicy_update(struct nvme_subsystem *subsys, int iopolicy)
>> +{
>> +	struct nvme_ctrl *ctrl;
>> +	int old_iopolicy = READ_ONCE(subsys->iopolicy);
>> +
>> +	WRITE_ONCE(subsys->iopolicy, iopolicy);
>> +
>> +	/* iopolicy changes reset the counters and clear the mpath by design */
>> +	mutex_lock(&nvme_subsystems_lock);
>> +	list_for_each_entry(ctrl, &subsys->ctrls, subsys_entry) {
>> +		atomic_set(&ctrl->nr_active, 0);
> 
> Can you me understand why this is a desirable feature? Unless you
> quiesce everything at some point, you'll always have more unaccounted
> requests on whichever path has higher latency. That sounds like it
> defeats the goals of this io policy.

This is true. And as a matter of practice I never change the IO policy when IOs are in flight.  I always stop the IO first.
But we can't stop any user from changing the IO policy again and again.  So I'm not sure what to do.

If you'd like I add the 'if (old_iopolicy == iopolicy) return;' here, but that's not going to solve the problem of inaccurate 
counters when users start flipping io policies around. with IO inflight. There is no synchronization between io submission 
across controllers and changing the policy so I expect changing between round-robin and queue-depth with IO inflight suffers 
from the same problem... though not as badly.

I'd rather take this patch now and figure out how to fix the problem with another patch in the future.  Maybe we can check the 
io stats and refuse to change the policy of they are not zero....

>> @@ -1061,6 +1066,9 @@ static inline bool nvme_disk_is_ns_head(struct gendisk *disk)
>>   {
>>   	return false;
>>   }
>> +static inline void nvme_subsys_iopolicy_update(struct nvme_subsystem *subsys, int iopolicy)
>> +{
>> +}
>>   #endif /* CONFIG_NVME_MULTIPATH */
> 
> You can remove this stub function since the only caller resides in a
> CONFIG_NVME_MULTIPATH file.
> 

I can do that.  Since I have to spin a version 5 patch, do you want me to make the change above?

Is there anything else?

/John