linux-kernel - Re: [PATCH RFC 6/6] btrfs: Add roundrobin raid1 read policy

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20210211022738.GB4933@qmqm.qmqm.pl>
Date:   Thu, 11 Feb 2021 03:27:38 +0100
From:   Michał Mirosław <mirq-linux@...e.qmqm.pl>
To:     Michal Rostecki <mrostecki@...e.de>
Cc:     Chris Mason <clm@...com>, Josef Bacik <josef@...icpanda.com>,
        David Sterba <dsterba@...e.com>,
        "open list:BTRFS FILE SYSTEM" <linux-btrfs@...r.kernel.org>,
        open list <linux-kernel@...r.kernel.org>,
        Michal Rostecki <mrostecki@...e.com>
Subject: Re: [PATCH RFC 6/6] btrfs: Add roundrobin raid1 read policy

On Wed, Feb 10, 2021 at 07:23:04PM +0000, Michal Rostecki wrote:
> On Wed, Feb 10, 2021 at 01:58:15PM +0100, Michał Mirosław wrote:
> > On Wed, Feb 10, 2021 at 12:29:25PM +0000, Michal Rostecki wrote:
> > > On Wed, Feb 10, 2021 at 05:24:28AM +0100, Michał Mirosław wrote:
> > > > This looks like it effectively decreases queue depth for non-last
> > > > device. After all devices are filled to queue_depth-penalty, only
> > > > a single mirror will be selected for next reads (until a read on
> > > > some other one completes).
> > > > 
> > > 
> > > Good point. And if all devices are going to be filled for longer time,
> > > this function will keep selecting the last one. Maybe I should select
> > > last+1 in that case. Would that address your concern or did you have any
> > > other solution in mind?
> > 
> > The best would be to postpone the selection until one device becomes free
> > again. But if that's not doable, then yes, you could make sure it stays
> > round-robin after filling the queues (the scheduling will loose the
> > "penalty"-driven adjustment though).
> 
> Or another idea - when all the queues are filled, return the mirror
> which has the lowest load (inflight + penalty), even though it's greater
> than queue depth. In that case the scheduling will not lose the penalty
> adjustment and the load is going to be spreaded more fair.
> 
> I'm not sure if postponing the selection is that good idea. I think it's
> better if the request is added to the iosched queue anyway, even if the
> disks' queues are filled, and let the I/O scheduler handle that. The
> length of the iosched queue (nr_requests, attribute of the iosched) is
> usually greater than queue depth (attribute of the devide), which means
> that it's fine to schedule more requests for iosched to handle.
> 
> IMO btrfs should use the information given by iosched only for heuristic
> mirror selection, rather than implement its own throttling logic.
> 
> Does it make sense to you?
> 
> An another idea could be an additional iteration in regard to
> nr_requests, if all load values are greater than queue depths, though it
> might be an overkill. I would prefer to stick to my first idea if
> everyone agrees.

What if iosched could provide an estimate of request's latency? Then
btrfs could always select the lowest. For reads from NVME/SSD I would
normally expect something simple: speed_factor * (pending_bytes + req_bytes).
For HDDs this could do more computation like looking into what is there
in the queue already.

This would deviate from simple round-robin scheme, though.

Best Regards
Michał Mirosław