linux-kernel - Re: [PATCH RFC 6/6] btrfs: Add roundrobin raid1 read policy

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20210210125815.GA20903@qmqm.qmqm.pl>
Date:   Wed, 10 Feb 2021 13:58:15 +0100
From:   Michał Mirosław <mirq-linux@...e.qmqm.pl>
To:     Michal Rostecki <mrostecki@...e.de>
Cc:     Chris Mason <clm@...com>, Josef Bacik <josef@...icpanda.com>,
        David Sterba <dsterba@...e.com>,
        "open list:BTRFS FILE SYSTEM" <linux-btrfs@...r.kernel.org>,
        open list <linux-kernel@...r.kernel.org>,
        Michal Rostecki <mrostecki@...e.com>
Subject: Re: [PATCH RFC 6/6] btrfs: Add roundrobin raid1 read policy

On Wed, Feb 10, 2021 at 12:29:25PM +0000, Michal Rostecki wrote:
> On Wed, Feb 10, 2021 at 05:24:28AM +0100, Michał Mirosław wrote:
> > On Tue, Feb 09, 2021 at 09:30:40PM +0100, Michal Rostecki wrote:
> > [...]
> > > For the array with 3 HDDs, not adding any penalty resulted in 409MiB/s
> > > (429MB/s) performance. Adding the penalty value 1 resulted in a
> > > performance drop to 404MiB/s (424MB/s). Increasing the value towards 10
> > > was making the performance even worse.
> > > 
> > > For the array with 2 HDDs and 1 SSD, adding penalty value 1 to
> > > rotational disks resulted in the best performance - 541MiB/s (567MB/s).
> > > Not adding any value and increasing the value was making the performance
> > > worse.
> > > 
> > > Adding penalty value to non-rotational disks was always decreasing the
> > > performance, which motivated setting it as 0 by default. For the purpose
> > > of testing, it's still configurable.
> > [...]
> > > +	bdev = map->stripes[mirror_index].dev->bdev;
> > > +	inflight = mirror_load(fs_info, map, mirror_index, stripe_offset,
> > > +			       stripe_nr);
> > > +	queue_depth = blk_queue_depth(bdev->bd_disk->queue);
> > > +
> > > +	return inflight < queue_depth;
> > [...]
> > > +	last_mirror = this_cpu_read(*fs_info->last_mirror);
> > [...]
> > > +	for (i = last_mirror; i < first + num_stripes; i++) {
> > > +		if (mirror_queue_not_filled(fs_info, map, i, stripe_offset,
> > > +					    stripe_nr)) {
> > > +			preferred_mirror = i;
> > > +			goto out;
> > > +		}
> > > +	}
> > > +
> > > +	for (i = first; i < last_mirror; i++) {
> > > +		if (mirror_queue_not_filled(fs_info, map, i, stripe_offset,
> > > +					    stripe_nr)) {
> > > +			preferred_mirror = i;
> > > +			goto out;
> > > +		}
> > > +	}
> > > +
> > > +	preferred_mirror = last_mirror;
> > > +
> > > +out:
> > > +	this_cpu_write(*fs_info->last_mirror, preferred_mirror);
> > 
> > This looks like it effectively decreases queue depth for non-last
> > device. After all devices are filled to queue_depth-penalty, only
> > a single mirror will be selected for next reads (until a read on
> > some other one completes).
> > 
> 
> Good point. And if all devices are going to be filled for longer time,
> this function will keep selecting the last one. Maybe I should select
> last+1 in that case. Would that address your concern or did you have any
> other solution in mind?

The best would be to postpone the selection until one device becomes free
again. But if that's not doable, then yes, you could make sure it stays
round-robin after filling the queues (the scheduling will loose the
"penalty"-driven adjustment though).

Best Reagrds,
Michał Mirosław