lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 5 Apr 2011 11:43:01 +0300
From:	Zeev Tarantov <zeev.tarantov@...il.com>
To:	Andreas Dilger <adilger@...ger.ca>
Cc:	Eric Sandeen <sandeen@...hat.com>,
	ext4 development <linux-ext4@...r.kernel.org>
Subject: Re: [PATCH] e2fsprogs: don't set stripe/stride to 1 block in mkfs

On Tue, Apr 5, 2011 at 11:10, Andreas Dilger <adilger@...ger.ca> wrote:
> On 2011-04-04, at 9:11 AM, Eric Sandeen wrote:
>> Block devices may set minimum or optimal IO hints equal to
>> blocksize; in this case there is really nothing for ext4
>> to do with this information (i.e. search for a block-aligned
>> allocation?) so don't set fs geometry with single-block
>> values.
>>
>> Zeev also reported that with a block-sized stripe, the
>> ext4 allocator spends time spinning in ext4_mb_scan_aligned(),
>> oddly enough.
>>
>> Reported-by: Zeev Tarantov <zeev.tarantov@...il.com>
>> Signed-off-by: Eric Sandeen <sandeen@...hat.com>
>> ---
>>
>> diff --git a/misc/mke2fs.c b/misc/mke2fs.c
>> index 9798b88..74b838c 100644
>> --- a/misc/mke2fs.c
>> +++ b/misc/mke2fs.c
>> @@ -1135,8 +1135,11 @@ static int get_device_geometry(const char *file,
>>       if ((opt_io == 0) && (psector_size > blocksize))
>>               opt_io = psector_size;
>>
>> -     fs_param->s_raid_stride = min_io / blocksize;
>> -     fs_param->s_raid_stripe_width = opt_io / blocksize;
>> +     /* setting stripe/stride to blocksize is pointless */
>> +     if (min_io > blocksize)
>> +             fs_param->s_raid_stride = min_io / blocksize;
>> +     if (opt_io > blocksize)
>> +             fs_param->s_raid_stripe_width = opt_io / blocksize;
>
> I don't think it is harmful to specify an mballoc alignment that is an even multiple of the underlying device IO size (e.g. at least 256kB or 512kB).
>
> If the underlying device (e.g. zram) is reporting 16kB or 64kB opt_io size because that is PAGE_SIZE, but blocksize is 4kB, then we will have the same performance problem again.

I think I'm not understanding you. Are you objecting to the patch? If so, why?
If io block is an even multiple of fs blocks, then mballoc really does
need to use the fancier aligning code (though for a ratio that is a
power of two, it shouldn't need to be slow).
In your example, If PAGE_SIZE is 16k and zram reports min and opt io
size as 16k but fs block is 4k, then Eric Sandeen's new code will set
stride and stripe to 4, as it should. Then mballoc will really need to
align blocks in groups of 4 and spending time doing that is not a
performance problem.

This patch is only meant to _not_ set stripe and stride to 1, which
cannot help block allocation and does in fact cause the kernel to
spend more cpu time because it does not detect the degenerate case.
Making the allocator faster in the case of a ratio that is a power of
two is the work of a different patch someone may want to write.
Using larger fs blocks in case the underlying block device prefers
larger io blocks is the work of the admin, I think.

> Cheers, Andreas

-Z.T.
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ