lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 2 Feb 2010 20:38:26 +0100
From:	Jens Axboe <jens.axboe@...cle.com>
To:	Wu Fengguang <fengguang.wu@...el.com>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Linux Memory Management List <linux-mm@...ck.org>,
	linux-fsdevel@...r.kernel.org, LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 01/11] readahead: limit readahead size for small devices

On Tue, Feb 02 2010, Wu Fengguang wrote:
> Linus reports a _really_ small & slow (505kB, 15kB/s) USB device,
> on which blkid runs unpleasantly slow. He manages to optimize the blkid
> reads down to 1kB+16kB, but still kernel read-ahead turns it into 48kB.
> 
>      lseek 0,    read 1024   => readahead 4 pages (start of file)
>      lseek 1536, read 16384  => readahead 8 pages (page contiguous)
> 
> The readahead heuristics involved here are reasonable ones in general.
> So it's good to fix blkid with fadvise(RANDOM), as Linus already did.
> 
> For the kernel part, Linus suggests:
>   So maybe we could be less aggressive about read-ahead when the size of
>   the device is small? Turning a 16kB read into a 64kB one is a big deal,
>   when it's about 15% of the whole device!
> 
> This looks reasonable: smaller device tend to be slower (USB sticks as
> well as micro/mobile/old hard disks).
> 
> Given that the non-rotational attribute is not always reported, we can
> take disk size as a max readahead size hint. We use a formula that
> generates the following concrete limits:
> 
>         disk size    readahead size
>      (scale by 4)      (scale by 2)
>                2M            	 4k
>                8M                8k
>               32M               16k
>              128M               32k
>              512M               64k
>                2G              128k
>                8G              256k
>               32G              512k
>              128G             1024k

I'm not sure the size part makes a ton of sense. You can have really
fast small devices, and large slow devices. One real world example are
the Sun FMod SSD devices, which are only 22GB in size but are faster
than the Intel X25-E SLC disks.

What makes it even worse for these devices is that they are often
attached to fatter controllers than ahci, where command overhead is
larger.

Running your script on such a device yields (I enlarged the read-count
by 2, makes it more reproducible):

MARVELL SD88SA02 MP1F

rasize	1st             2nd
------------------------------------------------------------------
  4k	 41 MB/s	 41 MB/s
 16k	 85 MB/s	 81 MB/s
 32k	102 MB/s	109 MB/s
 64k	125 MB/s	144 MB/s
128k	183 MB/s	185 MB/s
256k	216 MB/s	216 MB/s
512k	216 MB/s	236 MB/s
1024k	251 MB/s	252 MB/s
  2M	258 MB/s	258 MB/s
  4M	266 MB/s	266 MB/s
  8M	266 MB/s	266 MB/s

So for that device, 1M-2M looks like the sweet spot, with even needing
4-8M to fully reach full throughput.

I don't think this is atypical of bigger systems. Only very recently
have controller started to slim down the command overhead for real,
because of the SSD devices. What probably is atypical is a device that
is this small yet pretty fast.

-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ