lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100309100153.GD18077@nb.net.home>
Date:	Tue, 9 Mar 2010 11:01:53 +0100
From:	Karel Zak <kzak@...hat.com>
To:	Michael Tokarev <mjt@....msk.ru>
Cc:	Mike Snitzer <snitzer@...hat.com>,
	"Martin K. Petersen" <martin.petersen@...cle.com>,
	Tejun Heo <tj@...nel.org>,
	"linux-ide@...r.kernel.org" <linux-ide@...r.kernel.org>,
	lkml <linux-kernel@...r.kernel.org>,
	Daniel Taylor <Daniel.Taylor@....com>,
	Jeff Garzik <jeff@...zik.org>, Mark Lord <kernel@...savvy.com>,
	tytso@....edu, "H. Peter Anvin" <hpa@...or.com>,
	hirofumi@...l.parknet.co.jp,
	Andrew Morton <akpm@...ux-foundation.org>,
	Alan Cox <alan@...rguk.ukuu.org.uk>, irtiger@...il.com,
	Matthew Wilcox <matthew@....cx>, aschnell@...e.de,
	knikanth@...e.de, jdelvare@...e.de,
	Jim Meyering <jim@...ering.net>, Neil Brown <neilb@...e.de>
Subject: Re: ATA 4 KiB sector issues.

On Tue, Mar 09, 2010 at 09:53:37AM +0300, Michael Tokarev wrote:
> Mike Snitzer wrote:
> []
> > I've been keeping track of all the pieces in play, have coordinated
> > with kzak and jim, and have a summary that offers some amount of macro
> > detail (at the end I touch on parted and fdisk):
> > 
> > http://people.redhat.com/msnitzer/docs/io-limits.txt
> 
> What I don't see in this thread and in this document is - any mention
> of linux md layer.  I think it is the first candidate to test the whole
> thing, the easiest and most important one.  I mean the alignment and
> "recommended I/O size" and all this similar stuff.
> 
> Think of a raid5 array - with all the mentioned good stuff in place
> fdisk should figure out to align partitions on the array stripe
> boundary, and should do that automatically.  And this should be

Yes. For userspace there is not a difference between RAID and non-RAID
device -- the topology support in kernel provides unified API to all
devices. It means we needn't any extra support for RAIDs in
fdisk/parted. The userspace tools follow topology data from kernel.

The good thing with 1MiB default alignment is that it is usable for
usual stripe sizes (for sizes greater than 1MiB we use optimal I/O
size).

> most easy to debug/test, since the whole thing is controllable
> by kernel.

I did almost all my tests with scsi_debug or MD RAID0 on scsi_debug.
It works as expected. (Note that kernel 2.6.31 has a problem with
alignment_offset calculation on stacked devices, so use the latest
kernel where the bug is already fixed.)

But I didn't tried to use unpartitioned (whole) 4K disks for RAIDs,
because scsi_debug does not allow to create more devices (and I don't
have a real HW).

Some tests are available in util-linux-ng sources:
http://git.kernel.org/?p=utils/util-linux-ng/util-linux-ng.git;a=tree;f=tests/ts/fdisk

    Karel


 # modprobe scsi_debug dev_size_mb=2500 sector_size=512 physblk_exp=3

    [..create partitions...]

 # fdisk -lcu /dev/sdb 

 Disk /dev/sdb: 2621 MB, 2621440000 bytes
 255 heads, 63 sectors/track, 318 cylinders, total 5120000 sectors
 Units = sectors of 1 * 512 = 512 bytes
 Sector size (logical/physical): 512 bytes / 4096 bytes
 I/O size (minimum/optimal): 4096 bytes / 32768 bytes
 Disk identifier: 0xb585b0be

 Device Boot         Start         End      Blocks   Id  System
 /dev/sdb1            2048     1026047      512000   83  Linux
 /dev/sdb2         1026048     2050047      512000   83  Linux
 /dev/sdb3         2050048     3074047      512000   83  Linux
 /dev/sdb4         3074048     4098047      512000   83  Linux


 # mdadm --create /dev/md8 --level=5 --raid-devices=4 /dev/sdb{1,2,3,4}

     [...create partitions on the raid...]

 # fdisk -lcu /dev/md8

 Disk /dev/md8: 1572 MB, 1572667392 bytes
 2 heads, 4 sectors/track, 383952 cylinders, total 3071616 sectors
 Units = sectors of 1 * 512 = 512 bytes
 Sector size (logical/physical): 512 bytes / 4096 bytes
 I/O size (minimum/optimal): 65536 bytes / 65536 bytes
 Disk identifier: 0x1bb6fd8d

 Device Boot          Start         End      Blocks   Id  System
 /dev/md8p1            2048     1435647      716800   83  Linux
 /dev/md8p2         1435648     2869247      716800   83  Linux


 Check offsets (alignment):

 # cat /sys/block/sdb/sdb{1,2,3,4}/alignment_offset
 0
 0
 0
 0

 # cat /sys/block/md8/md8p{1,2}/alignment_offset
 0
 0

-- 
 Karel Zak  <kzak@...hat.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ