[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <yq1y6ah4kuu.fsf@sermon.lab.mkp.net>
Date: Fri, 01 Oct 2010 18:19:21 -0400
From: "Martin K. Petersen" <martin.petersen@...cle.com>
To: "Ted Ts'o" <tytso@....edu>
Cc: Mike Snitzer <snitzer@...hat.com>,
Eric Sandeen <sandeen@...hat.com>,
"Martin K. Petersen" <martin.petersen@...cle.com>,
Jens Axboe <jaxboe@...ionio.com>,
"James.Bottomley\@hansenpartnership.com"
<James.Bottomley@...senpartnership.com>,
"linux-scsi\@vger.kernel.org" <linux-scsi@...r.kernel.org>,
"linux-ext4\@vger.kernel.org" <linux-ext4@...r.kernel.org>
Subject: Re: I/O topology fixes for big physical block size
>>>>> "Ted" == Ted Ts'o <tytso@....edu> writes:
Ted> If we scale minimum_io_size up to the physical block size, then
Ted> even though these devices will have 512 or 4k logical block sizes,
Ted> minimum_io_size will be 16k? That sounds wrong, incorrect, and
Ted> given that the Linux VM can't handle file system block sizes
Ted> greater than page size. And if we scale the minimum_io_size to the
Ted> physical block size, mke2fs will refuse to create a 4k blocksize
Ted> filesystem --- since presumably "minimum io size" means we can't do
Ted> I/O's smaller than that.
logical <= physical <= minimum
logical is the smallest unit we can address. Usually 512 bytes.
physical is the allocation unit the device claims to use
internally. Typically 512 or 4096. 8 and 16 KiB coming.
minimal is the device's preferred minimum random I/O unit. This
is usually identical to the physical block size. Arrays might
report a multiple of the physical block size here (stripe chunk
size).
optimal (if provided) is the preferred sequential I/O unit and a
multiple of minimal (stripe width).
The logical and physical parameters are device protocol-centric values.
The minimum and optimal I/O sizes are the two "soft" values that
filesystems should be looking at for layout hints.
A filesystem should use minimal as a cue for block size and optimal as a
cue for stripe width. minimum may indeed be bigger than page size and
this discussion was started to figure out if there were thing we could
do to accommodate these device without actually changing the filesystem
block size in the traditional sense.
Since not all drives guarantee that read-modify-write cycle on a 4 KiB
physical block won't clobber adjacent 512-byte logical blocks it may be
a good idea to look at physical block size if there are atomicity
concerns. I.e. filesystems that depend on atomic journal writes may
want to look at the reported physical block size.
--
Martin K. Petersen Oracle Linux Engineering
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists