linux-kernel - Request for clarification on disk i/o limitations

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [day] [month] [year] [list]

Message-ID: <f4e013870704261701j3833369dv828fd9aaccb10c23@mail.gmail.com>
Date:	Thu, 26 Apr 2007 17:01:19 -0700
From:	"Mark Hull-Richter" <mhullrich@...il.com>
To:	linux-kernel@...r.kernel.org
Subject: Request for clarification on disk i/o limitations

I'm studying the behavior of the page cache vs. direct i/o in the
kernel (so far up to 2.6.9-42.0.10 for CentOS 4.4, moving to 2.6.18
for CentOS 5 soon).

I have found that large I/O requests get broken up differently between
direct i/o and cached i/o (not surprising, but it's the way they're
broken up that puzzles me).

While both direct and cached i/o break the i/o requests down to page
sized chunks, these are subsequently merged back together until a
limit is reached.  For direct i/o on my SCSI disks, that limit is 88
segments, or 704 sectors (352k of a 512k block), so when direct i/os
of 512k chunks are performed via dd, each 512k i/o usually gets broken
into one 704 sector chunk and one 320 sector chunk.

However, when this same kind of i/o is done through the page cache,
the pages also get chunked together, and each 704 sector chunk limits
the size of an i/o sent to the scheduler, but this time, the 704
sector chunks typically get merged into a 1408 sector chunk because
the limiting factor here is a maximum sectors per i/o size of 2048.

My question is: what is the relationship between the hardware segment
limit (enforced in ll_new_segment and __bio_add_page) of 88 pages per
segment and the hardware maximum sectors per i/o of 2048 (enforced by
ll_back_merge, and also ll_front_merge, though I haven't seen that one
happen yet)?  Where do these limits come from and why don't they
match?

(The other subject of my study is why EMC NAS devices take at least 2x
as long as SCSI "DAS" disks, and why paged i/o appears to be no more
than 1/2 the speed of direct i/o on both types of disk, specifically
for large i/os (>256k chunks.)

Please reply with CC to me directly as I am still not on the list (but
I will be soon.)

Thanks.

-- 
Mark Hull-Richter, Linux Kernel Engineer
DATAllegro (www.datallegro.com)
85 Enterprise, Second Floor, Aliso Viejo, CA  92656
949-680-3082 - Office     949-330-7691 - fax
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/