lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Thu, 9 Jun 2011 09:09:08 -0400
From:	Christoph Hellwig <hch@...radead.org>
To:	Vivek Goyal <vgoyal@...hat.com>
Cc:	Christoph Hellwig <hch@...radead.org>, Ted Ts'o <tytso@....edu>,
	Dave Chinner <david@...morbit.com>, linux-ext4@...r.kernel.org
Subject: Re: Query about DIO/AIO WRITE throttling and ext4 serialization

On Thu, Jun 02, 2011 at 09:33:45PM -0400, Vivek Goyal wrote:
> > Yes this patch helps. I have already laid out the file and doing
> > overwrites.
> > 
> > I throttled aio-stress in one cgroup to 1 byte/sec and edited another
> > file from other cgroup and did a "sync" and it completed.
> 
> Even other test where I am running aio-stress in one window and edited
> a file in another window and typed "sync" worked. "sync" does not hang
> waiting for aio-stress to finish.

I've been thinking about the patch a bit more, and I think it's simply
incorrect.  i_iocount is the only thing that actually tracks in-flight
DIO/AIO requests, so we can't actually skip incrementing it as that
means we can't wait for pending AIO in fsync/sync/inode reclaim or
remount r/o.

We could simply declare AIO is off limits for sync and skip it there,
which is easily doable, but we'd still need a special case version of
sync for remount r/o as that absolutely needs to stop all pending I/O.

Of the other filesystem ext4 also has the counter, but only waits for
it during inode teardown, and using a slightly different, but also
effective scheme for fsync, but completely ignores sync and remount.

I couldn't find a similar scheme in other filesystem supporting AIO,
but it might be hidden a bit better.

I suspect we could optimize things by using the dual count and list
approach ext4 does - there is a counter for in-flight direct I/O, which
we only check for inode teardown and remount, as those need to stop
pending I/O, but sync and fsync can skip them as they only need to
flush pending I/O.  There is a list for the pending unwritten extent
conversions that only gets appended to when the actual I/O is done,
and the unwritten extent conversion is queued up. 

I'll see if I can come up with a good scheme for that, preferably
sitting directly in the direct I/O code, so that everyone gets it
without additional work.
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ