lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20100301152149.7ce78e14.akpm@linux-foundation.org>
Date:	Mon, 1 Mar 2010 15:21:49 -0800
From:	Andrew Morton <akpm@...ux-foundation.org>
To:	Dmitry Monakhov <dmonakhov@...nvz.org>
Cc:	linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org
Subject: Re: [patch] RFC directio: partial writes support

On Thu, 25 Feb 2010 15:45:58 +0300
Dmitry Monakhov <dmonakhov@...nvz.org> wrote:

> Can someone please describe me why directio deny partial writes.
> For example if someone try to write 100Mb but file system has less
> data it return ENOSPC in the middle of block allocation.
> All allocated blocks will be truncated (it may be 100Mb -4k) end
> ENOSPC will be returned. As far as i remember direct_io always act
> like this, but i never asked why?
> Why do we have to give up all the progress we made?
> In fact partial writes are possible in case of holes, when we 
> fall back to buffered write. XFS implemented partial writes.

The problem with direct-io writes is that the writes don't necessarily
complete in file-offset-ascending order.  So if we've issued 50 write
BIOs and then hit an EIO on a BIO then we could have a hunk of
unwritten data with newly-writted data either side of it.  If we get a
bunch of discontiguous EIO BIOs coming in then the problem gets even
messier - we have a span of disk which has a random mix of
correctly-written and not-correctly-written runs of sectors.  What do
we do with that?

The code _could_ perhaps go back and crawl through the request and
identify the number of successfully-written bytes between
start-of-request and first-EIO and then return that.  But we didn't
bother.


ENOSPC errors are handled via the same code path and hence got
deoptimised due to this EIO handling.  We could perhaps improve the
ENOSPC handling along the lines you propose, as long as we
appropriately take care of EIO considerations.  Which, afacit, your
patch didn't do.

The presence of opt-in DIO_PARTIAL_WRITE thing is rather unfortunate -
it would be better to make this change for all filesystems in one hit. 
But I guess DIO_PARTIAL_WRITE permits us to migrate filesystems
one-at-a-time as testing permits.  But the aim should be to remove
DIO_PARTIAL_WRITE altogether once all the conversion and testing is
completed.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ