lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z43bDYQ6PwklKgJ3@dread.disaster.area>
Date: Mon, 20 Jan 2025 16:11:41 +1100
From: Dave Chinner <david@...morbit.com>
To: Amir Goldstein <amir73il@...il.com>
Cc: Christoph Hellwig <hch@...radead.org>,
	Brian Foster <bfoster@...hat.com>,
	"Darrick J. Wong" <djwong@...nel.org>,
	Chi Zhiling <chizhiling@....com>, cem@...nel.org,
	linux-xfs@...r.kernel.org, linux-kernel@...r.kernel.org,
	Chi Zhiling <chizhiling@...inos.cn>,
	John Garry <john.g.garry@...cle.com>
Subject: Re: [PATCH] xfs: Remove i_rwsem lock in buffered read

On Sat, Jan 18, 2025 at 02:03:41PM +0100, Amir Goldstein wrote:
> On Fri, Jan 17, 2025 at 11:19 PM Dave Chinner
> <david@...morbit.com> wrote:
> > On Fri, Jan 17, 2025 at 02:27:46PM +0100, Amir Goldstein wrote:
> > > On Wed, Jan 15, 2025 at 10:41 PM Dave Chinner
> > > <david@...morbit.com> wrote: For all practical purposes, we
> > > could maintain a counter in inode not for submitted DIO, but
> > > for files opened O_DIRECT.
> > >
> > > The first open O_DIRECT could serialize with in-flight
> > > buffered writes holding a shared iolock and then buffered
> > > writes could take SH vs. EX iolock depending on folio state
> > > and on i_dio_open_count.
> >
> > I don't see how this can be made to work w.r.t. sane data
> > coherency behaviour. e.g. how does this model serialise a new
> > DIO write that is submitted after a share-locked buffered write
> > has just started and passed all the "i_dio_count == 0" checks
> > that enable it to use shared locking? i.e. we now have
> > potentially overlapping buffered writes and DIO writes being
> > done concurrently because the buffered write may not have
> > instantiated folios in the page cache yet....
> >
> 
> A shared-locked buffered write can only start when there is no
> O_DIRECT file open on the inode.  The first open for O_DIRECT to
> increment i_dio_open_count needs to take exclusive iolock to wait
> for all in-flight buffered writes then release the iolock.
> 
> All DIO submitted via O_DIRECT fds will be safe against in-flight
> share-locked buffered write.

Hooking ->open() and ->release() isn't sufficient. O_DIRECT can be
changed at any time via fcntl(F_SETFL) which isn't synchronised
against IO in progress at all. i.e. we can't track that, nor can we
even use f_ops->check_flags to reject changes to O_DIRECT state
because it doesn't pass either the filp or the inode....

IOWs, as it stands logic like "is the file opened for O_DIRECT" in
the IO path is inherently racy and the racing mechanisms are
directly under the control of unprivileged userspace applications.
Without mods from the VFS down and new hooks being implemented in
XFS along with all the whacky "which serialisiation method do we
use" logic and dealing with the complexities and switching
between them, tracking struct files that are open for O_DIRECT isn't
really a viable mechanism for enabling shared buffered writes...

-Dave.
-- 
Dave Chinner
david@...morbit.com

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ