lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 22 Oct 2019 09:38:19 +1100
From:   Dave Chinner <david@...morbit.com>
To:     Jan Kara <jack@...e.cz>
Cc:     "Theodore Y. Ts'o" <tytso@....edu>,
        Matthew Bobrowski <mbobrowski@...browski.org>,
        adilger.kernel@...ger.ca, linux-ext4@...r.kernel.org,
        linux-fsdevel@...r.kernel.org, hch@...radead.org,
        darrick.wong@...cle.com
Subject: Re: [PATCH v5 00/12] ext4: port direct I/O to iomap infrastructure

On Mon, Oct 21, 2019 at 09:43:30PM +0200, Jan Kara wrote:
> On Mon 21-10-19 09:31:12, Theodore Y. Ts'o wrote:
> > Hi Matthew, thanks for your work on this patch series!
> > 
> > I applied it against 4c3, and ran a quick test run on it, and found
> > the following locking problem.  To reproduce:
> > 
> > kvm-xfstests -c nojournal generic/113
> > 
> > generic/113		[09:27:19][    5.841937] run fstests generic/113 at 2019-10-21 09:27:19
> > [    7.959477] 
> > [    7.959798] ============================================
> > [    7.960518] WARNING: possible recursive locking detected
> > [    7.961225] 5.4.0-rc3-xfstests-00012-g7fe6ea084e48 #1238 Not tainted
> > [    7.961991] --------------------------------------------
> > [    7.962569] aio-stress/1516 is trying to acquire lock:
> > [    7.963129] ffff9fd4791148c8 (&sb->s_type->i_mutex_key#12){++++}, at: __generic_file_fsync+0x3e/0xb0
> > [    7.964109] 
> > [    7.964109] but task is already holding lock:
> > [    7.964740] ffff9fd4791148c8 (&sb->s_type->i_mutex_key#12){++++}, at: ext4_dio_write_iter+0x15b/0x430
> 
> This is going to be a tricky one. With iomap, the inode locking is handled
> by the filesystem while calling generic_write_sync() is done by
> iomap_dio_rw(). I would really prefer to avoid tweaking iomap_dio_rw() not
> to call generic_write_sync().

You can't remove it from there, because that will break O_DSYNC
AIO+DIO. i.e. generic_write_sync() needs to be called before
iocb->ki_complete() is called in the AIO completion path, and that
means filesystems using iomap_dio_rw need to be be able to run
generic_write_sync() without taking the inode_lock().

> So we need to remove inode_lock from
> __generic_file_fsync() (used from ext4_sync_file()). This locking is mostly
> for legacy purposes and we don't need this in ext4 AFAICT - but removing
> the lock from __generic_file_fsync() would mean auditing all legacy
> filesystems that use this to make sure flushing inode & its metadata buffer
> list while it is possibly changing cannot result in something unexpected. I
> don't want to clutter this series with it so we are left with
> reimplementing __generic_file_fsync() inside ext4 without inode_lock. Not
> too bad but not great either. Thoughts?

XFS has implemented it's own ->fsync operation pretty much forever
without issues. It's basically:

	1. flush dirty cached data w/ WB_SYNC_ALL
	2. flush dirty cached metadata (i.e. journal force)
	3. flush device caches if journal force didn't, keeping in
	mind the requirements of data and journal being placed on
	different devices.

The ext4 variant shouldn't need to be any more complex than that...

Cheers,

Dave.
-- 
Dave Chinner
david@...morbit.com

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ