lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <02d18236-26ef-09b0-90ad-030c4fe3ee20@linux.alibaba.com>
Date:   Tue, 19 Sep 2023 14:00:04 +0800
From:   Gao Xiang <hsiangkao@...ux.alibaba.com>
To:     linux-ext4@...r.kernel.org
Cc:     Theodore Ts'o <tytso@....edu>, Jan Kara <jack@...e.cz>,
        Matthew Bobrowski <mbobrowski@...browski.org>,
        Christoph Hellwig <hch@....de>,
        Joseph Qi <joseph.qi@...ux.alibaba.com>
Subject: [bug report] ext4 misses final i_size meta sync under O_DIRECT |
 O_SYNC semantics after iomap DIO conversion

Hi folks,

Our consumer reports a behavior change between pre-iomap and iomap
direct io conversion:

If the system crashes after an appending write to a file open with
O_DIRECT | O_SYNC flag set, file i_size won't be updated even if
O_SYNC was marked before.

It can be reproduced by a test program in the attachment with
gcc -o repro repro.c && ./repro testfile && echo c > /proc/sysrq-trigger

After some analysis, we found that before iomap direct I/O conversion,
the timing was roughly (taking Linux 3.10 codebase as an example):

	..
	- ext4_file_dio_write
	  - __generic_file_aio_write
	      ..
	    - ext4_direct_IO  # generic_file_direct_write
	      - ext4_ext_direct_IO
	        - ext4_ind_direct_IO  # final_size > inode->i_size
	          - ..
	          - ret = blockdev_direct_IO()
	          - i_size_write(inode, end) # orphan && ret > 0 &&
	                                   # end > inode->i_size
	          - ext4_mark_inode_dirty()
	          - ...
	  - generic_write_sync  # handling O_SYNC

So the dirty inode meta will be committed into journal immediately
if O_SYNC is set.  However, After commit 569342dc2485 ("ext4: move
inode extension/truncate code out from ->iomap_end() callback"),
the new behavior seems as below:

	..
	- ext4_dio_write_iter
	  - ext4_dio_write_checks  # extend = 1
	  - iomap_dio_rw
	      - __iomap_dio_rw
	      - iomap_dio_complete
	        - generic_write_sync
	  - ext4_handle_inode_extension  # extend = 1

So that i_size will be recorded only after generic_write_sync() is
called.  So O_SYNC won't flush the update i_size to the disk.

On the other side, after a quick look of XFS side, it will record
i_size changes in xfs_dio_write_end_io() so it seems that it doesn't
have this problem.

Thanks,
Gao Xiang
View attachment "repro.c" of type "text/plain" (1282 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ