linux-kernel - Re: [PATCH v2 2/7] iomap: Add zero unwritten mappings dio support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20250108174216.GJ1306365@frogsfrogsfrogs>
Date: Wed, 8 Jan 2025 09:42:16 -0800
From: "Darrick J. Wong" <djwong@...nel.org>
To: John Garry <john.g.garry@...cle.com>
Cc: Christoph Hellwig <hch@....de>, brauner@...nel.org, cem@...nel.org,
	dchinner@...hat.com, ritesh.list@...il.com,
	linux-xfs@...r.kernel.org, linux-fsdevel@...r.kernel.org,
	linux-kernel@...r.kernel.org, martin.petersen@...cle.com
Subject: Re: [PATCH v2 2/7] iomap: Add zero unwritten mappings dio support

On Wed, Jan 08, 2025 at 11:39:35AM +0000, John Garry wrote:
> On 08/01/2025 01:26, Darrick J. Wong wrote:
> > > > I (vaguely) agree ith that.
> > > > 
> > > > > And only if the file mapping is in the correct state, and the
> > > > > program is willing to*maintain* them in the correct state to get the
> > > > > better performance.
> > > > I kinda agree with that, but the maintain is a bit hard as general
> > > > rule of thumb as file mappings can change behind the applications
> > > > back.  So building interfaces around the concept that there are
> > > > entirely stable mappings seems like a bad idea.
> > > I tend to agree.
> > As long as it's a general rule that file mappings can change even after
> > whatever prep work an application tries to do, we're never going to have
> > an easy time enabling any of these fancy direct-to-storage tricks like
> > cpu loads and stores to pmem, or this block-untorn writes stuff.
> > 
> > > > > I don't want xfs to grow code to write zeroes to
> > > > > mapped blocks just so it can then write-untorn to the same blocks.
> > > > Agreed.
> 
> Any other ideas on how to achieve this then?
> 
> There was the proposal to create a single bio covering mixed mappings, but
> then we had the issue that all the mappings cannot be atomically converted.
> I am not sure if this is really such an issue. I know that RWF_ATOMIC means
> all or nothing, but partially converted extents (from an atomic write) is a
> sort of grey area, as the original unmapped extents had nothing in the first
> place.

The long way -- introducing a file remap log intent item to guarantee
that the ioend processing completes no matter how mixed the mapping
might be.

> > > > 
> > > So if we want to allow large writes over mixed extents, how to handle?
> > > 
> > > Note that some time ago we also discussed that we don't want to have a
> > > single bio covering mixed extents as we cannot atomically convert all
> > > unwritten extents to mapped.
> > Fromhttps://lore.kernel.org/linux-xfs/Z3wbqlfoZjisbe1x@infradead.org/ :
> > 
> > "I think we should wire it up as a new FALLOC_FL_WRITE_ZEROES mode,
> > document very vigorously that it exists to facilitate pure overwrites
> > (specifically that it returns EOPNOTSUPP for always-cow files), and not
> > add more ioctls."
> > 
> > If we added this new fallocate mode to set up written mappings, would it
> > be enough to write in the programming manuals that applications should
> > use it to prepare a file for block-untorn writes?
> 
> Sure, that API extension could be useful in the case that we conclude that
> we don't permit atomic writes over mixed mappings.
> 
> > Perhaps we should
> > change the errno code to EMEDIUMTYPE for the mixed mappings case.
> > 
> > Alternately, maybe we/should/ let programs open a lease-fd on a file
> > range, do their untorn writes through the lease fd, and if another
> > thread does something to break the lease, then the lease fd returns EIO
> > until you close it.
> 
> So do means applications own specific ranges in files for exclusive atomic
> writes? Wouldn't that break what we already support today?

The application would own a lease on a specific range, but it could pass
that fd around.  Also you wouldn't need a lease for a single-fsblock
untorn write.

--D

> Cheers,
> John
> 
>