[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <7963aad6-203c-4da4-ba9f-cf716d350121@www.fastmail.com>
Date: Mon, 23 May 2022 06:40:30 -0400
From: "Colin Walters" <walters@...bum.org>
To: "Javier Martinez Canillas" <javierm@...hat.com>,
linux-kernel@...r.kernel.org
Cc: "Peter Jones" <pjones@...hat.com>,
"Alexander Larsson" <alexl@...hat.com>,
"Alberto Ruiz" <aruiz@...hat.com>,
"Christian Kellner" <ckellner@...hat.com>,
"Lennart Poettering" <lennart@...ttering.net>,
"Chung-Chiang Cheng" <cccheng@...ology.com>,
"OGAWA Hirofumi" <hirofumi@...l.parknet.co.jp>
Subject: Re: [RFC PATCH 2/3] fat: add renameat2 RENAME_EXCHANGE flag support
On Thu, May 19, 2022, at 5:23 AM, Javier Martinez Canillas wrote:
> The renameat2 RENAME_EXCHANGE flag allows to atomically exchange two paths
> but is currently not supported by the Linux vfat filesystem driver.
>
> Add a vfat_rename_exchange() helper function that implements this support.
>
> The super block lock is acquired during the operation to ensure atomicity,
> and in the error path actions made are reversed also with the mutex held,
> making the whole operation transactional.
Transactional with respect to the mounted kernel, but AIUI because vfat does not have journaling, the semantics on hard failure are...unspecified? Is it possible for example we could see no file at all in the destination path?
This relates to https://github.com/ostreedev/ostree/issues/1951
TL;DR I'd been thinking that in order to have things be maximally robust we need to:
1. Write new desired bootloader config
2. fsync it
3. fsync containing directory (I guess for vfat really, syncfs())
4. remove old config, syncfs()
And here the bootloader would know to prefer the "new" file if it exists, and to delete the old one if it's still present on the next boot.
(Now obviously this is a small patch which will surely be generally useful, e.g. for tools that operate on things like mounted USB sticks, being able to do an atomic exchange at least from the running kernel PoV is just as useful as it is on other "regular" (and journaled) mounted filesystems)
So assuming we have this, I guess the flow could be:
1. rename_exchange(old, new)
2. syncfs()
? But that's assuming that the implementation of this doesn't e.g. have any "holes" where in theory we could flush an intermediate state.
Powered by blists - more mailing lists