[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1484d32b-ab0f-48ff-998a-62feada58300@app.fastmail.com>
Date: Thu, 12 Sep 2024 10:01:59 +0000
From: "Arnd Bergmann" <arnd@...nel.org>
To: "Jeff Layton" <jlayton@...nel.org>, "John Stultz" <jstultz@...gle.com>
Cc: "Alexander Viro" <viro@...iv.linux.org.uk>,
"Christian Brauner" <brauner@...nel.org>, "Jan Kara" <jack@...e.cz>,
"Thomas Gleixner" <tglx@...utronix.de>, "Stephen Boyd" <sboyd@...nel.org>,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
"kernel test robot" <oliver.sang@...el.com>
Subject: Re: [PATCH] timekeeping: move multigrain ctime floor handling into timekeeper
On Wed, Sep 11, 2024, at 20:43, Jeff Layton wrote:
>
> I think we'd have to track this delta as an atomic value and cmpxchg
> new values into place. The zeroing seems quite tricky to make race-
> free.
>
> Currently, we fetch the floor value early in the process and if it
> changes before we can swap a new one into place, we just take whatever
> the new value is (since it's just as good). Since these are monotonic
> values, any new value is still newer than the original one, so its
> fine. I'm not sure that still works if we're dealing with a delta that
> is siding upward and downward.
>
> Maybe it does though. I'll take a stab at this tomorrow and see how it
> looks.
Right, the only idea I had for this would be to atomically
update a 64-bit tuple of the 32-bit sequence count and the
32-bit delta value in the timerkeeper. That way I think the
"coarse" reader would still get a correct value when running
concurrently with both a fine-grained reader updating the count
and the timer tick setting a new count.
There are still a couple of problems:
- this extends the timekeeper logic beyond what the seqlock
semantics normally allow, and I can't prove that this actually
works in all corner cases.
- if the delta doesn't fit in a 32-bit value, there has to
be another fallback mechanism.
- This still requires an atomic64_cmpxchg() in the
fine-grained ktime_get_real_ts64() replacement, which
I think is what inode_set_ctime_current() needs today
as well to ensure that the next coarse value is the
highest one that has been read so far.
There is another idea that would completely replace
your design with something /much/ simpler:
- add a variant of ktime_get_real_ts64() that just
sets a flag in the timekeeper to signify that a
fine-grained time has been read since the last
timer tick
- add a variant of ktime_get_coarse_real_ts64()
that returns either tk_xtime() if the flag is
clear or calls ktime_get_real_ts64() if it's set
- reset the flag in timekeeping_advance() and any other
place that updates tk_xtime
That way you avoid the atomic64_try_cmpxchg()
inode_set_ctime_current(), making that case faster,
and avoid all overhead in coarse_ctime() unless you
use both types during the same tick.
Arnd
Powered by blists - more mailing lists