[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240819-bestbezahlt-galaabend-36a83208e172@brauner>
Date: Mon, 19 Aug 2024 10:16:34 +0200
From: Christian Brauner <brauner@...nel.org>
To: NeilBrown <neilb@...e.de>, Peter Zijlstra <peterz@...radead.org>
Cc: Ingo Molnar <mingo@...hat.com>, Peter Zijlstra <peterz@...radead.org>,
Linus Torvalds <torvalds@...ux-foundation.org>, linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org
Subject: Re: [PATCH 0/9 RFC] Make wake_up_{bit,var} less fragile
On Mon, Aug 19, 2024 at 03:20:34PM GMT, NeilBrown wrote:
> I wasn't really sure who to send this too, and get_maintainer.pl
> suggested 132 addresses which seemed excessive. So I've focussed on
> 'sched' maintainers. I'll probably submit individual patches to
> relevant maintainers/lists if I get positive feedback at this level.
>
> This series was motivated by
>
> Commit ed0172af5d6f ("SUNRPC: Fix a race to wake a sync task")
>
> which adds smp_mb__after_atomic(). I thought "any API that requires that
> sort of thing needs to be fixed".
>
> The main patches here are 7 and 8 which revise wake_up_bit and
> wake_up_var respectively. They result in 3 interfaces:
> wake_up_{bit,var} includes smp_mb__after_atomic()
> wake_up_{bit,var}_relaxed() doesn't have a barrier
> wake_up_{bit,var}_mb() includes smb_mb().
It's great that this api is brought up because it gives me a reason to
ask a stupid question I've had for a while.
I want to change the i_state member in struct inode from an unsigned
long to a u32 because really we're wasting 4 bytes on 64 bit that we're
never going to use given how little I_* flags we actually have and I
dislike that we use that vacuous type in a bunch of our structures for
that reason.
(Together with another 4 byte shrinkage we would get a whopping 8 bytes
back.)
The problem is that we currently use wait_on_bit() and wake_up_bit() in
various places on i_state and all of these functions require an unsigned
long (probably because some architectures only handle atomic ops on
unsigned long).
I have an experimental patch converting all of that from wait_on_bit()
to wait_var_event() but I was hesitant about it because iiuc the
semantics don't nicely translate into each other. Specifically, if some
code uses wait_on_bit(SOME_BIT) and someone calls
wake_up_bit(SOME_OTHER_BIT) then iiuc only SOME_OTHER_BIT waiters will
be woken. IOW, SOME_BIT waiters are unaffected.
But if this is switched to wait_var_event() and wake_up_var() and there
are two waiters e.g.,
W1: wait_var_event(inode->i_state, !(inode->i_state & SOME_BIT))
W2: wait_var_event(inode->i_state, !(inode->i_state & SOME_OTHER_BIT))
then a waker like
inode->i_state &= ~SOME_OTHER_BIT;
// insert appropriate barrier
wake_up_var(inode->i_state)
will cause both W1 and W2 to be woken but W1 is put back to sleep? Is
there a nicer way to do this? The only thing I want is a (pony) 32bit
bit-wait-like mechanism.
Powered by blists - more mailing lists