[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <11d39248-31fc-c625-7c06-341f0146bd67@redhat.com>
Date: Wed, 2 Aug 2023 17:09:13 -0400
From: Waiman Long <longman@...hat.com>
To: Kent Overstreet <kent.overstreet@...ux.dev>
Cc: linux-bcachefs@...r.kernel.org, linux-fsdevel@...r.kernel.org,
linux-kernel@...r.kernel.org,
Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>,
Boqun Feng <boqun.feng@...il.com>
Subject: Re: [PATCH 11/20] locking/osq: Export osq_(lock|unlock)
On 8/2/23 16:44, Kent Overstreet wrote:
> On Wed, Aug 02, 2023 at 04:16:12PM -0400, Waiman Long wrote:
>> On 7/12/23 17:11, Kent Overstreet wrote:
>>> These are used by bcachefs's six locks.
>>>
>>> Signed-off-by: Kent Overstreet <kent.overstreet@...ux.dev>
>>> Cc: Peter Zijlstra <peterz@...radead.org>
>>> Cc: Ingo Molnar <mingo@...hat.com>
>>> Cc: Waiman Long <longman@...hat.com>
>>> Cc: Boqun Feng <boqun.feng@...il.com>
>>> ---
>>> kernel/locking/osq_lock.c | 2 ++
>>> 1 file changed, 2 insertions(+)
>>>
>>> diff --git a/kernel/locking/osq_lock.c b/kernel/locking/osq_lock.c
>>> index d5610ad52b..b752ec5cc6 100644
>>> --- a/kernel/locking/osq_lock.c
>>> +++ b/kernel/locking/osq_lock.c
>>> @@ -203,6 +203,7 @@ bool osq_lock(struct optimistic_spin_queue *lock)
>>> return false;
>>> }
>>> +EXPORT_SYMBOL_GPL(osq_lock);
>>> void osq_unlock(struct optimistic_spin_queue *lock)
>>> {
>>> @@ -230,3 +231,4 @@ void osq_unlock(struct optimistic_spin_queue *lock)
>>> if (next)
>>> WRITE_ONCE(next->locked, 1);
>>> }
>>> +EXPORT_SYMBOL_GPL(osq_unlock);
>> Have you considered extending the current rw_semaphore to support a SIX lock
>> semantics? There are a number of instances in the kernel that a up_read() is
>> followed by a down_write(). Basically, the code try to upgrade the lock from
>> read to write. I have been thinking about adding a upgrade_read() API to do
>> that. However, the concern that I had was that another writer may come in
>> and make modification before the reader can be upgraded to have exclusive
>> write access and will make the task to repeat what has been done in the read
>> lock part. By adding a read with intent to upgrade to write, we can have
>> that guarantee.
> It's been discussed, Linus had the same thought.
>
> But it'd be a massive change to the rw semaphore code; this "read with
> intent" really is a third lock state which needs all the same
> lock/trylock/unlock paths, and with the way rw semaphore has separate
> entry points for read and write it'd be a _ton_ of new code. It really
> touches everything - waitlist handling included.
Yes, it is a major change, but I had done that before and it is
certainly doable. There are spare bits in the low byte of rwsem->count
that can be used as an intent bit. We also need to add a new
rwsem_wake_type for that for waitlist handling.
>
> And six locks have several other features that bcachefs needs, and other
> users may also end up wanting, that rw semaphores don't have; the two
> main features being a percpu read lock mode and support for an external
> cycle detector (which requires exposing lock waitlists, with some
> guarantees about how those waitlists are used).
Can you provide more information about those features?
>
>> With that said, I would prefer to keep osq_{lock/unlock} for internal use by
>> some higher level locking primitives - mutex, rwsem and rt_mutex.
> Yeah, I'm aware, but it seems like exposing osq_(lock|unlock) is the
> most palatable solution for now. Long term, I'd like to get six locks
> promoted to kernel/locking.
Your SIX overlaps with rwsem in term of features. So we will have to
somehow merge them instead of having 2 APIs with somewhat similar
functionality.
Cheers,
Longman
>
Powered by blists - more mailing lists