[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20231101202302.GB32034@redhat.com>
Date: Wed, 1 Nov 2023 21:23:03 +0100
From: Oleg Nesterov <oleg@...hat.com>
To: David Howells <dhowells@...hat.com>
Cc: Marc Dionne <marc.dionne@...istor.com>,
Alexander Viro <viro@...iv.linux.org.uk>,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>,
Chuck Lever <chuck.lever@...cle.com>,
linux-afs@...ts.infradead.org, netdev@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] rxrpc_find_service_conn_rcu: use read_seqbegin() rather
than read_seqbegin_or_lock()
On 11/01, David Howells wrote:
>
> Oleg Nesterov <oleg@...hat.com> wrote:
>
> > read_seqbegin_or_lock() makes no sense unless you make "seq" odd
> > after the lockless access failed.
>
> I think you're wrong.
I think you missed the point ;)
> write_seqlock() turns it odd.
It changes seqcount_t->sequence but not "seq" so this doesn't matter.
> For instance, if the read lock is taken first:
>
> sequence seq CPU 1 CPU 2
> ======= ======= =============================== ===============
> 0
> 0 0 seq = 0 MUST BE EVEN
This is correct,
> ACCORDING TO DOC
documentation is wrong, please see
[PATCH 1/2] seqlock: fix the wrong read_seqbegin_or_lock/need_seqretry documentation
https://lore.kernel.org/all/20231024120808.GA15382@redhat.com/
> 0 0 read_seqbegin_or_lock() [lockless]
> ...
> 1 0 write_seqlock()
> 1 0 need_seqretry() [seq=even; sequence!=seq: retry]
Yes, if CPU_1 races with write_seqlock() need_seqretry() returns true,
> 1 1 read_seqbegin_or_lock() [exclusive]
No. "seq" is still even, so read_seqbegin_or_lock() won't do read_seqlock_excl(),
it will do
seq = read_seqbegin(lock);
again.
> Note that it spins in __read_seqcount_begin() until we get an even seq,
> indicating that no write is currently in progress - at which point we can
> perform a lockless pass.
Exactly. And this means that "seq" is always even.
> > See thread_group_cputime() as an example, note that it does nextseq = 1 for
> > the 2nd round.
>
> That's not especially convincing.
See also the usage of read_seqbegin_or_lock() in fs/dcache.c and fs/d_path.c.
All other users are wrong.
Lets start from the very beginning. This code does
int seq = 0;
do {
read_seqbegin_or_lock(service_conn_lock, &seq);
do_something();
} while (need_seqretry(service_conn_lock, seq));
done_seqretry(service_conn_lock, seq);
Initially seq is even (it is zero), so read_seqbegin_or_lock(&seq) does
*seq = read_seqbegin(lock);
and returns. Note that "seq" is still even.
Now. If need_seqretry(seq) detects the race with write_seqlock() it returns
true but it does NOT change this "seq", it is still even. So on the next
iteration read_seqbegin_or_lock() will do
*seq = read_seqbegin(lock);
again, it won't take this lock for writing. And again, seq will be even.
And so on.
And this means that the code above is equivalent to
do {
seq = read_seqbegin(service_conn_lock);
do_something();
} while (read_seqretry(service_conn_lock, seq));
and this is what this patch does.
Yes this is confusing. Again, even the documentation is wrong! That is why
I am trying to remove the misuse of read_seqbegin_or_lock(), then I am going
to change the semantics of need_seqretry() to enforce the locking on the 2nd
pass.
Oleg.
Powered by blists - more mailing lists