[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAP01T77B9OH6vPqYNyLwmdo4Q6EE5iAi4dTKduPqpTOgdkO_Bw@mail.gmail.com>
Date: Thu, 13 Feb 2025 07:15:27 +0100
From: Kumar Kartikeya Dwivedi <memxor@...il.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: bpf@...r.kernel.org, linux-kernel@...r.kernel.org,
Ankur Arora <ankur.a.arora@...cle.com>, Linus Torvalds <torvalds@...ux-foundation.org>,
Will Deacon <will@...nel.org>, Waiman Long <llong@...hat.com>, Alexei Starovoitov <ast@...nel.org>,
Andrii Nakryiko <andrii@...nel.org>, Daniel Borkmann <daniel@...earbox.net>,
Martin KaFai Lau <martin.lau@...nel.org>, Eduard Zingerman <eddyz87@...il.com>,
"Paul E. McKenney" <paulmck@...nel.org>, Tejun Heo <tj@...nel.org>, Barret Rhoden <brho@...gle.com>,
Josh Don <joshdon@...gle.com>, Dohyun Kim <dohyunkim@...gle.com>,
linux-arm-kernel@...ts.infradead.org, kernel-team@...a.com
Subject: Re: [PATCH bpf-next v2 17/26] rqspinlock: Hardcode cond_acquire loops
to asm-generic implementation
On Mon, 10 Feb 2025 at 11:03, Peter Zijlstra <peterz@...radead.org> wrote:
>
> On Mon, Feb 10, 2025 at 10:53:25AM +0100, Peter Zijlstra wrote:
> > On Thu, Feb 06, 2025 at 02:54:25AM -0800, Kumar Kartikeya Dwivedi wrote:
> > > Currently, for rqspinlock usage, the implementation of
> > > smp_cond_load_acquire (and thus, atomic_cond_read_acquire) are
> > > susceptible to stalls on arm64, because they do not guarantee that the
> > > conditional expression will be repeatedly invoked if the address being
> > > loaded from is not written to by other CPUs. When support for
> > > event-streams is absent (which unblocks stuck WFE-based loops every
> > > ~100us), we may end up being stuck forever.
> > >
> > > This causes a problem for us, as we need to repeatedly invoke the
> > > RES_CHECK_TIMEOUT in the spin loop to break out when the timeout
> > > expires.
> > >
> > > Hardcode the implementation to the asm-generic version in rqspinlock.c
> > > until support for smp_cond_load_acquire_timewait [0] lands upstream.
> > >
> >
> > *sigh*.. this patch should go *before* patch 8. As is that's still
> > horribly broken and I was WTF-ing because your 0/n changelog said you
> > fixed it.
>
Sorry about that, I will move it before the patch using this.
> And since you're doing local copies of things, why not take a lobal copy
> of the smp_cond_load_acquire_timewait() thing?
Ack, I'll address this in v3.
Powered by blists - more mailing lists