[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Yvqddx6nTRSZ0JYP@boqun-archlinux>
Date: Mon, 15 Aug 2022 12:24:39 -0700
From: Boqun Feng <boqun.feng@...il.com>
To: Hector Martin <marcan@...can.st>
Cc: Will Deacon <will@...nel.org>,
Linux ARM <linux-arm-kernel@...ts.infradead.org>,
Greg KH <gregkh@...uxfoundation.org>, jirislaby@...nel.org,
Marc Zyngier <maz@...nel.org>,
Mark Rutland <mark.rutland@....com>,
Peter Zijlstra <peterz@...radead.org>,
Catalin Marinas <catalin.marinas@....com>,
Asahi Linux <asahi@...ts.linux.dev>,
Oliver Neukum <oneukum@...e.com>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: Debugging a TTY race condition on M1 (memory ordering dragons)
On Tue, Aug 16, 2022 at 04:15:00AM +0900, Hector Martin wrote:
> On 16/08/2022 03.58, Boqun Feng wrote:
> > I agree this is handy, but an unconditional full barrier may be costy to
> > some users, and probably unnecessary if the users periodically queue
> > the work. In that case, some successful enqueue will eventually make all
> > memory accesses observable. Also if workqueue users use their own
> > locking in work function, then the barrier is also unnecessary.
> >
> > The document part of course needs some help to clear things up. But I'm
> > not sure "strengthen"ing the ordering guarantee of queue_work() is a
> > good idea. Maybe a dedicated API, like:
> >
> > // More work is needed for the @work, it has the same semantics as
> > // queue_work() if the @work is not pending. If the @work is pending,
> > // this ensures the work function observes all memory access before
> > // this.
> > void queue_more_work(struct work_struct *work)
> > {
> > smp_mb();
> > queue_work(work);
> > }
> >
> > Regards,
> > Boqun
>
> FWIW, I didn't actually use a full barrier in my patch. I just replaced
> the test_and_set_bit() with the underlying atomic op, sans early exit path.
>
> Personally though, I think it makes more sense to have the default
> function provide the guarantees, and if someone *really* needs the
> performance gain from eliding the implicit barrier, they could use an
> alternate API for that (after they show useful gains). This stuff is too
> subtle to expect every caller to wrap their head around memory ordering,
> and having queue_work() always provide order with prior stores *feels*
> intuitive.
>
Fair enough. It's just that when you know something about memory
ordering and atomics, you kinda want to play the game to save as many
barrier as you can ;-) It's a curse of knowledge.
> But let's see what the workqueue folks say :)
>
Looks like they agree with you ;-) Nice work!
Regards,
Boqun
> - Hector
Powered by blists - more mailing lists