lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 15 Aug 2022 12:24:39 -0700
From:   Boqun Feng <boqun.feng@...il.com>
To:     Hector Martin <marcan@...can.st>
Cc:     Will Deacon <will@...nel.org>,
        Linux ARM <linux-arm-kernel@...ts.infradead.org>,
        Greg KH <gregkh@...uxfoundation.org>, jirislaby@...nel.org,
        Marc Zyngier <maz@...nel.org>,
        Mark Rutland <mark.rutland@....com>,
        Peter Zijlstra <peterz@...radead.org>,
        Catalin Marinas <catalin.marinas@....com>,
        Asahi Linux <asahi@...ts.linux.dev>,
        Oliver Neukum <oneukum@...e.com>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: Debugging a TTY race condition on M1 (memory ordering dragons)

On Tue, Aug 16, 2022 at 04:15:00AM +0900, Hector Martin wrote:
> On 16/08/2022 03.58, Boqun Feng wrote:
> > I agree this is handy, but an unconditional full barrier may be costy to
> > some users, and probably unnecessary if the users periodically queue
> > the work. In that case, some successful enqueue will eventually make all
> > memory accesses observable. Also if workqueue users use their own
> > locking in work function, then the barrier is also unnecessary.
> > 
> > The document part of course needs some help to clear things up. But I'm
> > not sure "strengthen"ing the ordering guarantee of queue_work() is a
> > good idea. Maybe a dedicated API, like:
> > 
> > // More work is needed for the @work, it has the same semantics as
> > // queue_work() if the @work is not pending. If the @work is pending,
> > // this ensures the work function observes all memory access before
> > // this.
> > void queue_more_work(struct work_struct *work)
> > {
> > 	smp_mb();
> > 	queue_work(work);
> > }
> > 
> > Regards,
> > Boqun
> 
> FWIW, I didn't actually use a full barrier in my patch. I just replaced
> the test_and_set_bit() with the underlying atomic op, sans early exit path.
> 
> Personally though, I think it makes more sense to have the default
> function provide the guarantees, and if someone *really* needs the
> performance gain from eliding the implicit barrier, they could use an
> alternate API for that (after they show useful gains). This stuff is too
> subtle to expect every caller to wrap their head around memory ordering,
> and having queue_work() always provide order with prior stores *feels*
> intuitive.
> 

Fair enough. It's just that when you know something about memory
ordering and atomics, you kinda want to play the game to save as many
barrier as you can ;-) It's a curse of knowledge.

> But let's see what the workqueue folks say :)
> 

Looks like they agree with you ;-) Nice work!

Regards,
Boqun

> - Hector

Powered by blists - more mailing lists