lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f3f84fc1-50be-0a9b-0538-6ea26dd93b16@marcan.st>
Date:   Tue, 16 Aug 2022 04:15:00 +0900
From:   Hector Martin <marcan@...can.st>
To:     Boqun Feng <boqun.feng@...il.com>
Cc:     Will Deacon <will@...nel.org>,
        Linux ARM <linux-arm-kernel@...ts.infradead.org>,
        Greg KH <gregkh@...uxfoundation.org>, jirislaby@...nel.org,
        Marc Zyngier <maz@...nel.org>,
        Mark Rutland <mark.rutland@....com>,
        Peter Zijlstra <peterz@...radead.org>,
        Catalin Marinas <catalin.marinas@....com>,
        Asahi Linux <asahi@...ts.linux.dev>,
        Oliver Neukum <oneukum@...e.com>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: Debugging a TTY race condition on M1 (memory ordering dragons)

On 16/08/2022 03.58, Boqun Feng wrote:
> I agree this is handy, but an unconditional full barrier may be costy to
> some users, and probably unnecessary if the users periodically queue
> the work. In that case, some successful enqueue will eventually make all
> memory accesses observable. Also if workqueue users use their own
> locking in work function, then the barrier is also unnecessary.
> 
> The document part of course needs some help to clear things up. But I'm
> not sure "strengthen"ing the ordering guarantee of queue_work() is a
> good idea. Maybe a dedicated API, like:
> 
> // More work is needed for the @work, it has the same semantics as
> // queue_work() if the @work is not pending. If the @work is pending,
> // this ensures the work function observes all memory access before
> // this.
> void queue_more_work(struct work_struct *work)
> {
> 	smp_mb();
> 	queue_work(work);
> }
> 
> Regards,
> Boqun

FWIW, I didn't actually use a full barrier in my patch. I just replaced
the test_and_set_bit() with the underlying atomic op, sans early exit path.

Personally though, I think it makes more sense to have the default
function provide the guarantees, and if someone *really* needs the
performance gain from eliding the implicit barrier, they could use an
alternate API for that (after they show useful gains). This stuff is too
subtle to expect every caller to wrap their head around memory ordering,
and having queue_work() always provide order with prior stores *feels*
intuitive.

But let's see what the workqueue folks say :)

- Hector

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ