lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Mon, 15 Aug 2022 12:24:39 -0700 From: Boqun Feng <boqun.feng@...il.com> To: Hector Martin <marcan@...can.st> Cc: Will Deacon <will@...nel.org>, Linux ARM <linux-arm-kernel@...ts.infradead.org>, Greg KH <gregkh@...uxfoundation.org>, jirislaby@...nel.org, Marc Zyngier <maz@...nel.org>, Mark Rutland <mark.rutland@....com>, Peter Zijlstra <peterz@...radead.org>, Catalin Marinas <catalin.marinas@....com>, Asahi Linux <asahi@...ts.linux.dev>, Oliver Neukum <oneukum@...e.com>, LKML <linux-kernel@...r.kernel.org> Subject: Re: Debugging a TTY race condition on M1 (memory ordering dragons) On Tue, Aug 16, 2022 at 04:15:00AM +0900, Hector Martin wrote: > On 16/08/2022 03.58, Boqun Feng wrote: > > I agree this is handy, but an unconditional full barrier may be costy to > > some users, and probably unnecessary if the users periodically queue > > the work. In that case, some successful enqueue will eventually make all > > memory accesses observable. Also if workqueue users use their own > > locking in work function, then the barrier is also unnecessary. > > > > The document part of course needs some help to clear things up. But I'm > > not sure "strengthen"ing the ordering guarantee of queue_work() is a > > good idea. Maybe a dedicated API, like: > > > > // More work is needed for the @work, it has the same semantics as > > // queue_work() if the @work is not pending. If the @work is pending, > > // this ensures the work function observes all memory access before > > // this. > > void queue_more_work(struct work_struct *work) > > { > > smp_mb(); > > queue_work(work); > > } > > > > Regards, > > Boqun > > FWIW, I didn't actually use a full barrier in my patch. I just replaced > the test_and_set_bit() with the underlying atomic op, sans early exit path. > > Personally though, I think it makes more sense to have the default > function provide the guarantees, and if someone *really* needs the > performance gain from eliding the implicit barrier, they could use an > alternate API for that (after they show useful gains). This stuff is too > subtle to expect every caller to wrap their head around memory ordering, > and having queue_work() always provide order with prior stores *feels* > intuitive. > Fair enough. It's just that when you know something about memory ordering and atomics, you kinda want to play the game to save as many barrier as you can ;-) It's a curse of knowledge. > But let's see what the workqueue folks say :) > Looks like they agree with you ;-) Nice work! Regards, Boqun > - Hector
Powered by blists - more mailing lists