[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <fec60bba-e414-43d1-bc3e-870f5ffe4626@paulmck-laptop>
Date: Mon, 8 Apr 2024 09:55:23 -0700
From: "Paul E. McKenney" <paulmck@...nel.org>
To: Matthew Wilcox <willy@...radead.org>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
Philipp Stanner <pstanner@...hat.com>,
Kent Overstreet <kent.overstreet@...ux.dev>,
Boqun Feng <boqun.feng@...il.com>, rust-for-linux@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-arch@...r.kernel.org,
llvm@...ts.linux.dev, Miguel Ojeda <ojeda@...nel.org>,
Alex Gaynor <alex.gaynor@...il.com>,
Wedson Almeida Filho <wedsonaf@...il.com>,
Gary Guo <gary@...yguo.net>,
Björn Roy Baron <bjorn3_gh@...tonmail.com>,
Benno Lossin <benno.lossin@...ton.me>,
Andreas Hindborg <a.hindborg@...sung.com>,
Alice Ryhl <aliceryhl@...gle.com>,
Alan Stern <stern@...land.harvard.edu>,
Andrea Parri <parri.andrea@...il.com>,
Will Deacon <will@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
Nicholas Piggin <npiggin@...il.com>,
David Howells <dhowells@...hat.com>,
Jade Alglave <j.alglave@....ac.uk>,
Luc Maranget <luc.maranget@...ia.fr>,
Akira Yokosawa <akiyks@...il.com>,
Daniel Lustig <dlustig@...dia.com>,
Joel Fernandes <joel@...lfernandes.org>,
Nathan Chancellor <nathan@...nel.org>,
Nick Desaulniers <ndesaulniers@...gle.com>,
kent.overstreet@...il.com,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>, elver@...gle.com,
Mark Rutland <mark.rutland@....com>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org,
"H. Peter Anvin" <hpa@...or.com>,
Catalin Marinas <catalin.marinas@....com>,
linux-arm-kernel@...ts.infradead.org, linux-fsdevel@...r.kernel.org
Subject: Re: [WIP 0/3] Memory model and atomic API in Rust
On Mon, Apr 08, 2024 at 05:02:37PM +0100, Matthew Wilcox wrote:
> On Mon, Mar 25, 2024 at 10:44:43AM -0700, Linus Torvalds wrote:
> > So I actually think most compiler people are perfectly fine with the
> > kernel model of mostly doing 'volatile' not on the data structures
> > themselves, but as accesses through casts.
> >
> > It's very traditional C, and there's actually nothing particularly odd
> > about it. Not even from a compiler standpoint.
> >
> > In fact, I personally will argue that it is fundamentally wrong to
> > think that the underlying data has to be volatile. A variable may be
> > entirely stable in some cases (ie locks held), but not in others.
> >
> > So it's not the *variable* (aka "object") that is 'volatile', it's the
> > *context* that makes a particular access volatile.
> >
> > That explains why the kernel has basically zero actual volatile
> > objects, and 99% of all volatile accesses are done through accessor
> > functions that use a cast to mark a particular access volatile.
>
> What annoys me is that 'volatile' accesses have (at least) two distinct
> meanings:
> - Make this access untorn
> - Prevent various optimisations (code motion,
> common-subexpression-elimination, ...)
>
> As an example, folio_migrate_flags() (in mm/migrate.c):
>
> if (folio_test_error(folio))
> folio_set_error(newfolio);
> if (folio_test_referenced(folio))
> folio_set_referenced(newfolio);
> if (folio_test_uptodate(folio))
> folio_mark_uptodate(newfolio);
>
> ... which becomes...
>
> 1f: f6 c4 04 test $0x4,%ah
> 22: 74 05 je 29 <folio_migrate_flags+0x19>
> 24: f0 80 4f 01 04 lock orb $0x4,0x1(%rdi)
> 29: 48 8b 03 mov (%rbx),%rax
> 2c: a8 04 test $0x4,%al
> 2e: 74 05 je 35 <folio_migrate_flags+0x25>
> 30: f0 80 4d 00 04 lock orb $0x4,0x0(%rbp)
> 35: 48 8b 03 mov (%rbx),%rax
> 38: a8 08 test $0x8,%al
> 3a: 74 05 je 41 <folio_migrate_flags+0x31>
> 3c: f0 80 4d 00 08 lock orb $0x8,0x0(%rbp)
>
> In my ideal world, the compiler would turn this into:
>
> newfolio->flags |= folio->flags & MIGRATE_MASK;
>
> but because folio_test_foo() and folio_set_foo() contain all manner of
> volatile casts, the compiler is forced to do individual tests and sets.
>
> Part of that is us being dumb; folio_set_foo() should be __folio_set_foo()
> because this folio is newly allocated and nobody else can be messing
> with its flags word yet. I failed to spot that at the time I was doing
> the conversion from SetPageFoo to folio_set_foo.
>
> But if the compiler people could give us something a little more
> granular than "scary volatile access disable everything", that would
> be nice. Also hard, because now you have to figure out what this new
> thing interacts with and when is it safe to do what.
OK, I will bite...
Why not accumulate the changes in a mask, and then apply the mask the
one time? (In situations where __folio_set_foo() need not apply.)
If it turns out that we really do need a not-quite-volatile, what exactly
does it do? You clearly want it to be able to be optimized so as to merge
similar accesses. Is there a limit to the number of accesses that can
be merged or to the region of code over which such merging is permitted?
Either way, how is the compiler informed of these limits?
(I admit that I am not crazy about this sort of proposal, but that might
have something to do with the difficulty of repeatedly convincing
people that volatile is necessary and must be retained...)
Thanx, Paul
Powered by blists - more mailing lists