[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c961758ed08a61cef6a30427cfafc898fb119f47.camel@surriel.com>
Date: Mon, 06 Jan 2025 09:26:14 -0500
From: Rik van Riel <riel@...riel.com>
To: Jann Horn <jannh@...gle.com>
Cc: x86@...nel.org, linux-kernel@...r.kernel.org, kernel-team@...a.com,
dave.hansen@...ux.intel.com, luto@...nel.org, peterz@...radead.org,
tglx@...utronix.de, mingo@...hat.com, bp@...en8.de, hpa@...or.com,
akpm@...ux-foundation.org, nadav.amit@...il.com,
zhengqi.arch@...edance.com, linux-mm@...ck.org
Subject: Re: [PATCH 09/12] x86/mm: enable broadcast TLB invalidation for
multi-threaded processes
On Mon, 2025-01-06 at 14:04 +0100, Jann Horn wrote:
> On Sat, Jan 4, 2025 at 3:55 AM Rik van Riel <riel@...riel.com> wrote:
>
> >
> > Then the only change needed to switch_mm_irqs_off
> > would be to move the LOADED_MM_SWITCHING line to
> > before choose_new_asid, to fully close the window.
> >
> > Am I overlooking anything here?
>
> I think that might require having a full memory barrier in
> switch_mm_irqs_off to ensure that the write of LOADED_MM_SWITCHING
> can't be reordered after reads in choose_new_asid(). Which wouldn't
> be
> very nice; we probably should avoid adding heavy barriers to the task
> switch path...
>
> Hmm, but I think luckily the cpumask_set_cpu() already implies a
> relaxed RMW atomic, which I think on X86 is actually the same as a
> sequentially consistent atomic, so as long as you put the
> LOADED_MM_SWITCHING line before that, it might do the job? Maybe with
> an smp_mb__after_atomic() and/or an explainer comment.
> (smp_mb__after_atomic() is a no-op on x86, so maybe just a comment is
> the right way. Documentation/memory-barriers.txt says
> smp_mb__after_atomic() can be used together with atomic RMW bitop
> functions.)
>
That noop smp_mb__after_atomic() might be the way to go,
since we do not actually use the mm_cpumask with INVLPGB,
and we could conceivably skip updates to the bitmask for
tasks using broadcast TLB flushing.
> > >
> >
> > I'll add the READ_ONCE.
> >
> > Will the race still exist if we wait on
> > LOADED_MM_SWITCHING as proposed above?
>
> I think so, since between reading the loaded_mm and reading the
> loaded_mm_asid, the remote CPU might go through an entire task
> switch.
> Like:
>
> 1. We read the loaded_mm, and see that the remote CPU is currently
> running in our mm_struct.
> 2. The remote CPU does a task switch to another process with a
> different mm_struct.
> 3. We read the loaded_mm_asid, and see an ASID that does not match
> our
> broadcast ASID (because the loaded ASID is not for our mm_struct).
>
A false positive, where we do not clear the
asid_transition field, and will check again
in the future should be harmless, though.
The worry is false negatives, where we fail
to detect an out-of-sync CPU, yet still clear
the asid_transition field.
--
All Rights Reversed.
Powered by blists - more mailing lists