linux-kernel - Re: [PATCHv3 4/8] x86/mm: Handle LAM on context switch

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <c12b8785-1fc2-434a-83ae-f28532e6823a@www.fastmail.com>
Date:   Wed, 29 Jun 2022 18:51:44 -0700
From:   "Andy Lutomirski" <luto@...nel.org>
To:     "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
Cc:     "Dave Hansen" <dave.hansen@...ux.intel.com>,
        "Peter Zijlstra (Intel)" <peterz@...radead.org>,
        "the arch/x86 maintainers" <x86@...nel.org>,
        "kcc@...gle.com" <kcc@...gle.com>,
        "ryabinin.a.a@...il.com" <ryabinin.a.a@...il.com>,
        "andreyknvl@...il.com" <andreyknvl@...il.com>,
        "glider@...gle.com" <glider@...gle.com>,
        "dvyukov@...gle.com" <dvyukov@...gle.com>,
        "H.J. Lu" <hjl.tools@...il.com>, "Andi Kleen" <ak@...ux.intel.com>,
        "Rick P Edgecombe" <rick.p.edgecombe@...el.com>,
        linux-mm@...ck.org,
        "Linux Kernel Mailing List" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCHv3 4/8] x86/mm: Handle LAM on context switch



On Tue, Jun 28, 2022, at 5:34 PM, Kirill A. Shutemov wrote:
> On Tue, Jun 28, 2022 at 04:33:21PM -0700, Andy Lutomirski wrote:
>> On 6/10/22 07:35, Kirill A. Shutemov wrote:
>> > Linear Address Masking mode for userspace pointers encoded in CR3 bits.
>> > The mode is selected per-thread. Add new thread features indicate that the
>> > thread has Linear Address Masking enabled.
>> > 
>> > switch_mm_irqs_off() now respects these flags and constructs CR3
>> > accordingly.
>> > 
>> > The active LAM mode gets recorded in the tlb_state.
>> > 
>> > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@...ux.intel.com>
>> > ---
>> >   arch/x86/include/asm/mmu.h         |  1 +
>> >   arch/x86/include/asm/mmu_context.h | 24 ++++++++++++
>> >   arch/x86/include/asm/tlbflush.h    |  3 ++
>> >   arch/x86/mm/tlb.c                  | 62 ++++++++++++++++++++++--------
>> >   4 files changed, 75 insertions(+), 15 deletions(-)
>> > 
>> > diff --git a/arch/x86/include/asm/mmu.h b/arch/x86/include/asm/mmu.h
>> > index 5d7494631ea9..d150e92163b6 100644
>> > --- a/arch/x86/include/asm/mmu.h
>> > +++ b/arch/x86/include/asm/mmu.h
>> > @@ -40,6 +40,7 @@ typedef struct {
>> >   #ifdef CONFIG_X86_64
>> >   	unsigned short flags;
>> > +	u64 lam_cr3_mask;
>> >   #endif
>> >   	struct mutex lock;
>> > diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h
>> > index b8d40ddeab00..e6eac047c728 100644
>> > --- a/arch/x86/include/asm/mmu_context.h
>> > +++ b/arch/x86/include/asm/mmu_context.h
>> > @@ -91,6 +91,29 @@ static inline void switch_ldt(struct mm_struct *prev, struct mm_struct *next)
>> >   }
>> >   #endif
>> > +#ifdef CONFIG_X86_64
>> > +static inline u64 mm_cr3_lam_mask(struct mm_struct *mm)
>> > +{
>> > +	return mm->context.lam_cr3_mask;
>> > +}
>> > +
>> > +static inline void dup_lam(struct mm_struct *oldmm, struct mm_struct *mm)
>> > +{
>> > +	mm->context.lam_cr3_mask = oldmm->context.lam_cr3_mask;
>> > +}
>> > +
>> > +#else
>> > +
>> > +static inline u64 mm_cr3_lam_mask(struct mm_struct *mm)
>> > +{
>> > +	return 0;
>> > +}
>> > +
>> > +static inline void dup_lam(struct mm_struct *oldmm, struct mm_struct *mm)
>> > +{
>> > +}
>> > +#endif
>> 
>> Do we really need the ifdeffery here?  I see no real harm in having the
>> field exist on 32-bit -- we don't care much about performance for 32-bit
>> kernels.
>
> The waste doesn't feel right to me. I would rather keep it.
>
> But sure I can do this if needed.

I could go either way here.

>
>> > -	if (real_prev == next) {
>> > +	if (real_prev == next && prev_lam == new_lam) {
>> >   		VM_WARN_ON(this_cpu_read(cpu_tlbstate.ctxs[prev_asid].ctx_id) !=
>> >   			   next->context.ctx_id);
>> 
>> This looks wrong to me.  If we change threads within the same mm but lam
>> changes (which is certainly possible by a race if nothing else) then this
>> will go down the "we really are changing mms" path, not the "we're not
>> changing but we might need to flush something" path.
>
> If LAM gets enabled we must write CR3 with the new LAM mode. Without the
> change real_prev == next case will not do this for !was_lazy case.

You could fix that.  Or you could determine that this isn’t actually needed, just like updating the LDT in that path isn’t needed, if you change the way LAM is updated.

>
> Note that currently enabling LAM is done by setting LAM mode in the mmu
> context and doing switch_mm(current->mm, current->mm, current), so it is
> very important case.
>

Well, I did separately ask why this is the case.

> -- 
>  Kirill A. Shutemov