[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <75e95acc-6730-ddcf-d722-66e575076256@kernel.org>
Date: Wed, 29 Sep 2021 09:59:42 -0700
From: Andy Lutomirski <luto@...nel.org>
To: Thomas Gleixner <tglx@...utronix.de>,
Peter Zijlstra <peterz@...radead.org>
Cc: Tony Luck <tony.luck@...el.com>, Fenghua Yu <fenghua.yu@...el.com>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...el.com>,
Lu Baolu <baolu.lu@...ux.intel.com>,
Joerg Roedel <joro@...tes.org>,
Josh Poimboeuf <jpoimboe@...hat.com>,
Dave Jiang <dave.jiang@...el.com>,
Jacob Jun Pan <jacob.jun.pan@...el.com>,
Raj Ashok <ashok.raj@...el.com>,
"Shankar, Ravi V" <ravi.v.shankar@...el.com>,
iommu@...ts.linux-foundation.org,
the arch/x86 maintainers <x86@...nel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 5/8] x86/mmu: Add mm-based PASID refcounting
On 9/29/21 05:28, Thomas Gleixner wrote:
> On Wed, Sep 29 2021 at 11:54, Peter Zijlstra wrote:
>> On Fri, Sep 24, 2021 at 04:03:53PM -0700, Andy Lutomirski wrote:
>>> I think the perfect and the good are a bit confused here. If we go for
>>> "good", then we have an mm owning a PASID for its entire lifetime. If
>>> we want "perfect", then we should actually do it right: teach the
>>> kernel to update an entire mm's PASID setting all at once. This isn't
>>> *that* hard -- it involves two things:
>>>
>>> 1. The context switch code needs to resync PASID. Unfortunately, this
>>> adds some overhead to every context switch, although a static_branch
>>> could minimize it for non-PASID users.
>>
>>> 2. A change to an mm's PASID needs to sent an IPI, but that IPI can't
>>> touch FPU state. So instead the IPI should use task_work_add() to
>>> make sure PASID gets resynced.
>>
>> What do we need 1 for? Any PASID change can be achieved using 2 no?
>>
>> Basically, call task_work_add() on all relevant tasks [1], then IPI
>> spray the current running of those and presto.
>>
>> [1] it is nigh on impossible to find all tasks sharing an mm in any sane
>> way due to CLONE_MM && !CLONE_THREAD.
>
> Why would we want any of that at all?
>
> Process starts, no PASID assigned.
>
> bind to device -> PASID is allocated and assigned to the mm
>
> some task of the process issues ENQCMD -> #GP -> write PASID MSR
>
> After that the PASID is saved and restored as part of the XSTATE and
> there is no extra overhead in context switch or return to user space.
>
> All tasks of the process which did never use ENQCMD don't care and their
> PASID xstate is in init state.
>
> There is absolutely no point in enforcing that all tasks of the process
> have the PASID activated immediately when it is assigned. If they need
> it they get it via the #GP fixup and everything just works.
>
> Looking at that patch again, none of this muck in fpu__pasid_write() is
> required at all. The whole exception fixup is:
>
> if (!user_mode(regs))
> return false;
>
> if (!current->mm->pasid)
> return false;
>
> if (current->pasid_activated)
> return false;
<-- preemption or BH here: kaboom.
>
> wrmsrl(MSR_IA32_PASID, current->mm->pasid);
This needs the actual sane fpstate writing helper -- see other email.
Powered by blists - more mailing lists