linux-kernel - Re: xen_exit

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <d3be0f11-5699-bc90-c1f0-b770bc7da596@oracle.com>
Date:   Wed, 26 Apr 2017 20:55:55 -0400
From:   Boris Ostrovsky <boris.ostrovsky@...cle.com>
To:     Andy Lutomirski <luto@...nel.org>
Cc:     "xen-devel@...ts.xenproject.org" <xen-devel@...ts.xenproject.org>,
        Juergen Gross <jgross@...e.com>, X86 ML <x86@...nel.org>,
        Borislav Petkov <bp@...en8.de>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: xen_exit_mmap() questions



On 04/26/2017 06:49 PM, Andy Lutomirski wrote:
> On Wed, Apr 26, 2017 at 3:45 PM, Boris Ostrovsky
> <boris.ostrovsky@...cle.com> wrote:
>> On 04/26/2017 04:52 PM, Andy Lutomirski wrote:
>>> I was trying to understand xen_drop_mm_ref() to update it for some
>>> changes I'm working on, and I'm wondering whether we need
>>> xen_exit_mmap() at all.
>>>
>>> AFAICS the intent is to force all CPUs to drop their lazy uses of the
>>> mm being destroyed so it can be unpinned before tearing down the page
>>> tables, thus making it faster to tear down the page tables.  This
>>> seems like it'll speed up xen_set_pud() and xen_set_pmd(), but this
>>> seems like it may be of rather limited value.
>>
>> Why do you think it's of limited value? Without it we will end up with a
>> hypercall for each update.
>>
>> Or is your point that the number of those update is relatively small
>> when we are tearing down?
>
> The latter.  Also, unless I'm missing something, xen_set_pte() doesn't
> have the optimization.  I haven't looked at exactly how page table
> teardown works, but if it clears each PTE individually, then that's
> the bulk of the work.
>
>>
>>
>>>  Could we get away with
>>> deleting it?
>>>
>>> Also, this code in drop_other_mm_ref() looks dubious to me:
>>>
>>>     /* If this cpu still has a stale cr3 reference, then make sure
>>>        it has been flushed. */
>>>     if (this_cpu_read(xen_current_cr3) == __pa(mm->pgd))
>>>         load_cr3(swapper_pg_dir);
>>>
>>> If cr3 hasn't been flushed to the hypervisor because we're in a lazy
>>> mode, why would load_cr3() help?  Shouldn't this be xen_mc_flush()
>>> instead?
>>
>> load_cr3() actually ends with xen_mc_flush() by way of xen_write_cr3()
>> -> xen_mc_issue().
>
> xen_mc_issue() does:
>
>         if ((paravirt_get_lazy_mode() & mode) == 0)
>                 xen_mc_flush();
>
> I assume the load_cr3() is intended to deal with the case where we're
> in lazy mode, but we'll still be in lazy mode, right?  Or does it
> serve some other purpose?

Of course. I can't read (I ignored the "== 0" part).

Apparently the early version had an explicit flush but then it 
disappeared (commit 9f79991d4186089e228274196413572cc000143b).

The point of CR3 loading here, I believe, is to make sure the hypervisor 
knows that the (v)CPU is no longer using the the mm's cr3 (we are 
loading swapper_pgdir here).

-boris