linux-kernel - Re: [Xen-devel] [PATCH v4 0/3] x86: modify_ldt improvement, test, and config option

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CALCETrWkMRb+Y3FsJ7+kNYmPxtupM3ZPOeOPwagXytgBqM6tJQ@mail.gmail.com>
Date:	Tue, 28 Jul 2015 22:28:27 -0700
From:	Andy Lutomirski <luto@...capital.net>
To:	Boris Ostrovsky <boris.ostrovsky@...cle.com>
Cc:	Andrew Cooper <andrew.cooper3@...rix.com>,
	"security@...nel.org" <security@...nel.org>,
	Peter Zijlstra <peterz@...radead.org>, X86 ML <x86@...nel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Steven Rostedt <rostedt@...dmis.org>,
	xen-devel <xen-devel@...ts.xen.org>,
	Borislav Petkov <bp@...en8.de>,
	Jan Beulich <jbeulich@...e.com>,
	Sasha Levin <sasha.levin@...cle.com>
Subject: Re: [Xen-devel] [PATCH v4 0/3] x86: modify_ldt improvement, test, and
 config option

On Tue, Jul 28, 2015 at 8:01 PM, Boris Ostrovsky
<boris.ostrovsky@...cle.com> wrote:
> On 07/28/2015 08:47 PM, Andrew Cooper wrote:
>>
>> On 29/07/2015 01:21, Andy Lutomirski wrote:
>>>
>>> On Tue, Jul 28, 2015 at 10:10 AM, Boris Ostrovsky
>>> <boris.ostrovsky@...cle.com> wrote:
>>>>
>>>> On 07/28/2015 01:07 PM, Andy Lutomirski wrote:
>>>>>
>>>>> On Tue, Jul 28, 2015 at 9:30 AM, Andrew Cooper
>>>>> <andrew.cooper3@...rix.com> wrote:
>>>>>>
>>>>>> I suspect that the set_ldt(NULL, 0) call hasn't reached Xen before
>>>>>> xen_free_ldt() is attempting to nab back the pages which Xen still has
>>>>>> mapped as an LDT.
>>>>>>
>>>>> I just instrumented it with yet more LSL instructions.  I'm pretty
>>>>> sure that set_ldt really is clearing at least LDT entry zero.
>>>>> Nonetheless the free_ldt call still oopses.
>>>>>
>>>> Yes, I added some instrumentation to the hypervisor and we definitely
>>>> set
>>>> LDT to NULL before failing.
>>>>
>>>> -boris
>>>
>>> Looking at map_ldt_shadow_page: what keeps shadow_ldt_mapcnt from
>>> getting incremented once on each CPU at the same time if both CPUs
>>> fault in the same shadow LDT page at the same time?
>>
>> Nothing, but that is fine.  If a page is in use in two vcpus LDTs, it is
>> expected to have a type refcount of 2.
>>
>>> Similarly, what
>>> keeps both CPUs from calling get_page_type at the same time and
>>> therefore losing track of the page type reference count?
>>
>> a cmpxchg() loop in the depths of __get_page_type().
>>
>>> I don't see why vmalloc or vm_unmap_aliases would have anything to do
>>> with this, though.
>
>
> So just for kicks I made lazy_max_pages() return 0 to free vmaps immediately
> and the problem went away.

As far as I can tell, this affects TLB flushes but not unmaps.  That
means that my patch is totally bogus -- vm_unmap_aliases() *flushed*
aliases but isn't involved in removing them from the page tables.
That must be why xen_alloc_ldt and xen_set_ldt work today.

So what does flushing the TLB have to do with anything?  The only
thing I can think of is that it might force some deferred hypercalls
out.  I can reproduce this easily on UP, so IPIs aren't involved.

The other odd thing is that it seems like this happens when clearing
the LDT and freeing the old one but not when setting the LDT and
freeing the old one.  This is plausibly related to the lazy mode in
effect at the time, but I have no evidence for that.

Two more data points:  Putting xen_flush_mc before and after the
SET_LDT multicall has no effect.  Putting flush_tlb_all() in
xen_free_ldt doesn't help either, while vm_unmap_aliases() in the
exact same place does help.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/