linux-kernel - Re: [PATCH v6 3/7] KVM: MMU: fast invalidate all pages

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <519C92B6.1020405@linux.vnet.ibm.com>
Date:	Wed, 22 May 2013 17:41:10 +0800
From:	Xiao Guangrong <xiaoguangrong@...ux.vnet.ibm.com>
To:	Gleb Natapov <gleb@...hat.com>
CC:	Marcelo Tosatti <mtosatti@...hat.com>, avi.kivity@...il.com,
	pbonzini@...hat.com, linux-kernel@...r.kernel.org,
	kvm@...r.kernel.org, Anthony Liguori <anthony@...emonkey.ws>
Subject: Re: [PATCH v6 3/7] KVM: MMU: fast invalidate all pages

On 05/22/2013 04:54 PM, Gleb Natapov wrote:
> On Wed, May 22, 2013 at 04:46:04PM +0800, Xiao Guangrong wrote:
>> On 05/22/2013 02:34 PM, Gleb Natapov wrote:
>>> On Tue, May 21, 2013 at 10:33:30PM -0300, Marcelo Tosatti wrote:
>>>> On Tue, May 21, 2013 at 11:39:03AM +0300, Gleb Natapov wrote:
>>>>>> Any pages with stale information will be zapped by kvm_mmu_zap_all().
>>>>>> When that happens, page faults will take place which will automatically 
>>>>>> use the new generation number.
>>>>>>
>>>>>> So still not clear why is this necessary.
>>>>>>
>>>>> This is not, strictly speaking, necessary, but it is the sane thing to do.
>>>>> You cannot update page's generation number to prevent it from been
>>>>> destroyed since after kvm_mmu_zap_all() completes stale ptes in the
>>>>> shadow page may point to now deleted memslot. So why build shadow page
>>>>> table with a page that is in a process of been destroyed?
>>>>
>>>> OK, can this be introduced separately, in a later patch, with separate
>>>> justification, then?
>>>>
>>>> Xiao please have the first patches of the patchset focus on the problem
>>>> at hand: fix long mmu_lock hold times.
>>>>
>>>>> Not sure what you mean again. We flush TLB once before entering this function.
>>>>> kvm_reload_remote_mmus() does this for us, no?
>>>>
>>>> kvm_reload_remote_mmus() is used as an optimization, its separate from the
>>>> problem solution.
>>>>
>>>>>>
>>>>>> What was suggested was... go to phrase which starts with "The only purpose
>>>>>> of the generation number should be to".
>>>>>>
>>>>>> The comment quoted here does not match that description.
>>>>>>
>>>>> The comment describes what code does and in this it is correct.
>>>>>
>>>>> You propose to not reload roots right away and do it only when root sp
>>>>> is encountered, right? So my question is what's the point? There are,
>>>>> obviously, root sps with invalid generation number at this point, so
>>>>> reload will happen regardless in kvm_mmu_prepare_zap_page(). So why not
>>>>> do it here right away and avoid it in kvm_mmu_prepare_zap_page() for
>>>>> invalid and obsolete sps as I proposed in one of my email?
>>>>
>>>> Sure. But Xiao please introduce that TLB collapsing optimization as a
>>>> later patch, so we can reason about it in a more organized fashion.
>>>
>>> So, if I understand correctly, you are asking to move is_obsolete_sp()
>>> check from kvm_mmu_get_page() and kvm_reload_remote_mmus() from
>>> kvm_mmu_invalidate_all_pages() to a separate patch. Fine by me, but if
>>> we drop kvm_reload_remote_mmus() from kvm_mmu_invalidate_all_pages() the
>>> call to kvm_mmu_invalidate_all_pages() in emulator_fix_hypercall() will
>>> become nop. But I question the need to zap all shadow pages tables there
>>> in the first place, why kvm_flush_remote_tlbs() is not enough?
>>
>> I do not know too... I even do no know why kvm_flush_remote_tlbs
>> is needed. :(
> We changed the content of an executable page, we need to flush instruction
> cache of all vcpus to not use stale data, so my suggestion to call

I thought the reason is about icache too but icache is automatically
flushed on x86, we only need to invalidate the prefetched instructions by
executing a serializing operation.

See the SDM in the chapter of
"8.1.3 Handling Self- and Cross-Modifying Code"

> kvm_flush_remote_tlbs() is obviously incorrect since this flushes tlb,
> not instruction cache, but why kvm_reload_remote_mmus() would flush
> instruction cache?

kvm_reload_remote_mmus do not have any help i think.

I find that this change is introduced by commit: 7aa81cc0
and I have added Anthony in the CC.

I also find some discussions related to calling
kvm_reload_remote_mmus():

>
> But if the instruction is architecture dependent, and you run on the
> wrong architecture, now you have to patch many locations at fault time,
> introducing some nasty runtime code / data cache overlap performance
> problems.  Granted, they go away eventually.
>

We're addressing that by blowing away the shadow cache and holding the
big kvm lock to ensure SMP safety.  Not a great thing to do from a
performance perspective but the whole point of patching is that the cost
is amortized.

(http://kerneltrap.org/mailarchive/linux-kernel/2007/9/14/260288)

But i can not understand...


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/