linux-kernel - Re: [PATCH RFCv2 9/9] arm64: Support async page fault

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4337cca152df47c93d96e092189a0e36@kernel.org>
Date:   Sun, 31 May 2020 13:44:40 +0100
From:   Marc Zyngier <maz@...nel.org>
To:     Paolo Bonzini <pbonzini@...hat.com>
Cc:     Gavin Shan <gshan@...hat.com>, kvmarm@...ts.cs.columbia.edu,
        linux-kernel@...r.kernel.org, shan.gavin@...il.com,
        catalin.marinas@....com, will@...nel.org,
        linux-arm-kernel@...ts.infradead.org
Subject: Re: [PATCH RFCv2 9/9] arm64: Support async page fault

On 2020-05-29 12:11, Paolo Bonzini wrote:
> On 29/05/20 11:41, Marc Zyngier wrote:
>>>> 
>>>> 
>>>> For x86 the advantage is that the processor can take care of raising 
>>>> the
>>>> stage2 page fault in the guest, so it's faster.
>>>> 
>>> I think there might be too much overhead if the page can be populated
>>> quickly by host. For example, it's fast to populate the pages if 
>>> swapin
>>> isn't involved.
> 
> Those would still be handled by the host.  Only those that are not
> present in the host (which you can see through the MMU notifier) would
> be routed to the guest.  You can do things differently between "not
> present fault because the page table does not exist" and "not present
> fault because the page is missing in the host".
> 
>>> If I'm correct enough, it seems arm64 doesn't have similar mechanism,
>>> routing stage2 page fault to guest.
>> 
>> Indeed, this isn't a thing on arm64. Exception caused by a S2 fault 
>> are
>> always routed to EL2.
> 
> Is there an ARM-approved way to reuse the S2 fault syndromes to detect
> async page faults?

It would mean being able to set an ESR_EL2 register value into ESR_EL1,
and there is nothing in the architecture that would allow that, with
the exception of nested virt: a VHE guest hypervisor running at EL1
must be able to observe S2 faults for its own S2, as synthesized by
the host hypervisor.

The trouble is that:
- there is so far no commercially available CPU supporting NV
- even if you could get hold of such a machine, there is no
   guarantee that such "EL2 syndrome at EL1" is valid outside of
   the nested context
- this doesn't solve the issue for non-NV CPUs anyway

> (By the way, another "modern" use for async page faults is for postcopy
> live migration).

Right. That's definitely a more interesting version of "swap-in".

         M.
-- 
Jazz is not dead. It just smells funny...