linux-kernel - Re: [PATCH v3 7/8] arm64: exception: handle asynchronous SError interrupt

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2bb41d98-553f-1361-e497-4095ddd501d0@huawei.com>
Date:   Fri, 21 Apr 2017 19:33:48 +0800
From:   Xiongfeng Wang <wangxiongfeng2@...wei.com>
To:     James Morse <james.morse@....com>,
        Mark Rutland <mark.rutland@....com>,
        Xie XiuQi <xiexiuqi@...wei.com>
CC:     <christoffer.dall@...aro.org>, <marc.zyngier@....com>,
        <catalin.marinas@....com>, <will.deacon@....com>,
        <fu.wei@...aro.org>, <rostedt@...dmis.org>,
        <hanjun.guo@...aro.org>, <shiju.jose@...wei.com>,
        <wuquanming@...wei.com>, <kvm@...r.kernel.org>,
        <linux-kernel@...r.kernel.org>, <gengdongjiu@...wei.com>,
        <linux-acpi@...r.kernel.org>, <zhengqiang10@...wei.com>,
        <kvmarm@...ts.cs.columbia.edu>,
        <linux-arm-kernel@...ts.infradead.org>
Subject: Re: [PATCH v3 7/8] arm64: exception: handle asynchronous SError
 interrupt

Hi James,

Thanks for your reply.

On 2017/4/20 16:52, James Morse wrote:
> Hi Wang Xiongfeng,
> 
> On 19/04/17 03:37, Xiongfeng Wang wrote:
>> On 2017/4/18 18:51, James Morse wrote:
>>> The host expects to receive physical SError Interrupts. The ARM-ARM doesn't
>>> describe a way to inject these as they are generated by the CPU.
>>>
>>> Am I right in thinking you want this to use SError Interrupts as an APEI
>>> notification? (This isn't a CPU thing so the RAS spec doesn't cover this use)
>>
>> Yes, using sei as an APEI notification is one part of my consideration. Another use is for ESB.
>> RAS spec 6.5.3 'Example software sequences: Variant: asynchronous External Abort with ESB'
>> describes the SEI recovery process when ESB is implemented.
>>
>> In this situation, SEI is routed to EL3 (SCR_EL3.EA = 1). When an SEI occurs in EL0 and not been taken immediately,
>> and then an ESB instruction at SVC entry is executed, SEI is taken to EL3. The ESB at SVC entry is
>> used for preventing the error propagating from user space to kernel space. The EL3 SEI handler collects
> 
>> the errors and fills in the APEI table, and then jump to EL2 SEI handler. EL2 SEI handler inject
>> an vSEI into EL1 by setting HCR_EL2.VSE = 1, so that when returned to OS, an SEI is pending.
> 
> This step has confused me. How would this work with VHE where the host runs at
> EL2 and there is nothing at Host:EL1?

RAS spec 6.5.3 'Example software sequences: Variant: asynchronous External Abort with ESB'
I don't know whether the step described in the section is designed for guest os or host os or both.
Yes, it doesn't work with VHE where the host runs at EL2.

>>>From your description I assume you have some firmware resident at EL2.

Our actual SEI step is planned as follows:
Host OS:  EL0/EL1 -> EL3 -> EL0/EL1
Guest OS:  EL0/EL1 -> EL3 -> EL2 -> EL0/EL1

Our problem is that, when returning to EL0/EL1, whether we should jump to EL1 SEI vector or just return to where the SEI is taken from?
In guest os situation, we can inject an vSEI and return to where the SEI is taken from.
But in host os situation, we can't inject an vSEI (if we don't set HCR_EL2.AMO), so we have to jump to EL1 SEI vector.
Then the process following ESB won't be executed becuase SEI is taken to EL3 from the ESB instruction in EL1, and when control
is returned to OS, we are in EL1 SEI vector rather than the ESB instruction.

It is ok to just ignore the process following the ESB instruction in el0_sync, because the process will be sent SIGBUS signal.
But for el0_irq, the irq process following the ESB may should be executed because the irq is not related to the user process.

If we set HCR_EL2.AMO when SCR_EL3.EA is set. We can still inject an vSEI into host OS in EL3.
Physical SError won't be taken to EL2 because SCR_EL3.EA is set. But this may be too racy is
not consistent with ARM rules.
> 
> 
>> Then ESB is executed again, and DISR_EL1.A is set by hardware (2.4.4 ESB and virtual errors), so that
>> the following process can be executed.
> 
> 
>> So we want to inject a vSEI into OS, when control is returned from EL3/2 to OS, no matter whether
>> it is on host OS or guest OS.
> 
> I disagree. With Linux/KVM you can't use Virtual SError to notify the host OS.
> The host OS expects to receive Physical SError, these shouldn't be taken to EL2
> unless a guest is running. Notifications from firmware that use SEA or SEI
> should follow the routing rules in the ARM-ARM, which means they should never
> reach a guest OS.

Yes, I agree. When running the host os, exception should not be taken to EL2, because
EL2 SEI vector always suppose that exception is taken from guest os and save
guest context in the first place.
> 
> For VHE the host runs at EL2 and sets HCR_EL2.AMO. Any firmware notification
> should come from EL3 to the host at EL2. The host may choose to notify the
> guest, but this should always go via Qemu.
> 
> For non-VHE systems the host runs at EL1 and KVM lives at EL2 to do the world
> switch. When KVM is running a guest it sets HCR_EL2.AMO, when it has switched
> back to the host it clears it.
> 
> Notifications from EL3 that pretend to be SError should route SError to EL2 or
> EL1 depending on HCR_EL2.AMO.
> When KVM gets a Synchronous External Abort or an SError while a guest is running
> it will switch back to the host and tell the handle_exit() code.  Again the host
> may choose to notify the guest, but this should always go via Qemu.
> 
> The ARM-ARM pseudo code for the routing rules is in
> AArch64.TakePhysicalSErrorException(). Firmware injecting fake SError should
> behave exactly like this.
> 
> 
> Newly appeared in the ARM-ARM is HCR_EL2.TEA, which takes Synchronous External
> Aborts to EL2 (if SCR_EL3 hasn't claimed them). We should set/clear this bit
> like we do HCR_EL2.AMO if the CPU has the RAS extensions.
> 
> 
>>> You cant use SError to cover all the possible RAS exceptions. We already have
>>> this problem using SEI if PSTATE.A was set and the exception was an imprecise
>>> abort from EL2. We can't return to the interrupted context and we can't deliver
>>> an SError to EL2 either.
>>
>> SEI came from EL2 and PSTATE.A is set. Is it the situation where VHE is enabled and CPU is running
>> in kernel space. If SEI occurs in kernel space, can we just panic or shutdown.
> 
> firmware-panic? We can do a little better than that:
> If the IESB bit is set in the ESR we can behave as if this were an ESB and have
> firmware write an appropriate ESR to DISR_EL1 if PSTATE.A is set and the
> exception came from EL2.

This method may solve the problem I said above.
Is IESB a new bit added int ESR in the newest RAS spec?
> 
> Linux should have an ESB in its __switch_to(), I've re-worked the daif masking
> in entry.S so that PSTATE.A will always be unmasked here.
> 
> Other than these two cases, yes, this CPU really is wrecked. Firmware can power
> it off via PSCI, and could notify another CPU about what happened. UEFI's table
> 250 'Processor Generic Error Section' has a 'Processor ID' field, with a note
> that on ARM this is the MPIDR_EL1. Table 260 'ARM Processor Error Section' has
> something similar.
> 
> 
> Thanks,
> 
> James
> 
> .
> 
Thanks,

Wang Xiongfeng