[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMj-D2DLW3a1XxWtW0xouBoLEVFqHmX5t-ds=2bCyb8ZbRS9Tg@mail.gmail.com>
Date: Sun, 21 May 2017 16:24:26 +0800
From: gengdongjiu <gengdj.1984@...il.com>
To: James Morse <james.morse@....com>
Cc: Tyler Baicar <tbaicar@...eaurora.org>,
Christoffer Dall <christoffer.dall@...aro.org>,
Marc Zyngier <marc.zyngier@....com>, pbonzini@...hat.com,
rkrcmar@...hat.com, linux@...linux.org.uk, catalin.marinas@....com,
will.deacon@....com, rjw@...ysocki.net,
Len Brown <lenb@...nel.org>, matt@...eblueprint.co.uk,
robert.moore@...el.com, lv.zheng@...el.com, nkaje@...eaurora.org,
zjzhang@...eaurora.org, mark.rutland@....com,
akpm@...ux-foundation.org, eun.taik.lee@...sung.com,
Sandeepa Prabhu <sandeepa.s.prabhu@...il.com>,
labbott@...hat.com, shijie.huang@....com, rruigrok@...eaurora.org,
paul.gortmaker@...driver.com, tn@...ihalf.com,
Fu Wei <fu.wei@...aro.org>, rostedt@...dmis.org,
bristot@...hat.com, linux-arm-kernel@...ts.infradead.org,
kvmarm@...ts.cs.columbia.edu, kvm@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-acpi@...r.kernel.org,
linux-efi@...r.kernel.org, devel@...ica.org,
Suzuki.Poulose@....com, Punit Agrawal <punit.agrawal@....com>,
astone@...hat.com, harba@...eaurora.org,
Hanjun Guo <hanjun.guo@...aro.org>,
John Garry <john.garry@...wei.com>,
Shiju Jose <shiju.jose@...wei.com>, joe@...ches.com,
Xiongfeng Wang <wangxiongfeng2@...wei.com>
Subject: Re: [PATCH v3 3/3] arm/arm64: signal SIBGUS and inject SEA Error
Hi James,
sorry for the late response due to recently verify and debug the
RAS solution.
2017-05-13 1:24 GMT+08:00, James Morse <james.morse@....com>:
> Hi gengdongjiu,
>
> On 05/05/17 13:31, gengdongjiu wrote:
>> when guest OS happen an SEA, My current solution is shown below:
>>
>> (1) host EL3 firmware firstly handle the SEA error and generate the CPER
>> record.
>> (2) EL3 firmware separately copy the esr_el3, elr_el3, SPSR_el3,
>> far_el3 to the esr_el2, elr_el2, SPSR_el2, far_el2.
>
> Copying {ELR,SPSR,FAR}_EL3 to the EL2 registers rings some alarm bells: I'm
> sure
> you exclude values from EL3 or the secure-world, we should never hand those
> to
> the normal world.
it is sure that needs to exclude the EL3 Error and secure-world.
>
>
>> (3) then jump the EL2 hypervisor
>
>> so the EL2 hypervisor uses the ESR that come from esr_el3, here the
>> ESR(esr_el3) value may be different with the exist KVM API's ESR.
>
> The ESR may be different between EL3 and EL2. The ESR contains the severity
> of
> the event, the CPU will choose this when it takes the SError to EL3. If it
> had
> taken the SError to EL2, the CPU may have classified the error differently.
>
> Firmware may need to generate a more severe ESR if it receives an error
> that
> would be propagated by delivering SEI to a lower exception level, for
> example if
> an EL2 system register is 'infected'.
>
> This is the same for Qemu/kvmtool. A contained error at EL2 may be an
> uncontained error if we hand it to guest EL1. Linux's RAS code will decide
> this
> with its choice of signal to send, (and possibly which code to set).
> Qemu/kvmtool need to choose an appropriate APEI notification, which may
> involve
> generating a relevant ESR.
>
> Also relevant is the problem we discussed earlier with trying to deliver
> fake
> Physical-SError from software at EL3: If the SError is routed to EL2, and
> EL2
> has PSTATE.A masked, EL3 has to wait and try again later. This is another
> case
> where firmware may have to upgrade the classification of an error to
> uncontainable.
it makes sense. thanks to James.
>
>
> Thanks,
>
> James
>
Powered by blists - more mailing lists