[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5880FDFC.7060304@arm.com>
Date: Thu, 19 Jan 2017 17:57:16 +0000
From: James Morse <james.morse@....com>
To: "Baicar, Tyler" <tbaicar@...eaurora.org>
CC: christoffer.dall@...aro.org, marc.zyngier@....com,
pbonzini@...hat.com, rkrcmar@...hat.com, linux@...linux.org.uk,
catalin.marinas@....com, will.deacon@....com, rjw@...ysocki.net,
lenb@...nel.org, matt@...eblueprint.co.uk, robert.moore@...el.com,
lv.zheng@...el.com, nkaje@...eaurora.org, zjzhang@...eaurora.org,
mark.rutland@....com, akpm@...ux-foundation.org,
eun.taik.lee@...sung.com, sandeepa.s.prabhu@...il.com,
labbott@...hat.com, shijie.huang@....com, rruigrok@...eaurora.org,
paul.gortmaker@...driver.com, tn@...ihalf.com, fu.wei@...aro.org,
rostedt@...dmis.org, bristot@...hat.com,
linux-arm-kernel@...ts.infradead.org, kvmarm@...ts.cs.columbia.edu,
kvm@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-acpi@...r.kernel.org, linux-efi@...r.kernel.org,
devel@...ica.org, Suzuki.Poulose@....com, punit.agrawal@....com,
astone@...hat.com, harba@...eaurora.org, hanjun.guo@...aro.org,
john.garry@...wei.com, shiju.jose@...wei.com
Subject: Re: [PATCH V7 05/10] acpi: apei: handle SEA notification type for
ARMv8
Hi Tyler,
On 18/01/17 23:51, Baicar, Tyler wrote:
> On 1/18/2017 7:50 AM, James Morse wrote:
>> On 12/01/17 18:15, Tyler Baicar wrote:
>>> ARM APEI extension proposal added SEA (Synchrounous External
>>> Abort) notification type for ARMv8.
>>> Add a new GHES error source handling function for SEA. If an error
>>> source's notification type is SEA, then this function can be registered
>>> into the SEA exception handler. That way GHES will parse and report
>>> SEA exceptions when they occur.
>>> diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
>>> index 2acbc60..87efe26 100644
>>> --- a/drivers/acpi/apei/ghes.c
>>> +++ b/drivers/acpi/apei/ghes.c
>>> @@ -767,6 +772,62 @@ static struct notifier_block ghes_notifier_sci = {
>>> .notifier_call = ghes_notify_sci,
>>> };
>>> +#ifdef CONFIG_HAVE_ACPI_APEI_SEA
>>> +static LIST_HEAD(ghes_sea);
>>> +
>>> +static int ghes_notify_sea(struct notifier_block *this,
>>> + unsigned long event, void *data)
>>> +{
>>> + struct ghes *ghes;
>>> + int ret = NOTIFY_DONE;
>>> +
>>> + nmi_enter();
>> Can we move this into the arch code? Its because we got here from a
>> synchronous-exception that makes this nmi-like, I think it only makes sense for
>> it be called from under /arch/.
> So move the nmi_enter/exit calls into do_sea of the previous patch? I can do
> that in the next patchset.
>> Where did the rcu_read_lock() go? I can see its missing from ghes_notify_nmi()
>> too, but I don't know enough about RCU to know if that's safe!
>>
>> The second paragraph in the comment above rcu_read_lock() describes it as
>> preventing call_rcu() during a read-side critical section that was running
>> concurrently. Doesn't this mean we can race with ghes_sea_remove() on another
>> CPU because we wait for the wrong grace period?
>>
>> The same comment talks about how these read-side critical sections can nest, so
>> I think its quite safe to make these 'lock' calls here.
> Sorry, I thought we wanted nmi_enter/exit instead of the rcu_read_lock/unlock. I
> guess the rcu locks
> will not cause the deadlock scenario you described in the previous patchset if
> we have the
> nmi_enter/exit wrapped around the rcu critical section.
Ah, not instead of, (well, not initially!).
The nmi_enter()/nmi_exit() thing was to fix the APEI interrupting APEI problem.
This is only a problem for notification types which can interrupt
interrupts-masked code, of which SEA is one. (and x86's NMI is the other).
I think I've found the answer to why the rcu_read_lock() isn't needed.
synchronize_sched() has:
> * This means that all preempt_disable code sequences, including NMI and
> * non-threaded hardware-interrupt handlers, in progress on entry will
> * have completed before this primitive returns.
synchronize_rcu() has the same innards, so I'm convinced this its safe not to
have those calls in here. Could we have a comment along the lines of:
> synchronize_rcu() will wait for nmi_exit(), so no need to rcu_read_lock().
(The more I learn about RCU the scarier it becomes!)
There are two other things that need changing to make the in_nmi() code path
work on arm64.
Always reserve the virtual-address-space forcing GHES_IOREMAP_PAGES to be 2
regardless of CONFIG_HAVE_ACPI_APEI_NMI. This is almost revert of
594c7255dce7a13cac50cf2470cc56e2c3b0494e (but that did a few other things too).
We also need to fix ghes_ioremap_pfn_nmi() to use arch_apei_get_mem_attribute()
and not assume PAGE_KERNEL.
Thanks,
James
Powered by blists - more mailing lists