[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <438c1743-5c8a-287d-3f97-e4a451ae8027@arm.com>
Date: Mon, 18 May 2020 19:08:17 +0100
From: James Morse <james.morse@....com>
To: Dan Williams <dan.j.williams@...el.com>,
"Rafael J. Wysocki" <rafael.j.wysocki@...el.com>
Cc: kernel test robot <rong.a.chen@...el.com>,
stable <stable@...r.kernel.org>, Len Brown <lenb@...nel.org>,
Borislav Petkov <bp@...en8.de>,
Ira Weiny <ira.weiny@...el.com>,
Erik Kaneda <erik.kaneda@...el.com>,
Myron Stowe <myron.stowe@...hat.com>,
"Rafael J. Wysocki" <rjw@...ysocki.net>,
Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
linux-nvdimm <linux-nvdimm@...ts.01.org>, lkp@...ts.01.org,
Linux ACPI <linux-acpi@...r.kernel.org>,
"Huang, Ying" <ying.huang@...el.com>
Subject: Re: [ACPI] b13663bdf9:
BUG:sleeping_function_called_from_invalid_context_at_kernel/locking/mutex.c
Hi guys,
On 12/05/2020 19:05, Dan Williams wrote:
> On Tue, May 12, 2020 at 9:28 AM Rafael J. Wysocki
> <rafael.j.wysocki@...el.com> wrote:
>> Dan,
>>
>> Has this been addressed in the v2?
>
> No, this looks like a case I was concerned about, i.e. the GHES code
> is not being completely careful to avoid calling potentially sleeping
> functions with interrupts disabled. There is the nice comment that
> indicates that the fixmap should be used when ghes_notify_lock_irq()
> is held, but there seems to be no infrastructure to use / divert to
> the fixmap in the ghes_proc() path.
ghes_map()/ghes_unmap() use the fixmap for reading the firmware provided records,
but this came through apei_read(), which claims to be IRQ and NMI safe...
> That needs to be reworked first.
> It seems the implementation was getting lucky before to hit the cached
> acpi_ioremap in this path under rcu_read_lock(), but it appears it
> should have always been using the fixmap. Ying, James, is my read
> correct?
The path through this thing is pretty tortuous: The static HEST contains the address of
the pointer that firmware updates to point to CPER records when they are generated. This
pointer might be static (records are always in the same place), it might not.
The address in the tables is static. ghes.c maps it in ghes_new():
| rc = apei_map_generic_address(&generic->error_status_address);
which happens before the ghes_add_timer()/request_irq()/ghes_nmi_add() stuff, so we should
always use the existing mapping.
__ghes_peek_estatus() reads the pointer with apei_read(), which should use the mapping
from ghes_new(), then uses ghes_copy_tofrom_phys() which uses the fixmap to read the CPER
records.
Does apei_map_generic_address() no longer keep the GAR/address mapped?
(also possible I've totally mis-understood how ACPIs caching of mappings works!)
Thanks,
James
Powered by blists - more mailing lists