[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130508222845.GL30955@pd.tnic>
Date: Thu, 9 May 2013 00:28:45 +0200
From: Borislav Petkov <bp@...en8.de>
To: "Rafael J. Wysocki" <rjw@...k.pl>
Cc: Lance Ortiz <lance.ortiz@...com>, bhelgaas@...gle.com,
lance_ortiz@...mail.com, jiang.liu@...wei.com, tony.luck@...el.com,
rostedt@...dmis.org, mchehab@...hat.com,
linux-acpi@...r.kernel.org, linux-pci@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] aerdrv: Move cper_print_pcie() out of interrupt context
On Thu, May 09, 2013 at 12:01:21AM +0200, Rafael J. Wysocki wrote:
> On Wednesday, May 08, 2013 11:15:19 AM Lance Ortiz wrote:
> > The following warning was seen on 3.9 when a corrected PCIe error was being
> > handled by the AER subsystem.
> >
> > WARNING: at .../drivers/pci/search.c:214 pci_get_dev_by_id+0x8a/0x90()
> >
> > This occurred because code was added to the function cper_print_pcie() that
> > calls the pci_get_domain_bus_and_slot() function.
>
> Do you know which commit added that code?
1d5210008bd3a26daf4b06aed9d6c330dd4c83e2
> > cper_print_pcie() is called
> > in an interrupt context and pci_get* functions are not supposed to be called
> > in that context hence the warning.
> >
> > The solution is to move the call to cper_print_aer() out of the interrupt
> > context and into aer_recover_queue() to avoid any warnings when calling
> > pci_get* functions.
>
> The way the changes are described here isn't particularly clear to me. I'd say
> something like
>
> If cper_print_aer() is called by aer_recover_work_func(), there won't be any
> reason to call it from cper_print_pcie() any more, in which case all of the
> problematic code needed only to prepare for the cper_print_aer() call,
> including the invocation of pci_get_domain_bus_and_slot() causing the warning
> to be printed, may be removed from there. Make that happen."
>
> Also, since aer_recover_work_func() is going to be the only existing caller of
> cper_print_aer() after this change, as far as I can say, and it doesn't use the
> function's first argument, that argument should be dropped entirely.
Hmm, that needs more diddling: AFAICT __ghes_print_estatus() figures out
what the prefix is depending on the ->error_severity coming from the
acpi_hest_generic_status thing so it probably needs to be handed down or
similar...
Fun.
--
Regards/Gruss,
Boris.
Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists