[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e090d7f5-208a-4a4d-8162-7202ad6b0183@roeck-us.net>
Date: Tue, 9 Jan 2024 14:32:41 -0800
From: Guenter Roeck <linux@...ck-us.net>
To: minyard@....org
Cc: Kai-Heng Feng <kai.heng.feng@...onical.com>, jdelvare@...e.com,
linux-hwmon@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4 2/2] hwmon: (acpi_power_meter) Ensure IPMI space
handler is ready on Dell systems
On 1/9/24 13:28, Corey Minyard wrote:
> On Tue, Jan 09, 2024 at 07:23:40AM -0800, Guenter Roeck wrote:
>> On 1/8/24 20:12, Kai-Heng Feng wrote:
>>> The following error can be observed at boot:
>>> [ 3.717920] ACPI Error: No handler for Region [SYSI] (00000000ab9e62c5) [IPMI] (20230628/evregion-130)
>>> [ 3.717928] ACPI Error: Region IPMI (ID=7) has no handler (20230628/exfldio-261)
>>>
>>> [ 3.717936] No Local Variables are initialized for Method [_GHL]
>>>
>>> [ 3.717938] No Arguments are initialized for method [_GHL]
>>>
>>> [ 3.717940] ACPI Error: Aborting method \_SB.PMI0._GHL due to previous error (AE_NOT_EXIST) (20230628/psparse-529)
>>> [ 3.717949] ACPI Error: Aborting method \_SB.PMI0._PMC due to previous error (AE_NOT_EXIST) (20230628/psparse-529)
>>> [ 3.717957] ACPI: \_SB_.PMI0: _PMC evaluation failed: AE_NOT_EXIST
>>>
>>> On Dell systems several methods of acpi_power_meter access variables in
>>> IPMI region [0], so wait until IPMI space handler is installed by
>>> acpi_ipmi and also wait until SMI is selected to make the space handler
>>> fully functional.
>>>
>>> [0] https://www.dell.com/support/manuals/en-us/redhat-enterprise-linux-v8.0/rhel8_rn_pub/advanced-configuration-and-power-interface-acpi-error-messages-displayed-in-dmesg?guid=guid-0d5ae482-1977-42cf-b417-3ed5c3f5ee62
>>>
>>> Signed-off-by: Kai-Heng Feng <kai.heng.feng@...onical.com>
>>> ---
>>> v4:
>>> - No change.
>>>
>>> v3:
>>> - Use helper.
>>> - Use return value to print warning message.
>>>
>>> v2:
>>> - Use completion instead of request_module().
>>>
>>> drivers/hwmon/acpi_power_meter.c | 6 ++++++
>>> 1 file changed, 6 insertions(+)
>>>
>>> diff --git a/drivers/hwmon/acpi_power_meter.c b/drivers/hwmon/acpi_power_meter.c
>>> index 703666b95bf4..33fb9626633d 100644
>>> --- a/drivers/hwmon/acpi_power_meter.c
>>> +++ b/drivers/hwmon/acpi_power_meter.c
>>> @@ -883,6 +883,12 @@ static int acpi_power_meter_add(struct acpi_device *device)
>>> strcpy(acpi_device_class(device), ACPI_POWER_METER_CLASS);
>>> device->driver_data = resource;
>>> + if (dmi_match(DMI_SYS_VENDOR, "Dell Inc.") &&
>>> + acpi_dev_get_first_match_dev("IPI0001", NULL, -1)) {
>>> + if (acpi_wait_for_acpi_ipmi())
>>> + dev_warn(&device->dev, "Waiting for ACPI IPMI timeout");
>>> + }
>>> +
>>
>> What a hack :-(.
>>
>> This needs a comment in the driver explaining the rationale for this change, and
>> also a comment explaining why, for example, using late_initcall() does not help.
>>
>> If CONFIG_IPMI_SI=n, acpi_wait_for_acpi_ipmi() will return 0, indicating success.
>> I can only imagine that this will result in a failure since the whole point
>> of this code is to wait until that driver is loaded. Please explain how and why
>> the code works with CONFIG_IPMI_SI=n. Similar, if the function returns an error,
>> I can not imagine how it would make sense to instantiate the driver. If it does
>> make sense to continue in this situation, a comment is needed in the code
>> describing the rationale.
>
> I'm trying to figure out where CONFIG_IPMI_SI comes in here. It's
> nowhere in these patches or in drivers/acpi. ACPI_IPMI depends on
> IPMI_HANDLER, but that's all I found. However, ACPI_IPMI can be "m" as
> you mention and SENSOR_ACPI_POWER is only under the ACPI config, which
> is a problem.
>
The patch above is looking for IPI0001, which is instantiated in
drivers/char/ipmi/ipmi_si_platform.c: { "IPI0001", 0 },
drivers/char/ipmi/ipmi_ssif.c: { "IPI0001", 0 },
Are you saying that the above code doesn't depend on it ? In that case,
why does it need to check for the IPI0001 device in the first place ?
That will need another comment/explanation in the code because people
(or maybe dummies) like me won't understand the non-dependency (i.e.,
the need to look for IPI0001 but not requiring the associated code).
More specifically, unless I really don't understand the acpi code,
acpi_dev_get_first_match_dev() will return NULL if there is no matching
device. In that case, the above code won't call acpi_wait_for_acpi_ipmi().
Fine, but why would this driver have to wait for ipmi if and only if there
is a device (and thus a driver) for IPI0001 ?
Thanks,
Guenter
> I do think there are other issues with this patch, though. The IPMI
> handler code decouples the user from the driver from a dependency point
> of view. It seems to be fairly common to see IPMI_HANDLER and
> ACPI_IPMI as "y" and IPMI_SI (and IPMI_SSIF, and others) as "m". That
> means this code will run but will wait for the IPMI device to appear,
> which may not be until the module gets loaded, which may be far more
> than 2 seconds later.
>
> I'm not quite sure how to fix this. Really, the add call for this
> driver shouldn't be called until the IPMI device is present. Doesn't
> ACPI have mechanisms to handle this sort of thing? If so, the hack may
> need to be in the handling of that ACPI data (this field is not there
> but should be), not here, which as Guenter says, is a big hack.
>
> -corey
>
>>
>> Third, the new symbol is declared with CONFIG_ACPI, but defined with
>> CONFIG_IPMI_SI. I can not imagine how this would compile with CONFIG_ACPI=y
>> and CONFIG_IPMI_SI={m,n} and/or CONFIG_ACPI_IPMI={m,n}.
>>
>> On top of that, IPMI_SI and ACPI_IPMI are is tristate, as is SENSORS_ACPI_POWER.
>> This means that SENSORS_ACPI_POWER=y combined with CONFIG_IPMI_SI={m,n} or
>> CONFIG_ACPI_IPMI={m,n} will result in a compile failure.
>>
>> Please make sure that this code compiles with all possible symbol combinations.
>>
>> Thanks,
>> Guenter
>>
>>> res = read_capabilities(resource);
>>> if (res)
>>> goto exit_free;
>>
>>
>
Powered by blists - more mailing lists