lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <47ee8f0e-1a50-4284-b33f-115f898fedcf@amd.com>
Date:   Fri, 20 Oct 2023 12:13:18 -0500
From:   Mario Limonciello <mario.limonciello@....com>
To:     Hans de Goede <hdegoede@...hat.com>,
        Thomas Gleixner <tglx@...utronix.de>, kys@...rosoft.com,
        hpa@...ux.intel.com
Cc:     x86@...nel.org, LKML <linux-kernel@...r.kernel.org>,
        Borislav Petkov <bp@...en8.de>
Subject: Re: PIC probing code from e179f6914152 failing

On 10/20/2023 10:16, Hans de Goede wrote:
> Hi Mario,
> 
> On 10/19/23 23:20, Mario Limonciello wrote:
>> On 10/18/2023 17:50, Thomas Gleixner wrote:
> 
> <snip>
> 
>>> But that brings up an interesting question. How are those affected
>>> machines even reaching a state where the user notices that just the
>>> keyboard and the GPIO are not working? Why?
>>
>> So the GPIO controller driver (pinctrl-amd) uses platform_get_irq() to try to discover the IRQ to use.
>>
>> This calls acpi_irq_get() which isn't implemented on x86 (hardcodes -EINVAL).
>>
>> I can "work around it" by:
>>
>> diff --git a/drivers/base/platform.c b/drivers/base/platform.c
>> index 76bfcba25003..2b4b436c65d8 100644
>> --- a/drivers/base/platform.c
>> +++ b/drivers/base/platform.c
>> @@ -187,7 +187,8 @@ int platform_get_irq_optional(struct platform_device *dev, unsigned int num)
>>          }
>>
>>          r = platform_get_resource(dev, IORESOURCE_IRQ, num);
>> -       if (has_acpi_companion(&dev->dev)) {
>> +       if (IS_ENABLED(CONFIG_ACPI_GENERIC_GSI) &&
>> +            has_acpi_companion(&dev->dev)) {
>>                  if (r && r->flags & IORESOURCE_DISABLED) {
>>                          ret = acpi_irq_get(ACPI_HANDLE(&dev->dev), num, r);
>>                          if (ret)
>>
>> but the resource that is returned from the next hunk has the resource flags set wrong in the NULL pic case:
>>
>> NULL case:
>> r: AMDI0030:00 flags: 0x30000418
>> PIC case:
>> r: AMDI0030:00 flags: 0x418
>>
>> IOW NULL pic case has IORESOURCE_DISABLED / IORESOURCE_UNSET
>>
>> This then later the GPIO controller interrupts are not actually working.
>> For example the attn pin for my I2C touchpad doesn't work.
> 
> Right the issue is that with the legacy-pic path disabled /
> with nr_legacy_irqs() returning 0 them there is no mapping
> added for the Legacy ISA IRQs which causes this problem.
> 
> My hack to set nr_legacy_irqs to 16 also for the NULL PIC from:
> https://bugzilla.kernel.org/show_bug.cgi?id=218003
> 
> Does cause the Legacy ISA IRQ mappings to get added and makes
> the GPIO controller actually work, as can be seen from:
> 
> https://bugzilla.kernel.org/attachment.cgi?id=305241&action=edit
> 
> Which is a dmesg with that hack and it does NOT have this error:
> 
> [    0.276113] amd_gpio AMDI0030:00: error -EINVAL: IRQ index 0 not found
> [    0.278464] amd_gpio: probe of AMDI0030:00 failed with error -22
> 
> and the reporter also reports the touchpad works with this patch.
> 
> As Thomas already said the legayc PIC really is not necessary,
> but what is still necessary on these laptops with the legacy PIC
> not initialized is to have the Legacy ISA IRQ mappings added
> by the kernel itself since these are missing from the MADT
> (if I have my ACPI/IOAPIC terminology correct).

They're not missing, the problem is that the ioapic code doesn't
let it get updated because of what I see as an extra nr_legacy_irqs()
check.

The series I posted I believe fixes this issue.

> 
> This quick hack (which is the one from the working dmesg)
> does this:
> 
> --- a/arch/x86/kernel/i8259.c	
> +++ a/arch/x86/kernel/i8259.c	
> @@ -394,7 +394,7 @@ static int legacy_pic_probe(void)
>   }
>   
>   struct legacy_pic null_legacy_pic = {
> -	.nr_legacy_irqs = 0,
> +	.nr_legacy_irqs = NR_IRQS_LEGACY,
>   	.chip = &dummy_irq_chip,
>   	.mask = legacy_pic_uint_noop,
>   	.unmask = legacy_pic_uint_noop,
> 
> But I believe this will break things when there are actually
> non legacy ISA IRQs / GSI-s using GSI numbers < NR_IRQS_LEGACY
> 
> Thomas, I'm not at all familiar with this area of the kernel,
> but would checking if the MADT defines any non ISA GSIs under
> 16 and if NOT use nr_legacy_irqs = NR_IRQS_LEGACY for the
> NULL PIC be an option?
> 
> Or maybe some sort of DMI (sys_vendor == Lenovo) quirk to
> set nr_legacy_irqs = NR_IRQS_LEGACY for the NULL PIC ?
> 

I'd prefer we don't do this.
As tglx pointed out there is an underlying bug and we shouldn't paper 
over it with quirks.

My guess at what he doesn't see this issue on his system is that the 
default preconfigured IOAPIC mappings (polarity and triggering) happen 
to match the values that would have been programmed from _CRS.

That's not the case here.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ