lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5318D475.9040701@amd.com>
Date:	Thu, 6 Mar 2014 14:03:01 -0600
From:	Suravee Suthikulpanit <suravee.suthikulpanit@....com>
To:	Bjorn Helgaas <bhelgaas@...gle.com>
CC:	"linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Aravind Gopalakrishnan <aravind.gopalakrishnan@....com>,
	<kim.naru@....com>,
	"linux-acpi@...r.kernel.org" <linux-acpi@...r.kernel.org>,
	Yinghai Lu <yinghai@...nel.org>
Subject: Re: [PATCH 0/3] amd/pci: Add AMD hostbridge supports for newer AMD
 systems

On 3/6/2014 11:40 AM, Bjorn Helgaas wrote:
> [+cc Yinghai, sorry I didn't think of it before]
>
> On Wed, Mar 5, 2014 at 11:30 PM, Suravee Suthikulpanit
> <suravee.suthikulpanit@....com> wrote:
>> On 3/5/2014 8:13 PM, Suravee Suthikulanit wrote:
>>>
>>> On 3/5/2014 3:24 PM, Bjorn Helgaas wrote:
>>>>
>>>> [+cc linux-acpi]
>>>>
>>>> On Wed, Mar 5, 2014 at 2:06 PM,  <suravee.suthikulpanit@....com> wrote:
>>>>>
>>>>> From: Suravee Suthikulpanit <suravee.suthikulpanit@....com>
>>>>>
>>>>> The current code only supports upto AMD hostbridge for family11h.
>>>>> This causes PCI numa_node information to be reported incorrectly
>>>>> for newer family with multi sockets.
>>>>
>>>>
>>>> Where is the incorrect reporting?  In ACPI tables?  Is this patch a
>>>> way to cover up firmware defects in the ACPI description?  Or is this
>>>> for machines without ACPI (it seems unlikely that machines with new
>>>> AMD processors would not have ACPI)?
>>>
>>>
>>> This is incorrectly reported in the sysfs for each PCI device (e.g.
>>> /devices/pci0000:50/0000:50:00.2/numa_node). Without the patch, they
>>> return -1.
>>>
>>> In file arch/x86/pci/acpi.c, in function pci_acpi_scan_root(), it is
>>> queries the node information as following:
>>>
>>> #ifdef CONFIG_ACPI_NUMA
>>>       pxm = acpi_get_pxm(device->handle);
>>>       if (pxm >= 0)
>>>           node = pxm_to_node(pxm);
>>>       if (node != -1)
>>>           set_mp_bus_to_node(busnum, node);
>>>       else
>>> #endif
>>>           node = get_mp_bus_to_node(busnum);
>>>
>>> In this case, I see that the acpi_get_pxm() returns -1.  Therefore, it
>>> falls back to using the node information in mp_bus_to_node[].  So,
>>> without this patch, it would also returning -1.
>>>
>>> Also, the spec mentioned that the _PXM is optional, so I am not sure if
>>> this is a firmware bug.
>>
>> I am not quite familiar with the ACPI for this part.  However, after taking
>> a look at the code (in driver/acpi/pci_root.c: acpi_pci_root_add()), I
>> believe it's trying to locate _PXM method in the DSDT table, in which I
>> don't see any _PXM methods.
>
> This sure looks like a firmware bug.  True, _PXM is optional, but if
> the firmware doesn't provide it, nobody should be surprised that the
> OS thinks everything is in the same proximity domain.
>
> I would not endorse extending amd_bus.c for new CPUs.  That just
> covers up firmware problems like this, and if you ever run a different
> OS on the box, you'll trip over them again.  And I don't think a patch
> like this will even be a possibility for Windows.
>
> Bjorn
>

I understand and am trying to verify this with the BIOS engineers. 
However, this is currently affecting family15h servers out in the field. 
  We can try to fix ACPI for newer generation of machines, but it won't 
be practical to push this BIOS fix to all the BIOS vendors and system 
vendors for older platforms, as they tend to.

What if I localize the extension to the changes to access node 
information in the hostbridge for just the famil15h which is mostly used 
in our main server products? Would that be acceptable?

Suravee

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ