lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 17 Mar 2014 11:18:37 -0600
From:	Bjorn Helgaas <bhelgaas@...gle.com>
To:	Suravee Suthikulpanit <suravee.suthikulpanit@....com>
Cc:	"linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Aravind Gopalakrishnan <aravind.gopalakrishnan@....com>,
	kim.naru@....com,
	"linux-acpi@...r.kernel.org" <linux-acpi@...r.kernel.org>,
	Yinghai Lu <yinghai@...nel.org>,
	"Rafael J. Wysocki" <rjw@...ysocki.net>
Subject: Re: [PATCH 0/3] amd/pci: Add AMD hostbridge supports for newer AMD systems

[+cc Rafael]

On Thu, Mar 13, 2014 at 8:06 PM, Suravee Suthikulpanit
<suravee.suthikulpanit@....com> wrote:
> On 3/12/2014 4:13 PM, Bjorn Helgaas wrote:
>>>
>>> I assume the system is fully functional even without these patches,
>>> >right?  The only effect of these changes should be a performance
>>> >improvement.
>
> [Suravee] Yes, the system is fully functional except the numa information
> for PCI ethernet adapters is returning -1.

This would normally be determined by the _PXM of the PCI host bridge
leading to the NIC.  The DSDT you attached to the bug report contains
no _PXM methods at all, so we have no idea what the NUMA information
should be.

>>> >So the choices are:
>>> >
>>> >   1) Change the BIOS to provide _PXM
>>> >   2) Change Linux with your patches
>>> >
>>> >Either way, the customer has to upgrade something.  Choice 1) gets you
>>> >the performance improvement on all Linux and Windows releases, even
>>> >the ones that are already in the field.
>
> [Suravee] I agree with you on both cases and I am working with the BIOS team
> to see if they can look into this and provide fixes for the future.  My only
> concern is the device _PXM information might be specific to each platform
> vendors since they are the one deciding the number of northbridge to be put
> on the systems. Also, it's quite difficult to influence HW/BIOS vendor to
> provide fixes for older platforms unless some important customers are
> complaining to them.

Yes, you're right that _PXM describes platform topology and can't be
implemented by the CPU vendor because it doesn't know the topology of
the system where the CPUs will end up.

>>> >Choice 2) fixes future Linux
>>> >distros and whatever old releases you can convince the distro to
>>> >backport to, and doesn't help Windows at all.  It makes work for all
>>> >the distros and we (i.e., I) have to worry about maintaining this in
>>> >the future.
>
> [Suravee] Actually, this was originally reported by Redhat (please see
> https://bugzilla.redhat.com/show_bug.cgi?id=1040440), and they are waiting
> for the fixes if we decide to fix it in the upstream Linux code base.

This RH bugzilla is private, so I can't see it.

> [Suravee] I am not sure which table is the _PXM method supposed to be apart
> of.

It would be in the DSDT (or an SSDT).  In this case, the PCI host
bridges (the PNP0A08 devices) are described in the DSDT you posted, so
_PXM would normally be there, too.

>> Can you collect an acpidump and complete dmesg log from a system with
>> the problem?
>
> [Suravee] Please see https://bugzilla.kernel.org/show_bug.cgi?id=72051
> I have included SRAT, SLIT and DSDT table as part of the bug. Please let me
> know if you are looking for more information.

One of my objections to your patches is that if we include them, it
will reinforce the perception that the BIOS is doing things correctly,
so there will be less incentive to actually fix the problem by
following the spec.  This has happened several times in the past,
e.g., quirk_amd_nb_node() and amd_bus.c itself.

Another is more of a question: as coded, your patch assumes *all* AMD
host bridge devices will have these registers at those locations
forever.  Are those registers documented as being architected?

Finally, your patch extracts a 3-bit number from a CPU register and
uses it as the node number.  The node numbers used by Linux are
logical and there's no reason they need to be identical to settings in
the CPU registers.  So if we got some node information in the normal
way (from _PXM, SLIT, SRAT, etc.) and some from your patch, there's no
reason to believe they would be compatible.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists