linux-kernel - Re: [PATCH v3 1/3] x86/amd_nb: Add support for northbridges on Aldebaran

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <dcc7c537-0370-d190-9ca7-ce60fa29d68c@amd.com>
Date:   Mon, 11 Oct 2021 19:56:34 +0530
From:   "Chatradhi, Naveen Krishna" <nchatrad@....com>
To:     Borislav Petkov <bp@...en8.de>
Cc:     linux-edac@...r.kernel.org, x86@...nel.org,
        linux-kernel@...r.kernel.org, mchehab@...nel.org,
        Yazen.Ghannam@....com, Muralidhara M K <muralimk@....com>
Subject: Re: [PATCH v3 1/3] x86/amd_nb: Add support for northbridges on
 Aldebaran

Hi Boris,

Apologies for the late reply. We were modifying the code based on the 
conversation between yourself and Yazen.

I would like to answer some of your questions before submitting the next 
version of the patch-set.

On 8/25/2021 4:12 PM, Borislav Petkov wrote:
> [CAUTION: External Email]
>
> On Tue, Aug 24, 2021 at 12:24:35AM +0530, Naveen Krishna Chatradhi wrote:
>> From: Muralidhara M K <muralimk@....com>
>>
>> On newer systems the CPUs manage MCA errors reported from the GPUs.
>> Enumerate the GPU nodes with the AMD NB framework to support EDAC.
>>
>> This patch adds necessary code to manage the Aldebaran nodes along with
> Avoid having "This patch" or "This commit" in the commit message. It is
> tautologically useless.
>
> Also, do
>
> $ git grep 'This patch' Documentation/process
>
> for more details.
Sure thanks
>
> Also, what are the "Aldebaran nodes"?
>
> Something on a star which is 65 light years away?

Aldebaran is an AMD GPU name, code submitted [PATCH 000/159] Aldebaran 
support (lists.freedesktop.org) 
<https://lists.freedesktop.org/archives/amd-gfx/2021-February/059694.html> 
is a part of the DRM framework

>
>> the CPU nodes.
>>
>> The GPU nodes are enumerated in sequential order based on the
>> PCI hierarchy, and the first GPU node is assumed to have an "AMD Node
>> ID" value of 8 (the second GPU node has 9, etc.).
> What does that mean? The GPU nodes are simply numerically after the CPU
> nodes or how am I to understand this nomenclature?
>
>> Each Aldebaran GPU
>> package has 2 Data Fabrics, which are enumerated as 2 nodes.
>> With this implementation detail, the Data Fabric on the GPU nodes can be
>> accessed the same way as the Data Fabric on CPU nodes.
>>
>> Signed-off-by: Muralidhara M K <muralimk@....com>
>> Signed-off-by: Naveen Krishna Chatradhi <nchatrad@....com>
>> Reviewed-by: Yazen Ghannam <yazen.ghannam@....com>
>> ---
>> Changes since v2: Added Reviewed-by Yazen Ghannam
>>
>>   arch/x86/include/asm/amd_nb.h | 10 ++++++
>>   arch/x86/kernel/amd_nb.c      | 63 ++++++++++++++++++++++++++++++++---
>>   include/linux/pci_ids.h       |  1 +
>>   3 files changed, 69 insertions(+), 5 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/amd_nb.h b/arch/x86/include/asm/amd_nb.h
>> index 455066a06f60..09905f6c7218 100644
>> --- a/arch/x86/include/asm/amd_nb.h
>> +++ b/arch/x86/include/asm/amd_nb.h
>> @@ -80,6 +80,16 @@ struct amd_northbridge_info {
>>
>>   #ifdef CONFIG_AMD_NB
>>
>> +/*
>> + * On newer heterogeneous systems the data fabrics of the CPUs and GPUs
>> + * are connected directly via a custom links, like is done with
> s/ a //
>
>> + * 2 socket CPU systems and also within a socket for Multi-chip Module
>> + * (MCM) CPUs like Naples.
>> + * The first GPU node(non cpu) is assumed to have an "AMD Node ID" value
> In all your text:
>
> s/cpu/CPU/g
>
>> + * of 8 (the second GPU node has 9, etc.).
>> + */
>> +#define NONCPU_NODE_INDEX    8
> Why is this assumed? Can it instead be read from the hardware somewhere?
> Or there simply won't be more than 8 CPU nodes anyway? Not at least in
> the near future?
>
> I'd prefer stuff to be read out directly from the hardware so that when
> the hardware changes, the code just works instead of doing assumptions
> which get invalidated later.
>
>> +
>>   u16 amd_nb_num(void);
>>   bool amd_nb_has_feature(unsigned int feature);
>>   struct amd_northbridge *node_to_amd_nb(int node);
>> diff --git a/arch/x86/kernel/amd_nb.c b/arch/x86/kernel/amd_nb.c
>> index 23dda362dc0f..6ad5664a18aa 100644
>> --- a/arch/x86/kernel/amd_nb.c
>> +++ b/arch/x86/kernel/amd_nb.c
>> @@ -26,6 +26,8 @@
>>   #define PCI_DEVICE_ID_AMD_17H_M70H_DF_F4 0x1444
>>   #define PCI_DEVICE_ID_AMD_19H_DF_F4  0x1654
>>   #define PCI_DEVICE_ID_AMD_19H_M50H_DF_F4 0x166e
>> +#define PCI_DEVICE_ID_AMD_ALDEBARAN_ROOT     0x14bb
>> +#define PCI_DEVICE_ID_AMD_ALDEBARAN_DF_F4    0x14d4
> You see how those defines are aligned vertically, right?
>
> If your new defines don't fit, you realign them all vertically too - you
> don't just slap them there.
>
> And if it wasn't clear above, that Aldebaran GPU chip name means
> something only to AMD folks. If this is correct
>
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fen.wikipedia.org%2Fwiki%2FVideo_Core_Next&amp;data=04%7C01%7CNaveenKrishna.Chatradhi%40amd.com%7Ce18fc8abf0da4da621b908d967b4ff31%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637654849356966973%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=Ge%2Fh%2FA2YmlKA7IF8XjuwM%2F4eygYYfybMOJs4jLX3g3I%3D&amp;reserved=0
>
> then that Aldebaran thing is VCN 2.6 although this is only the video
> encoding stuff and GPU I guess is more than that.
>
> IOW, what I'm trying to say is, just like we name the CPUs using their
> families, you should name the GPUs nomenclature with GPU families (I
> guess there's stuff like that) and not use the marketing crap.

Aldebaran GPU might be a later variant of gfx9 and are connected to the 
CPU sockets via custom xGMI links.

I could not find any family number associated with the GPUs. The DRM 
driver code uses it as follows and

does not expose the value to other frameworks in Linux.

+#define CHIP_ALDEBARAN 25

in 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm

> If you need an example, here's how we did it for the Intel marketing
> pile of bullsh*t:
>
> arch/x86/include/asm/intel-family.h
>
> Please provide a similar way of referring to the GPU chips.
>
>>   /* Protect the PCI config register pairs used for SMN and DF indirect access. */
>>   static DEFINE_MUTEX(smn_mutex);
>> @@ -94,6 +96,21 @@ static const struct pci_device_id hygon_nb_link_ids[] = {
>>        {}
>>   };
>>
>> +static const struct pci_device_id amd_noncpu_root_ids[] = {
> Why is that "noncpu" thing everywhere? Is this thing going to be
> anything else besides a GPU?
>
> If not, you can simply call it
>
>          amd_gpu_root_ids
>
> to mean *exactly* what they are. PCI IDs on the GPU.
>
>> +     { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_ALDEBARAN_ROOT) },
>> +     {}
>> +};
>> +
>> +static const struct pci_device_id amd_noncpu_nb_misc_ids[] = {
>> +     { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_ALDEBARAN_DF_F3) },
>> +     {}
>> +};
>> +
>> +static const struct pci_device_id amd_noncpu_nb_link_ids[] = {
>> +     { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_ALDEBARAN_DF_F4) },
>> +     {}
>> +};
>> +
>>   const struct amd_nb_bus_dev_range amd_nb_bus_dev_ranges[] __initconst = {
>>        { 0x00, 0x18, 0x20 },
>>        { 0xff, 0x00, 0x20 },
>> @@ -230,11 +247,16 @@ int amd_cache_northbridges(void)
>>        const struct pci_device_id *misc_ids = amd_nb_misc_ids;
>>        const struct pci_device_id *link_ids = amd_nb_link_ids;
>>        const struct pci_device_id *root_ids = amd_root_ids;
>> +
>> +     const struct pci_device_id *noncpu_misc_ids = amd_noncpu_nb_misc_ids;
>> +     const struct pci_device_id *noncpu_link_ids = amd_noncpu_nb_link_ids;
>> +     const struct pci_device_id *noncpu_root_ids = amd_noncpu_root_ids;
>> +
>>        struct pci_dev *root, *misc, *link;
>>        struct amd_northbridge *nb;
>>        u16 roots_per_misc = 0;
>> -     u16 misc_count = 0;
>> -     u16 root_count = 0;
>> +     u16 misc_count = 0, misc_count_noncpu = 0;
>> +     u16 root_count = 0, root_count_noncpu = 0;
>>        u16 i, j;
>>
>>        if (amd_northbridges.num)
>> @@ -253,10 +275,16 @@ int amd_cache_northbridges(void)
>>        if (!misc_count)
>>                return -ENODEV;
>>
>> +     while ((misc = next_northbridge(misc, noncpu_misc_ids)) != NULL)
>> +             misc_count_noncpu++;
>> +
>>        root = NULL;
>>        while ((root = next_northbridge(root, root_ids)) != NULL)
>>                root_count++;
>>
>> +     while ((root = next_northbridge(root, noncpu_root_ids)) != NULL)
>> +             root_count_noncpu++;
>> +
>>        if (root_count) {
>>                roots_per_misc = root_count / misc_count;
>>
>> @@ -270,15 +298,28 @@ int amd_cache_northbridges(void)
>>                }
>>        }
>>
>> -     nb = kcalloc(misc_count, sizeof(struct amd_northbridge), GFP_KERNEL);
>> +     if (misc_count_noncpu) {
>> +             /*
>> +              * The first non-CPU Node ID starts at 8 even if there are fewer
>> +              * than 8 CPU nodes. To maintain the AMD Node ID to Linux amd_nb
>> +              * indexing scheme, allocate the number of GPU nodes plus 8.
>> +              * Some allocated amd_northbridge structures will go unused when
>> +              * the number of CPU nodes is less than 8, but this tradeoff is to
>> +              * keep things relatively simple.
> Why simple?
>
> What's wrong with having
>
> [node IDs][GPU node IDs]
>
> i.e., the usual nodes come first and the GPU ones after it.
>
> You enumerate everything properly here so you can control what goes
> where. Which means, you don't need this NONCPU_NODE_INDEX non-sense at
> all.
>
> Hmmm?
>
>> +              */
>> +             amd_northbridges.num = NONCPU_NODE_INDEX + misc_count_noncpu;
>> +     } else {
>> +             amd_northbridges.num = misc_count;
>> +     }
>> +
>> +     nb = kcalloc(amd_northbridges.num, sizeof(struct amd_northbridge), GFP_KERNEL);
>>        if (!nb)
>>                return -ENOMEM;
>>
>>        amd_northbridges.nb = nb;
>> -     amd_northbridges.num = misc_count;
>>
>>        link = misc = root = NULL;
>> -     for (i = 0; i < amd_northbridges.num; i++) {
>> +     for (i = 0; i < misc_count; i++) {
>>                node_to_amd_nb(i)->root = root =
>>                        next_northbridge(root, root_ids);
>>                node_to_amd_nb(i)->misc = misc =
>> @@ -299,6 +340,18 @@ int amd_cache_northbridges(void)
>>                        root = next_northbridge(root, root_ids);
>>        }
>>
>> +     if (misc_count_noncpu) {
>> +             link = misc = root = NULL;
>> +             for (i = NONCPU_NODE_INDEX; i < NONCPU_NODE_INDEX + misc_count_noncpu; i++) {
> So this is not keeping things relatively simple - this is making you
> jump to the GPU nodes to prepare them too which is making them special.
>
>> +                     node_to_amd_nb(i)->root = root =
>> +                             next_northbridge(root, noncpu_root_ids);
>> +                     node_to_amd_nb(i)->misc = misc =
>> +                             next_northbridge(misc, noncpu_misc_ids);
>> +                     node_to_amd_nb(i)->link = link =
>> +                             next_northbridge(link, noncpu_link_ids);
> And seeing how you put those pointers in ->root, ->misc and ->link,
> you can just as well drop those noncpu_*_ids and put the aldebaran
> PCI IDs simply in amd_root_ids, amd_nb_misc_ids and amd_nb_link_ids
> respectively.
>
> Because to this code, the RAS functionality is no different than any
> other CPU because, well, the interface is those PCI devices. So the
> thing doesn't care if it is GPU or not.
>
> So you don't need any of that separation between GPU and CPU nodes when
> it comes to the RAS code.

The roots_per_misc count is different for the CPU nodes and GPU nodes. 
We tried to address

your comment without introducing pci_dev_id arrays for GPU roots, misc 
and links. But, introducing

GPU ID arrays looks cleaner, let me submit the revised code and we can 
revisit this point.

>
> Makes sense?
>
> --
> Regards/Gruss,
>      Boris.

Regards,

Naveen

>
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpeople.kernel.org%2Ftglx%2Fnotes-about-netiquette&amp;data=04%7C01%7CNaveenKrishna.Chatradhi%40amd.com%7Ce18fc8abf0da4da621b908d967b4ff31%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637654849356966973%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=Y5j%2BNctWbWjlpOkQ5DYvDtroWv8MplSBTPgopewm38E%3D&amp;reserved=0