lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200923162510.GB1684790@yaz-nikka.amd.com>
Date:   Wed, 23 Sep 2020 11:25:10 -0500
From:   Yazen Ghannam <yazen.ghannam@....com>
To:     Borislav Petkov <bp@...en8.de>
Cc:     linux-edac@...r.kernel.org, linux-kernel@...r.kernel.org,
        tony.luck@...el.com, x86@...nel.org,
        Smita.KoralahalliChannabasappa@....com
Subject: Re: [PATCH v2 8/8] x86/MCE/AMD Support new memory interleaving modes
 during address translation

On Wed, Sep 23, 2020 at 10:20:39AM +0200, Borislav Petkov wrote:
> On Thu, Sep 03, 2020 at 08:01:44PM +0000, Yazen Ghannam wrote:
> > From: Muralidhara M K <muralidhara.mk@....com>
> > 
> > Add support for new memory interleaving modes used in current AMD systems.
> >
> > Check if the system is using a current Data Fabric version or a legacy
> > version as some bit and register definitions have changed.
> > 
> > Tested on AMD reference platforms with the following memory interleaving
> > options.
> > 
> > Naples
> > - None
> > - Channel
> > - Die
> > - Socket
> > 
> > Rome (NPS = Nodes per Socket)
> > - None
> > - NPS0
> > - NPS1
> > - NPS2
> > - NPS4
> > 
> > The fixes tag refers to the commit that allows amd64_edac_mod to load on
> > Rome systems.
> 
> Err, why? This is adding new stuff to an address translation function.
> How does that fix amd64_edac loading on Rome?
> 
> > The module may report an incorrect system addresses on
> > Rome systems depending on the interleaving option used.
> 
> That doesn't stop it from loading, sorry.
>

Okay, no problem.

> Now, before you guys do any new features, I'd like you to split this
> humongous function umc_normaddr_to_sysaddr() logically into separate
> helpers and each helper does exactly one thing and one thing only.
> 
> Then use a verb in its name: umc_translate_normaddr_to_sysaddr() or so.
>

Okay, will do.

> Also, Yazen, remind me again pls why isn't this function in
> drivers/edac/amd64_edac.c, where it is needed?
> 
> If the reason is not valid anymore, let's move it there before splitting
> so that it doesn't bloat the core code.
>

I don't remember the original reason, and I was recently asked about
this code living in a module. I did some looking after this ask, and I
found that we should be using this translation to get a proper value for
the memory error notifiers to use. So I think we still need to use this
function some way with the core code even if the EDAC interface isn't
used.

I think this set can be split up.

1) Set with patches 1-3 fixed up to use cpu_die_id.
2) Set with the address translation updates.
   a) Move umc_normaddr_to_sysaddr() into a new module under EDAC.
   b) Hook the new module into amd64_edac.c where it's used today.
   c) Refactor the code as you suggested above.
   d) Add the new features.
3) New set that sets up a proper notifier for the address translation.
   a) Unhook the new module from amd64_edac.c.
   b) Register a notifer that runs before any notifiers that operate on
      memory errors.
   c) Find a way to pass the translated address through the chain
      without losing the original value.

What do you think?

Thanks,
Yazen

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ