lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPcyv4gNrFOQJhKUV7crZqNfg8LQFZRVO04Z+Fo50kzswVQ=TA@mail.gmail.com>
Date:   Mon, 25 Mar 2019 12:29:48 -0700
From:   Dan Williams <dan.j.williams@...el.com>
To:     Brice Goglin <Brice.Goglin@...ia.fr>
Cc:     Yang Shi <yang.shi@...ux.alibaba.com>,
        Michal Hocko <mhocko@...e.com>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Rik van Riel <riel@...riel.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Dave Hansen <dave.hansen@...el.com>,
        Keith Busch <keith.busch@...el.com>,
        Fengguang Wu <fengguang.wu@...el.com>,
        "Du, Fan" <fan.du@...el.com>, "Huang, Ying" <ying.huang@...el.com>,
        Linux MM <linux-mm@...ck.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [RFC PATCH 0/10] Another Approach to Use PMEM as NUMA Node

On Mon, Mar 25, 2019 at 10:45 AM Brice Goglin <Brice.Goglin@...ia.fr> wrote:
>
> Le 25/03/2019 à 17:56, Dan Williams a écrit :
> >
> > I'm generally against the concept that a "pmem" or "type" flag should
> > indicate anything about the expected performance of the address range.
> > The kernel should explicitly look to the HMAT for performance data and
> > not otherwise make type-based performance assumptions.
>
>
> Oh sorry, I didn't mean to have the kernel use such a flag to decide of
> placement, but rather to expose more information to userspace to clarify
> what all these nodes are about when userspace will decide where to
> allocate things.

I understand, but I'm concerned about the risk of userspace developing
vendor-specific, or generation-specific policies around a coarse type
identifier. I think the lack of type specificity is a feature rather
than a gap, because it requires userspace to consider deeper
information.

Perhaps "path" might be a suitable replacement identifier rather than
type. I.e. memory that originates from an ACPI.NFIT root device is
likely "pmem".

> I understand that current NVDIMM-F are not slower than DDR and HMAT
> would better describe this than a flag. But I have seen so many buggy or
> dummy SLIT tables in the past that I wonder if we can expect HMAT to be
> widely available (and correct).

That's always a fear that the platform BIOS will try to game OS
behavior. However, that was the reason that HMAT was defined to
indicate actual performance values rather than relative. It is
hopefully harder to game than the relative SLIT values, but I'l  grant
you it's now impossible.

> Is there a safe fallback in case of missing or buggy HMAT? For instance,
> is DDR supposed to be listed before NVDIMM (or HBM) in SRAT?

One fallback might be to make some of these sysfs attributes writable
so userspace can correct the situation, but I'm otherwise unclear of
what you mean by "safe". If a platform has hard dependencies on
correctly enumerating memory performance capabilities then there's not
much the kernel can do if the HMAT is botched. I would expect the
general case is that the performance capabilities are a soft
dependency. but things still work if the data is wrong.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ