lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190327193918.GP11927@dhcp22.suse.cz>
Date:   Wed, 27 Mar 2019 21:09:54 +0100
From:   Michal Hocko <mhocko@...nel.org>
To:     Yang Shi <yang.shi@...ux.alibaba.com>
Cc:     Dan Williams <dan.j.williams@...el.com>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Rik van Riel <riel@...riel.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Dave Hansen <dave.hansen@...el.com>,
        Keith Busch <keith.busch@...el.com>,
        Fengguang Wu <fengguang.wu@...el.com>,
        "Du, Fan" <fan.du@...el.com>, "Huang, Ying" <ying.huang@...el.com>,
        Linux MM <linux-mm@...ck.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [RFC PATCH 0/10] Another Approach to Use PMEM as NUMA Node

On Wed 27-03-19 11:59:28, Yang Shi wrote:
> 
> 
> On 3/27/19 10:34 AM, Dan Williams wrote:
> > On Wed, Mar 27, 2019 at 2:01 AM Michal Hocko <mhocko@...nel.org> wrote:
> > > On Tue 26-03-19 19:58:56, Yang Shi wrote:
[...]
> > > > It is still NUMA, users still can see all the NUMA nodes.
> > > No, Linux NUMA implementation makes all numa nodes available by default
> > > and provides an API to opt-in for more fine tuning. What you are
> > > suggesting goes against that semantic and I am asking why. How is pmem
> > > NUMA node any different from any any other distant node in principle?
> > Agree. It's just another NUMA node and shouldn't be special cased.
> > Userspace policy can choose to avoid it, but typical node distance
> > preference should otherwise let the kernel fall back to it as
> > additional memory pressure relief for "near" memory.
> 
> In ideal case, yes, I agree. However, in real life world the performance is
> a concern. It is well-known that PMEM (not considering NVDIMM-F or HBM) has
> higher latency and lower bandwidth. We observed much higher latency on PMEM
> than DRAM with multi threads.

One rule of thumb is: Do not design user visible interfaces based on the
contemporary technology and its up/down sides. This will almost always
fire back.

Btw. if you keep arguing about performance without any numbers. Can you
present something specific?

> In real production environment we don't know what kind of applications would
> end up on PMEM (DRAM may be full, allocation fall back to PMEM) then have
> unexpected performance degradation. I understand to have mempolicy to choose
> to avoid it. But, there might be hundreds or thousands of applications
> running on the machine, it sounds not that feasible to me to have each
> single application set mempolicy to avoid it.

we have cpuset cgroup controller to help here.

> So, I think we still need a default allocation node mask. The default value
> may include all nodes or just DRAM nodes. But, they should be able to be
> override by user globally, not only per process basis.
> 
> Due to the performance disparity, currently our usecases treat PMEM as
> second tier memory for demoting cold page or binding to not memory access
> sensitive applications (this is the reason for inventing a new mempolicy)
> although it is a NUMA node.

If the performance sucks that badly then do not use the pmem as NUMA,
really. There are certainly other ways to export the pmem storage. Use
it as a fast swap storage. Or try to work on a swap caching mechanism
that still allows much faster access than a slow swap storage. But do
not try to pretend to abuse the NUMA interface while you are breaking
some of its long term established semantics.
-- 
Michal Hocko
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ