lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7177f59b-dc05-daff-7dc6-5815b539a790@intel.com>
Date:   Thu, 27 May 2021 12:36:21 -0700
From:   Dave Hansen <dave.hansen@...el.com>
To:     Mel Gorman <mgorman@...hsingularity.net>,
        Andrew Morton <akpm@...ux-foundation.org>
Cc:     Hillf Danton <hdanton@...a.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Vlastimil Babka <vbabka@...e.cz>,
        Michal Hocko <mhocko@...nel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Linux-MM <linux-mm@...ck.org>, "Tang, Feng" <feng.tang@...el.com>
Subject: Re: [PATCH 0/6 v2] Calculate pcp->high based on zone sizes and active
 CPUs

Hi Mel,

Feng Tang tossed these on a "Cascade Lake" system with 96 threads and
~512G of persistent memory and 128G of DRAM.  The PMEM is in "volatile
use" mode and being managed via the buddy just like the normal RAM.

The PMEM zones are big ones:

        present  65011712 = 248 G
        high       134595 = 525 M

The PMEM nodes, of course, don't have any CPUs in them.

With your series, the pcp->high value per-cpu is 69584 pages or about
270MB per CPU.  Scaled up by the 96 CPU threads, that's ~26GB of
worst-case memory in the pcps per zone, or roughly 10% of the size of
the zone.

I did see quite a few pcp->counts above 60,000, so it's definitely
possible in practice to see the pcps filled up.  This was not observed
to cause any actual problems in practice.  But, it's still a bit worrisome.

Maybe instead of pretending that cpu-less nodes have one cpu:

	nr_local_cpus = max(1U,
cpumask_weight(cpumask_of_node(zone_to_nid(zone))));

we should just consider them to have *all* the CPUs in the system.  Perhaps:

	nr_local_cpus = cpumask_weight(cpumask_of_node(zone_to_nid(zone)));
	if (!nr_local_cpus)
		nr_local_cpus = num_online_cpus();

Even if a system has a silly number of CPUs, the 'high' sizes will still
hit a floor at 4*batch:

	high = max(high, batch << 2);

Which doesn't seem too bad, especially considering that CPU-less nodes
naturally have less contention.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ