lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 12 Nov 2018 22:15:46 +0000
From:   "Elliott, Robert (Persistent Memory)" <elliott@....com>
To:     Daniel Jordan <daniel.m.jordan@...cle.com>
CC:     "linux-mm@...ck.org" <linux-mm@...ck.org>,
        "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "aarcange@...hat.com" <aarcange@...hat.com>,
        "aaron.lu@...el.com" <aaron.lu@...el.com>,
        "akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
        "alex.williamson@...hat.com" <alex.williamson@...hat.com>,
        "bsd@...hat.com" <bsd@...hat.com>,
        "darrick.wong@...cle.com" <darrick.wong@...cle.com>,
        "dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
        "jgg@...lanox.com" <jgg@...lanox.com>,
        "jwadams@...gle.com" <jwadams@...gle.com>,
        "jiangshanlai@...il.com" <jiangshanlai@...il.com>,
        "mhocko@...nel.org" <mhocko@...nel.org>,
        "mike.kravetz@...cle.com" <mike.kravetz@...cle.com>,
        "Pavel.Tatashin@...rosoft.com" <Pavel.Tatashin@...rosoft.com>,
        "prasad.singamsetty@...cle.com" <prasad.singamsetty@...cle.com>,
        "rdunlap@...radead.org" <rdunlap@...radead.org>,
        "steven.sistare@...cle.com" <steven.sistare@...cle.com>,
        "tim.c.chen@...el.com" <tim.c.chen@...el.com>,
        "tj@...nel.org" <tj@...nel.org>, "vbabka@...e.cz" <vbabka@...e.cz>
Subject: RE: [RFC PATCH v4 11/13] mm: parallelize deferred struct page
 initialization within each node



> -----Original Message-----
> From: Daniel Jordan <daniel.m.jordan@...cle.com>
> Sent: Monday, November 12, 2018 11:54 AM
> To: Elliott, Robert (Persistent Memory) <elliott@....com>
> Cc: Daniel Jordan <daniel.m.jordan@...cle.com>; linux-mm@...ck.org;
> kvm@...r.kernel.org; linux-kernel@...r.kernel.org; aarcange@...hat.com;
> aaron.lu@...el.com; akpm@...ux-foundation.org; alex.williamson@...hat.com;
> bsd@...hat.com; darrick.wong@...cle.com; dave.hansen@...ux.intel.com;
> jgg@...lanox.com; jwadams@...gle.com; jiangshanlai@...il.com;
> mhocko@...nel.org; mike.kravetz@...cle.com; Pavel.Tatashin@...rosoft.com;
> prasad.singamsetty@...cle.com; rdunlap@...radead.org;
> steven.sistare@...cle.com; tim.c.chen@...el.com; tj@...nel.org;
> vbabka@...e.cz
> Subject: Re: [RFC PATCH v4 11/13] mm: parallelize deferred struct page
> initialization within each node
> 
> On Sat, Nov 10, 2018 at 03:48:14AM +0000, Elliott, Robert (Persistent
> Memory) wrote:
> > > -----Original Message-----
> > > From: linux-kernel-owner@...r.kernel.org <linux-kernel-
> > > owner@...r.kernel.org> On Behalf Of Daniel Jordan
> > > Sent: Monday, November 05, 2018 10:56 AM
> > > Subject: [RFC PATCH v4 11/13] mm: parallelize deferred struct page
> > > initialization within each node
> > >
...
> > > In testing, a reasonable value turned out to be about a quarter of the
> > > CPUs on the node.
> > ...
> > > +	/*
> > > +	 * We'd like to know the memory bandwidth of the chip to
> > >         calculate the
> > > +	 * most efficient number of threads to start, but we can't.
> > > +	 * In testing, a good value for a variety of systems was a
> > >         quarter of the CPUs on the node.
> > > +	 */
> > > +	nr_node_cpus = DIV_ROUND_UP(cpumask_weight(cpumask), 4);
> >
> >
> > You might want to base that calculation on and limit the threads to
> > physical cores, not hyperthreaded cores.
> 
> Why?  Hyperthreads can be beneficial when waiting on memory.  That said, I
> don't have data that shows that in this case.

I think that's only if there are some register-based calculations to do while
waiting. If both threads are just doing memory accesses, they'll both stall, and
there doesn't seem to be any benefit in having two contexts generate the IOs
rather than one (at least on the systems I've used). I think it takes longer
to switch contexts than to just turnaround the next IO.


---
Robert Elliott, HPE Persistent Memory



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ