lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <525a20be-dea9-ed54-ca8e-8c4bc5e8a04f@intel.com>
Date:   Wed, 24 Jan 2018 11:23:49 -0800
From:   Dave Hansen <dave.hansen@...el.com>
To:     Mel Gorman <mgorman@...hsingularity.net>
Cc:     Aaron Lu <aaron.lu@...el.com>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org,
        Andrew Morton <akpm@...ux-foundation.org>,
        Huang Ying <ying.huang@...el.com>,
        Kemi Wang <kemi.wang@...el.com>,
        Tim Chen <tim.c.chen@...ux.intel.com>,
        Andi Kleen <ak@...ux.intel.com>,
        Michal Hocko <mhocko@...e.com>,
        Vlastimil Babka <vbabka@...e.cz>
Subject: Re: [PATCH 2/2] free_pcppages_bulk: prefetch buddy while not holding
 lock

On 01/24/2018 10:19 AM, Mel Gorman wrote:
>> IOW, I don't think this has the same downsides normally associated with
>> prefetch() since the data is always used.
> That doesn't side-step the calculations are done twice in the
> free_pcppages_bulk path and there is no guarantee that one prefetch
> in the list of pages being freed will not evict a previous prefetch
> due to collisions.

Fair enough.  The description here could probably use some touchups to
explicitly spell out the downsides.

I do agree with you that there is no guarantee that this will be
resident in the cache before use.  In fact, it might be entertaining to
see if we can show the extra conflicts in the L1 given from this change
given a large enough PCP batch size.

But, this isn't just about the L1.  If the results of the prefetch()
stay in *ANY* cache, then the memory bandwidth impact of this change is
still zero.  You'll have a lot harder time arguing that we're likely to
see L2/L3 evictions in this path for our typical PCP batch sizes.

Do you want to see some analysis for less-frequent PCP frees?  We could
pretty easily instrument the latency doing normal-ish things to see if
we can measure a benefit from this rather than a tight-loop micro.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ