lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20200307151542.b14131037dc44a8edcb22cad@linux-foundation.org>
Date:   Sat, 7 Mar 2020 15:15:42 -0800
From:   Andrew Morton <akpm@...ux-foundation.org>
To:     mateusznosek0@...il.com
Cc:     linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH] mm/page_alloc.c: Micro-optimisation Remove
 unnecessary branch

On Sat,  7 Mar 2020 23:53:35 +0100 mateusznosek0@...il.com wrote:

> From: Mateusz Nosek <mateusznosek0@...il.com>
> 
> Previously if branch condition was false, the assignment was not executed.
> The assignment can be safely executed even when the condition is false and
> it is not incorrect as it assigns the value of 'nodemask' to 'ac.nodemask'
> which already has the same value.
> 
> So as the assignment can be executed unconditionally, the branch can be
> removed.
> 
> ...
>
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -4819,8 +4819,7 @@ __alloc_pages_nodemask(gfp_t gfp_mask, unsigned int order, int preferred_nid,
>  	 * Restore the original nodemask if it was potentially replaced with
>  	 * &cpuset_current_mems_allowed to optimize the fast-path attempt.
>  	 */
> -	if (unlikely(ac.nodemask != nodemask))
> -		ac.nodemask = nodemask;
> +	ac.nodemask = nodemask;
>  

This will now unconditionally dirty the ac.nodemask cacheline, which
means that cacheline will need to be written back.  If it is truly
unlikely that the write was needed then the thinking goes that the
test-and-branch is worthwhile, by saving on memory traffic.

At least, I assume that's why the code is the way it is.

I don't know whether this optimisation is valid on a majority of modern
platforms.  But that's the thinking!

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ