lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Wed, 4 Mar 2020 15:15:13 +0000
From:   Mel Gorman <mgorman@...hsingularity.net>
To:     mateusznosek0@...il.com
Cc:     linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        akpm@...ux-foundation.org
Subject: Re: [RFC PATCH] mm: Micro-optimisation: Save two branches on hot -
 page allocation path

On Wed, Mar 04, 2020 at 03:22:30PM +0100, mateusznosek0@...il.com wrote:
> From: Mateusz Nosek <mateusznosek0@...il.com>
> 
> This patch makes ALLOC_KSWAPD
> equal to __GFP_KSWAPD_RACLAIM (cast to 'int').
> 

s/RACLAIM/RECLAIM/

Very strange word wrapping

> Thanks to that code like:
> ```if (gfp_mask & __GFP_KSWAPD_RECLAIM)
> 		alloc_flags |= ALLOC_KSWAPD;```
> can be changed to:
> ```alloc_flags |= (__force int) (gfp_mask &__GFP_KSWAPD_RECLAIM);```

Not sure what the multiple ``` are about. It does not appear to be an
encoding error but I think you can drop them. Maybe some mail readers
render this differently but it looks weird in plain text.

> Thanks to this one branch less is generated in the assembly.
> 
> In case of ALLOC_KSWAPD flag two branches are saved,
> first one in code that always executes in the beggining of page allocation

s/beggining/beginning/

> and the second one in loop in page allocator slowpath.

Also strange word wrapping.

> 
> Signed-off-by: Mateusz Nosek <mateusznosek0@...il.com>
> ---
>  mm/internal.h   |  2 +-
>  mm/page_alloc.c | 23 +++++++++++++++--------
>  2 files changed, 16 insertions(+), 9 deletions(-)
> 
> diff --git a/mm/internal.h b/mm/internal.h
> index 86372d164476..7fb724977743 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -535,7 +535,7 @@ unsigned long reclaim_clean_pages_from_list(struct zone *zone,
>  #else
>  #define ALLOC_NOFRAGMENT	  0x0
>  #endif
> -#define ALLOC_KSWAPD		0x200 /* allow waking of kswapd */
> +#define ALLOC_KSWAPD		0x800 /* allow waking of kswapd */
>  
>  enum ttu_flags;
>  struct tlbflush_unmap_batch;
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 79e950d76ffc..73afd883eab5 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -3609,10 +3609,14 @@ static bool zone_allows_reclaim(struct zone *local_zone, struct zone *zone)
>  static inline unsigned int
>  alloc_flags_nofragment(struct zone *zone, gfp_t gfp_mask)
>  {
> -	unsigned int alloc_flags = 0;
> +	unsigned int alloc_flags;
>  
> -	if (gfp_mask & __GFP_KSWAPD_RECLAIM)
> -		alloc_flags |= ALLOC_KSWAPD;
> +	/*
> +	 * __GFP_KSWAPD_RECLAIM is assumed to be the same as ALLOC_KSWAPD
> +	 * to save a branch.
> +	 */
> +	BUILD_BUG_ON(__GFP_KSWAPD_RECLAIM != (__force gfp_t) ALLOC_KSWAPD);
> +	alloc_flags = (__force int) (gfp_mask & __GFP_KSWAPD_RECLAIM);
>  
>  #ifdef CONFIG_ZONE_DMA32
>  	if (!zone)

Ok, that looks reasonable.

> @@ -4248,8 +4252,13 @@ gfp_to_alloc_flags(gfp_t gfp_mask)
>  {
>  	unsigned int alloc_flags = ALLOC_WMARK_MIN | ALLOC_CPUSET;
>  
> -	/* __GFP_HIGH is assumed to be the same as ALLOC_HIGH to save a branch. */
> +	/*
> +	 * __GFP_HIGH is assumed to be the same as ALLOC_HIGH
> +	 * and __GFP_KSWAPD_RECLAIM is assumed to be the same as ALLOC_KSWAPD
> +	 * to save two branches.
> +	 */
>  	BUILD_BUG_ON(__GFP_HIGH != (__force gfp_t) ALLOC_HIGH);
> +	BUILD_BUG_ON(__GFP_KSWAPD_RECLAIM != (__force gfp_t) ALLOC_KSWAPD);
>  
>  	/*
>  	 * The caller may dip into page reserves a bit more if the caller

As Vlastimil pointed out, you do not need to make two compiler-based
checks. This seems like a reasonable location given that gfp_to_alloc_flags
is the most obvious place to document tricks with the flags.

> @@ -4257,7 +4266,8 @@ gfp_to_alloc_flags(gfp_t gfp_mask)
>  	 * policy or is asking for __GFP_HIGH memory.  GFP_ATOMIC requests will
>  	 * set both ALLOC_HARDER (__GFP_ATOMIC) and ALLOC_HIGH (__GFP_HIGH).
>  	 */
> -	alloc_flags |= (__force int) (gfp_mask & __GFP_HIGH);
> +	alloc_flags |= (__force int)
> +		(gfp_mask & (__GFP_HIGH | __GFP_KSWAPD_RECLAIM));
>  
>  	if (gfp_mask & __GFP_ATOMIC) {
>  		/*

Slightly harder to read but it does potentially generate better code.

If you correct the typos in the changelog, the slightly strange formatting
of the changelog and drop one of the build checks then you can add

Acked-by: Mel Gorman <mgorman@...hsingularity.net>

-- 
Mel Gorman
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ