linux-kernel - Re: [PATCH v9 1/8] mm: Add per-cpu logic to page shuffling

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:   Mon, 9 Sep 2019 12:07:01 +0300
From:   "Kirill A. Shutemov" <kirill@...temov.name>
To:     Alexander Duyck <alexander.duyck@...il.com>
Cc:     virtio-dev@...ts.oasis-open.org, kvm@...r.kernel.org,
        mst@...hat.com, catalin.marinas@....com, david@...hat.com,
        dave.hansen@...el.com, linux-kernel@...r.kernel.org,
        willy@...radead.org, mhocko@...nel.org, linux-mm@...ck.org,
        akpm@...ux-foundation.org, will@...nel.org,
        linux-arm-kernel@...ts.infradead.org, osalvador@...e.de,
        yang.zhang.wz@...il.com, pagupta@...hat.com,
        konrad.wilk@...cle.com, nitesh@...hat.com, riel@...riel.com,
        lcapitulino@...hat.com, wei.w.wang@...el.com, aarcange@...hat.com,
        ying.huang@...el.com, pbonzini@...hat.com,
        dan.j.williams@...el.com, fengguang.wu@...el.com,
        alexander.h.duyck@...ux.intel.com, kirill.shutemov@...ux.intel.com
Subject: Re: [PATCH v9 1/8] mm: Add per-cpu logic to page shuffling

On Sat, Sep 07, 2019 at 10:25:12AM -0700, Alexander Duyck wrote:
> From: Alexander Duyck <alexander.h.duyck@...ux.intel.com>
> 
> Change the logic used to generate randomness in the suffle path so that we

Typo.

> can avoid cache line bouncing. The previous logic was sharing the offset
> and entropy word between all CPUs. As such this can result in cache line
> bouncing and will ultimately hurt performance when enabled.
> 
> To resolve this I have moved to a per-cpu logic for maintaining a unsigned
> long containing some amount of bits, and an offset value for which bit we
> can use for entropy with each call.
> 
> Reviewed-by: Dan Williams <dan.j.williams@...el.com>
> Signed-off-by: Alexander Duyck <alexander.h.duyck@...ux.intel.com>
> ---
>  mm/shuffle.c |   33 +++++++++++++++++++++++----------
>  1 file changed, 23 insertions(+), 10 deletions(-)
> 
> diff --git a/mm/shuffle.c b/mm/shuffle.c
> index 3ce12481b1dc..9ba542ecf335 100644
> --- a/mm/shuffle.c
> +++ b/mm/shuffle.c
> @@ -183,25 +183,38 @@ void __meminit __shuffle_free_memory(pg_data_t *pgdat)
>  		shuffle_zone(z);
>  }
>  
> +struct batched_bit_entropy {
> +	unsigned long entropy_bool;
> +	int position;
> +};
> +
> +static DEFINE_PER_CPU(struct batched_bit_entropy, batched_entropy_bool);
> +
>  void add_to_free_area_random(struct page *page, struct free_area *area,
>  		int migratetype)
>  {
> -	static u64 rand;
> -	static u8 rand_bits;
> +	struct batched_bit_entropy *batch;
> +	unsigned long entropy;
> +	int position;
>  
>  	/*
> -	 * The lack of locking is deliberate. If 2 threads race to
> -	 * update the rand state it just adds to the entropy.
> +	 * We shouldn't need to disable IRQs as the only caller is
> +	 * __free_one_page and it should only be called with the zone lock
> +	 * held and either from IRQ context or with local IRQs disabled.
>  	 */
> -	if (rand_bits == 0) {
> -		rand_bits = 64;
> -		rand = get_random_u64();
> +	batch = raw_cpu_ptr(&batched_entropy_bool);
> +	position = batch->position;
> +
> +	if (--position < 0) {
> +		batch->entropy_bool = get_random_long();
> +		position = BITS_PER_LONG - 1;
>  	}
>  
> -	if (rand & 1)
> +	batch->position = position;
> +	entropy = batch->entropy_bool;
> +
> +	if (1ul & (entropy >> position))

Maybe something like this would be more readble:

	if (entropy & BIT(position))

>  		add_to_free_area(page, area, migratetype);
>  	else
>  		add_to_free_area_tail(page, area, migratetype);
> -	rand_bits--;
> -	rand >>= 1;
>  }
> 
> 

-- 
 Kirill A. Shutemov