linux-kernel - Re: [RFC] memory_hotplug: Free pages as pageblock

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20180912103853.GC10951@dhcp22.suse.cz>
Date:   Wed, 12 Sep 2018 12:38:53 +0200
From:   Michal Hocko <mhocko@...nel.org>
To:     Arun KS <arunks@...eaurora.org>
Cc:     akpm@...ux-foundation.org, dan.j.williams@...el.com,
        vbabka@...e.cz, pasha.tatashin@...cle.com, iamjoonsoo.kim@....com,
        osalvador@...e.de, malat@...ian.org, gregkh@...uxfoundation.org,
        yasu.isimatu@...il.com, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, arunks.linux@...il.com,
        vinmenon@...eaurora.org
Subject: Re: [RFC] memory_hotplug: Free pages as pageblock_order

On Wed 12-09-18 14:56:45, Arun KS wrote:
> When free pages are done with pageblock_order, time spend on
> coalescing pages by buddy allocator can be reduced. With
> section size of 256MB, hot add latency of a single section
> shows improvement from 50-60 ms to less than 1 ms, hence
> improving the hot add latency by 60%.

Where does the improvement come from? You are still doing the same
amount of work except that the number of callbacks is lower. Is this the
real source of 60% improvement?

> 
> If this looks okey, I'll modify users of set_online_page_callback
> and resend clean patch.

[...]

> +static int generic_online_pages(struct page *page, unsigned int order);
> +static online_pages_callback_t online_pages_callback = generic_online_pages;
> +
> +static int generic_online_pages(struct page *page, unsigned int order)
> +{
> +	unsigned long nr_pages = 1 << order;
> +	struct page *p = page;
> +	unsigned int loop;
> +
> +	for (loop = 0 ; loop < nr_pages ; loop++, p++) {
> +		__ClearPageReserved(p);
> +		set_page_count(p, 0);
> +	}
> +	adjust_managed_page_count(page, nr_pages);
> +	init_page_count(page);
> +	__free_pages(page, order);
> +
> +	return 0;
> +}
> +
> +static int online_pages_blocks(unsigned long start_pfn, unsigned long nr_pages)
> +{
> +	unsigned long pages_per_block = (1 << pageblock_order);
> +	unsigned long nr_pageblocks = nr_pages / pages_per_block;
> +//	unsigned long rem_pages = nr_pages % pages_per_block;
> +	int i, ret, onlined_pages = 0;
> +	struct page *page;
> +
> +	for (i = 0 ; i < nr_pageblocks ; i++) {
> +		page = pfn_to_page(start_pfn + (i * pages_per_block));
> +		ret = (*online_pages_callback)(page, pageblock_order);
> +		if (!ret)
> +			onlined_pages += pages_per_block;
> +		else if (ret > 0)
> +			onlined_pages += ret;
> +	}

Could you explain why does the pages_per_block step makes any sense? Why
don't you simply apply handle the full nr_pages worth of memory range
instead?

> +/*
> +	if (rem_pages)
> +		onlined_pages += online_page_single(start_pfn + i, rem_pages);
> +*/
> +
> +	return onlined_pages;
> +}
> +
>  static int online_pages_range(unsigned long start_pfn, unsigned long nr_pages,
>  			void *arg)
>  {
> -	unsigned long i;
>  	unsigned long onlined_pages = *(unsigned long *)arg;
> -	struct page *page;
>  
>  	if (PageReserved(pfn_to_page(start_pfn)))
> -		for (i = 0; i < nr_pages; i++) {
> -			page = pfn_to_page(start_pfn + i);
> -			(*online_page_callback)(page);
> -			onlined_pages++;
> -		}
> +		onlined_pages = online_pages_blocks(start_pfn, nr_pages);
>  
>  	online_mem_sections(start_pfn, start_pfn + nr_pages);
>  
> -- 
> 1.9.1
> 

-- 
Michal Hocko
SUSE Labs