linux-kernel - RE: [PATCH 3.2.0-rc1 2/3] MM hook for page allocation and release

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <84FF21A720B0874AA94B46D76DB9826904554270@008-AM1MPN1-003.mgdnok.nokia.com>
Date:	Thu, 5 Jan 2012 11:26:34 +0000
From:	<leonid.moiseichuk@...ia.com>
To:	<kamezawa.hiroyu@...fujitsu.com>
CC:	<linux-mm@...ck.org>, <linux-kernel@...r.kernel.org>,
	<cesarb@...arb.net>, <emunson@...bm.net>, <penberg@...nel.org>,
	<aarcange@...hat.com>, <riel@...hat.com>, <mel@....ul.ie>,
	<rientjes@...gle.com>, <dima@...roid.com>, <gregkh@...e.de>,
	<rebecca@...roid.com>, <san@...gle.com>,
	<akpm@...ux-foundation.org>, <vesa.jaaskelainen@...ia.com>
Subject: RE: [PATCH 3.2.0-rc1 2/3] MM hook for page allocation and release

Hi,

I agree that hooking alloc_pages is ugly way. So alternatives I see:

- shrinkers (as e.g. Android OOM used) but shrink_slab called only from try_to_free_pages only if we are on slow reclaim path on memory allocation, so it cannot be used for e.g. 75% memory tracking or when pages released to notify user space that we are OK. But according to easy to use it will be the best approach.

- memcg-kind of changes like mem_cgroup_newpage_charge/uncharge_page but without blocking decision making logic. Seems to me more changes. Threshold currently in memcg set 128 pages per CPU, that is quite often for level tracking needs.

- tracking situation using timer? Maybe not due to will impact battery.

With Best Wishes,
Leonid


-----Original Message-----
From: ext KAMEZAWA Hiroyuki [mailto:kamezawa.hiroyu@...fujitsu.com] 
Sent: 05 January, 2012 09:00
To: Moiseichuk Leonid (Nokia-MP/Helsinki)
Cc: linux-mm@...ck.org; linux-kernel@...r.kernel.org; cesarb@...arb.net; emunson@...bm.net; penberg@...nel.org; aarcange@...hat.com; riel@...hat.com; mel@....ul.ie; rientjes@...gle.com; dima@...roid.com; gregkh@...e.de; rebecca@...roid.com; san@...gle.com; akpm@...ux-foundation.org; Jaaskelainen Vesa (Nokia-MP/Helsinki)
Subject: Re: [PATCH 3.2.0-rc1 2/3] MM hook for page allocation and release

On Wed,  4 Jan 2012 19:21:55 +0200
Leonid Moiseichuk <leonid.moiseichuk@...ia.com> wrote:

> That is required by Used Memory Meter (UMM) pseudo-device to track 
> memory utilization in system. It is expected that hook MUST be very 
> light to prevent performance impact on the hot allocation path. 
> Accuracy of number managed pages does not expected to be absolute but 
> fact of allocation or deallocation must be registered.
> 
> Signed-off-by: Leonid Moiseichuk <leonid.moiseichuk@...ia.com>

I never like arbitrary hooks to alloc_pages().
Could you find another way ?

Hmm. memcg uses per-cpu counters for counting event of alloc/free and trigger threashold check per 128 event on a cpu.


Thanks,
-Kame


> ---
>  include/linux/mm.h |   15 +++++++++++++++
>  mm/Kconfig         |    8 ++++++++
>  mm/page_alloc.c    |   31 +++++++++++++++++++++++++++++++
>  3 files changed, 54 insertions(+), 0 deletions(-)
> 
> diff --git a/include/linux/mm.h b/include/linux/mm.h index 
> 3dc3a8c..d133f73 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -1618,6 +1618,21 @@ extern int soft_offline_page(struct page *page, 
> int flags);
>  
>  extern void dump_page(struct page *page);
>  
> +#ifdef CONFIG_MM_ALLOC_FREE_HOOK
> +/*
> + * Hook function type which called when some pages allocated or released.
> + * Value of nr_pages is positive for post-allocation calls and 
> +negative
> + * after free.
> + */
> +typedef void (*mm_alloc_free_hook_t)(int nr_pages);
> +
> +/*
> + * Setups specified hook function for tracking pages allocation.
> + * Returns value of old hook to organize chains of calls if necessary.
> + */
> +mm_alloc_free_hook_t set_mm_alloc_free_hook(mm_alloc_free_hook_t 
> +hook); #endif
> +
>  #if defined(CONFIG_TRANSPARENT_HUGEPAGE) || defined(CONFIG_HUGETLBFS)  
> extern void clear_huge_page(struct page *page,
>  			    unsigned long addr,
> diff --git a/mm/Kconfig b/mm/Kconfig
> index 011b110..2aaa1e9 100644
> --- a/mm/Kconfig
> +++ b/mm/Kconfig
> @@ -373,3 +373,11 @@ config CLEANCACHE
>  	  in a negligible performance hit.
>  
>  	  If unsure, say Y to enable cleancache
> +
> +config MM_ALLOC_FREE_HOOK
> +	bool "Enable callback support for pages allocation and releasing"
> +	default n
> +	help
> +	  Required for some features like used memory meter.
> +	  If unsure, say N to disable alloc/free hook.
> +
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 9dd443d..9307800 
> 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -236,6 +236,30 @@ static void set_pageblock_migratetype(struct page 
> *page, int migratetype)
>  
>  bool oom_killer_disabled __read_mostly;
>  
> +#ifdef CONFIG_MM_ALLOC_FREE_HOOK
> +static atomic_long_t alloc_free_hook __read_mostly = 
> +ATOMIC_LONG_INIT(0);
> +
> +mm_alloc_free_hook_t set_mm_alloc_free_hook(mm_alloc_free_hook_t 
> +hook) {
> +	const mm_alloc_free_hook_t old_hook =
> +		(mm_alloc_free_hook_t)atomic_long_read(&alloc_free_hook);
> +
> +	atomic_long_set(&alloc_free_hook, (long)hook);
> +	pr_info("MM alloc/free hook set to 0x%p (was 0x%p)\n", hook, 
> +old_hook);
> +
> +	return old_hook;
> +}
> +EXPORT_SYMBOL(set_mm_alloc_free_hook);
> +
> +static inline void call_alloc_free_hook(int pages) {
> +	const mm_alloc_free_hook_t hook =
> +		(mm_alloc_free_hook_t)atomic_long_read(&alloc_free_hook);
> +	if (hook)
> +		hook(pages);
> +}
> +#endif
> +
>  #ifdef CONFIG_DEBUG_VM
>  static int page_outside_zone_boundaries(struct zone *zone, struct 
> page *page)  { @@ -2298,6 +2322,10 @@ __alloc_pages_nodemask(gfp_t 
> gfp_mask, unsigned int order,
>  	put_mems_allowed();
>  
>  	trace_mm_page_alloc(page, order, gfp_mask, migratetype);
> +#ifdef CONFIG_MM_ALLOC_FREE_HOOK
> +	call_alloc_free_hook(1 << order);
> +#endif
> +
>  	return page;
>  }
>  EXPORT_SYMBOL(__alloc_pages_nodemask);
> @@ -2345,6 +2373,9 @@ void __free_pages(struct page *page, unsigned int order)
>  			free_hot_cold_page(page, 0);
>  		else
>  			__free_pages_ok(page, order);
> +#ifdef CONFIG_MM_ALLOC_FREE_HOOK
> +		call_alloc_free_hook(-(1 << order)); #endif
>  	}
>  }
>  
> --
> 1.7.7.3
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/