lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 9 May 2016 17:07:07 +0900
From:	Minchan Kim <minchan@...nel.org>
To:	Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>
Cc:	Minchan Kim <minchan@...nel.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	linux-kernel@...r.kernel.org, linux-mm@...ck.org,
	Sergey Senozhatsky <sergey.senozhatsky@...il.com>,
	"[4.3+]" <stable@...r.kernel.org>
Subject: Re: [PATCH] zsmalloc: fix zs_can_compact() integer overflow

Hello Sergey,

On Mon, May 09, 2016 at 04:35:33PM +0900, Sergey Senozhatsky wrote:
> zs_can_compact() has two race conditions in its core calculation:
> 
> unsigned long obj_wasted = zs_stat_get(class, OBJ_ALLOCATED) -
> 				zs_stat_get(class, OBJ_USED);
> 
> 1) classes are not locked, so the numbers of allocated and used
>    objects can change by the concurrent ops happening on other CPUs
> 2) shrinker invokes it from preemptible context
> 
> Depending on the circumstances, OBJ_ALLOCATED can become less
> than OBJ_USED, which can result in either very high or negative
> `total_scan' value calculated in do_shrink_slab().

So, do you see pr_err("shrink_slab: %pF negative objects xxxx)
in vmscan.c and skip shrinking?

It would be better to explain what's the result without this patch
and end-user effect for going -stable.

At least, I seem to see above pr_err but at that time if I remember
correctly but at that time I thought it was a bug I introduces in
development process. Since then, I cannot reproduce the symptom
until now. :)

Good catch!
Comment is below.

> 
> Take a pool stat snapshot and use it instead of racy zs_stat_get()
> calls.
> 
> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@...il.com>
> Cc: Minchan Kim <minchan@...nel.org>
> Cc: <stable@...r.kernel.org>        [4.3+]
> ---
>  mm/zsmalloc.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
> index 107ec06..1bc2a98 100644
> --- a/mm/zsmalloc.c
> +++ b/mm/zsmalloc.c
> @@ -2262,10 +2262,13 @@ static void SetZsPageMovable(struct zs_pool *pool, struct zspage *zspage)

It seems this patch is based on my old page migration work?
It's not go to the mainline yet but your patch which fixes the bug should
be supposed to go to the -stable. So, I hope this patch first.

Thanks.

>  static unsigned long zs_can_compact(struct size_class *class)
>  {
>  	unsigned long obj_wasted;
> +	unsigned long obj_allocated = zs_stat_get(class, OBJ_ALLOCATED);
> +	unsigned long obj_used = zs_stat_get(class, OBJ_USED);
>  
> -	obj_wasted = zs_stat_get(class, OBJ_ALLOCATED) -
> -		zs_stat_get(class, OBJ_USED);
> +	if (obj_allocated <= obj_used)
> +		return 0;
>  
> +	obj_wasted = obj_allocated - obj_used;
>  	obj_wasted /= get_maxobj_per_zspage(class->size,
>  			class->pages_per_zspage);
>  
> -- 
> 2.8.2
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ