linux-kernel - Re: [PATCH] mm/page_counter: fix various data races

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20200129120302.GJ24244@dhcp22.suse.cz>
Date:   Wed, 29 Jan 2020 13:03:02 +0100
From:   Michal Hocko <mhocko@...nel.org>
To:     Qian Cai <cai@....pw>
Cc:     akpm@...ux-foundation.org, hannes@...xchg.org, elver@...gle.com,
        linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mm/page_counter: fix various data races

On Wed 29-01-20 05:52:24, Qian Cai wrote:
> The commit 3e32cb2e0a12 ("mm: memcontrol: lockless page counters") could
> had memcg->memsw->watermark been accessed concurrently as reported by
> KCSAN,
> 
>  Reported by Kernel Concurrency Sanitizer on:
>  BUG: KCSAN: data-race in page_counter_try_charge / page_counter_try_charge
> 
>  read to 0xffff8fb18c4cd190 of 8 bytes by task 1081 on cpu 59:
>   page_counter_try_charge+0x4d/0x150 mm/page_counter.c:138
>   try_charge+0x131/0xd50 mm/memcontrol.c:2405
>   __memcg_kmem_charge_memcg+0x58/0x140
>   __memcg_kmem_charge+0xcc/0x280
>   __alloc_pages_nodemask+0x1e1/0x450
>   alloc_pages_current+0xa6/0x120
>   pte_alloc_one+0x17/0xd0
>   __pte_alloc+0x3a/0x1f0
>   copy_p4d_range+0xc36/0x1990
>   copy_page_range+0x21d/0x360
>   dup_mmap+0x5f5/0x7a0
>   dup_mm+0xa2/0x240
>   copy_process+0x1b3f/0x3460
>   _do_fork+0xaa/0xa20
>   __x64_sys_clone+0x13b/0x170
>   do_syscall_64+0x91/0xb47
>   entry_SYSCALL_64_after_hwframe+0x49/0xbe
> 
>  write to 0xffff8fb18c4cd190 of 8 bytes by task 1153 on cpu 120:
>   page_counter_try_charge+0x5b/0x150 mm/page_counter.c:139
>   try_charge+0x131/0xd50 mm/memcontrol.c:2405
>   mem_cgroup_try_charge+0x159/0x460
>   mem_cgroup_try_charge_delay+0x3d/0xa0
>   wp_page_copy+0x14d/0x930
>   do_wp_page+0x107/0x7b0
>   __handle_mm_fault+0xce6/0xd40
>   handle_mm_fault+0xfc/0x2f0
>   do_page_fault+0x263/0x6f9
>   page_fault+0x34/0x40
> 
> Since watermark could be compared or set to garbage due to load or
> store tearing which would change the code logic, fix it by adding a pair
> of READ_ONCE() and WRITE_ONCE() in those places.

There is no actual problem and the report is false positive, right?
While compilers are free to do all sorts of stuff do we have any actual
proof that this particular path would ever be problematic.

I do not oppose to the patch. {READ,WRITE}_ONCE actually makes some
sense to me, but the changelog should be more clear on how serious this
is.
 
> Fixes: 3e32cb2e0a12 ("mm: memcontrol: lockless page counters")
> Signed-off-by: Qian Cai <cai@....pw>

Acked-by: Michal Hocko <mhocko@...e.com>

> ---
>  mm/page_counter.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/mm/page_counter.c b/mm/page_counter.c
> index de31470655f6..a17841150906 100644
> --- a/mm/page_counter.c
> +++ b/mm/page_counter.c
> @@ -82,8 +82,8 @@ void page_counter_charge(struct page_counter *counter, unsigned long nr_pages)
>  		 * This is indeed racy, but we can live with some
>  		 * inaccuracy in the watermark.
>  		 */
> -		if (new > c->watermark)
> -			c->watermark = new;
> +		if (new > READ_ONCE(c->watermark))
> +			WRITE_ONCE(c->watermark, new);
>  	}
>  }
>  
> @@ -135,8 +135,8 @@ bool page_counter_try_charge(struct page_counter *counter,
>  		 * Just like with failcnt, we can live with some
>  		 * inaccuracy in the watermark.
>  		 */
> -		if (new > c->watermark)
> -			c->watermark = new;
> +		if (new > READ_ONCE(c->watermark))
> +			WRITE_ONCE(c->watermark, new);
>  	}
>  	return true;
>  
> -- 
> 2.21.0 (Apple Git-122.2)

-- 
Michal Hocko
SUSE Labs