[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <3760F60F-4133-4FE1-9A4C-F335A8230285@lca.pw>
Date: Fri, 17 Jan 2020 11:59:39 -0500
From: Qian Cai <cai@....pw>
To: paulmck@...nel.org
Cc: Marco Elver <elver@...gle.com>,
Alexander Potapenko <glider@...gle.com>,
Andrey Konovalov <andreyknvl@...gle.com>,
Dmitriy Vyukov <dvyukov@...gle.com>,
kasan-dev <kasan-dev@...glegroups.com>,
LKML <linux-kernel@...r.kernel.org>,
Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>, Will Deacon <will@...nel.org>
Subject: Re: [PATCH -rcu] kcsan: Make KCSAN compatible with lockdep
> On Jan 17, 2020, at 11:40 AM, Paul E. McKenney <paulmck@...nel.org> wrote:
>
> True enough, but even if we reach the nirvana state where there is general
> agreement on what constitutes a data race in need of fixing and KCSAN
> faithfully checks based on that data-race definition, we need to handle
> the case where someone introduces a bug that results in a destructive
> off-CPU access to a per-CPU variable, which is exactly the sort of thing
> that KCSAN is supposed to detect. But suppose that this variable is
> frequently referenced from functions that are inlined all over the place.
>
> Then that one bug might result in huge numbers of data-race reports in
> a very short period of time, especially on a large system.
It sounds like the case with debug_pagealloc where it prints a spam of those, and then the system is just dead.
[ 28.992752][ T394] Reported by Kernel Concurrency Sanitizer on:
[ 28.992752][ T394] CPU: 0 PID: 394 Comm: pgdatinit0 Not tainted 5.5.0-rc6-next-20200115+ #3
[ 28.992752][ T394] Hardware name: HP ProLiant XL230a Gen9/ProLiant XL230a Gen9, BIOS U13 01/22/2018
[ 28.992752][ T394] ===============================================================
[ 28.992752][ T394] ==================================================================
[ 28.992752][ T394] BUG: KCSAN: data-race in __change_page_attr / __change_page_attr
[ 28.992752][ T394]
[ 28.992752][ T394] read to 0xffffffffa01a6de0 of 8 bytes by task 395 on cpu 16:
[ 28.992752][ T394] __change_page_attr+0xe81/0x1620
[ 28.992752][ T394] __change_page_attr_set_clr+0xde/0x4c0
[ 28.992752][ T394] __set_pages_np+0xcc/0x100
[ 28.992752][ T394] __kernel_map_pages+0xd6/0xdb
[ 28.992752][ T394] __free_pages_ok+0x1a8/0x730
[ 28.992752][ T394] __free_pages+0x51/0x90
[ 28.992752][ T394] __free_pages_core+0x1c7/0x2c0
[ 28.992752][ T394] deferred_free_range+0x59/0x8f
[ 28.992752][ T394] deferred_init_max21d
[ 28.992752][ T394] deferred_init_memmap+0x14a/0x1c1
[ 28.992752][ T394] kthread+0x1e0/0x200
[ 28.992752][ T394] ret_from_fork+0x3a/0x50
[ 28.992752][ T394]
[ 28.992752][ T394] write to 0xffffffffa01a6de0 of 8 bytes by task 394 on cpu 0:
[ 28.992752][ T394] __change_page_attr+0xe9c/0x1620
[ 28.992752][ T394] __change_page_attr_set_clr+0xde/0x4c0
[ 28.992752][ T394] __set_pages_np+0xcc/0x100
[ 28.992752][ T394] __kernel_map_pages+0xd6/0xdb
[ 28.992752][ T394] __free_pages_ok+0x1a8/0x730
[ 28.992752][ T394] __free_pages+0x51/0x90
[ 28.992752][ T394] __free_pages_core+0x1c7/0x2c0
[ 28.992752][ T394] deferred_free_range+0x59/0x8f
[ 28.992752][ T394] deferred_init_maxorder+0x1d6/0x21d
[ 28.992752][ T394] deferred_init_memmap+0x14a/0x1c1
[ 28.992752][ T394] kthread+0x1e0/0x200
[ 28.992752][ T394] ret_from_fork+0x3a/0x50
It point out to this,
pgprot_val(new_prot) &= ~pgprot_val(cpa->mask_clr);
pgprot_val(new_prot) |= pgprot_val(cpa->mask_set);
cpa_inc_4k_install();
/* Hand in lpsize = 0 to enforce the protection mechanism */
new_prot = static_protections(new_prot, address, pfn, 1, 0,
CPA_PROTECT);
In static_protections(),
/*
* There is no point in checking RW/NX conflicts when the requested
* mapping is setting the page !PRESENT.
*/
if (!(pgprot_val(prot) & _PAGE_PRESENT))
return prot;
Is there a data race there?
Powered by blists - more mailing lists