linux-kernel - Re: [PATCH -rcu] kcsan: Make KCSAN compatible with lockdep

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <3760F60F-4133-4FE1-9A4C-F335A8230285@lca.pw>
Date:   Fri, 17 Jan 2020 11:59:39 -0500
From:   Qian Cai <cai@....pw>
To:     paulmck@...nel.org
Cc:     Marco Elver <elver@...gle.com>,
        Alexander Potapenko <glider@...gle.com>,
        Andrey Konovalov <andreyknvl@...gle.com>,
        Dmitriy Vyukov <dvyukov@...gle.com>,
        kasan-dev <kasan-dev@...glegroups.com>,
        LKML <linux-kernel@...r.kernel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>, Will Deacon <will@...nel.org>
Subject: Re: [PATCH -rcu] kcsan: Make KCSAN compatible with lockdep



> On Jan 17, 2020, at 11:40 AM, Paul E. McKenney <paulmck@...nel.org> wrote:
> 
> True enough, but even if we reach the nirvana state where there is general
> agreement on what constitutes a data race in need of fixing and KCSAN
> faithfully checks based on that data-race definition, we need to handle
> the case where someone introduces a bug that results in a destructive
> off-CPU access to a per-CPU variable, which is exactly the sort of thing
> that KCSAN is supposed to detect.  But suppose that this variable is
> frequently referenced from functions that are inlined all over the place.
> 
> Then that one bug might result in huge numbers of data-race reports in
> a very short period of time, especially on a large system.

It sounds like the case with debug_pagealloc where it prints a spam of those, and then the system is just dead.

[   28.992752][  T394] Reported by Kernel Concurrency Sanitizer on: 
[   28.992752][  T394] CPU: 0 PID: 394 Comm: pgdatinit0 Not tainted 5.5.0-rc6-next-20200115+ #3 
[   28.992752][  T394] Hardware name: HP ProLiant XL230a Gen9/ProLiant XL230a Gen9, BIOS U13 01/22/2018 
[   28.992752][  T394] =============================================================== 
[   28.992752][  T394] ================================================================== 
[   28.992752][  T394] BUG: KCSAN: data-race in __change_page_attr / __change_page_attr 
[   28.992752][  T394]  
[   28.992752][  T394] read to 0xffffffffa01a6de0 of 8 bytes by task 395 on cpu 16: 
[   28.992752][  T394]  __change_page_attr+0xe81/0x1620 
[   28.992752][  T394]  __change_page_attr_set_clr+0xde/0x4c0 
[   28.992752][  T394]  __set_pages_np+0xcc/0x100 
[   28.992752][  T394]  __kernel_map_pages+0xd6/0xdb 
[   28.992752][  T394]  __free_pages_ok+0x1a8/0x730 
[   28.992752][  T394]  __free_pages+0x51/0x90 
[   28.992752][  T394]  __free_pages_core+0x1c7/0x2c0 
[   28.992752][  T394]  deferred_free_range+0x59/0x8f 
[   28.992752][  T394]  deferred_init_max21d 
[   28.992752][  T394]  deferred_init_memmap+0x14a/0x1c1 
[   28.992752][  T394]  kthread+0x1e0/0x200 
[   28.992752][  T394]  ret_from_fork+0x3a/0x50 
[   28.992752][  T394]  
[   28.992752][  T394] write to 0xffffffffa01a6de0 of 8 bytes by task 394 on cpu 0: 
[   28.992752][  T394]  __change_page_attr+0xe9c/0x1620 
[   28.992752][  T394]  __change_page_attr_set_clr+0xde/0x4c0 
[   28.992752][  T394]  __set_pages_np+0xcc/0x100 
[   28.992752][  T394]  __kernel_map_pages+0xd6/0xdb 
[   28.992752][  T394]  __free_pages_ok+0x1a8/0x730 
[   28.992752][  T394]  __free_pages+0x51/0x90 
[   28.992752][  T394]  __free_pages_core+0x1c7/0x2c0 
[   28.992752][  T394]  deferred_free_range+0x59/0x8f 
[   28.992752][  T394]  deferred_init_maxorder+0x1d6/0x21d 
[   28.992752][  T394]  deferred_init_memmap+0x14a/0x1c1 
[   28.992752][  T394]  kthread+0x1e0/0x200 
[   28.992752][  T394]  ret_from_fork+0x3a/0x50 

It point out to this,

		pgprot_val(new_prot) &= ~pgprot_val(cpa->mask_clr);
		pgprot_val(new_prot) |= pgprot_val(cpa->mask_set);

		cpa_inc_4k_install();
		/* Hand in lpsize = 0 to enforce the protection mechanism */
		new_prot = static_protections(new_prot, address, pfn, 1, 0,
					      CPA_PROTECT);

In static_protections(),

	/*
	 * There is no point in checking RW/NX conflicts when the requested
	 * mapping is setting the page !PRESENT.
	 */
	if (!(pgprot_val(prot) & _PAGE_PRESENT))
		return prot;

Is there a data race there?