lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240618064022.1990814-1-mawupeng1@huawei.com>
Date: Tue, 18 Jun 2024 14:40:22 +0800
From: Wupeng Ma <mawupeng1@...wei.com>
To: <akpm@...ux-foundation.org>, <ryabinin.a.a@...il.com>,
	<glider@...gle.com>, <andreyknvl@...il.com>, <dvyukov@...gle.com>,
	<vincenzo.frascino@....com>
CC: <mawupeng1@...wei.com>, <kasan-dev@...glegroups.com>,
	<linux-mm@...ck.org>, <linux-kernel@...r.kernel.org>
Subject: [Question] race during kasan_populate_vmalloc_pte

Hi maintainers,

During our testing, we discovered that kasan vmalloc may trigger a false
vmalloc-out-of-bounds warning due to a race between kasan_populate_vmalloc_pte
and kasan_depopulate_vmalloc_pte.

cpu0				cpu1				cpu2
  kasan_populate_vmalloc_pte	kasan_populate_vmalloc_pte	kasan_depopulate_vmalloc_pte
								spin_unlock(&init_mm.page_table_lock);
  pte_none(ptep_get(ptep))
  // pte is valid here, return here
								pte_clear(&init_mm, addr, ptep);
				pte_none(ptep_get(ptep))
				// pte is none here try alloc new pages
								spin_lock(&init_mm.page_table_lock);
kasan_poison
// memset kasan shadow region to 0
				page = __get_free_page(GFP_KERNEL);
				__memset((void *)page, KASAN_VMALLOC_INVALID, PAGE_SIZE);
				pte = pfn_pte(PFN_DOWN(__pa(page)), PAGE_KERNEL);
				spin_lock(&init_mm.page_table_lock);
				set_pte_at(&init_mm, addr, ptep, pte);
				spin_unlock(&init_mm.page_table_lock);


Since kasan shadow memory in cpu0 is set to 0xf0 which means it is not
initialized after the race in cpu1. Consequently, a false vmalloc-out-of-bounds
warning is triggered when a user attempts to access this memory region.

The root cause of this problem is the pte valid check at the start of
kasan_populate_vmalloc_pte should be removed since it is not protected by
page_table_lock. However, this may result in severe performance degradation
since pages will be frequently allocated and freed.

Is there have any thoughts on how to solve this issue?

Thank you.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ