[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a24105be-b07f-41e5-4c3e-0b8a911ce231@intel.com>
Date: Wed, 25 Aug 2021 07:59:49 -0700
From: Dave Hansen <dave.hansen@...el.com>
To: Andy Lutomirski <luto@...nel.org>, Mike Rapoport <rppt@...nel.org>,
linux-mm@...ck.org
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Dave Hansen <dave.hansen@...ux.intel.com>,
Ira Weiny <ira.weiny@...el.com>,
Kees Cook <keescook@...omium.org>,
Mike Rapoport <rppt@...ux.ibm.com>,
"Peter Zijlstra (Intel)" <peterz@...radead.org>,
Rick P Edgecombe <rick.p.edgecombe@...el.com>,
Vlastimil Babka <vbabka@...e.cz>,
the arch/x86 maintainers <x86@...nel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [RFC PATCH 4/4] x86/mm: write protect (most) page tables
On 8/23/21 8:34 PM, Andy Lutomirski wrote:
>> I would expected this to have leveraged the pte_offset_map/unmap() code
>> to enable/disable write access. Granted, it would enable write access
>> even when only a read is needed, but that could be trivially fixed with
>> having a variant like:
>>
>> pte_offset_map_write()
>> pte_offset_unmap_write()
> I would also like to see a discussion of how races in which multiple
> threads or CPUs access ptes in the same page at the same time.
Yeah, the 32-bit highmem code has a per-cpu mapping area for
kmap_atomic() that lies underneath the pte_offset_map(). Although it's
in the shared kernel mapping, only one CPU uses it.
I didn't look at it until you mentioned it, but the code in this set is
just plain buggy if two CPUs do a
enable_pgtable_write()/disable_pgtable_write(). They'll clobber each other:
> +static void pgtable_write_set(void *pg_table, bool set)
> +{
> + int level = 0;
> + pte_t *pte;
> +
> + /*
> + * Skip the page tables allocated from pgt_buf break area and from
> + * memblock
> + */
> + if (!after_bootmem)
> + return;
> + if (!PageTable(virt_to_page(pg_table)))
> + return;
> +
> + pte = lookup_address((unsigned long)pg_table, &level);
> + if (!pte || level != PG_LEVEL_4K)
> + return;
> +
> + if (set) {
> + if (pte_write(*pte))
> + return;
> +
> + WRITE_ONCE(*pte, pte_mkwrite(*pte));
> + } else {
> + if (!pte_write(*pte))
> + return;
> +
> + WRITE_ONCE(*pte, pte_wrprotect(*pte));
> + }
> +}
Powered by blists - more mailing lists