[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200821095129.GF3354@suse.de>
Date: Fri, 21 Aug 2020 11:51:29 +0200
From: Joerg Roedel <jroedel@...e.de>
To: Chris Wilson <chris@...is-wilson.co.uk>
Cc: linux-kernel@...r.kernel.org, intel-gfx@...ts.freedesktop.org,
linux-mm@...ck.org, Pavel Machek <pavel@....cz>,
Andrew Morton <akpm@...ux-foundation.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Dave Airlie <airlied@...hat.com>,
Joonas Lahtinen <joonas.lahtinen@...ux.intel.com>,
Rodrigo Vivi <rodrigo.vivi@...el.com>,
David Vrabel <david.vrabel@...rix.com>, stable@...r.kernel.org
Subject: Re: [PATCH 1/4] mm: Export flush_vm_area() to sync the PTEs upon
construction
On Fri, Aug 21, 2020 at 09:50:08AM +0100, Chris Wilson wrote:
> The alloc_vm_area() is another method for drivers to
> vmap/map_kernel_range that uses apply_to_page_range() rather than the
> direct vmalloc walkers. This is missing the page table modification
> tracking, and the ability to synchronize the PTE updates afterwards.
> Provide flush_vm_area() for the users of alloc_vm_area() that assumes
> the worst and ensures that the page directories are correctly flushed
> upon construction.
>
> The impact is most pronounced on x86_32 due to the delayed set_pmd().
>
> Reported-by: Pavel Machek <pavel@....cz>
> References: 2ba3e6947aed ("mm/vmalloc: track which page-table levels were modified")
> References: 86cf69f1d893 ("x86/mm/32: implement arch_sync_kernel_mappings()")
> Signed-off-by: Chris Wilson <chris@...is-wilson.co.uk>
> Cc: Andrew Morton <akpm@...ux-foundation.org>
> Cc: Joerg Roedel <jroedel@...e.de>
> Cc: Linus Torvalds <torvalds@...ux-foundation.org>
> Cc: Dave Airlie <airlied@...hat.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@...ux.intel.com>
> Cc: Rodrigo Vivi <rodrigo.vivi@...el.com>
> Cc: Pavel Machek <pavel@....cz>
> Cc: David Vrabel <david.vrabel@...rix.com>
> Cc: <stable@...r.kernel.org> # v5.8+
> ---
> include/linux/vmalloc.h | 1 +
> mm/vmalloc.c | 16 ++++++++++++++++
> 2 files changed, 17 insertions(+)
>
> diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h
> index 0221f852a7e1..a253b27df0ac 100644
> --- a/include/linux/vmalloc.h
> +++ b/include/linux/vmalloc.h
> @@ -204,6 +204,7 @@ static inline void set_vm_flush_reset_perms(void *addr)
>
> /* Allocate/destroy a 'vmalloc' VM area. */
> extern struct vm_struct *alloc_vm_area(size_t size, pte_t **ptes);
> +extern void flush_vm_area(struct vm_struct *area);
> extern void free_vm_area(struct vm_struct *area);
>
> /* for /dev/kmem */
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index b482d240f9a2..c41934486031 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -3078,6 +3078,22 @@ struct vm_struct *alloc_vm_area(size_t size, pte_t **ptes)
> }
> EXPORT_SYMBOL_GPL(alloc_vm_area);
>
> +void flush_vm_area(struct vm_struct *area)
> +{
> + unsigned long addr = (unsigned long)area->addr;
> +
> + /* apply_to_page_range() doesn't track the damage, assume the worst */
> + if (ARCH_PAGE_TABLE_SYNC_MASK & (PGTBL_PTE_MODIFIED |
> + PGTBL_PMD_MODIFIED |
> + PGTBL_PUD_MODIFIED |
> + PGTBL_P4D_MODIFIED |
> + PGTBL_PGD_MODIFIED))
> + arch_sync_kernel_mappings(addr, addr + area->size);
This should happen in __apply_to_page_range() directly and look like
this:
if (ARCH_PAGE_TABLE_SYNC_MASK && create)
arch_sync_kernel_mappings(addr, addr + size);
Or even better, track whether something had to be allocated in the
__apply_to_page_range() path and check for:
if (ARCH_PAGE_TABLE_SYNC_MASK & mask)
Powered by blists - more mailing lists