[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALCETrXJYuHN872F74kVTuw4dYOc5saKqoUFbgJ5X0EuGEhXcA@mail.gmail.com>
Date: Thu, 18 Jul 2019 12:04:49 -0700
From: Andy Lutomirski <luto@...nel.org>
To: Joerg Roedel <jroedel@...e.de>
Cc: Andy Lutomirski <luto@...nel.org>, Joerg Roedel <joro@...tes.org>,
Dave Hansen <dave.hansen@...ux.intel.com>,
Peter Zijlstra <peterz@...radead.org>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
Andrew Morton <akpm@...ux-foundation.org>,
LKML <linux-kernel@...r.kernel.org>,
Linux-MM <linux-mm@...ck.org>
Subject: Re: [PATCH 3/3] mm/vmalloc: Sync unmappings in vunmap_page_range()
On Thu, Jul 18, 2019 at 2:17 AM Joerg Roedel <jroedel@...e.de> wrote:
>
> Hi Andy,
>
> On Wed, Jul 17, 2019 at 02:24:09PM -0700, Andy Lutomirski wrote:
> > On Wed, Jul 17, 2019 at 12:14 AM Joerg Roedel <joro@...tes.org> wrote:
> > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> > > index 4fa8d84599b0..322b11a374fd 100644
> > > --- a/mm/vmalloc.c
> > > +++ b/mm/vmalloc.c
> > > @@ -132,6 +132,8 @@ static void vunmap_page_range(unsigned long addr, unsigned long end)
> > > continue;
> > > vunmap_p4d_range(pgd, addr, next);
> > > } while (pgd++, addr = next, addr != end);
> > > +
> > > + vmalloc_sync_all();
> > > }
> >
> > I'm confused. Shouldn't the code in _vm_unmap_aliases handle this?
> > As it stands, won't your patch hurt performance on x86_64? If x86_32
> > is a special snowflake here, maybe flush_tlb_kernel_range() should
> > handle this?
>
> Imo this is the logical place to handle this. The code first unmaps the
> area from the init_mm page-table and then syncs that page-table to all
> other page-tables in the system, so one place to update the page-tables.
I find it problematic that there is no meaningful documentation as to
what vmalloc_sync_all() is supposed to do. The closest I can find is
this comment by following the x86_64 code, which calls
sync_global_pgds(), which says:
/*
* When memory was added make sure all the processes MM have
* suitable PGD entries in the local PGD level page.
*/
void sync_global_pgds(unsigned long start, unsigned long end)
{
Which is obviously entirely inapplicable. If I'm understanding
correctly, the underlying issue here is that the vmalloc fault
mechanism can propagate PGD entry *addition*, but nothing (not even
flush_tlb_kernel_range()) propagates PGD entry *removal*.
I find it suspicious that only x86 has this. How do other
architectures handle this?
At the very least, I think this series needs a comment in
vmalloc_sync_all() explaining exactly what the function promises to
do. But maybe a better fix is to add code to flush_tlb_kernel_range()
to sync the vmalloc area if the flushed range overlaps the vmalloc
area. Or, even better, improve x86_32 the way we did x86_64: adjust
the memory mapping code such that top-level paging entries are never
deleted in the first place.
Powered by blists - more mailing lists