[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150611154958.GA16799@gmail.com>
Date: Thu, 11 Jun 2015 17:49:58 +0200
From: Ingo Molnar <mingo@...nel.org>
To: Andy Lutomirski <luto@...capital.net>
Cc: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
linux-mml@...r.kernel.org,
Andrew Morton <akpm@...ux-foundation.org>,
Denys Vlasenko <dvlasenk@...hat.com>,
Brian Gerst <brgerst@...il.com>,
Peter Zijlstra <peterz@...radead.org>,
Borislav Petkov <bp@...en8.de>,
"H. Peter Anvin" <hpa@...or.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Oleg Nesterov <oleg@...hat.com>,
Thomas Gleixner <tglx@...utronix.de>,
Waiman Long <Waiman.Long@...com>
Subject: Re: [PATCH 08/12] x86/mm: Remove pgd_list use from vmalloc_sync_all()
* Andy Lutomirski <luto@...capital.net> wrote:
> On Thu, Jun 11, 2015 at 7:07 AM, Ingo Molnar <mingo@...nel.org> wrote:
> > The vmalloc() code uses vmalloc_sync_all() to synchronize changes to
> > the global reference kernel PGD to task PGDs.
>
> Does it? AFAICS the only caller is register_die_notifier, and it's
> not really clear to me why that exists.
Doh, indeed, got confused in that changelog - we are filling it in
opportunistically via vmalloc_fault().
> At some point I'd love to remove lazy kernel PGD sync from the kernel entirely
> (or at least from x86) and just do it when we switch mms. Now that you're
> removing all code that deletes kernel PGD entries, I think all we'd need to do
> is to add a per-PGD or per-mm count of the number of kernel entries populated
> and to fix it up when we switch to an mm with fewer entries populated than
> init_mm.
That would add a (cheap but nonzero) runtime check to every context switch. It's a
relative slow path, but in comparison vmalloc() is an even slower slowpath, so why
not do it there and just do synchronous updates and remove the vmalloc faults
altogether?
Also, on 64-bit it should not matter much: there the only change is the once in a
blue moon case where we allocate a new pgd for a 512 GB block of address space
that a single pgd entry covers.
I'd hate to add a check to every context switch, no matter how cheap, just for a
case that essentially never triggers...
So how about this solution instead:
- we add a generation counter to sync_global_pgds() so that it can detect when
the number of pgds populated in init_mm changes.
- we change vmalloc() to call sync_global_pgds(): this will be very cheap in the
overwhelming majority of cases.
- we eliminate vmalloc_fault(), on 64-bit at least. Yay! :-)
Thanks,
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists