[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20080507223646.GA10757@linux-os.sc.intel.com>
Date: Wed, 7 May 2008 15:36:46 -0700
From: Venki Pallipadi <venkatesh.pallipadi@...el.com>
To: Ingo Molnar <mingo@...e.hu>
Cc: Venki Pallipadi <venkatesh.pallipadi@...el.com>,
Frans Pop <elendil@...net.nl>,
Jesse Barnes <jesse.barnes@...el.com>,
linux-kernel@...r.kernel.org,
"Packard, Keith" <keith.packard@...el.com>,
Yinghai Lu <yhlu.kernel@...il.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Hugh Dickins <hugh@...itas.com>,
"H. Peter Anvin" <hpa@...or.com>,
Thomas Gleixner <tglx@...utronix.de>, suresh.b.siddha@...el.com
Subject: Re: [git head] X86_PAT & mprotect
On Wed, May 07, 2008 at 09:02:17AM +0200, Ingo Molnar wrote:
>
> * Venki Pallipadi <venkatesh.pallipadi@...el.com> wrote:
>
> > There is a hole in mprotect, which lets the user to change the page
> > cache type bits by-passing the kernel reserve_memtype and free_memtype
> > wrappers. Fix the hole by not letting mprotect change the PAT bits.
> >
> > Some versions of X used the mprotect hole to change caching type from
> > UC to WB, so that it can then use mtrr to program WC for that region
> > [1]. Change the mmap of pci space through /sys or /proc interfaces
> > from UC to UC_MINUS. With this change, X will not need to use mprotect
> > hole to get WC type.
> >
> > [1] lkml.org/lkml/2008/4/16/369
> >
> > Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@...el.com>
> > Signed-off-by: Suresh Siddha <suresh.b.siddha@...el.com>
> >
> > ---
> > arch/x86/pci/i386.c | 4 +---
> > include/asm-x86/pgtable.h | 5 ++++-
> > include/linux/mm.h | 1 +
> > mm/mmap.c | 13 +++++++++++++
> > mm/mprotect.c | 4 +++-
> > 5 files changed, 22 insertions(+), 5 deletions(-)
>
> hm, that's one dangerous looking patch. (Cc:-ed more MM folks. I've
> attached the patch below for reference.)
>
> the purpose of the fix itself seems to make some sense - we dont want
> mprotect() change the PAT bits in the pte from what they got populated
> with at fault or mmap time.
>
> the pte_modify() change looks correct at first sight.
>
> The _PAGE_PROT_PRESERVE_BITS solution looks a bit ugly (although we do
> have a couple of other similar #ifndefs in the MM already).
>
> at minimum we should add vm_get_page_prot_preserve() as an inline
> function to mm.h if _PAGE_PROT_PRESERVE_BITS is not defined, and make it
> call vm_get_page_prot(). Also, vm_get_page_prot() in mm/mmap.c should
> probably be marked inline so that we'll have only a single function call
> [vm_get_page_prot() is trivial].
I did check that with _PAGE_PROT_PRESERVE_BITS defined to zero,
compiler optimizes vm_get_page_prot_preserve to generate same code as
vm_get_page_prot with no function call (on x86 64). So, things should be OK
from overhead perspective. But, from the code cleanliness aspect,
PRESERVE_BITS looks unclean and needs some cleanup. The reason I posted the
patch as is, was to get the confirmation on the original thread, whether this
indeed fixes those PAT related error messages.
> but i'm wondering why similar issues never came up on other
> architectures - i thought it would be rather common to have immutable
> pte details. So maybe i'm missing something here ...
Probably other architectures does not depend on preserving things in
vma->vm_page_prot once ptes are set correctly. With PAT, we use
vm_page_prot to keep track of PAT attributes for vmas from parent to
child across fork.
pte_modify() part will be required in all archs that wants to preserve
some bits in pte in a mprotect call.
Thanks,
Venki
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists