[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20151112084616.EABFE19B@black.fi.intel.com>
Date: Thu, 12 Nov 2015 10:46:16 +0200 (EET)
From: "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
To: Ingo Molnar <mingo@...nel.org>
Cc: "Kirill A. Shutemov" <kirill@...temov.name>,
Borislav Petkov <bp@...en8.de>,
"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
hpa@...or.com, tglx@...utronix.de, mingo@...hat.com,
akpm@...ux-foundation.org, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, x86@...nel.org, jgross@...e.com,
konrad.wilk@...cle.com, elliott@....com,
boris.ostrovsky@...cle.com, Toshi Kani <toshi.kani@....com>,
Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: [PATCH] x86/mm: fix regression with huge pages on PAE
Ingo Molnar wrote:
>
> * Kirill A. Shutemov <kirill@...temov.name> wrote:
>
> > On Thu, Nov 12, 2015 at 08:48:54AM +0100, Ingo Molnar wrote:
> > >
> > > * Borislav Petkov <bp@...en8.de> wrote:
> > >
> > > > --- a/arch/x86/include/asm/pgtable_types.h
> > > > +++ b/arch/x86/include/asm/pgtable_types.h
> > > > @@ -279,17 +279,14 @@ static inline pmdval_t native_pmd_val(pmd_t pmd)
> > > > static inline pudval_t pud_pfn_mask(pud_t pud)
> > > > {
> > > > if (native_pud_val(pud) & _PAGE_PSE)
> > > > - return PUD_PAGE_MASK & PHYSICAL_PAGE_MASK;
> > > > + return ~((1ULL << PUD_SHIFT) - 1) & PHYSICAL_PAGE_MASK;
> > > > else
> > > > return PTE_PFN_MASK;
> > > > }
> > >
> > > > static inline pmdval_t pmd_pfn_mask(pmd_t pmd)
> > > > {
> > > > if (native_pmd_val(pmd) & _PAGE_PSE)
> > > > - return PMD_PAGE_MASK & PHYSICAL_PAGE_MASK;
> > > > + return ~((1ULL << PMD_SHIFT) - 1) & PHYSICAL_PAGE_MASK;
> > > > else
> > > > return PTE_PFN_MASK;
> > > > }
> > >
> > > So instead of uglifying the code, why not fix the real bug: change the
> > > PMD_PAGE_MASK/PUD_PAGE_MASK definitions to be 64-bit everywhere?
> >
> > *PAGE_MASK are usually applied to virtual addresses. I don't think it
> > should anything but 'unsigned long'. This is odd use case really.
>
> So we already have PHYSICAL_PAGE_MASK, why not introduce PHYSICAL_PMD_MASK et al,
> instead of uglifying the code?
Okay, makes sense. Check out the patch below.
> But, what problems do you expect with having a wider mask than its primary usage?
> If it's used for 32-bit values it will be truncated down safely. (But I have not
> tested it, so I might be missing some complication.)
Yeah, I basically worry about non realized side effect.
And these masks are defined via {PMD,PUD}_PAGE_SIZE. Should we change them
to 'unsigned long long' too or leave alone? What about PAGE_SIZE and
PAGE_MASK? Need to be converted too?
>From 96362f38d47d6ef9a0ebb9a92c6b4db9337f1565 Mon Sep 17 00:00:00 2001
From: "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
Date: Thu, 12 Nov 2015 10:39:00 +0200
Subject: [PATCH] x86/mm: fix regression with huge pages on PAE
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Recent PAT patchset has caused issue on 32-bit PAE machines:
page:eea45000 count:0 mapcount:-128 mapping: (null) index:0x0
flags: 0x40000000()
page dumped because: VM_BUG_ON_PAGE(page_mapcount(page) < 0)
------------[ cut here ]------------
kernel BUG at /home/build/linux-boris/mm/huge_memory.c:1485!
invalid opcode: 0000 [#1] SMP
Modules linked in: ahci libahci ata_generic skge r8169 firewire_ohci mii libata qla2xxx(+) scsi_transport_fc scsi_mod radeon tpm_infineon ttm backlight wmi acpi_cpufreq tpm_tis
CPU: 2 PID: 1758 Comm: modprobe Not tainted 4.3.0upstream-09269-gce5c2d2 #1
Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./To be filled by O.E.M., BIOS 080014 07/18/2008
task: ed84e600 ti: f6458000 task.ti: f6458000
EIP: 0060:[<c11bde80>] EFLAGS: 00010246 CPU: 2
EIP is at zap_huge_pmd+0x240/0x260
EAX: 00000000 EBX: f6459eb0 ECX: 00000292 EDX: 00000292
ESI: f6634d98 EDI: eea45000 EBP: f6459dc8 ESP: f6459d98
ata1: SATA link down (SStatus 0 SControl 300)
ata2: SATA link down (SStatus 0 SControl 300)
DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
CR0: 8005003b CR2: b75b21a0 CR3: 3655b880 CR4: 000006f0
Stack:
...
Call Trace:
unmap_single_vma
? __wake_up
unmap_vmas
unmap_region
do_munmap
vm_munmap
SyS_munmap
do_fast_syscall_32
? __do_page_fault
sysenter_past_esp
Code: 00 e9 05 fe ff ff 90 8d 74 26 00 0f 0b eb fe ba 4c e1 7a c1 89 f8 e8 f0 91 fd ff 0f 0b eb fe ba 6c e1 7a c1 89 f8 e8 e0 91 fd ff <0f> 0b eb fe ba c4 e1 7a c1 89 f8 e8 d0 91 fd ff 0f 0b eb fe 8d
EIP: [<c11bde80>] zap_huge_pmd+0x240/0x260 SS:ESP 0068:f6459d98
---[ end trace cba8fb1fc2e2e78a ]---
The problem is in pmd_pfn_mask() and pmd_flags_mask(). These helpers use
PMD_PAGE_MASK to calculate resulting mask. PMD_PAGE_MASK is 'unsigned
long', not 'unsigned long long' as phys_addr_t. As result upper bits of
resulting mask is truncated.
pud_pfn_mask() and pud_flags_mask() aren't problematic since we don't
have PUD page table level on 32-bit systems, but it's reasonable to keep
them consitent with PMD counterpart.
The patch introduces PHYSICAL_PMD_PAGE_MASK and PHYSICAL_PUD_PAGE_MASK
in addition to existing PHYSICAL_PAGE_MASK and rework helpers to use
them.
Reported-and-Tested-by: Boris Ostrovsky <boris.ostrovsky@...cle.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@...ux.intel.com>
Reviewed-by: Toshi Kani <toshi.kani@....com>
Cc: Andrew Morton <akpm@...ux-foundation.org>
Cc: elliott@....com
Cc: "H. Peter Anvin" <hpa@...or.com>
Cc: Ingo Molnar <mingo@...hat.com>
Cc: Jürgen Gross <jgross@...e.com>
Cc: konrad.wilk@...cle.com
Cc: linux-mm <linux-mm@...ck.org>
Cc: Mel Gorman <mgorman@...e.de>
Cc: Thomas Gleixner <tglx@...utronix.de>
Fixes: f70abb0fc3da ("x86/asm: Fix pud/pmd interfaces to handle large PAT bit")
Link: http://lkml.kernel.org/r/1447111090-8526-1-git-send-email-kirill.shutemov@linux.intel.com
[ Fix -Woverflow warnings from the realmode code. ]
Signed-off-by: Borislav Petkov <bp@...e.de>
---
arch/x86/boot/boot.h | 1 -
arch/x86/boot/video-mode.c | 2 ++
arch/x86/boot/video.c | 2 ++
arch/x86/include/asm/page_types.h | 16 +++++++++-------
arch/x86/include/asm/pgtable_types.h | 14 ++++----------
arch/x86/include/asm/x86_init.h | 1 -
6 files changed, 17 insertions(+), 19 deletions(-)
diff --git a/arch/x86/boot/boot.h b/arch/x86/boot/boot.h
index 0033e96c3f09..9011a88353de 100644
--- a/arch/x86/boot/boot.h
+++ b/arch/x86/boot/boot.h
@@ -23,7 +23,6 @@
#include <stdarg.h>
#include <linux/types.h>
#include <linux/edd.h>
-#include <asm/boot.h>
#include <asm/setup.h>
#include "bitops.h"
#include "ctype.h"
diff --git a/arch/x86/boot/video-mode.c b/arch/x86/boot/video-mode.c
index aa8a96b052e3..95c7a818c0ed 100644
--- a/arch/x86/boot/video-mode.c
+++ b/arch/x86/boot/video-mode.c
@@ -19,6 +19,8 @@
#include "video.h"
#include "vesa.h"
+#include <uapi/asm/boot.h>
+
/*
* Common variables
*/
diff --git a/arch/x86/boot/video.c b/arch/x86/boot/video.c
index 05111bb8d018..77780e386e9b 100644
--- a/arch/x86/boot/video.c
+++ b/arch/x86/boot/video.c
@@ -13,6 +13,8 @@
* Select video mode
*/
+#include <uapi/asm/boot.h>
+
#include "boot.h"
#include "video.h"
#include "vesa.h"
diff --git a/arch/x86/include/asm/page_types.h b/arch/x86/include/asm/page_types.h
index c5b7fb2774d0..cc071c6f7d4d 100644
--- a/arch/x86/include/asm/page_types.h
+++ b/arch/x86/include/asm/page_types.h
@@ -9,19 +9,21 @@
#define PAGE_SIZE (_AC(1,UL) << PAGE_SHIFT)
#define PAGE_MASK (~(PAGE_SIZE-1))
+#define PMD_PAGE_SIZE (_AC(1, UL) << PMD_SHIFT)
+#define PMD_PAGE_MASK (~(PMD_PAGE_SIZE-1))
+
+#define PUD_PAGE_SIZE (_AC(1, UL) << PUD_SHIFT)
+#define PUD_PAGE_MASK (~(PUD_PAGE_SIZE-1))
+
#define __PHYSICAL_MASK ((phys_addr_t)((1ULL << __PHYSICAL_MASK_SHIFT) - 1))
#define __VIRTUAL_MASK ((1UL << __VIRTUAL_MASK_SHIFT) - 1)
-/* Cast PAGE_MASK to a signed type so that it is sign-extended if
+/* Cast *PAGE_MASK to a signed type so that it is sign-extended if
virtual addresses are 32-bits but physical addresses are larger
(ie, 32-bit PAE). */
#define PHYSICAL_PAGE_MASK (((signed long)PAGE_MASK) & __PHYSICAL_MASK)
-
-#define PMD_PAGE_SIZE (_AC(1, UL) << PMD_SHIFT)
-#define PMD_PAGE_MASK (~(PMD_PAGE_SIZE-1))
-
-#define PUD_PAGE_SIZE (_AC(1, UL) << PUD_SHIFT)
-#define PUD_PAGE_MASK (~(PUD_PAGE_SIZE-1))
+#define PHYSICAL_PMD_PAGE_MASK (((signed long)PMD_PAGE_MASK) & __PHYSICAL_MASK)
+#define PHYSICAL_PUD_PAGE_MASK (((signed long)PUD_PAGE_MASK) & __PHYSICAL_MASK)
#define HPAGE_SHIFT PMD_SHIFT
#define HPAGE_SIZE (_AC(1,UL) << HPAGE_SHIFT)
diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h
index 116fc4ee586f..d1b76f88ccd1 100644
--- a/arch/x86/include/asm/pgtable_types.h
+++ b/arch/x86/include/asm/pgtable_types.h
@@ -277,17 +277,14 @@ static inline pmdval_t native_pmd_val(pmd_t pmd)
static inline pudval_t pud_pfn_mask(pud_t pud)
{
if (native_pud_val(pud) & _PAGE_PSE)
- return PUD_PAGE_MASK & PHYSICAL_PAGE_MASK;
+ return PHYSICAL_PUD_PAGE_MASK;
else
return PTE_PFN_MASK;
}
static inline pudval_t pud_flags_mask(pud_t pud)
{
- if (native_pud_val(pud) & _PAGE_PSE)
- return ~(PUD_PAGE_MASK & (pudval_t)PHYSICAL_PAGE_MASK);
- else
- return ~PTE_PFN_MASK;
+ return ~pud_pfn_mask(pud);
}
static inline pudval_t pud_flags(pud_t pud)
@@ -298,17 +295,14 @@ static inline pudval_t pud_flags(pud_t pud)
static inline pmdval_t pmd_pfn_mask(pmd_t pmd)
{
if (native_pmd_val(pmd) & _PAGE_PSE)
- return PMD_PAGE_MASK & PHYSICAL_PAGE_MASK;
+ return PHYSICAL_PMD_PAGE_MASK;
else
return PTE_PFN_MASK;
}
static inline pmdval_t pmd_flags_mask(pmd_t pmd)
{
- if (native_pmd_val(pmd) & _PAGE_PSE)
- return ~(PMD_PAGE_MASK & (pmdval_t)PHYSICAL_PAGE_MASK);
- else
- return ~PTE_PFN_MASK;
+ return ~pmd_pfn_mask(pmd);
}
static inline pmdval_t pmd_flags(pmd_t pmd)
diff --git a/arch/x86/include/asm/x86_init.h b/arch/x86/include/asm/x86_init.h
index 48d34d28f5a6..cd0fc0cc78bc 100644
--- a/arch/x86/include/asm/x86_init.h
+++ b/arch/x86/include/asm/x86_init.h
@@ -1,7 +1,6 @@
#ifndef _ASM_X86_PLATFORM_H
#define _ASM_X86_PLATFORM_H
-#include <asm/pgtable_types.h>
#include <asm/bootparam.h>
struct mpc_bus;
--
2.6.2
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists