lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20081230102831.GB3189@wotan.suse.de>
Date:	Tue, 30 Dec 2008 11:28:31 +0100
From:	Nick Piggin <npiggin@...e.de>
To:	Ingo Molnar <mingo@...e.hu>
Cc:	Andrew Morton <akpm@...ux-foundation.org>, ebiederm@...ssion.com,
	linux-kernel@...r.kernel.org, tglx@...utronix.de,
	ijc@...lion.org.uk
Subject: Re: early fixmap causes kmap breakage

On Tue, Dec 30, 2008 at 07:13:44AM +0100, Ingo Molnar wrote:
> 
> * Nick Piggin <npiggin@...e.de> wrote:
> 
> > On Mon, Dec 29, 2008 at 03:17:31PM -0800, Andrew Morton wrote:
> > > On Thu, 18 Dec 2008 22:15:43 +0100
> > > Nick Piggin <npiggin@...e.de> wrote:
> > > 
> > > > Hi,
> > > > 
> > > > I've debugged a problem where i386+pae systems with more than a few CPUs
> > > > blow up at boot in the kmap_atomic code.
> > > 
> > > ping?
> > 
> > No further progress here, I'm waiting on input for how to fix this 
> > "nicely". Meantime, clearing the early fixmap pte I guess works, but you 
> > lose a page... is it possible to put it into .initdata or is there some 
> > issue with that? (I guess on a PAE kernel, 4K isn't a big deal).
> 
> yeah, 4K shouldnt be a big deal. Mind sending a patch for this?

How's this?
--

The early fixmap pmd entry inserted at the very top of the KVA is casing the
subsequent fixmap mapping code to not provide physically linear pte pages over
the kmap atomic portion of the fixmap (which relies on said property to calculate
pte address).

This has caused weird boot failures in kmap_atomic much later in the boot
process (initial userspace faults) on a 32-bit PAE system with a larger number
of CPUs (smaller CPU counts tend not to run over into the next page so don't
show up the problem).

Solve this by attempting to clear out the page table, and copy any of its
entries to the new one. Also, add a bug if a nonlinear condition is encounted
and can't be resolved, which might save some hours of debugging if this fragile
scheme ever breaks again...

Putting swapper_pg_fixmap into initdata is an exercise left for the reviewer...

Signed-off-by: Nick Piggin <npiggin@...e.de>

---
 arch/x86/mm/init_32.c |   26 +++++++++++++++++++++++++-
 1 file changed, 25 insertions(+), 1 deletion(-)

Index: linux-2.6/arch/x86/mm/init_32.c
===================================================================
--- linux-2.6.orig/arch/x86/mm/init_32.c
+++ linux-2.6/arch/x86/mm/init_32.c
@@ -155,6 +155,7 @@ page_table_range_init(unsigned long star
 	unsigned long vaddr;
 	pgd_t *pgd;
 	pmd_t *pmd;
+	pte_t *lastpte = NULL;
 
 	vaddr = start;
 	pgd_idx = pgd_index(vaddr);
@@ -166,7 +167,30 @@ page_table_range_init(unsigned long star
 		pmd = pmd + pmd_index(vaddr);
 		for (; (pmd_idx < PTRS_PER_PMD) && (vaddr != end);
 							pmd++, pmd_idx++) {
-			one_page_table_init(pmd);
+			pte_t *pte;
+
+			pte = one_page_table_init(pmd);
+			/*
+			 * Something (early fixmap) has already put a pte page
+			 * here, which causes the page table allocation to
+			 * become nonlinear. Attempt to fix it, and if it is
+			 * still nonlinear then we have to bug.
+			 */
+			if (lastpte && lastpte + PTRS_PER_PTE != pte) {
+				pte_t *newpte;
+				int i;
+
+				pmd_clear(pmd);
+				__flush_tlb_all();
+
+				newpte = one_page_table_init(pmd);
+				BUG_ON(lastpte + PTRS_PER_PTE != newpte);
+				for (i = 0; i < PTRS_PER_PTE; i++) {
+					set_pte(newpte + i, pte_val(*(pte + i)));
+				}
+				pte = lastpte;
+			}
+			lastpte = pte;
 
 			vaddr += PMD_SIZE;
 		}
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ