lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20081231015452.GC32239@wotan.suse.de>
Date:	Wed, 31 Dec 2008 02:54:52 +0100
From:	Nick Piggin <npiggin@...e.de>
To:	"Eric W. Biederman" <ebiederm@...ssion.com>
Cc:	Ingo Molnar <mingo@...e.hu>,
	Andrew Morton <akpm@...ux-foundation.org>,
	linux-kernel@...r.kernel.org, tglx@...utronix.de,
	ijc@...lion.org.uk
Subject: Re: early fixmap causes kmap breakage

On Tue, Dec 30, 2008 at 02:41:53PM -0800, Eric W. Biederman wrote:
> Nick Piggin <npiggin@...e.de> writes:
> 
> > On Tue, Dec 30, 2008 at 07:13:44AM +0100, Ingo Molnar wrote:
> >> 
> >> * Nick Piggin <npiggin@...e.de> wrote:
> >> 
> >> > On Mon, Dec 29, 2008 at 03:17:31PM -0800, Andrew Morton wrote:
> >> > > On Thu, 18 Dec 2008 22:15:43 +0100
> >> > > Nick Piggin <npiggin@...e.de> wrote:
> >> > > 
> >> > > > Hi,
> >> > > > 
> >> > > > I've debugged a problem where i386+pae systems with more than a few CPUs
> >> > > > blow up at boot in the kmap_atomic code.
> >> > > 
> >> > > ping?
> >> > 
> >> > No further progress here, I'm waiting on input for how to fix this 
> >> > "nicely". Meantime, clearing the early fixmap pte I guess works, but you 
> >> > lose a page... is it possible to put it into .initdata or is there some 
> >> > issue with that? (I guess on a PAE kernel, 4K isn't a big deal).
> >> 
> >> yeah, 4K shouldnt be a big deal. Mind sending a patch for this?
> >
> > How's this?
> > --
> >
> > The early fixmap pmd entry inserted at the very top of the KVA is casing the
> > subsequent fixmap mapping code to not provide physically linear pte pages over
> > the kmap atomic portion of the fixmap (which relies on said property to
> > calculate
> > pte address).
> >
> > This has caused weird boot failures in kmap_atomic much later in the boot
> > process (initial userspace faults) on a 32-bit PAE system with a larger number
> > of CPUs (smaller CPU counts tend not to run over into the next page so don't
> > show up the problem).
> atomic>
> > Solve this by attempting to clear out the page table, and copy any of its
> > entries to the new one. Also, add a bug if a nonlinear condition is encounted
> > and can't be resolved, which might save some hours of debugging if this fragile
> > scheme ever breaks again...
> >
> > Putting swapper_pg_fixmap into initdata is an exercise left for the reviewer...
> 
> Ok.  I see what is going on now.  We have exceeded 512 fixmap entries, causing
> the fixmap entries to consume more than 2MB of the address space.   Which broke
> the assumption that the fixmap entries are all contiguous.

Yes. That wasn't obvious from my problem description?

 
> Ditching the swapper_pg_fixmap has some problems.
> 
> This appears to break early_printk to a usb debug port, which calls
> set_fixmap_nocache and expects the mapping to last.
> 
> This looks like it will have problems with Xen and other environments
> where we come in with a pre-populated page table, possibly unmapping
> something important.

My patch copies the early fixmap mappings to the new page table. Isn't
this enough?
 
> one_page_table_init relies on alloc_bootmem_low_pages for it's memory allocation
> so we do not have a guarantee that we will have contiguous memory even without
> this.

It's OK, if fragile, in early boot where it uses alloc_low_page.

 
> I see three ways we can address this.
> - Grow swapper_pg_fixmap to cover the entire fixmap range.
>   This trivially and without problems gives an atomic guarantee,
>   and should allow removal of code that sets up the fixmaps later
>   in C, except in weird cases like Xen.

Would be fine by me, although I want to get a minimal patch working
in the meantime if that is going to be complex.

 
> - Decide it is worth optimizing kmap_atomic_prot some more.
>   Have a kmap_pte per cpu.
>   Cache line align the kmap pte entries so we don't get conflicts
>   per cpu, at which point we should be guaranteed the all 13 of
>   them will be physically contiguous.

Go wild if you'd like to spend time optimising x86 PAE ;)

 
> - Not support more than 32 cpus on x86_32.
 
This is not an option really. Anyway, it's not more than 32 CPUs, but
can be a problem with as few as about 8 depending on config.


> I suspect it might even be worth writing a version of one_page_table_init
> that would guarantee discontiguous pages.  So we can flush out these
> kinds of fragile assumptions.  

So did you actually see anything wrong with my last patch which catches
discontiguous page tables and copies over the early fixmap translations?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ