[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <m1tz8m41yj.fsf@frodo.ebiederm.org>
Date: Mon, 29 Dec 2008 22:22:12 -0800
From: ebiederm@...ssion.com (Eric W. Biederman)
To: Nick Piggin <npiggin@...e.de>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
linux-kernel@...r.kernel.org, tglx@...utronix.de, mingo@...e.hu,
ijc@...lion.org.uk
Subject: Re: early fixmap causes kmap breakage
Nick Piggin <npiggin@...e.de> writes:
> On Mon, Dec 29, 2008 at 03:17:31PM -0800, Andrew Morton wrote:
>> On Thu, 18 Dec 2008 22:15:43 +0100
>> Nick Piggin <npiggin@...e.de> wrote:
>>
>> > Hi,
>> >
>> > I've debugged a problem where i386+pae systems with more than a few CPUs
>> > blow up at boot in the kmap_atomic code.
>>
>> ping?
>
> No further progress here, I'm waiting on input for how to fix this
> "nicely". Meantime, clearing the early fixmap pte I guess works, but
> you lose a page... is it possible to put it into .initdata or is
> there some issue with that? (I guess on a PAE kernel, 4K isn't a
> big deal).
>
>
>> > The problem is that the kmap_atomic pte pages all need to be contiguous
>> > memory because the pte is calculated via the first kmap pte page + an
>> > offset (so as not to have to walk the page tables every time).
>> >
>> > The fixmap setup code crudely allocates contiguous pte pages, which is fine,
>> > but if it finds an already populated pmd entry, then it will not switch it
>> > to a new, contiguous pte page. So the early fixmap introduces a discontig
>> > page table right in the middle of the kmap atomic fixmaps.
Where is this?
>> > Commenting out the eaarly fixmap setup in head_32.S gets everything working
>> > properly. What would be the best way to fix this? Could we put the early
>> > fixmap page table in initdata, and then have the fixmap setup proper first
>> > clear its corresponding pmd entry?
Why would we want or need to?
>> How come users/testers aren't reporting this?
>
> Because apparently nobody tests 32-bit PAE systems with more than a couple
> of CPUs anymore. This bug comes from HW vendor doing testing of SLES11.
Hmm.
I have taken a quick skim and I am not seeming the part of the code you are
talking about. Is the problem code in mainline?
I'm guessing it has something to do with reserve_top_address() being called
with a bad value in the normal case. But I don't see it being called
at all in the normal case.
Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists