[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4DC9CD60.6010200@zytor.com>
Date: Tue, 10 May 2011 16:42:24 -0700
From: "H. Peter Anvin" <hpa@...or.com>
To: Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>
CC: Linus Torvalds <torvalds@...ux-foundation.org>, mingo@...e.hu,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
hpa@...ux.intel.com,
Stefano Stabellini <stefano.stabellini@...citrix.com>,
Jeremy Fitzhardinge <jeremy@...p.org>
Subject: Re: Linux 2.6.39-rc7
On 05/10/2011 04:36 PM, Konrad Rzeszutek Wilk wrote:
>>
>> I was hoping that the rc6 could stretch out so that by the time hpa came back from
>> his travels he would have had a chance to look at: https://lkml.org/lkml/2011/5/5/226
>
> I had a chance to briefly talk on IRC with hpa and he mentioned I should
> send a note to Ingo about this since hpa won't be able to do anything until Friday.
>
> Ingo,
> Not sure how familiar you are with this issue, but let me briefly explain it.
> Yinghai provided a patch, which calls memblock_find_in_range(), then calls
> kernel_physical_mapping_init, which populates the pagetable between pgt_buf_start
> and pgt_buf_top and once it is done, calls memblock_x86_reserve_range with pgt_buf_start
> and pgt_buf_end (wherein pgt_buf_end<= pgt_buf_top). The memory between pgt_buf_end
> and pgt_buf_top can be re-used later on and it is by other subsystems - NUMA for
> example uses it.
>
> Under Xen, the pagetables end up being marked RO, so what ends up happening is that
> some pages from pgt_buf_end through pgt_buf_top end up RO and the system crashes during
> bootup as NUMA subsystem tries to write to that area. The fix is to essentially mark the
> area from pgt_buf_end through pgt_buf_top to RW.
>
> Stefano posted a patch, which was Acked by Yinghai, but not so by hpa. The concerns
> were that the patch inserts a hook just for this single case and there should be a better
> way of doing this - where we either don't need a hook or provide an semantic explanation
> of the pagetable building and build the patch from there.
>
> Sadly there was/is not enough time in the 2.6.39 train to actually do it properly.
> So I provided another patch (which Linus merged) which crudely tries to mark the area from
> pgt_buf_end through pgt_buf_top to RW and all is done within the Xen MMU code. Sadly it
> does not work on all machines.
>
> Without a resolution to this, the Linux x86_64 kernel cannot boot under Xen. There are two
> options left right now:
> a). Revert 4b239f458c229de044d6905c2b0f9fe16ed9e01e (x86-64, mm: Put early page table high)
> b). or revert the workaround that Linus merged and pick the one that Stefano came up with.
> The patches are available in
> git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git stable/bug-fixes-for-rc6
>
> They touch the generic x86 MMU code.
>
At this point this does indeed seem to be the only reasonable solution.
I'm not happy about either the fix nor the fact that Xen is so fragile
yet wants to piggy back on generic x86 code, but for .39 there really
isn't much opportunity to fix it any other way. Konrad has promised me
to personally drive the work to get a better fix in.
Unfortunately as mentioned I am travelling at the moment and have
limited ability to fix this; if I get a chance I'll look at it and pull
it into tip, but under the circumstances I can't promise anything.
-hpa
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists