[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4D6D6980.7060304@kernel.org>
Date: Tue, 01 Mar 2011 13:47:44 -0800
From: Yinghai Lu <yinghai@...nel.org>
To: Stefano Stabellini <stefano.stabellini@...citrix.com>
CC: linux-kernel@...r.kernel.org,
Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>,
Jeremy Fitzhardinge <Jeremy.Fitzhardinge@...rix.com>
Subject: Re: "x86-64, mm: Put early page table high" causes crash on Xen
On 03/01/2011 09:21 AM, Stefano Stabellini wrote:
> Yinghai Lu,
> while testing tip/x86/mm on Xen I found out that the commit "x86-64, mm:
> Put early page table high" reliably crashes Linux at boot.
> The reason is well explained by the commit message of
> fef5ba797991f9335bcfc295942b684f9bf613a1:
>
> "Xen requires that all pages containing pagetable entries to be mapped
> read-only. If pages used for the initial pagetable are already mapped
> then we can change the mapping to RO. However, if they are initially
> unmapped, we need to make sure that when they are later mapped, they
> are also mapped RO.
>
> We do this by knowing that the kernel pagetable memory is pre-allocated
> in the range e820_table_start - e820_table_end, so any pfn within this
> range should be mapped read-only. However, the pagetable setup code
> early_ioremaps the pages to write their entries, so we must make sure
> that mappings created in the early_ioremap fixmap area are mapped RW.
> (Those mappings are removed before the pages are presented to Xen
> as pagetable pages.)"
>
> In other words mask_rw_pte (called by xen_set_pte_init) should mark RO
> the already existing pagetable pages (like the ones belonging to the
> initial mappings), while it should mark RW the new pages not yet hooked
> into the pagetable. This is what the following lines used to achieve,
> but don't anymore:
>
> /*
> * If the new pfn is within the range of the newly allocated
> * kernel pagetable, and it isn't being mapped into an
> * early_ioremap fixmap slot, make sure it is RO.
> */
> if (!is_early_ioremap_ptep(ptep) &&
> pfn >= pgt_buf_start && pfn < pgt_buf_end)
> pte = pte_wrprotect(pte);
>
> Unfortunately now we map the already existing initial pagetable pages a
> second time and the new zeroed pages using map_low_page, so we are
> unable to distinguish between the two.
>
> Can we go back to the previous way of accessing pagetable pages from
> kernel_physical_mapping_init, while keeping the new pagetable allocation
> strategy? It seems to me that the introduction of map_low_page is not
> actually required, is it? In that case we could just revert that bit...
> (appended partial revert example).
We do need map_low_page ( BTW, that name is totally misleading...)
the reason is we put page_table high and at that time is not under max_pfn_mapped. (aka not mapped).
So have to use
adr = early_memremap(phys & PAGE_MASK, PAGE_SIZE);
to early map it and Read/Write to it.
Thanks
Yinghai
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists