[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20101222150615.GF1760@dumpdata.com>
Date: Wed, 22 Dec 2010 10:06:15 -0500
From: Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>
To: Ian Campbell <Ian.Campbell@...rix.com>
Cc: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"jeremy@...p.org" <jeremy@...p.org>,
"hpa@...or.com" <hpa@...or.com>,
Konrad Rzeszutek Wilk <konrad@...nel.org>,
"xen-devel@...ts.xensource.com" <xen-devel@...ts.xensource.com>,
Jan Beulich <JBeulich@...ell.com>
Subject: Re: [Xen-devel] [RFC PATCH v1] Consider void entries in the P2M as
1-1 mapping.
On Wed, Dec 22, 2010 at 08:36:55AM +0000, Ian Campbell wrote:
> On Tue, 2010-12-21 at 21:37 +0000, Konrad Rzeszutek Wilk wrote:
> > In the past we used to think of those regions as "missing" and under
> > the ownership of the balloon code. But the balloon code only operates
> > on a specific region. This region is in lastE820 RAM page (basically
> > any region past nr_pages is considered balloon type page).
>
> That is true at start of day but once the system is up and running the
> balloon driver can make a hole for anything which can be returned by
> alloc_page.
<nods>
>
> The following descriptions seem to consider this correctly but I just
> wanted to clarify.
Yes. Thank you for thinking this one through.
>
> I don't think it's necessarily the last E820 RAM page either, that's
> just what the tools today happen to build. In principal the tools could
> push down a holey e820 (e.g. with PCI holes prepunched etc) and boot the
> domain ballooned down such that the N-2, N-3 e820 RAM regions are above
> nr_pages too.
OK, but they would be marked as E820 RAM regions, right?
>
> > This patchset considers the void entries as "identity" and for balloon
> > pages you have to set the PFNs to be "missing". This means that the
> > void entries are now considered 1-1, so for PFNs which exist in large
> > gaps of the P2M space will return the same PFN.
>
> I would naively have expected that a missing entry indicated an
> invalid/missing entry rather than an identity region, it just seems like
It has. For regions that are small, or already allocated it would
stuff the INVALID_P2M_ENTRY in it. For larger areas (so more than 1MB or so)
if there has not been a top entry allocated for it, it will attach
the p2m_mid_missing to it which has pointes to p2m_missing, which in
turn is filled iwht INVALID_P2M_ENTRY.
> the safer default since we are (maybe) more likely to catch an
> INVALID_P2M_ENTRY before handing it to the hypervisor and getting
> ourselves shot.
When I think entry, I think the lowel-level of the tree, not the
top or middle which are the ones that are by default now considered
"identity". FYI, the p2m_identity is stuffed with INVALID_P2M_ENTRY
so if somebody does get a hold of the value there somehow without
first trying to set it, we would catch it and do this:
(xen/mmu.c, pte_pfn_to_mfn function):
/*
* If there's no mfn for the pfn, then just create an
* empty non-present pte. Unfortunately this loses
* information about the original pfn, so
* pte_mfn_to_pfn is asymmetric.
*/
if (unlikely(mfn == INVALID_P2M_ENTRY)) {
mfn = 0;
flags = 0;
}
>
> In that case the identity regions would need to be explicitly
> registered, is that harder to do?
It might not be.. but it would end up in the same logic path (in
the pte_pfn_to_mfn function).
>
> I guess we could register any hole or explicit non-RAM region in the
> e820 as identity but do we sometimes see I/O memory above the top of the
> e820 or is there some other problem I'm not thinking of?
Hot plug memory is one. There are also some PCI BARs that are above
that region (but I can't remember the details). Jeremy mentioned
something about Fujitsu machines.
>
> > The xen/mmu.c code where it deals with _PAGE_IOMAP can be removed, but
> > to guard against regressions or bugs lets take it one patchset at a
> > time.
>
> Could we have a WARN_ON(_PAGE_IOMAP && !PAGE_IDENTITY) (or whatever the
> predicates really are) in some relevant places in mmu.c?
The PAGE_IDENTITY or (IDENTITY_P2M_ENTRY) is never set anywhere. We could
do this:
WARN_ON(pfn_to_mfn(pfn)==pfn && (flag & _PAGE_IOMAP))
but that would be printed all the time.
Unless I saved some extra flag (as you were alluding to earlier) and did that
along with the MFN and for identity mappings just returned that flag unconditionaly.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists