lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20101222150615.GF1760@dumpdata.com>
Date:	Wed, 22 Dec 2010 10:06:15 -0500
From:	Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>
To:	Ian Campbell <Ian.Campbell@...rix.com>
Cc:	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"jeremy@...p.org" <jeremy@...p.org>,
	"hpa@...or.com" <hpa@...or.com>,
	Konrad Rzeszutek Wilk <konrad@...nel.org>,
	"xen-devel@...ts.xensource.com" <xen-devel@...ts.xensource.com>,
	Jan Beulich <JBeulich@...ell.com>
Subject: Re: [Xen-devel] [RFC PATCH v1] Consider void entries in the P2M as
 1-1 mapping.

On Wed, Dec 22, 2010 at 08:36:55AM +0000, Ian Campbell wrote:
> On Tue, 2010-12-21 at 21:37 +0000, Konrad Rzeszutek Wilk wrote:
> > In the past we used to think of those regions as "missing" and under
> > the ownership of the balloon code. But the balloon code only operates
> > on a specific region. This region is in lastE820 RAM page (basically
> > any region past nr_pages is considered balloon type page). 
> 
> That is true at start of day but once the system is up and running the
> balloon driver can make a hole for anything which can be returned by
> alloc_page.

<nods>
> 
> The following descriptions seem to consider this correctly but I just
> wanted to clarify.

Yes. Thank you for thinking this one through.
> 
> I don't think it's necessarily the last E820 RAM page either, that's
> just what the tools today happen to build. In principal the tools could
> push down a holey e820 (e.g. with PCI holes prepunched etc) and boot the
> domain ballooned down such that the N-2, N-3 e820 RAM regions are above
> nr_pages too.

OK, but they would be marked as E820 RAM regions, right?
> 
> > This patchset considers the void entries as "identity" and for balloon
> > pages you have to set the PFNs to be "missing". This means that the
> > void entries are now considered 1-1, so for PFNs which exist in large
> > gaps of the P2M space will return the same PFN.
> 
> I would naively have expected that a missing entry indicated an
> invalid/missing entry rather than an identity region, it just seems like

It has. For regions that are small, or already allocated it would
stuff the INVALID_P2M_ENTRY in it. For larger areas (so more than 1MB or so)
if there has not been a top entry allocated for it, it will attach
the p2m_mid_missing to it which has pointes to p2m_missing, which in
turn is filled iwht INVALID_P2M_ENTRY.

> the safer default since we are (maybe) more likely to catch an
> INVALID_P2M_ENTRY before handing it to the hypervisor and getting
> ourselves shot.

When I think entry, I think the lowel-level of the tree, not the
top or middle which are the ones that are by default now considered
"identity". FYI, the p2m_identity is stuffed with INVALID_P2M_ENTRY
so if somebody does get a hold of the value there somehow without
first trying to set it, we would catch it and do this:

(xen/mmu.c, pte_pfn_to_mfn function):

  		/*   
                 * If there's no mfn for the pfn, then just create an
                 * empty non-present pte.  Unfortunately this loses
                 * information about the original pfn, so
                 * pte_mfn_to_pfn is asymmetric.
                 */
                if (unlikely(mfn == INVALID_P2M_ENTRY)) {
                        mfn = 0; 
                        flags = 0; 
                }    


> 
> In that case the identity regions would need to be explicitly
> registered, is that harder to do?

It might not be.. but it would end up in the same logic path (in
the pte_pfn_to_mfn function).

> 
> I guess we could register any hole or explicit non-RAM region in the
> e820 as identity but do we sometimes see I/O memory above the top of the
> e820 or is there some other problem I'm not thinking of?

Hot plug memory is one. There are also some PCI BARs that are above
that region (but I can't remember the details). Jeremy mentioned
something about Fujitsu machines.

> 
> > The xen/mmu.c code where it deals with _PAGE_IOMAP can be removed, but
> > to guard against regressions or bugs lets take it one patchset at a
> > time.
> 
> Could we have a WARN_ON(_PAGE_IOMAP && !PAGE_IDENTITY) (or whatever the
> predicates really are) in some relevant places in mmu.c?

The PAGE_IDENTITY or (IDENTITY_P2M_ENTRY) is never set anywhere. We could
do this:

  WARN_ON(pfn_to_mfn(pfn)==pfn && (flag & _PAGE_IOMAP))

but that would be printed all the time.

Unless I saved some extra flag (as you were alluding to earlier) and did that
along with the MFN and for identity mappings just returned that flag unconditionaly.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ