lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 25 Oct 2013 12:38:26 -0400
From:	Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>
To:	linux-kernel@...r.kernel.org, xen-devel@...ts.xensource.com,
	Santosh.Jodh@...rix.com, JBeulich@...e.com,
	boris.ostrovsky@...cle.com, david.vrabel@...rix.com,
	mukesh.rathor@...cle.com, xhejtman@....muni.cz,
	yuval.shaia@...cle.com
Subject: Re: [v1 1/2] xen/p2m: Create identity mappings for PFNs beyound E820
 and PCI BARs

On Fri, Oct 25, 2013 at 11:03:20AM -0400, Konrad Rzeszutek Wilk wrote:
> On bootup the E820 "gaps" or E820_RESV regions are marked as
> identity regions. Meaning that any lookup done in the P2M
> will return the same value: pfn_to_mfn(pfn) == pfn.
> 
> This is needed for PCI devices so that drivers can reference
> the correct bus address. Unfortunatly there are also PCIe
> devices which setup their MMIO region above the E820. By default
> we assume in the P2M that any region above the last E820 entry
> is to be used for ballooning. That is not true - and we don't
> mark such regions as IDENTIY_FRAME but INVALID_P2M_ENTRY.
> The result is that any lookup in the P2M (pfn_to_mfn(pfn) == 0)
> gets us the 0 bus address which is hardly correct.
> 
> A solution that this patch implements is to walk the PCI device
> BAR regions and mark them as IDENTITY_FRAMEs in the P2M.
> Naturally some checks are needed such as making sure that the
> BAR regions are not part of the balloon pages, nor regular RAM.
> 
> We only set them to IDENTITY if the:
>  pfn_to_mfn(pfn) == INVALID_P2M_ENTRY.
> 
> Another solution would be to mark all P2M entries beyond the
> last E820 entry _and_ not in the balloon regions as IDENTITY.
> 
> If done, that means in worst case we have to reserve MAX_DOMAIN_PAGES
> pages (so 2MB) of virtual space in case we have to create
> new P2M leafs. We could be fancy and make the P2M code understand
> p2m_mid_missing and p2_mid_identity and do the right things.
> But that is quite complex while this particular patch only
> gets invoked if there are PCI devices. Another solution (David
> Vrabel ideas) was to consider INVALID_P2M_ENTRY as 1-1 regions.
> The author of this patch is not sure of the ramifications
> as it would require surgery in various P2M codebits.
> 
> Reported-by: Lukas Hejtmanek <xhejtman@....muni.cz>
> Reported-by: Lance Larsh <lance.larsh@...cle.com>
> CC: Boris Ostrovsky <boris.ostrovsky@...cle.com>
> CC: David Vrabel <david.vrabel@...rix.com>
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>
> ---
>  drivers/xen/balloon.c | 19 +++++++++++++
>  drivers/xen/pci.c     | 79 +++++++++++++++++++++++++++++++++++++++++++++++++--
>  include/xen/balloon.h |  1 +
>  3 files changed, 96 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
> index b232908..258e3f9 100644
> --- a/drivers/xen/balloon.c
> +++ b/drivers/xen/balloon.c
> @@ -133,6 +133,25 @@ static void balloon_append(struct page *page)
>  	adjust_managed_page_count(page, -1);
>  }
>  
> +/*
> + * Check if any the balloon pages overlap with the supplied
> + * pfn and its range.
> + */
> +bool balloon_pfn(unsigned long pfn, unsigned long nr)
> +{
> +	struct page *page;
> +
> +	if (list_empty(&ballooned_pages))
> +		return false;
> +
> +	list_for_each_entry(page, &ballooned_pages, lru) {
> +		unsigned long b_pfn = page_to_pfn(page);
> +
> +		if (b_pfn >= pfn && b_pfn < pfn + nr)
> +			return true;
> +	}
> +	return false;
> +}
>  /* balloon_retrieve: rescue a page from the balloon, if it is not empty. */
>  static struct page *balloon_retrieve(bool prefer_highmem)
>  {
> diff --git a/drivers/xen/pci.c b/drivers/xen/pci.c
> index 18fff88..6b86eda 100644
> --- a/drivers/xen/pci.c
> +++ b/drivers/xen/pci.c
> @@ -22,6 +22,9 @@
>  #include <xen/xen.h>
>  #include <xen/interface/physdev.h>
>  #include <xen/interface/xen.h>
> +#include <xen/interface/memory.h>
> +#include <xen/page.h>
> +#include <xen/balloon.h>
>  
>  #include <asm/xen/hypervisor.h>
>  #include <asm/xen/hypercall.h>
> @@ -127,6 +130,72 @@ static int xen_add_device(struct device *dev)
>  	return r;
>  }
>  
> +static void xen_p2m_add_device(struct device *dev)
> +{
> +	int i;
> +	struct pci_dev *pci_dev = to_pci_dev(dev);
> +
> +	/* Verify whether the MMIO BARs are 1-1 in the P2M. */
> +	for (i = 0; i < PCI_NUM_RESOURCES; i++) {
> +		unsigned long pfn, start, end, ok_pfns;
> +		char bus_addr[64];
> +		char *fmt;
> +
> +		if (!pci_resource_len(pci_dev, i))
> +			continue;
> +
> +		if (pci_resource_flags(pci_dev, i) == IORESOURCE_IO)
> +			fmt = " (bus address [%#06llx-%#06llx])";
> +		else
> +			fmt = " (bus address [%#010llx-%#010llx])";
> +
> +		snprintf(bus_addr, sizeof(bus_addr), fmt,
> +			 (unsigned long long) (pci_resource_start(pci_dev, i)),
> +			 (unsigned long long) (pci_resource_end(pci_dev, i)));
> +
> +		start = pci_resource_start(pci_dev, i) >> PAGE_SHIFT;
> +		end = pci_resource_end(pci_dev, i) >> PAGE_SHIFT;
> +
> +		/*
> +		 * We don't worry about the balloon scratch page as it has a
> +		 * valid PFN - which means we will catch in the loop below.
> +		 */
> +		if (balloon_pfn(start, end - start)) {
> +			dev_warn(dev, "%s is within balloon pages!\n", bus_addr);
> +			continue;
> +		}
> +
> +		for (ok_pfns = 0, pfn = start; pfn < end; pfn++) {
> +			unsigned long mfn = pfn_to_mfn(pfn);
> +
> +			if (mfn == pfn) {
> +				ok_pfns++;
> +				continue;
> +			}
> +			if (mfn != INVALID_P2M_ENTRY) { /* RAM */
> +				dev_warn(dev, "%s is within RAM [%lx] region!\n", bus_addr, pfn);
> +				break;
> +			}
> +		}
> +		if (ok_pfns == end - start) /* All good. */
> +			continue;
> +
> +		if (pfn != end - 1) /* We broke out of the loop above. */
> +			continue;

There are some bugs in this code so please wait until the next posting
which hopefully will also include me testing it on affected hardware.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ