lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ee5a226b-0e12-8823-0961-bf2fe0bd494e@gmail.com>
Date:   Fri, 19 Nov 2021 22:23:16 +0200
From:   Oleksandr <olekstysh@...il.com>
To:     Stefano Stabellini <sstabellini@...nel.org>
Cc:     xen-devel@...ts.xenproject.org,
        linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
        Oleksandr Tyshchenko <oleksandr_tyshchenko@...m.com>,
        Russell King <linux@...linux.org.uk>,
        Boris Ostrovsky <boris.ostrovsky@...cle.com>,
        Juergen Gross <jgross@...e.com>, Julien Grall <julien@....org>
Subject: Re: [PATCH V2 4/4] arm/xen: Read extended regions from DT and init
 Xen resource


On 19.11.21 03:19, Stefano Stabellini wrote:

Hi Stefano

> On Wed, 10 Nov 2021, Oleksandr wrote:
>> On 28.10.21 04:40, Stefano Stabellini wrote:
>>
>> Hi Stefano
>>
>> I am sorry for the late response.
>>
>>> On Tue, 26 Oct 2021, Oleksandr Tyshchenko wrote:
>>>> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@...m.com>
>>>>
>>>> This patch implements arch_xen_unpopulated_init() on Arm where
>>>> the extended regions (if any) are gathered from DT and inserted
>>>> into passed Xen resource to be used as unused address space
>>>> for Xen scratch pages by unpopulated-alloc code.
>>>>
>>>> The extended region (safe range) is a region of guest physical
>>>> address space which is unused and could be safely used to create
>>>> grant/foreign mappings instead of wasting real RAM pages from
>>>> the domain memory for establishing these mappings.
>>>>
>>>> The extended regions are chosen by the hypervisor at the domain
>>>> creation time and advertised to it via "reg" property under
>>>> hypervisor node in the guest device-tree. As region 0 is reserved
>>>> for grant table space (always present), the indexes for extended
>>>> regions are 1...N.
>>>>
>>>> If arch_xen_unpopulated_init() fails for some reason the default
>>>> behaviour will be restored (allocate xenballooned pages).
>>>>
>>>> This patch also removes XEN_UNPOPULATED_ALLOC dependency on x86.
>>>>
>>>> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@...m.com>
>>>> ---
>>>> Changes RFC -> V2:
>>>>      - new patch, instead of
>>>>       "[RFC PATCH 2/2] xen/unpopulated-alloc: Query hypervisor to provide
>>>> unallocated space"
>>>> ---
>>>>    arch/arm/xen/enlighten.c | 112
>>>> +++++++++++++++++++++++++++++++++++++++++++++++
>>>>    drivers/xen/Kconfig      |   2 +-
>>>>    2 files changed, 113 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
>>>> index dea46ec..1a1e0d3 100644
>>>> --- a/arch/arm/xen/enlighten.c
>>>> +++ b/arch/arm/xen/enlighten.c
>>>> @@ -62,6 +62,7 @@ static __read_mostly unsigned int xen_events_irq;
>>>>    static phys_addr_t xen_grant_frames;
>>>>      #define GRANT_TABLE_INDEX   0
>>>> +#define EXT_REGION_INDEX    1
>>>>      uint32_t xen_start_flags;
>>>>    EXPORT_SYMBOL(xen_start_flags);
>>>> @@ -303,6 +304,117 @@ static void __init xen_acpi_guest_init(void)
>>>>    #endif
>>>>    }
>>>>    +#ifdef CONFIG_XEN_UNPOPULATED_ALLOC
>>>> +int arch_xen_unpopulated_init(struct resource *res)
>>>> +{
>>>> +	struct device_node *np;
>>>> +	struct resource *regs, *tmp_res;
>>>> +	uint64_t min_gpaddr = -1, max_gpaddr = 0;
>>>> +	unsigned int i, nr_reg = 0;
>>>> +	struct range mhp_range;
>>>> +	int rc;
>>>> +
>>>> +	if (!xen_domain())
>>>> +		return -ENODEV;
>>>> +
>>>> +	np = of_find_compatible_node(NULL, NULL, "xen,xen");
>>>> +	if (WARN_ON(!np))
>>>> +		return -ENODEV;
>>>> +
>>>> +	/* Skip region 0 which is reserved for grant table space */
>>>> +	while (of_get_address(np, nr_reg + EXT_REGION_INDEX, NULL, NULL))
>>>> +		nr_reg++;
>>>> +	if (!nr_reg) {
>>>> +		pr_err("No extended regions are found\n");
>>>> +		return -EINVAL;
>>>> +	}
>>>> +
>>>> +	regs = kcalloc(nr_reg, sizeof(*regs), GFP_KERNEL);
>>>> +	if (!regs)
>>>> +		return -ENOMEM;
>>>> +
>>>> +	/*
>>>> +	 * Create resource from extended regions provided by the hypervisor to
>>>> be
>>>> +	 * used as unused address space for Xen scratch pages.
>>>> +	 */
>>>> +	for (i = 0; i < nr_reg; i++) {
>>>> +		rc = of_address_to_resource(np, i + EXT_REGION_INDEX,
>>>> &regs[i]);
>>>> +		if (rc)
>>>> +			goto err;
>>>> +
>>>> +		if (max_gpaddr < regs[i].end)
>>>> +			max_gpaddr = regs[i].end;
>>>> +		if (min_gpaddr > regs[i].start)
>>>> +			min_gpaddr = regs[i].start;
>>>> +	}
>>>> +
>>>> +	/* Check whether the resource range is within the hotpluggable range
>>>> */
>>>> +	mhp_range = mhp_get_pluggable_range(true);
>>>> +	if (min_gpaddr < mhp_range.start)
>>>> +		min_gpaddr = mhp_range.start;
>>>> +	if (max_gpaddr > mhp_range.end)
>>>> +		max_gpaddr = mhp_range.end;
>>>> +
>>>> +	res->start = min_gpaddr;
>>>> +	res->end = max_gpaddr;
>>>> +
>>>> +	/*
>>>> +	 * Mark holes between extended regions as unavailable. The rest of
>>>> that
>>>> +	 * address space will be available for the allocation.
>>>> +	 */
>>>> +	for (i = 1; i < nr_reg; i++) {
>>>> +		resource_size_t start, end;
>>>> +
>>>> +		start = regs[i - 1].end + 1;
>>>> +		end = regs[i].start - 1;
>>>> +
>>>> +		if (start > (end + 1)) {
>>> Should this be:
>>>
>>> if (start >= end)
>>>
>>> ?
>> Yes, we can do this here (since the checks are equivalent) but ...
>>
>>>> +			rc = -EINVAL;
>>>> +			goto err;
>>>> +		}
>>>> +
>>>> +		/* There is no hole between regions */
>>>> +		if (start == (end + 1))
>>> Also here, shouldn't it be:
>>>
>>> if (start == end)
>>>
>>> ?
>>     ... not here.
>>
>> As
>>
>> "(start == (end + 1))" is equal to "(regs[i - 1].end + 1 == regs[i].start)"
>>
>> but
>>
>> "(start == end)" is equal to "(regs[i - 1].end + 1 == regs[i].start - 1)"
>   
> OK. But the check:
>
>    if (start >= end)
>
> Actually covers both cases so that's the only check we need?

Sorry, I don't entirely understand the question.
Is the question to use only a single check in that loop?

Paste the updated code which I have locally for the convenience.

  [snip]

     /*
      * Mark holes between extended regions as unavailable. The rest of that
      * address space will be available for the allocation.
      */
     for (i = 1; i < nr_reg; i++) {
         resource_size_t start, end;

         start = regs[i - 1].end + 1;
         end = regs[i].start - 1;

         if (start > (end + 1)) {
             rc = -EINVAL;
             goto err;
         }

         /* There is no hole between regions */
         if (start == (end + 1))
             continue;

         tmp_res = kzalloc(sizeof(*tmp_res), GFP_KERNEL);
         if (!tmp_res) {
             rc = -ENOMEM;
             goto err;
         }

         tmp_res->name = "Unavailable space";
         tmp_res->start = start;
         tmp_res->end = end;

         rc = insert_resource(&xen_resource, tmp_res);
         if (rc) {
             pr_err("Cannot insert resource %pR (%d)\n", tmp_res, rc);
             kfree(tmp_res);
             goto err;
         }
     }

[snip]


1. The first check is to detect an overlap (which is a wrong 
configuration, correct?) and bail out if true (for example, regX: 
0x81000000...0x82FFFFFF and regY: 0x82000000...0x83FFFFFF).
2. The second check is just to skip current iteration as there is no 
space/hole between regions (for example, regX: 0x81000000...0x82FFFFFF 
and regY: 0x83000000...0x83FFFFFF).
Therefore I think they should be distinguished.

Yes, both check could be transformed to a single one, but this way the 
overlaps will be ignored:
if (start >= (end + 1))
     continue;

Or I really missed something?


>
>
>>> I think I am missing again something in termination accounting :-)
>> If I understand correctly, we need to follow "end = start + size - 1" rule, so
>> the "end" is the last address inside a range, but not the "first" address
>> outside of a range))
> yeah
>   
>
>>>> +			continue;
>>>> +
>>>> +		/* Check whether the hole range is within the resource range
>>>> */
>>>> +		if (start < res->start || end > res->end) {
>>> By definition I don't think this check is necessary as either condition
>>> is impossible?
>>
>> This is a good question, let me please explain.
>> Not all extended regions provided by the hypervisor can be used here. This is
>> because the addressable physical memory range for which the linear mapping
>> could be created has limits on Arm, and maximum addressable range depends on
>> the VA space size (CONFIG_ARM64_VA_BITS_XXX). So we decided to not filter them
>> in hypervisor as this logic could be quite complex as different OS may have
>> different requirement, etc. This means that we need to make sure that regions
>> are within the hotpluggable range to avoid a failure later on when a region is
>> pre-validated by the memory hotplug path.
>>
>> The following code limits the resource range based on that:
>>
>> +    /* Check whether the resource range is within the hotpluggable range */
>> +    mhp_range = mhp_get_pluggable_range(true);
>> +    if (min_gpaddr < mhp_range.start)
>> +        min_gpaddr = mhp_range.start;
>> +    if (max_gpaddr > mhp_range.end)
>> +        max_gpaddr = mhp_range.end;
>> +
>> +    res->start = min_gpaddr;
>> +    res->end = max_gpaddr;
>>
>> In current loop (when calculating and inserting holes) we also need to make
>> sure that resulting hole range is within the resource range (and adjust/skip
>> it if not true) as regs[] used for the calculations contains raw regions as
>> they described in DT so not updated. Otherwise insert_resource() down the
>> function will return an error for the conflicting operations. Yes, I could
>> took a different route and update regs[] in advance to adjust/skip
>> non-suitable regions in front, but I decided to do it on the fly in the loop
>> here, I thought doing it in advance would add some overhead/complexity. What
>> do you think?
> I understand now.
>
>
>> So I am afraid this check is necessary here.
>>
>> For example in my environment the extended regions are:
>>
>> (XEN) Extended region 0: 0->0x8000000
>> (XEN) Extended region 1: 0xc000000->0x30000000
>> (XEN) Extended region 2: 0x40000000->0x47e00000
>> (XEN) Extended region 3: 0xd0000000->0xe6000000
>> (XEN) Extended region 4: 0xe7800000->0xec000000
>> (XEN) Extended region 5: 0xf1200000->0xfd000000
>> (XEN) Extended region 6: 0x100000000->0x500000000
>> (XEN) Extended region 7: 0x580000000->0x600000000
>> (XEN) Extended region 8: 0x680000000->0x700000000
>> (XEN) Extended region 9: 0x780000000->0x10000000000
>>
>> *With* the check the holes are:
>>
>> holes [47e00000 - cfffffff]
>> holes [e6000000 - e77fffff]
>> holes [ec000000 - f11fffff]
>> holes [fd000000 - ffffffff]
>> holes [500000000 - 57fffffff]
>> holes [600000000 - 67fffffff]
>> holes [700000000 - 77fffffff]
>>
>> And they seem to look correct, you can see that two possible holes between
>> extended regions 0-1 (8000000-bffffff) and 1-2 (30000000-3fffffff) were
>> skipped as they entirely located below res->start
>> which is 0x40000000 in my case (48-bit VA: 0x40000000 - 0x80003fffffff).
>>
>> *Without* the check these two holes won't be skipped and as the result
>> insert_resource() will fail.
>>
>>
>> **********
>>
>>
>> I have one idea how we can simplify filter logic, we can drop all checks here
>> (including confusing one) in Arm code and update common code a bit:
>>
>> diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
>> index 1a1e0d3..ed5b855 100644
>> --- a/arch/arm/xen/enlighten.c
>> +++ b/arch/arm/xen/enlighten.c
>> @@ -311,7 +311,6 @@ int arch_xen_unpopulated_init(struct resource *res)
>>          struct resource *regs, *tmp_res;
>>          uint64_t min_gpaddr = -1, max_gpaddr = 0;
>>          unsigned int i, nr_reg = 0;
>> -       struct range mhp_range;
>>          int rc;
>>
>>          if (!xen_domain())
>> @@ -349,13 +348,6 @@ int arch_xen_unpopulated_init(struct resource *res)
>>                          min_gpaddr = regs[i].start;
>>          }
>>
>> -       /* Check whether the resource range is within the hotpluggable range
>> */
>> -       mhp_range = mhp_get_pluggable_range(true);
>> -       if (min_gpaddr < mhp_range.start)
>> -               min_gpaddr = mhp_range.start;
>> -       if (max_gpaddr > mhp_range.end)
>> -               max_gpaddr = mhp_range.end;
>> -
>>          res->start = min_gpaddr;
>>          res->end = max_gpaddr;
>>
>> @@ -378,17 +370,6 @@ int arch_xen_unpopulated_init(struct resource *res)
>>                  if (start == (end + 1))
>>                          continue;
>>
>> -               /* Check whether the hole range is within the resource range
>> */
>> -               if (start < res->start || end > res->end) {
>> -                       if (start < res->start)
>> -                               start = res->start;
>> -                       if (end > res->end)
>> -                               end = res->end;
>> -
>> -                       if (start >= (end + 1))
>> -                               continue;
>> -               }
>> -
>>                  tmp_res = kzalloc(sizeof(*tmp_res), GFP_KERNEL);
>>                  if (!tmp_res) {
>>                          rc = -ENOMEM;
>> diff --git a/drivers/xen/unpopulated-alloc.c b/drivers/xen/unpopulated-alloc.c
>> index 1f1d8d8..a5d3ebb 100644
>> --- a/drivers/xen/unpopulated-alloc.c
>> +++ b/drivers/xen/unpopulated-alloc.c
>> @@ -39,6 +39,7 @@ static int fill_list(unsigned int nr_pages)
>>          void *vaddr;
>>          unsigned int i, alloc_pages = round_up(nr_pages, PAGES_PER_SECTION);
>>          int ret;
>> +       struct range mhp_range;
>>
>>          res = kzalloc(sizeof(*res), GFP_KERNEL);
>>          if (!res)
>> @@ -47,8 +48,10 @@ static int fill_list(unsigned int nr_pages)
>>          res->name = "Xen scratch";
>>          res->flags = IORESOURCE_MEM | IORESOURCE_BUSY;
>>
>> +       mhp_range = mhp_get_pluggable_range(true);
>> +
>>          ret = allocate_resource(target_resource, res,
>> -                               alloc_pages * PAGE_SIZE, 0, -1,
>> +                               alloc_pages * PAGE_SIZE, mhp_range.start,
>> mhp_range.end,
>>                                  PAGES_PER_SECTION * PAGE_SIZE, NULL, NULL);
>>          if (ret < 0) {
>>                  pr_err("Cannot allocate new IOMEM resource\n");
>> (END)
>>
>> I believe, this will work on x86 as arch_get_mappable_range() is not
>> implemented there,
>> and the default option contains exactly what being used currently (0, -1).
>>
>> struct range __weak arch_get_mappable_range(void)
>> {
>>      struct range mhp_range = {
>>          .start = 0UL,
>>          .end = -1ULL,
>>      };
>>      return mhp_range;
>> }
>>
>> And this is going to be more generic and clear, what do you think?
> Yeah this is much better, good thinking!

Great, thank you. Will do in next version.


-- 
Regards,

Oleksandr Tyshchenko

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ