linux-kernel - RE: [PATCH v3] x86/Hyper-V: Support for free page reporting

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <SN4PR2101MB0880DB0606A5A0B72AD244B4C06A9@SN4PR2101MB0880.namprd21.prod.outlook.com>
Date:   Wed, 17 Mar 2021 20:30:30 +0000
From:   Sunil Muthuswamy <sunilmut@...rosoft.com>
To:     Michael Kelley <mikelley@...rosoft.com>,
        Matheus Castello <matheus@...tello.eng.br>,
        "linux-hyperv@...r.kernel.org" <linux-hyperv@...r.kernel.org>,
        Haiyang Zhang <haiyangz@...rosoft.com>,
        Stephen Hemminger <sthemmin@...rosoft.com>,
        Wei Liu <liuwe@...rosoft.com>,
        Tianyu Lan <Tianyu.Lan@...rosoft.com>,
        Wei Liu <wei.liu@...nel.org>, vkuznets <vkuznets@...hat.com>
CC:     KY Srinivasan <kys@...rosoft.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: RE: [PATCH v3] x86/Hyper-V: Support for free page reporting

> > +	if (!(ms_hyperv.priv_high & HV_ENABLE_EXTENDED_HYPERCALLS))
> > +		return 0;
> 
> Return 'false' since the function is declared as bool?
Will fix this in the next iteration.

> > +	if (hv_do_hypercall(HV_EXT_CALL_QUERY_CAPABILITIES, NULL, cap) ==
> > +	    HV_STATUS_SUCCESS)
> 
> Need to mask before checking for HV_STATUS_SUCCESS.  With regard to the
> reserved fields in the returned 64 bit status, the TLFS says "Callers should ignore the
> value in these bits".  There's no promise that they are zero.
Coming in next version.

> 
> > +		ext_cap = *cap;
> > +
> > +	local_irq_restore(flags);
> > +	return ext_cap & cap_query;
> > +}
> 
> As I noted in a review comment back in May, the output arg here is
> only 64 bits in size and could just live on the stack with assurance that
> it won't cross a page boundary.  So the code could be:
> 
> bool hv_query_ext_cap(u64 cap_query)
> {
> 	u64	cap;
> 	u64	status;
> 
> 	if(!(ms_hyperv.priv_high & HV_ENABLE_EXTENDED_HYPERCALLS))
> 		return false;
> 
> 	status = hv_do_hypercall(HV_EXT_CALL_QUERY_CAPABILITIES, NULL, &cap);
> 	if ((status & HV_HYPERCALL_RESULT_MASK) != HV_STATUS_SUCCESS)
> 		cap = 0;
> 
> 	return extcap & cap;
> }
> 
> But if you think there's value in using the designated page for hypercall args,
> I'm OK with just fixing the testing of the status.

Hypercall input/output addresses should be 'virt_to_phys' compatible as 'hv_do_hypercall'
will call that on the address to get the physical address, to pass on to the hypervisor. Stack
variables can be virtually allocated and are not compatible with 'virt_to_phys', but we should
be able to use 'static' variable for this. Will address this in next version.

> 
> > -	pr_info("Hyper-V: features 0x%x, hints 0x%x, misc 0x%x\n",
> > -		ms_hyperv.features, ms_hyperv.hints, ms_hyperv.misc_features);
> > +	pr_info("Hyper-V: privilege flags low:0x%x, high:0x%x, hints:0x%x, misc:0x%x\n",
> 
> Nit.  Could we just use a space instead of a colon before each of the printed hex values?
Sure, coming in next version.

> > @@ -23,6 +23,7 @@ config HYPERV_UTILS
> >  config HYPERV_BALLOON
> >  	tristate "Microsoft Hyper-V Balloon driver"
> >  	depends on HYPERV
> > +	select PAGE_REPORTING
> 
> With this selection made, are the #ifdef CONFIG_PAGE_REPORTING occurrences
> below really needed?  I looked at the virtio balloon driver, which is also does
> "select PAGE_REPORTING", and it does not have any #ifdef's.

Good point. Don't think we need extra 'ifdefs' for page reporting now that it is
implicit with Hyper-V Balloon. Coming in next version.

> >  static struct hv_dynmem_device dm_device;
> > @@ -1568,6 +1573,84 @@ static void balloon_onchannelcallback(void *context)
> >
> >  }
> >
> > +#ifdef CONFIG_PAGE_REPORTING
> > +/* Hyper-V only supports reporting 2MB pages or higher */
> 
> I'm guessing the above is the same on ARM64 where the guest is using 16K
> or 64K page size, because Hyper-V always uses 4K pages and expects all guest
> communication to be in units of 4K pages.

Yes.
 
> > +		range->page.additional_pages =
> > +			(sg->length / HV_MIN_PAGE_REPORTING_LEN) - 1;
> 
> Perhaps verify that sg->length is at least 2 Meg? (similar to verifying that nents
> isn't too big).  If it isn't at least 2 Meg, then additional_pages will get set to -1,
> and I suspect weird things will happen.
I will add an assert.

> 
> I was also thinking about whether sg->length could be big enough to overflow
> the additional_pages field.  sg->length is an unsigned int, so I don't think so.
Yes, the additional_pages is designed to accommodate 32-bits.

> 
> > +		range->base_large_pfn =
> > +			page_to_pfn(sg_page(sg)) >> HV_MIN_PAGE_REPORTING_ORDER;
> 
> page_to_pfn() will do the wrong thing on ARM64 with 16K or 64K pages.
> Use page_to_hvpfn() instead.
Good point.

> > +static void enable_page_reporting(void)
> > +{
> > +	int ret;
> > +
> > +	BUILD_BUG_ON(pageblock_order < HV_MIN_PAGE_REPORTING_ORDER);
> 
> The BUILD_BUG_ON won't work in the case where pageblock_order is
> actually a variable rather than a constant, though that's currently only ia64 and
> powerpc, which we don't directly care about.  Nonetheless, this would break if
> pageblock_order were to become a variable.
> 
I have moved this to a conditional statement. The compiler can optimize the code
away when it is a constant.

> > +	if (ret < 0) {
> > +		dm_device.pr_dev_info.report = NULL;
> > +		pr_err("Failed to enable cold memory discard: %d\n", ret);
> > +	} else {
> > +		pr_info("Cold memory discard hint enabled\n");
> > +	}
> 
> Should the above two messages be prefixed with "Hyper-V: "?
Not needed, as you also replied later.

> Nit:  Typically the error path undoes things in the reverse order. So
> the disable_page_reporting() would occur before the call to
> vmbus_close().
Sure.

> 
> >  	return ret;
> >  }
> > @@ -1753,6 +1843,9 @@ static int balloon_remove(struct hv_device *dev)
> >  #ifdef CONFIG_MEMORY_HOTPLUG
> >  	unregister_memory_notifier(&hv_memory_nb);
> >  	restore_online_page_callback(&hv_online_page);
> > +#endif
> > +#ifdef CONFIG_PAGE_REPORTING
> > +	disable_page_reporting();
> >  #endif
> 
> Same here regarding the ordering.
Noted.

> > + * The whole argument should fit in a page to be able to pass to the hypervisor
> > + * in one hypercall.
> > + */
> > +#define HV_MEMORY_HINT_MAX_GPA_PAGE_RANGES  \
> > +	((PAGE_SIZE - sizeof(struct hv_memory_hint)) / \
> 
> Use HV_HYP_PAGE_SIZE instead of PAGE_SIZE.
Done.

Thanks for the review.