lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID:
 <SN6PR02MB415774DC14AA213C47C443ABD4BBA@SN6PR02MB4157.namprd02.prod.outlook.com>
Date: Fri, 2 Jan 2026 16:27:20 +0000
From: Michael Kelley <mhklinux@...look.com>
To: Nuno Das Neves <nunodasneves@...ux.microsoft.com>,
	"linux-hyperv@...r.kernel.org" <linux-hyperv@...r.kernel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"skinsburskii@...ux.microsoft.com" <skinsburskii@...ux.microsoft.com>
CC: "kys@...rosoft.com" <kys@...rosoft.com>, "haiyangz@...rosoft.com"
	<haiyangz@...rosoft.com>, "wei.liu@...nel.org" <wei.liu@...nel.org>,
	"decui@...rosoft.com" <decui@...rosoft.com>, "longli@...rosoft.com"
	<longli@...rosoft.com>, "prapal@...ux.microsoft.com"
	<prapal@...ux.microsoft.com>, "mrathor@...ux.microsoft.com"
	<mrathor@...ux.microsoft.com>, "paekkaladevi@...ux.microsoft.com"
	<paekkaladevi@...ux.microsoft.com>
Subject: RE: [PATCH v2 1/3] mshv: Ignore second stats page map result failure

From: Nuno Das Neves <nunodasneves@...ux.microsoft.com> Sent: Monday, December 29, 2025 4:28 PM
> 
> On 12/8/2025 7:12 AM, Michael Kelley wrote:
> > From: Nuno Das Neves <nunodasneves@...ux.microsoft.com> Sent: Friday, December 5, 2025 10:59 AM
> >>
> >> From: Purna Pavan Chandra Aekkaladevi <paekkaladevi@...ux.microsoft.com>
> >>
> >> Older versions of the hypervisor do not support HV_STATS_AREA_PARENT
> >> and return HV_STATUS_INVALID_PARAMETER for the second stats page
> >> mapping request.
> >>
> >> This results a failure in module init. Instead of failing, gracefully
> >> fall back to populating stats_pages[HV_STATS_AREA_PARENT] with the
> >> already-mapped stats_pages[HV_STATS_AREA_SELF].
> >
> > This explains "what" this patch does. But could you add an explanation of "why"
> > substituting SELF for the unavailable PARENT is the right thing to do? As a somewhat
> > outside reviewer, I don't know enough about SELF vs. PARENT to immediately know
> > why this substitution makes sense.
> >
> I'll attempt to explain. I'm a little hindered by the fact that like many of the
> root interfaces this is not well-documented, but this is my understanding:
> 
> The stats areas HV_STATS_AREA_SELF and HV_STATS_AREA_PARENT indicate the
> privilege level of the data in the mapped stats page.

OK. But evidently that difference in "privilege level" (whatever that means) doesn't
affect what the root partition can do to read and display the data in debugfs, right?

> 
> Both SELF and PARENT contain the same fields, but some fields that are 0 in the
> SELF page may be nonzero in PARENT page, and vice-versa. So, to read all the fields
> we need to map both pages if possible, and prioritize reading non-zero data from
> each field, by checking both the SELF and PARENT pages.

Overall, this mostly makes sense. Each VP and each partition has associated SELF and
PARENT stats pages. For the SELF page, the stats are presumably for the single
associated VP or partition. But "PARENT" terminology usually implies some kind of
hierarchy, as in a parent has one or more children. Parent-level stats would typically
be an aggregate of all its children's stats. But if that's the case here, choosing at runtime
on a per-field stat basis between SELF and PARENT would produce weird results. So
maybe that typical model of "parent" isn't correct here. If SELF and PARENT are only
indicating some kind of privilege level, maybe the PARENT page for each VP and each
partition is like the SELF page -- it contains stats only for the associated VP or partition.

> 
> I don't know if it's possible for a given field to have a different (nonzero) value
> in both SELF and PARENT pages. I imagine in that case we'd want to prioritize the
> PARENT value, but it may simply not be possible.

It would be nice to confirm that this can't happen. If it can happen, that messes
up trying to construct a sensible model of how this all works. :-)

And a somewhat related question: Assuming that a particular stat appears in
either the SELF or the PARENT page, under what circumstances might that stat
move from one to the other, if ever? I would guess that for a given version of the
hypervisor, the split is always the same, across all VPs and all partitions running
on hypervisors of that version. But a different hypervisor version might split the
stats differently between SELF and PARENT. Of course, this stuff is overall a bit
unusual, so my guess might not be right. 

I ask because making a runtime decision between SELF and PARENT for every
individual stat, every time it is read, is conceptually a lot of wasted motion if the
split is static and know-able ahead of time. But I say "conceptually" because I
can't immediately come up with a way to make things faster or more compact if
the split were to be static and know-able ahead of time. So it may be moot point
from an implementation standpoint, but I'm still interested in the answer from
the standpoint of being able to document the overall model of how this works.

> 
> The API is designed in this way to be backward-compatible with older hypervisors
> that didn't have a concept of SELF and PARENT. Hence on older hypervisors (detectable
> via the error code), all we can do is map SELF and use it for everything.

In cases where PARENT can't be mapped by the root partition, does that mean
some of the stats just aren't available? Or does the hypervisor provide all the
stats in the SELF page?

> 
> > Also, does this patch affect the logic in mshv_vp_dispatch_thread_blocked() where
> > a zero value for the SELF version of VpRootDispatchThreadBlocked is replaced by
> > the PARENT value? But that logic seems to be in the reverse direction -- replacing
> > a missing SELF value with the PARENT value -- whereas this patch is about replacing
> > missing PARENT values with SELF values. So are there two separate PARENT vs. SELF
> > issues overall? And after this patch is in place and PARENT values are replaced with
> > SELF on older hypervisor versions, the logic in mshv_vp_dispatch_thread_blocked()
> > then effectively becomes a no-op if the SELF value is zero, and the return value will
> > be zero. Is that problem?
> >
> This is the same issue, because we only care about any nonzero value in
> mshv_vp_dispatch_thread_blocked(). It doesn't matter which page we check first in that
> code, just that any nonzero value is returned as a boolean to indicate a blocked state.
> 
> The code in question could be rewritten:
> 
> return self_vp_cntrs[VpRootDispatchThreadBlocked] ||
> parent_vp_cntrs[VpRootDispatchThreadBlocked];

OK. It would be more consistent to apply the same logic (check SELF then PARENT,
or vice versa) in both mshv_vp_dispatch_thread_blocked() and in this new debugfs
code. As you know, for me inconsistencies always beg the question of "why"? :-)
But that's a minor point.

> 
> >>
> >> Signed-off-by: Purna Pavan Chandra Aekkaladevi <paekkaladevi@...ux.microsoft.com>
> >> Signed-off-by: Nuno Das Neves <nunodasneves@...ux.microsoft.com>
> >> Reviewed-by: Stanislav Kinsburskii <skinsburskii@...ux.microsoft.com>
> >> ---
> >>  drivers/hv/mshv_root_hv_call.c | 41 ++++++++++++++++++++++++++++++----
> >>  drivers/hv/mshv_root_main.c    |  3 +++
> >>  2 files changed, 40 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/drivers/hv/mshv_root_hv_call.c b/drivers/hv/mshv_root_hv_call.c
> >> index 598eaff4ff29..b1770c7b500c 100644
> >> --- a/drivers/hv/mshv_root_hv_call.c
> >> +++ b/drivers/hv/mshv_root_hv_call.c
> >> @@ -855,6 +855,24 @@ static int hv_call_map_stats_page2(enum hv_stats_object_type type,
> >>  	return ret;
> >>  }
> >>
> >> +static int
> >> +hv_stats_get_area_type(enum hv_stats_object_type type,
> >> +		       const union hv_stats_object_identity *identity)
> >> +{
> >> +	switch (type) {
> >> +	case HV_STATS_OBJECT_HYPERVISOR:
> >> +		return identity->hv.stats_area_type;
> >> +	case HV_STATS_OBJECT_LOGICAL_PROCESSOR:
> >> +		return identity->lp.stats_area_type;
> >> +	case HV_STATS_OBJECT_PARTITION:
> >> +		return identity->partition.stats_area_type;
> >> +	case HV_STATS_OBJECT_VP:
> >> +		return identity->vp.stats_area_type;
> >> +	}
> >> +
> >> +	return -EINVAL;
> >> +}
> >> +
> >>  static int hv_call_map_stats_page(enum hv_stats_object_type type,
> >>  				  const union hv_stats_object_identity *identity,
> >>  				  void **addr)
> >> @@ -863,7 +881,7 @@ static int hv_call_map_stats_page(enum hv_stats_object_type type,
> >>  	struct hv_input_map_stats_page *input;
> >>  	struct hv_output_map_stats_page *output;
> >>  	u64 status, pfn;
> >> -	int ret = 0;
> >> +	int hv_status, ret = 0;
> >>
> >>  	do {
> >>  		local_irq_save(flags);
> >> @@ -878,11 +896,26 @@ static int hv_call_map_stats_page(enum hv_stats_object_type type,
> >>  		pfn = output->map_location;
> >>
> >>  		local_irq_restore(flags);
> >> -		if (hv_result(status) != HV_STATUS_INSUFFICIENT_MEMORY) {
> >> -			ret = hv_result_to_errno(status);
> >> +
> >> +		hv_status = hv_result(status);
> >> +		if (hv_status != HV_STATUS_INSUFFICIENT_MEMORY) {
> >>  			if (hv_result_success(status))
> >>  				break;
> >> -			return ret;
> >> +
> >> +			/*
> >> +			 * Older versions of the hypervisor do not support the
> >> +			 * PARENT stats area. In this case return "success" but
> >> +			 * set the page to NULL. The caller should check for
> >> +			 * this case and instead just use the SELF area.
> >> +			 */
> >> +			if (hv_stats_get_area_type(type, identity) == HV_STATS_AREA_PARENT &&
> >> +			    hv_status == HV_STATUS_INVALID_PARAMETER) {
> >> +				*addr = NULL;
> >> +				return 0;
> >> +			}
> >> +
> >> +			hv_status_debug(status, "\n");
> >> +			return hv_result_to_errno(status);
> >
> > Does the hv_call_map_stats_page2() function need a similar fix? Or is there a linkage
> > in hypervisor functionality where any hypervisor version that supports an overlay GPFN
> > also supports the PARENT stats? If such a linkage is why hv_call_map_stats_page2()
> > doesn't need a similar fix, please add a code comment to that effect in
> > hv_call_map_stats_page2().
> >
> Exactly; hv_call_map_stats_page2() is only available on hypervisors where the PARENT
> page is also available. I'll add a comment.

Thanks.

> 
> >>  		}
> >>
> >>  		ret = hv_call_deposit_pages(NUMA_NO_NODE,
> >> diff --git a/drivers/hv/mshv_root_main.c b/drivers/hv/mshv_root_main.c
> >> index bc15d6f6922f..f59a4ab47685 100644
> >> --- a/drivers/hv/mshv_root_main.c
> >> +++ b/drivers/hv/mshv_root_main.c
> >> @@ -905,6 +905,9 @@ static int mshv_vp_stats_map(u64 partition_id, u32 vp_index,
> >>  	if (err)
> >>  		goto unmap_self;
> >>
> >> +	if (!stats_pages[HV_STATS_AREA_PARENT])
> >> +		stats_pages[HV_STATS_AREA_PARENT] = stats_pages[HV_STATS_AREA_SELF];
> >> +
> >>  	return 0;
> >>
> >>  unmap_self:
> >> --
> >> 2.34.1


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ