lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID:
 <SN6PR02MB4157592293F7501A4E5C73E4D45FA@SN6PR02MB4157.namprd02.prod.outlook.com>
Date: Wed, 23 Jul 2025 14:13:10 +0000
From: Michael Kelley <mhklinux@...look.com>
To: Naman Jain <namjain@...ux.microsoft.com>, "K . Y . Srinivasan"
	<kys@...rosoft.com>, Haiyang Zhang <haiyangz@...rosoft.com>, Wei Liu
	<wei.liu@...nel.org>, Dexuan Cui <decui@...rosoft.com>
CC: Roman Kisel <romank@...ux.microsoft.com>, Anirudh Rayabharam
	<anrayabh@...ux.microsoft.com>, Saurabh Sengar <ssengar@...ux.microsoft.com>,
	Stanislav Kinsburskii <skinsburskii@...ux.microsoft.com>, Nuno Das Neves
	<nunodasneves@...ux.microsoft.com>, ALOK TIWARI <alok.a.tiwari@...cle.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"linux-hyperv@...r.kernel.org" <linux-hyperv@...r.kernel.org>
Subject: RE: [PATCH v5 2/2] Drivers: hv: Introduce mshv_vtl driver

From: Naman Jain <namjain@...ux.microsoft.com> Sent: Wednesday, July 23, 2025 2:59 AM
> 
> On 7/22/2025 10:39 PM, Michael Kelley wrote:
> > From: Naman Jain <namjain@...ux.microsoft.com> Sent: Tuesday, July 22, 2025 4:09 AM
> >>
> >> On 7/18/2025 8:37 PM, Michael Kelley wrote:
> >>> From: Naman Jain <namjain@...ux.microsoft.com> Sent: Thursday, July 17, 2025 9:36 PM
> >>>>
> >>>> On 7/9/2025 10:49 PM, Michael Kelley wrote:
> >>>>> From: Naman Jain <namjain@...ux.microsoft.com> Sent: Wednesday, June 11, 2025 12:27 AM
> >>>
> >>> [snip]
> >>>
> >>>>
> >>>>> Separately, "allow_bitmap" size is 64K bytes, or 512K bits. Is that the
> >>>>> correct size?  From looking at mshv_vtl_hvcall_is_allowed(), I think this
> >>>>> bitmap is indexed by the HV call code, which is a 16 bit value. So you
> >>>>> only need 64K bits, and the size is too big by a factor of 8. In any case,
> >>>>> it seems like the size should not be expressed in terms of PAGE_SIZE.
> >>>>
> >>>> There are HVcall codes which are of type u16. So max(HVcall code) =
> >>>> 0xffff.
> >>>>
> >>>> For every HVcall that needs to be allowed, we are saving HVcall code
> >>>> info in a bitmap in below fashion:
> >>>> if x = HVCall code and bitmap is an array of u64, of size
> >>>> ((0xffff/64=1023) + 1)
> >>>>
> >>>> bitmap[x / 64] = (u64)1 << (x%64);
> >>>>
> >>>> Later on in mshv_vtl_hvcall_is_allowed(), we calculate the array index
> >>>> by dividing it by 64, and then see if call_code/64 bit is set.
> >>>
> >>> I didn't add comments in mshv_vtl_hvcall_is_allowed(), but that code
> >>> can be simplified by recognizing that the Linux kernel bitmap utilities
> >>> can operate on bitmaps that are much larger than just 64 bits. Let's
> >>> assume that the allow_bitmap field in struct mshv_vtl_hvcall_fds has
> >>> 64K bits, regardless of whether it is declared as an array of u64,
> >>> an array of u16, or an array of u8. Then mshv_vtl_hvcall_is_allowed()
> >>> can be implemented as a single line:
> >>>
> >>> 	return test_bit(call_code, fd->allow_bitmap);
> >>>
> >>> There's no need to figure out which array element contains the bit,
> >>> or to construct a mask to select that particular bit in the array element.
> >>> And since call_code is a u16, test_bit won't access outside the allocated
> >>> 64K bits.
> >>>
> >>
> >> I understood it now. This works and is much better. Will incorporate it
> >> in next patch.
> >>
> >>>>
> >>>> Coming to size of allow_bitmap[], it is independent of PAGE_SIZE, and
> >>>> can be safely initialized to 1024 (reducing by a factor of 8).
> >>>> bitmap_size's maximum value is going to be 1024 in current
> >>>> implementation, picking u64 was not mandatory, u16 will also work. Also,
> >>>> item_index is also u16, so I should make bitmap_size as u16.
> >>>
> >>> The key question for me is whether bitmap_size describes the number
> >>> of bits in allow_bitmap, or whether it describes the number of array
> >>> elements in the declared allow_bitmap array. It's more typical to
> >>> describe a bitmap size as the number of bits. Then the value is
> >>> independent of the array element size, as the array element size
> >>> usually doesn't really matter anyway if using the Linux kernel's
> >>> bitmap utilities. The array element size only matters in allocating
> >>> the correct amount of space is for whatever number of bits are
> >>> needed in the bitmap.
> >>>
> >>
> >> I tried to put your suggestions in code. Please let me know if below
> >> works. I tested this and it works. Just that, I am a little hesitant in
> >> changing things on Userspace side of it, which passes these parameters
> >> in IOCTL. This way, userspace remains the same, the confusion of names
> >> may go away, and the code becomes simpler.
> >
> > Yes, I suspected there might be constraints due to not wanting
> > to disturb existing user space code.
> >
> >>
> >> --- a/include/uapi/linux/mshv.h
> >> +++ b/include/uapi/linux/mshv.h
> >> @@ -332,7 +332,7 @@ struct mshv_vtl_set_poll_file {
> >>    };
> >>
> >>    struct mshv_vtl_hvcall_setup {
> >> -       __u64 bitmap_size;
> >> +       __u64 bitmap_array_size;
> >>           __u64 allow_bitmap_ptr; /* pointer to __u64 */
> >>    };
> >>
> >>
> >> --- a/drivers/hv/mshv_vtl_main.c
> >> +++ b/drivers/hv/mshv_vtl_main.c
> >> @@ -52,10 +52,12 @@ static bool has_message;
> >>    static struct eventfd_ctx *flag_eventfds[HV_EVENT_FLAGS_COUNT];
> >>    static DEFINE_MUTEX(flag_lock);
> >>    static bool __read_mostly mshv_has_reg_page;
> >> -#define MAX_BITMAP_SIZE 1024
> >> +
> >> +/* hvcall code is of type u16, allocate a bitmap of size (1 << 16) to accomodate it */
> >
> > s/accomodate/accommodate/
> 
> Acked.
> 
> >
> >> +#define MAX_BITMAP_SIZE (1 << 16)
> >>
> >>    struct mshv_vtl_hvcall_fd {
> >> -       u64 allow_bitmap[MAX_BITMAP_SIZE];
> >> +       u64 allow_bitmap[MAX_BITMAP_SIZE / 64];
> >
> > OK, this seems like a reasonable approach to get the correct
> > number of bits.
> >
> >>           bool allow_map_initialized;
> >>           /*
> >>            * Used to protect hvcall setup in IOCTLs
> >>
> >> @@ -1204,12 +1207,12 @@ static int mshv_vtl_hvcall_do_setup(struct mshv_vtl_hvcall_fd *fd,
> >>                              sizeof(struct mshv_vtl_hvcall_setup))) {
> >>                   return -EFAULT;
> >>           }
> >> -       if (hvcall_setup.bitmap_size > ARRAY_SIZE(fd->allow_bitmap)) {
> >> +       if (hvcall_setup.bitmap_array_size > ARRAY_SIZE(fd->allow_bitmap)) {
> >>                   return -EINVAL;
> >>           }
> >>           if (copy_from_user(&fd->allow_bitmap,
> >>                              (void __user *)hvcall_setup.allow_bitmap_ptr,
> >> -                          hvcall_setup.bitmap_size)) {
> >> +                          hvcall_setup.bitmap_array_size)) {
> >
> > This still doesn't work. copy_from_user() expects a byte count as its
> > 3rd argument. If we have 64K bits in the bitmap, that means 8K bytes,
> > and 1K for bitmap_array_size. So the value of the 3rd argument
> > here is 1K, and you'll be copying 1K bytes when you want to be
> > copying 8K bytes. The 3rd argument needs to be:
> >
> > 		hv_call_setup.bitmap_array_size * sizeof(u64)
> >
> > It's a bit messy to have to add this multiply, and in the bigger
> > picture it might be cleaner to have the bitmap_array be
> > declared in units of u8 instead of u64, but that would affect
> > user space in a way that you probably want to avoid.
> >
> > Have you checked that user space is passing in the correct values
> > to the ioctl? If the kernel side code is wrong, user space might be
> > wrong as well. And if user space is wrong, then you might have
> > the flexibility to change everything to use u8 instead of u64.
> >
> 
> 
> This is correct in Userspace:
> https://github.com/microsoft/openvmm/blob/main/openhcl/hcl/src/ioctl.rs#L905
> 

I'm not really Rust literate, but it does look like userspace is
setting the bitmap_size to the number of *bytes*. User space has
used u64 as its underlying type, which is OK as long as the ioctl
API is understood to take a byte count in the bitmap_size field,
and user space populates that field with a byte count, which it does.

> This was wrong in kernel code internally from the beginning. This did
> not get caught because we took a large array size (2 * PAGE_SIZE) which
> never failed the check
> (hvcall_setup.bitmap_array_size > ARRAY_SIZE(fd->allow_bitmap))
> and hvcall_setup.bitmap_size always had number of bytes to copy.
> 
> We can make allow_bitmap as an array of u8 here, irrespective of how
> userspace is setting the bits in the bitmap because it is finally doing
> the same thing. I'll make the change.

Yep.  So on the kernel side, bitmap_array_size is a *byte* count,
and user space and the kernel side are in agreement. ;-)

> 
> 
> >>                   return -EFAULT;
> >>           }
> >>
> >> @@ -1221,11 +1224,7 @@ static int mshv_vtl_hvcall_do_setup(struct mshv_vtl_hvcall_fd *fd,
> >>
> >>    static bool mshv_vtl_hvcall_is_allowed(struct mshv_vtl_hvcall_fd *fd, u16 call_code)
> >>    {
> >> -       u8 bits_per_item = 8 * sizeof(fd->allow_bitmap[0]);
> >> -       u16 item_index = call_code / bits_per_item;
> >> -       u64 mask = 1ULL << (call_code % bits_per_item);
> >> -
> >> -       return fd->allow_bitmap[item_index] & mask;
> >> +       return test_bit(call_code, (unsigned long *)fd->allow_bitmap);
> >>    }
> >
> > Yes, the rest of this looks good.
> >
> > Michael
> 
> For CPU hotplug, I checked internally and we do not support hotplug in
> VTL2 for OpenHCL. Hotplugging vCPUs in VTL0 is transparent to VTL2 and
> these CPUs remain online in VTL2. VTL2 also won't be hotplugging CPUs
> when VTL0 is running.

Good to know.

> 
> I am planning to put a comment mentioning this in the two places where
> we check for cpu_online() in the driver. Preventing hotplug from
> happening through hotplug notifiers is a bit overkill for me, as of now,
> but if you feel that should be done, I will work on it.

I'm marginally OK with just a comment because VTL2 code is not
running general-purpose workloads.

> 
> Since good enough changes are done in this version, I will send the next
> version soon, unless there are follow-up comments here which needs
> something to change. We can then have a fresh look at it and work
> together to enhance it.

Sounds good.

Michael

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ