lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 2 Mar 2022 14:59:13 -0800
From:   Iouri Tarassov <iourit@...ux.microsoft.com>
To:     Greg KH <gregkh@...uxfoundation.org>
Cc:     kys@...rosoft.com, haiyangz@...rosoft.com, sthemmin@...rosoft.com,
        wei.liu@...nel.org, linux-hyperv@...r.kernel.org,
        linux-kernel@...r.kernel.org, spronovo@...rosoft.com,
        spronovo@...ux.microsoft.com
Subject: Re: [PATCH v3 02/30] drivers: hv: dxgkrnl: Driver initialization and
 loading

Hi Greg,

Thanks a lot for reviewing the patches.

I appreciate the very detailed comments. I my reply I omitted the
comments, which I agree with and will address in subsequent patches.

On 3/1/2022 12:45 PM, Greg KH wrote:

>
> > +long dxgk_unlocked_ioctl(struct file *f, unsigned int p1, unsigned long p2);
> > +
> > +static inline void guid_to_luid(guid_t *guid, struct winluid *luid)
> > +{
> > +	*luid = *(struct winluid *)&guid->b[0];
>
> Why is the cast needed?  Shouldn't you use real types in your
> structures?


The VMBus channel ID, which is GUID, is present in the PCI config space
of the compute device. The convention is that the lower part of the GUID
is LUID (local unique identifier). This function just converts GUID to
LUID. I am not sure what is the ask here.


> > +/*
> > + * The interface version is used to ensure that the host and the guest use the
> > + * same VM bus protocol. It needs to be incremented every time the VM bus
> > + * interface changes. DXGK_VMBUS_LAST_COMPATIBLE_INTERFACE_VERSION is
> > + * incremented each time the earlier versions of the interface are no longer
> > + * compatible with the current version.
> > + */
> > +#define DXGK_VMBUS_INTERFACE_VERSION_OLD		27
> > +#define DXGK_VMBUS_INTERFACE_VERSION			40
> > +#define DXGK_VMBUS_LAST_COMPATIBLE_INTERFACE_VERSION	16
>
> Where do these numbers come from, the hypervisor specification?


The definitions are for the VMBus interface between the virtual compute
device in the guest and the Windows host. The interface version is
updated when the VMBus protocol is changed. These are specific to the
virtual compute device, not to the hypervisor.


> > +/*
> > + * Pointer to the global device data. By design
> > + * there is a single vGPU device on the VM bus and a single /dev/dxg device
> > + * is created.
> > + */
> > +struct dxgglobal *dxgglobal;
>
> No, make this per-device, NEVER have a single device for your driver.
> The Linux driver model makes it harder to do it this way than to do it
> correctly.  Do it correctly please and have no global structures like
> this.


This will be addressed in the next version of patches.


> > +#define DXGKRNL_VERSION			0x2216
>
> What does this mean?


This is the driver implementation version. There are many Microsoft
shipped Linux kernels, which include the driver. The version allows to
quickly determine, which level of functionality or bug fixes is
implemented in the current driver.


> > +
> > +/*
> > + * Interface with the PCI driver
> > + */
> > +
> > +/*
> > + * Part of the PCI config space of the vGPU device is used for vGPU
> > + * configuration data. Reading/writing of the PCI config space is forwarded
> > + * to the host.
> > + */
>
> > +/* DXGK_VMBUS_INTERFACE_VERSION (u32) */
>
> What is this?


This comment explains that the value of DXGK_VMBUS_INTERFACE_VERSION is
located at the offset DXGK_VMBUS_CHANNEL_ID_OFFSET in the PCI config space.


> > +#define DXGK_VMBUS_VERSION_OFFSET	(DXGK_VMBUS_CHANNEL_ID_OFFSET + \
> > +					sizeof(guid_t))
>
> offsetof() is your friend, please use it.


There is no structure definition for the PCI config space of the device.
Therefore, offsets are computed by using data size.


> > +/* Luid of the virtual GPU on the host (struct winluid) */
>
> What is a "luid"?


LUID is "Locally Unique Identifier". It is used in Windows and its value
is guaranteed to be unique until the computer reboots. This is similar
to GUID, but is it not globally unique. Dxgkrnl on the host uses LUIDs
to identify compute adapter objects.
I added a comment to the definition in the header.


> > +	ret = vmbus_driver_register(&dxg_drv);
> > +	if (ret) {
> > +		pr_err("vmbus_driver_register failed: %d", ret);
> > +		return ret;
> > +	}
> > +	dxgglobal->vmbus_registered = true;
> > +
> > +	pr_info("%s  Version: %x", __func__, DXGKRNL_VERSION);
>
> When drivers work, they are quiet.
>
> Also, in-kernel modules do not have a version, they are part of the
> kernel tree, and that's the "version" they have.  You can drop the
> individual version here, it makes no sense anymore.


Microsoft develops many Linux kernels when the dxgkrnl driver is
included (EFLOW, Android, WSL). The kernels are built at different times
and might include a different version of the driver. Printing the
version allows to quickly determine what functionality and bug fixes are
included in the driver.

I see that other in-tree drivers print some information during
init_module (like nfblock.c).

This is done for convenience. So instead of tracking multiple kernel
versions, we can track the dxgkrnl driver version. I am ok to remove it
if it really hurts something. It is only a single line per driver load.


Thanks

Iouri

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ