linux-kernel - Re: [PATCH v8 3/3] hisi_acc_vfio_pci: adapt to new migration configuration

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <fd66f465-2f6e-8e4d-e010-7584ba43cdf7@huawei.com>
Date: Sat, 30 Aug 2025 17:08:05 +0800
From: liulongfang <liulongfang@...wei.com>
To: Shameer Kolothum <shameerkolothum@...il.com>, Alex Williamson
	<alex.williamson@...hat.com>
CC: <jgg@...dia.com>, <jonathan.cameron@...wei.com>, <kvm@...r.kernel.org>,
	<linux-kernel@...r.kernel.org>, <linuxarm@...neuler.org>
Subject: Re: [PATCH v8 3/3] hisi_acc_vfio_pci: adapt to new migration
 configuration

On 2025/8/22 15:03, Shameer Kolothum wrote:
> On 22/08/2025 03:44, liulongfang wrote:
>> On 2025/8/22 2:01, Alex Williamson wrote:
>>> On Wed, 20 Aug 2025 15:24:35 +0800
>>> Longfang Liu <liulongfang@...wei.com> wrote:
>>>
>>>> On new platforms greater than QM_HW_V3, the migration region has been
>>>> relocated from the VF to the PF. The driver must also be modified
>>>> accordingly to adapt to the new hardware device.
>>>>
>>>> On the older hardware platform QM_HW_V3, the live migration configuration
>>>> region is placed in the latter 32K portion of the VF's BAR2 configuration
>>>> space. On the new hardware platform QM_HW_V4, the live migration
>>>> configuration region also exists in the same 32K area immediately following
>>>> the VF's BAR2, just like on QM_HW_V3.
>>>>
>>>> However, access to this region is now controlled by hardware. Additionally,
>>>> a copy of the live migration configuration region is present in the PF's
>>>> BAR2 configuration space. On the new hardware platform QM_HW_V4, when an
>>>> older version of the driver is loaded, it behaves like QM_HW_V3 and uses
>>>> the configuration region in the VF, ensuring that the live migration
>>>> function continues to work normally. When the new version of the driver is
>>>> loaded, it directly uses the configuration region in the PF. Meanwhile,
>>>> hardware configuration disables the live migration configuration region
>>>> in the VF's BAR2: reads return all 0xF values, and writes are silently
>>>> ignored.
>>>>
>>>> Signed-off-by: Longfang Liu <liulongfang@...wei.com>
>>>> Reviewed-by: Shameer Kolothum <shameerkolothum@...il.com>
>>>> ---
>>>>  .../vfio/pci/hisilicon/hisi_acc_vfio_pci.c    | 169 ++++++++++++------
>>>>  .../vfio/pci/hisilicon/hisi_acc_vfio_pci.h    |  13 ++
>>>>  2 files changed, 130 insertions(+), 52 deletions(-)
>>>>
>>>> diff --git a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
>>>> index ddb3fd4df5aa..09893d143a68 100644
>>>> --- a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
>>>> +++ b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
>>>> @@ -125,6 +125,72 @@ static int qm_get_cqc(struct hisi_qm *qm, u64 *addr)
>>>>  	return 0;
>>>>  }
>>>>  
>>>> +static int qm_get_xqc_regs(struct hisi_acc_vf_core_device *hisi_acc_vdev,
>>>> +			   struct acc_vf_data *vf_data)
>>>> +{
>>>> +	struct hisi_qm *qm = &hisi_acc_vdev->vf_qm;
>>>> +	struct device *dev = &qm->pdev->dev;
>>>> +	u32 eqc_addr, aeqc_addr;
>>>> +	int ret;
>>>> +
>>>> +	if (hisi_acc_vdev->drv_mode == HW_V3_COMPAT) {
>>>> +		eqc_addr = QM_EQC_DW0;
>>>> +		aeqc_addr = QM_AEQC_DW0;
>>>> +	} else {
>>>> +		eqc_addr = QM_EQC_PF_DW0;
>>>> +		aeqc_addr = QM_AEQC_PF_DW0;
>>>> +	}
>>>> +
>>>> +	/* QM_EQC_DW has 7 regs */
>>>> +	ret = qm_read_regs(qm, eqc_addr, vf_data->qm_eqc_dw, 7);
>>>> +	if (ret) {
>>>> +		dev_err(dev, "failed to read QM_EQC_DW\n");
>>>> +		return ret;
>>>> +	}
>>>> +
>>>> +	/* QM_AEQC_DW has 7 regs */
>>>> +	ret = qm_read_regs(qm, aeqc_addr, vf_data->qm_aeqc_dw, 7);
>>>> +	if (ret) {
>>>> +		dev_err(dev, "failed to read QM_AEQC_DW\n");
>>>> +		return ret;
>>>> +	}
>>>> +
>>>> +	return 0;
>>>> +}
>>>> +
>>>> +static int qm_set_xqc_regs(struct hisi_acc_vf_core_device *hisi_acc_vdev,
>>>> +			   struct acc_vf_data *vf_data)
>>>> +{
>>>> +	struct hisi_qm *qm = &hisi_acc_vdev->vf_qm;
>>>> +	struct device *dev = &qm->pdev->dev;
>>>> +	u32 eqc_addr, aeqc_addr;
>>>> +	int ret;
>>>> +
>>>> +	if (hisi_acc_vdev->drv_mode == HW_V3_COMPAT) {
>>>> +		eqc_addr = QM_EQC_DW0;
>>>> +		aeqc_addr = QM_AEQC_DW0;
>>>> +	} else {
>>>> +		eqc_addr = QM_EQC_PF_DW0;
>>>> +		aeqc_addr = QM_AEQC_PF_DW0;
>>>> +	}
>>>> +
>>>> +	/* QM_EQC_DW has 7 regs */
>>>> +	ret = qm_write_regs(qm, eqc_addr, vf_data->qm_eqc_dw, 7);
>>>> +	if (ret) {
>>>> +		dev_err(dev, "failed to write QM_EQC_DW\n");
>>>> +		return ret;
>>>> +	}
>>>> +
>>>> +	/* QM_AEQC_DW has 7 regs */
>>>> +	ret = qm_write_regs(qm, aeqc_addr, vf_data->qm_aeqc_dw, 7);
>>>> +	if (ret) {
>>>> +		dev_err(dev, "failed to write QM_AEQC_DW\n");
>>>> +		return ret;
>>>> +	}
>>>> +
>>>> +	return 0;
>>>> +}
>>>> +
>>>>  static int qm_get_regs(struct hisi_qm *qm, struct acc_vf_data *vf_data)
>>>>  {
>>>>  	struct device *dev = &qm->pdev->dev;
>>>> @@ -167,20 +233,6 @@ static int qm_get_regs(struct hisi_qm *qm, struct acc_vf_data *vf_data)
>>>>  		return ret;
>>>>  	}
>>>>  
>>>> -	/* QM_EQC_DW has 7 regs */
>>>> -	ret = qm_read_regs(qm, QM_EQC_DW0, vf_data->qm_eqc_dw, 7);
>>>> -	if (ret) {
>>>> -		dev_err(dev, "failed to read QM_EQC_DW\n");
>>>> -		return ret;
>>>> -	}
>>>> -
>>>> -	/* QM_AEQC_DW has 7 regs */
>>>> -	ret = qm_read_regs(qm, QM_AEQC_DW0, vf_data->qm_aeqc_dw, 7);
>>>> -	if (ret) {
>>>> -		dev_err(dev, "failed to read QM_AEQC_DW\n");
>>>> -		return ret;
>>>> -	}
>>>> -
>>>>  	return 0;
>>>>  }
>>>>  
>>>> @@ -239,20 +291,6 @@ static int qm_set_regs(struct hisi_qm *qm, struct acc_vf_data *vf_data)
>>>>  		return ret;
>>>>  	}
>>>>  
>>>> -	/* QM_EQC_DW has 7 regs */
>>>> -	ret = qm_write_regs(qm, QM_EQC_DW0, vf_data->qm_eqc_dw, 7);
>>>> -	if (ret) {
>>>> -		dev_err(dev, "failed to write QM_EQC_DW\n");
>>>> -		return ret;
>>>> -	}
>>>> -
>>>> -	/* QM_AEQC_DW has 7 regs */
>>>> -	ret = qm_write_regs(qm, QM_AEQC_DW0, vf_data->qm_aeqc_dw, 7);
>>>> -	if (ret) {
>>>> -		dev_err(dev, "failed to write QM_AEQC_DW\n");
>>>> -		return ret;
>>>> -	}
>>>> -
>>>>  	return 0;
>>>>  }
>>>>  
>>>> @@ -522,6 +560,10 @@ static int vf_qm_load_data(struct hisi_acc_vf_core_device *hisi_acc_vdev,
>>>>  		return ret;
>>>>  	}
>>>>  
>>>> +	ret = qm_set_xqc_regs(hisi_acc_vdev, vf_data);
>>>> +	if (ret)
>>>> +		return ret;
>>>> +
>>>>  	ret = hisi_qm_mb(qm, QM_MB_CMD_SQC_BT, qm->sqc_dma, 0, 0);
>>>>  	if (ret) {
>>>>  		dev_err(dev, "set sqc failed\n");
>>>> @@ -589,6 +631,10 @@ static int vf_qm_state_save(struct hisi_acc_vf_core_device *hisi_acc_vdev,
>>>>  	vf_data->vf_qm_state = QM_READY;
>>>>  	hisi_acc_vdev->vf_qm_state = vf_data->vf_qm_state;
>>>>  
>>>> +	ret = qm_get_xqc_regs(hisi_acc_vdev, vf_data);
>>>> +	if (ret)
>>>> +		return ret;
>>>> +
>>>>  	ret = vf_qm_read_data(vf_qm, vf_data);
>>>>  	if (ret)
>>>>  		return ret;
>>>> @@ -1186,34 +1232,52 @@ static int hisi_acc_vf_qm_init(struct hisi_acc_vf_core_device *hisi_acc_vdev)
>>>>  {
>>>>  	struct vfio_pci_core_device *vdev = &hisi_acc_vdev->core_device;
>>>>  	struct hisi_qm *vf_qm = &hisi_acc_vdev->vf_qm;
>>>> +	struct hisi_qm *pf_qm = hisi_acc_vdev->pf_qm;
>>>>  	struct pci_dev *vf_dev = vdev->pdev;
>>>> +	u32 val;
>>>>  
>>>> -	/*
>>>> -	 * ACC VF dev BAR2 region consists of both functional register space
>>>> -	 * and migration control register space. For migration to work, we
>>>> -	 * need access to both. Hence, we map the entire BAR2 region here.
>>>> -	 * But unnecessarily exposing the migration BAR region to the Guest
>>>> -	 * has the potential to prevent/corrupt the Guest migration. Hence,
>>>> -	 * we restrict access to the migration control space from
>>>> -	 * Guest(Please see mmap/ioctl/read/write override functions).
>>>> -	 *
>>>> -	 * Please note that it is OK to expose the entire VF BAR if migration
>>>> -	 * is not supported or required as this cannot affect the ACC PF
>>>> -	 * configurations.
>>>> -	 *
>>>> -	 * Also the HiSilicon ACC VF devices supported by this driver on
>>>> -	 * HiSilicon hardware platforms are integrated end point devices
>>>> -	 * and the platform lacks the capability to perform any PCIe P2P
>>>> -	 * between these devices.
>>>> -	 */
>>>> +	val = readl(pf_qm->io_base + QM_MIG_REGION_SEL);
>>>> +	if (pf_qm->ver > QM_HW_V3 && (val & QM_MIG_REGION_EN))
>>>> +		hisi_acc_vdev->drv_mode = HW_V4_NEW;
>>>> +	else
>>>> +		hisi_acc_vdev->drv_mode = HW_V3_COMPAT;
>>>>  
>>>> -	vf_qm->io_base =
>>>> -		ioremap(pci_resource_start(vf_dev, VFIO_PCI_BAR2_REGION_INDEX),
>>>> -			pci_resource_len(vf_dev, VFIO_PCI_BAR2_REGION_INDEX));
>>>> -	if (!vf_qm->io_base)
>>>> -		return -EIO;
>>>> +	if (hisi_acc_vdev->drv_mode == HW_V4_NEW) {
>>>> +		/*
>>>> +		 * On hardware platforms greater than QM_HW_V3, the migration function
>>>> +		 * register is placed in the BAR2 configuration region of the PF,
>>>> +		 * and each VF device occupies 8KB of configuration space.
>>>> +		 */
>>>> +		vf_qm->io_base = pf_qm->io_base + QM_MIG_REGION_OFFSET +
>>>> +				 hisi_acc_vdev->vf_id * QM_MIG_REGION_SIZE;
>>>> +	} else {
>>>> +		/*
>>>> +		 * ACC VF dev BAR2 region consists of both functional register space
>>>> +		 * and migration control register space. For migration to work, we
>>>> +		 * need access to both. Hence, we map the entire BAR2 region here.
>>>> +		 * But unnecessarily exposing the migration BAR region to the Guest
>>>> +		 * has the potential to prevent/corrupt the Guest migration. Hence,
>>>> +		 * we restrict access to the migration control space from
>>>> +		 * Guest(Please see mmap/ioctl/read/write override functions).
>>>> +		 *
>>>> +		 * Please note that it is OK to expose the entire VF BAR if migration
>>>> +		 * is not supported or required as this cannot affect the ACC PF
>>>> +		 * configurations.
>>>> +		 *
>>>> +		 * Also the HiSilicon ACC VF devices supported by this driver on
>>>> +		 * HiSilicon hardware platforms are integrated end point devices
>>>> +		 * and the platform lacks the capability to perform any PCIe P2P
>>>> +		 * between these devices.
>>>> +		 */
>>>>  
>>>> +		vf_qm->io_base =
>>>> +			ioremap(pci_resource_start(vf_dev, VFIO_PCI_BAR2_REGION_INDEX),
>>>> +				pci_resource_len(vf_dev, VFIO_PCI_BAR2_REGION_INDEX));
>>>> +		if (!vf_qm->io_base)
>>>> +			return -EIO;
>>>> +	}
>>>>  	vf_qm->fun_type = QM_HW_VF;
>>>> +	vf_qm->ver = pf_qm->ver;
>>>>  	vf_qm->pdev = vf_dev;
>>>>  	mutex_init(&vf_qm->mailbox_lock);
>>>>  
>>>> @@ -1539,7 +1603,8 @@ static void hisi_acc_vfio_pci_close_device(struct vfio_device *core_vdev)
>>>>  	hisi_acc_vf_disable_fds(hisi_acc_vdev);
>>>>  	mutex_lock(&hisi_acc_vdev->open_mutex);
>>>>  	hisi_acc_vdev->dev_opened = false;
>>>> -	iounmap(vf_qm->io_base);
>>>> +	if (hisi_acc_vdev->drv_mode == HW_V3_COMPAT)
>>>> +		iounmap(vf_qm->io_base);
>>>>  	mutex_unlock(&hisi_acc_vdev->open_mutex);
>>>>  	vfio_pci_core_close_device(core_vdev);
>>>>  }
>>>> diff --git a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.h b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.h
>>>> index 91002ceeebc1..e7650f5ff0f7 100644
>>>> --- a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.h
>>>> +++ b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.h
>>>> @@ -59,6 +59,18 @@
>>>>  #define ACC_DEV_MAGIC_V1	0XCDCDCDCDFEEDAACC
>>>>  #define ACC_DEV_MAGIC_V2	0xAACCFEEDDECADEDE
>>>>  
>>>> +#define QM_MIG_REGION_OFFSET		0x180000
>>>> +#define QM_MIG_REGION_SIZE		0x2000
>>>> +
>>>> +#define QM_SUB_VERSION_ID		0x100210
>>>> +#define QM_EQC_PF_DW0			0x1c00
>>>> +#define QM_AEQC_PF_DW0			0x1c20
>>>> +
>>>> +enum hw_drv_mode {
>>>> +	HW_V3_COMPAT = 0,
>>>> +	HW_V4_NEW,
>>>> +};
>>>
>>> You might consider whether these names are going to make sense in the
>>> future if there a V5 and beyond, and why V3 hardware is going to use a
>>> "compat" name when that's it's native operating mode.
>>>
>>
>> If future versions such as V5 or higher emerge, we can still handle them by
>> simply updating the version number.
>> The use of "compat" naming is intended to ensure that newer hardware versions
>> remain compatible with older drivers.
>> For simplicity, we could alternatively rename them directly to HW_ACC_V3, HW_ACC_V4,
>> HW_ACC_V5, etc.
>>
>>> But also, patch 1/ is deciding whether to expose the full BAR based on
>>> the hardware version and here we choose whether to use the VF or PF
>>> control registers based on the hardware version and whether the new
>>> hardware feature is enabled.  Doesn't that leave V4 hardware exposing
>>> the full BAR regardless of whether the PF driver has disabled the
>>> migration registers within the BAR?  Thanks,
>>>
>>
>> Regarding V4 hardware: the migration registers within the PF's BAR are
>> accessible only by the host driver, just like other registers in the BAR.
>> When the VF's live migration configuration registers are enabled, the driver
>> can see the full BAR configuration space of the PF.However, at this point,
>> the PF's live migration configuration registers become read/write ineffective.
>> In other words, on V4 hardware, the VF's configuration domain and the PF's
>> configuration domain are mutually exclusive—only one of them is ever read/write
>> valid at any given time.
> 
> Sorry it is still not clear to me. My understanding was on V4 hardware,
> the VF's live migration config register will be inactive only when you
> set the PF's QM_MIG_REGION_SEL to QM_MIG_REGION_EN.
> 
> So, I think the question is whether you need to check the PF's
> QM_MIG_REGION_SEL has set to  QM_MIG_REGION_EN, in patch 1 before
> exposing the full VF BAR region or not. If yes, you need to reorganise
> the patch 1. Currently patch 1 only checks the hardware version to
> decide that.
>

You're absolutely right. Patch 1 should check its status using drv_mode,
just as done here, instead of relying on pf_qm->ver to determine its mode

Thanks.
Longfang.

> 
> Thanks,
> Shameer
> 
>> Thanks.
>> Longfang.
>>
>>> Alex
>>>
>>>> +
>>>>  struct acc_vf_data {
>>>>  #define QM_MATCH_SIZE offsetofend(struct acc_vf_data, qm_rsv_state)
>>>>  	/* QM match information */
>>>> @@ -125,6 +137,7 @@ struct hisi_acc_vf_core_device {
>>>>  	struct pci_dev *vf_dev;
>>>>  	struct hisi_qm *pf_qm;
>>>>  	struct hisi_qm vf_qm;
>>>> +	int drv_mode;
>>>>  	/*
>>>>  	 * vf_qm_state represents the QM_VF_STATE register value.
>>>>  	 * It is set by Guest driver for the ACC VF dev indicating
>>>
>>>
>>> .
>>>
> 
> .
>