lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <ZvJ/6wHoU9VXJKh8@yilunxu-OptiPlex-7050>
Date: Tue, 24 Sep 2024 17:01:31 +0800
From: Xu Yilun <yilun.xu@...ux.intel.com>
To: Peter Colberg <peter.colberg@...el.com>
Cc: Wu Hao <hao.wu@...el.com>, Tom Rix <trix@...hat.com>,
	Moritz Fischer <mdf@...nel.org>, Xu Yilun <yilun.xu@...el.com>,
	linux-fpga@...r.kernel.org, linux-kernel@...r.kernel.org,
	Russ Weight <russ.weight@...ux.dev>,
	Marco Pagani <marpagan@...hat.com>,
	Matthew Gerlach <matthew.gerlach@...ux.intel.com>,
	Russ Weight <russell.h.weight@...el.com>
Subject: Re: [PATCH v3 9/9] fpga: dfl: fix kernel warning on port
 release/assign for SRIOV

On Thu, Sep 19, 2024 at 04:34:30PM -0400, Peter Colberg wrote:
> From: Xu Yilun <yilun.xu@...el.com>
> 

Please describe what this patch does at the beginning. And below
background descriptions follow.

> With the Intel FPGA PAC D5005, DFL ports are registered as platform
> devices in PF mode. The port device must be removed from the host when
> the user wants to configure the port as a VF for use by a user-space
> driver, e.g., for pass-through to a virtual machine. The FME device
> ioctls DFL_FPGA_FME_PORT_RELEASE/ASSIGN are assigned for this purpose.
> 
> In the previous implementation, the port platform device is not
> completely destroyed on port release: it is removed from the system by
> platform_device_del(), but the platform device instance is retained.
> When DFL_FPGA_FME_PORT_ASSIGN is called, the platform device is added
> back with platform_device_add(), which conflicts with this comment of
> device_add(): "Do not call this routine more than once for any device
> structure", and would previously cause a kernel warning at runtime.
> 
> This patch completely unregisters the port platform device on release
> and registers a new device on assign. But the main work is to remove

The main work of this series, not this patch.

> the dependency on struct dfl_feature_platform_data for many internal DFL
> APIs. This structure holds many DFL enumeration infos for feature
> devices. Many DFL APIs are expected to work with these infos even when
> the port platform device is unregistered. But after this change, the
> platform_data will be freed on port release. Hence this patch introduces
> a new structure dfl_feature_dev_data, which acts similarly to the
> previous dfl_feature_platform_data. dfl_feature_platform_data then only
> needs a pointer to dfl_feature_dev_data to query DFL enumeration infos.
> 
> Signed-off-by: Xu Yilun <yilun.xu@...el.com>
> Signed-off-by: Russ Weight <russell.h.weight@...el.com>
> Signed-off-by: Peter Colberg <peter.colberg@...el.com>
> Reviewed-by: Matthew Gerlach <matthew.gerlach@...ux.intel.com>
> ---
>  drivers/fpga/dfl-fme-br.c |   2 -
>  drivers/fpga/dfl.c        | 207 ++++++++++++++++++--------------------
>  drivers/fpga/dfl.h        |  57 +++++++----
>  3 files changed, 133 insertions(+), 133 deletions(-)
> 
> diff --git a/drivers/fpga/dfl-fme-br.c b/drivers/fpga/dfl-fme-br.c
> index 5c60a38ec76c..a298a041877b 100644
> --- a/drivers/fpga/dfl-fme-br.c
> +++ b/drivers/fpga/dfl-fme-br.c
> @@ -85,8 +85,6 @@ static void fme_br_remove(struct platform_device *pdev)
>  
>  	fpga_bridge_unregister(br);
>  
> -	if (priv->port_fdata)
> -		put_device(&priv->port_fdata->dev->dev);

I can't remember why all the get_device/put_device() are not needed anymore.

>  	if (priv->port_ops)
>  		dfl_fpga_port_ops_put(priv->port_ops);
>  }
> diff --git a/drivers/fpga/dfl.c b/drivers/fpga/dfl.c
> index 52f58d029ca4..a77d7692b170 100644
> --- a/drivers/fpga/dfl.c
> +++ b/drivers/fpga/dfl.c
> @@ -160,7 +160,7 @@ struct dfl_fpga_port_ops *dfl_fpga_port_ops_get(struct dfl_feature_dev_data *fda
>  
>  	list_for_each_entry(ops, &dfl_port_ops_list, node) {
>  		/* match port_ops using the name of platform device */
> -		if (!strcmp(fdata->dev->name, ops->name)) {
> +		if (!strcmp(fdata->pdev_name, ops->name)) {
>  			if (!try_module_get(ops->owner))
>  				ops = NULL;
>  			goto done;
> @@ -681,7 +681,6 @@ EXPORT_SYMBOL_GPL(dfl_fpga_dev_ops_unregister);
>   * @nr_irqs: number of irqs for all feature devices.
>   * @irq_table: Linux IRQ numbers for all irqs, indexed by local irq index of
>   *	       this device.
> - * @feature_dev: current feature device.
>   * @type: the current FIU type.
>   * @ioaddr: header register region address of current FIU in enumeration.
>   * @start: register resource start of current FIU.
> @@ -695,7 +694,6 @@ struct build_feature_devs_info {
>  	unsigned int nr_irqs;
>  	int *irq_table;
>  
> -	struct platform_device *feature_dev;
>  	enum dfl_id_type type;
>  	void __iomem *ioaddr;
>  	resource_size_t start;
> @@ -736,7 +734,6 @@ static void dfl_fpga_cdev_add_port_data(struct dfl_fpga_cdev *cdev,
>  {
>  	mutex_lock(&cdev->lock);
>  	list_add(&fdata->node, &cdev->port_dev_list);
> -	get_device(&fdata->dev->dev);
>  	mutex_unlock(&cdev->lock);
>  }
>  
> @@ -744,7 +741,6 @@ static struct dfl_feature_dev_data *
>  binfo_create_feature_dev_data(struct build_feature_devs_info *binfo)
>  {
>  	enum dfl_id_type type = binfo->type;
> -	struct platform_device *fdev = binfo->feature_dev;
>  	struct dfl_feature_info *finfo, *p;
>  	struct dfl_feature_dev_data *fdata;
>  	int ret, index = 0, res_idx = 0;
> @@ -752,18 +748,27 @@ binfo_create_feature_dev_data(struct build_feature_devs_info *binfo)
>  	if (WARN_ON_ONCE(type >= DFL_ID_MAX))
>  		return ERR_PTR(-EINVAL);
>  
> -	/*
> -	 * we do not need to care for the memory which is associated with
> -	 * the platform device. After calling platform_device_unregister(),
> -	 * it will be automatically freed by device's release() callback,
> -	 * platform_device_release().
> -	 */
> -	fdata = kzalloc(struct_size(fdata, features, binfo->feature_num), GFP_KERNEL);
> +	fdata = devm_kzalloc(binfo->dev, sizeof(*fdata), GFP_KERNEL);
>  	if (!fdata)
>  		return ERR_PTR(-ENOMEM);
>  
> -	fdata->dev = fdev;
> +	fdata->features = devm_kcalloc(binfo->dev, binfo->feature_num,
> +				       sizeof(*fdata->features), GFP_KERNEL);
> +	if (!fdata->features)
> +		return ERR_PTR(-ENOMEM);
> +
> +	fdata->resources = devm_kcalloc(binfo->dev, binfo->feature_num,
> +					sizeof(*fdata->resources), GFP_KERNEL);
> +	if (!fdata->resources)
> +		return ERR_PTR(-ENOMEM);
> +
>  	fdata->type = type;
> +
> +	fdata->pdev_id = dfl_id_alloc(type, binfo->dev);
> +	if (fdata->pdev_id < 0)
> +		return ERR_PTR(fdata->pdev_id);
> +
> +	fdata->pdev_name = dfl_devs[type].name;
>  	fdata->num = binfo->feature_num;
>  	fdata->dfl_cdev = binfo->cdev;
>  	fdata->id = FEATURE_DEV_ID_UNUSED;
> @@ -779,15 +784,6 @@ binfo_create_feature_dev_data(struct build_feature_devs_info *binfo)
>  	 */
>  	WARN_ON(fdata->disable_count);
>  
> -	fdev->dev.platform_data = fdata;
> -
> -	/* each sub feature has one MMIO resource */
> -	fdev->num_resources = binfo->feature_num;
> -	fdev->resource = kcalloc(binfo->feature_num, sizeof(*fdev->resource),
> -				 GFP_KERNEL);
> -	if (!fdev->resource)
> -		return ERR_PTR(-ENOMEM);
> -
>  	/* fill features and resource information for feature dev */
>  	list_for_each_entry_safe(finfo, p, &binfo->sub_features, node) {
>  		struct dfl_feature *feature = &fdata->features[index++];
> @@ -795,7 +791,6 @@ binfo_create_feature_dev_data(struct build_feature_devs_info *binfo)
>  		unsigned int i;
>  
>  		/* save resource information for each feature */
> -		feature->dev = fdev;
>  		feature->id = finfo->fid;
>  		feature->revision = finfo->revision;
>  		feature->dfh_version = finfo->dfh_version;
> @@ -804,8 +799,10 @@ binfo_create_feature_dev_data(struct build_feature_devs_info *binfo)
>  			feature->params = devm_kmemdup(binfo->dev,
>  						       finfo->params, finfo->param_size,
>  						       GFP_KERNEL);
> -			if (!feature->params)
> -				return ERR_PTR(-ENOMEM);
> +			if (!feature->params) {
> +				ret = -ENOMEM;
> +				goto err_free_id;
> +			}
>  
>  			feature->param_size = finfo->param_size;
>  		}
> @@ -823,19 +820,20 @@ binfo_create_feature_dev_data(struct build_feature_devs_info *binfo)
>  						      &finfo->mmio_res);
>  			if (IS_ERR(feature->ioaddr)) {
>  				ret = PTR_ERR(feature->ioaddr);
> -				return ERR_PTR(ret);
> +				goto err_free_id;
>  			}
>  		} else {
>  			feature->resource_index = res_idx;
> -			fdev->resource[res_idx++] = finfo->mmio_res;
> +			fdata->resources[res_idx++] = finfo->mmio_res;
>  		}
>  
>  		if (finfo->nr_irqs) {
>  			ctx = devm_kcalloc(binfo->dev, finfo->nr_irqs,
>  					   sizeof(*ctx), GFP_KERNEL);
> -			if (!ctx)
> -				return ERR_PTR(-ENOMEM);
> -
> +			if (!ctx) {
> +				ret = -ENOMEM;
> +				goto err_free_id;
> +			}
>  			for (i = 0; i < finfo->nr_irqs; i++)
>  				ctx[i].irq =
>  					binfo->irq_table[finfo->irq_base + i];
> @@ -848,36 +846,67 @@ binfo_create_feature_dev_data(struct build_feature_devs_info *binfo)
>  		kfree(finfo);
>  	}
>  
> +	fdata->resource_num = res_idx;
> +
>  	return fdata;
> +
> +err_free_id:
> +	dfl_id_free(type, fdata->pdev_id);
> +
> +	return ERR_PTR(ret);
>  }
>  
> -static int
> -build_info_create_dev(struct build_feature_devs_info *binfo)
> +/*
> + * register current feature device, it is called when we need to switch to
> + * another feature parsing or we have parsed all features on given device
> + * feature list.
> + */
> +static int feature_dev_register(struct dfl_feature_dev_data *fdata)
>  {
> -	enum dfl_id_type type = binfo->type;
> +	struct dfl_feature_platform_data pdata = {};
>  	struct platform_device *fdev;
> +	struct dfl_feature *feature;
> +	int ret;
>  
> -	/*
> -	 * we use -ENODEV as the initialization indicator which indicates
> -	 * whether the id need to be reclaimed
> -	 */
> -	fdev = platform_device_alloc(dfl_devs[type].name, -ENODEV);
> +	fdev = platform_device_alloc(fdata->pdev_name, fdata->pdev_id);

I see a part of the change is to delay the platform_device_alloc() so
that the feature platform device registration work could be gathered in
one function. Not sure if it is necessary for the port platform release
issue, or could be an independent enhancement patch.

I raise this cause I still feel hard to understand all these changes
in this patch and look for any chance to further split it.


BTW: Overall this series is much improved than the last version.

Thanks,
Yilun

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ