lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Zd7cff43ffbJOGNY@yilunxu-OptiPlex-7050>
Date: Wed, 28 Feb 2024 15:10:53 +0800
From: Xu Yilun <yilun.xu@...ux.intel.com>
To: Marco Pagani <marpagan@...hat.com>
Cc: Moritz Fischer <mdf@...nel.org>, Wu Hao <hao.wu@...el.com>,
	Xu Yilun <yilun.xu@...el.com>, Tom Rix <trix@...hat.com>,
	Jonathan Corbet <corbet@....net>,
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	Alan Tull <atull@...nsource.altera.com>,
	linux-kernel@...r.kernel.org, linux-doc@...r.kernel.org,
	linux-fpga@...r.kernel.org
Subject: Re: [RFC PATCH v5 1/1] fpga: add an owner and use it to take the
 low-level module's refcount

On Tue, Feb 27, 2024 at 12:49:06PM +0100, Marco Pagani wrote:
> 
> 
> On 2024-02-21 15:37, Xu Yilun wrote:
> > On Tue, Feb 20, 2024 at 12:11:26PM +0100, Marco Pagani wrote:
> >>
> >>
> >> On 2024-02-18 11:05, Xu Yilun wrote:
> >>> On Mon, Feb 05, 2024 at 06:47:34PM +0100, Marco Pagani wrote:
> >>>>
> >>>>
> >>>> On 2024-02-04 06:15, Xu Yilun wrote:
> >>>>> On Fri, Feb 02, 2024 at 06:44:01PM +0100, Marco Pagani wrote:
> >>>>>>
> >>>>>>
> >>>>>> On 2024-01-30 05:31, Xu Yilun wrote:
> >>>>>>>> +#define fpga_mgr_register_full(parent, info) \
> >>>>>>>> +	__fpga_mgr_register_full(parent, info, THIS_MODULE)
> >>>>>>>>  struct fpga_manager *
> >>>>>>>> -fpga_mgr_register_full(struct device *parent, const struct fpga_manager_info *info);
> >>>>>>>> +__fpga_mgr_register_full(struct device *parent, const struct fpga_manager_info *info,
> >>>>>>>> +			 struct module *owner);
> >>>>>>>>  
> >>>>>>>> +#define fpga_mgr_register(parent, name, mops, priv) \
> >>>>>>>> +	__fpga_mgr_register(parent, name, mops, priv, THIS_MODULE)
> >>>>>>>>  struct fpga_manager *
> >>>>>>>> -fpga_mgr_register(struct device *parent, const char *name,
> >>>>>>>> -		  const struct fpga_manager_ops *mops, void *priv);
> >>>>>>>> +__fpga_mgr_register(struct device *parent, const char *name,
> >>>>>>>> +		    const struct fpga_manager_ops *mops, void *priv, struct module *owner);
> >>>>>>>> +
> >>>>>>>>  void fpga_mgr_unregister(struct fpga_manager *mgr);
> >>>>>>>>  
> >>>>>>>> +#define devm_fpga_mgr_register_full(parent, info) \
> >>>>>>>> +	__devm_fpga_mgr_register_full(parent, info, THIS_MODULE)
> >>>>>>>>  struct fpga_manager *
> >>>>>>>> -devm_fpga_mgr_register_full(struct device *parent, const struct fpga_manager_info *info);
> >>>>>>>> +__devm_fpga_mgr_register_full(struct device *parent, const struct fpga_manager_info *info,
> >>>>>>>> +			      struct module *owner);
> >>>>>>>
> >>>>>>> Add a line here. I can do it myself if you agree.
> >>>>>>
> >>>>>> Sure, that is fine by me. I also spotted a typo in the commit log body
> >>>>>> (in taken -> is taken). Do you want me to send a v6, or do you prefer
> >>>>>> to fix that in place?
> >>>>>
> >>>>> No need, I can fix it.
> >>>>>
> >>>>>>
> >>>>>>>
> >>>>>>> There is still a RFC prefix for this patch. Are you ready to get it merged?
> >>>>>>> If yes, Acked-by: Xu Yilun <yilun.xu@...el.com>
> >>>>>>
> >>>>>> I'm ready for the patch to be merged. However, I recently sent an RFC
> >>>>>> to propose a safer implementation of try_module_get() that would
> >>>>>> simplify the code and may also benefit other subsystems. What do you
> >>>>>> think?
> >>>>>>
> >>>>>> https://lore.kernel.org/linux-modules/20240130193614.49772-1-marpagan@redhat.com/
> >>>>>
> >>>>> I suggest take your fix to linux-fpga/for-next now. If your try_module_get()
> >>>>> proposal is applied before the end of this cycle, we could re-evaluate
> >>>>> this patch.
> >>>>
> >>>> That's fine by me.
> >>>
> >>> Sorry, I still found issues about this solution.
> >>>
> >>> void fpga_mgr_unregister(struct fpga_manager *mgr)
> >>> {
> >>>         dev_info(&mgr->dev, "%s %s\n", __func__, mgr->name);
> >>>
> >>>         /*
> >>>          * If the low level driver provides a method for putting fpga into
> >>>          * a desired state upon unregister, do it.
> >>>          */
> >>>         fpga_mgr_fpga_remove(mgr);
> >>>
> >>>         mutex_lock(&mgr->mops_mutex);
> >>>
> >>>         mgr->mops = NULL;
> >>>
> >>>         mutex_unlock(&mgr->mops_mutex);
> >>>
> >>>         device_unregister(&mgr->dev);
> >>> }
> >>>
> >>> Note that fpga_mgr_unregister() doesn't have to be called in module_exit().
> >>> So if we do fpga_mgr_get() then fpga_mgr_unregister(), We finally had a
> >>> fpga_manager dev without mops, this is not what the user want and cause
> >>> problem when using this fpga_manager dev for other FPGA APIs.
> >>
> >> How about moving mgr->mops = NULL from fpga_mgr_unregister() to
> >> class->dev_release()? In that way, mops will be set to NULL only when the
> >> manager dev refcount reaches 0.
> > 
> > I'm afraid it doesn't help.  The lifecycle of the module and the fpga
> > mgr dev is different.
> > 
> > We use mops = NULL to indicate module has been freed or will be freed in no
> > time.  On the other hand mops != NULL means module is still there, so
> > that try_module_get() could be safely called.  It is possible someone
> > has got fpga mgr dev but not the module yet, at that time the module is
> > unloaded, then try_module_get() triggers crash.
> > 
> >>
> >> If fpga_mgr_unregister() is called from module_exit(), we are sure that nobody
> >> got the manager dev earlier using fpga_mgr_get(), or it would have bumped up
> > 
> > No, someone may get the manager dev but not the module yet, and been
> > scheduled out.
> >
> 
> You are right. Overall, it's a bad idea. How about then using an additional 
> bool flag instead of "overloading" the mops pointer? Something like:
> 
> get:
> 	if (!mgr->owner_valid || !try_module_get(mgr->mops_owner))
> 
> remove:
> 	mgr->owner_valid = false;

I'm not quite sure which function is actually mentioned by "remove".  I
assume it should be fpga_mgr_unregister().  IIUC this flag means no
more reference to fpga mgr, but existing references are still valid.

It works for me. But the name of this flag could be reconsidered to
avoid misunderstanding.  The owner is still valid (we still need to put
the owner) but allows no more reference.  Maybe "owner_inactive"?

I still wanna this owner reference change been splitted, so that
we could simply revert it when the try_module_get_safe() got accepted.

Thanks,
Yilun

> 
> Another possibility that comes to my mind would be to "overload" the owner
> pointer itself by using the ERR_PTR/IS_ERR macros. However, it looks ugly
> to me.
> 
> Thanks,
> Marco
> 
> 
> [...]
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ