lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <08ba8bce-0ebf-4c8f-952d-a6665dc7fdf9@redhat.com>
Date: Mon, 4 Mar 2024 12:51:20 +0100
From: Marco Pagani <marpagan@...hat.com>
To: Xu Yilun <yilun.xu@...ux.intel.com>
Cc: Moritz Fischer <mdf@...nel.org>, Wu Hao <hao.wu@...el.com>,
 Xu Yilun <yilun.xu@...el.com>, Tom Rix <trix@...hat.com>,
 Jonathan Corbet <corbet@....net>,
 Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
 Alan Tull <atull@...nsource.altera.com>, linux-kernel@...r.kernel.org,
 linux-doc@...r.kernel.org, linux-fpga@...r.kernel.org
Subject: Re: [RFC PATCH v5 1/1] fpga: add an owner and use it to take the
 low-level module's refcount



On 2024-02-28 08:10, Xu Yilun wrote:
> On Tue, Feb 27, 2024 at 12:49:06PM +0100, Marco Pagani wrote:
>>
>>
>> On 2024-02-21 15:37, Xu Yilun wrote:
>>> On Tue, Feb 20, 2024 at 12:11:26PM +0100, Marco Pagani wrote:
>>>>
>>>>
>>>> On 2024-02-18 11:05, Xu Yilun wrote:
>>>>> On Mon, Feb 05, 2024 at 06:47:34PM +0100, Marco Pagani wrote:
>>>>>>
>>>>>>
>>>>>> On 2024-02-04 06:15, Xu Yilun wrote:
>>>>>>> On Fri, Feb 02, 2024 at 06:44:01PM +0100, Marco Pagani wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2024-01-30 05:31, Xu Yilun wrote:
>>>>>>>>>> +#define fpga_mgr_register_full(parent, info) \
>>>>>>>>>> +	__fpga_mgr_register_full(parent, info, THIS_MODULE)
>>>>>>>>>>  struct fpga_manager *
>>>>>>>>>> -fpga_mgr_register_full(struct device *parent, const struct fpga_manager_info *info);
>>>>>>>>>> +__fpga_mgr_register_full(struct device *parent, const struct fpga_manager_info *info,
>>>>>>>>>> +			 struct module *owner);
>>>>>>>>>>  
>>>>>>>>>> +#define fpga_mgr_register(parent, name, mops, priv) \
>>>>>>>>>> +	__fpga_mgr_register(parent, name, mops, priv, THIS_MODULE)
>>>>>>>>>>  struct fpga_manager *
>>>>>>>>>> -fpga_mgr_register(struct device *parent, const char *name,
>>>>>>>>>> -		  const struct fpga_manager_ops *mops, void *priv);
>>>>>>>>>> +__fpga_mgr_register(struct device *parent, const char *name,
>>>>>>>>>> +		    const struct fpga_manager_ops *mops, void *priv, struct module *owner);
>>>>>>>>>> +
>>>>>>>>>>  void fpga_mgr_unregister(struct fpga_manager *mgr);
>>>>>>>>>>  
>>>>>>>>>> +#define devm_fpga_mgr_register_full(parent, info) \
>>>>>>>>>> +	__devm_fpga_mgr_register_full(parent, info, THIS_MODULE)
>>>>>>>>>>  struct fpga_manager *
>>>>>>>>>> -devm_fpga_mgr_register_full(struct device *parent, const struct fpga_manager_info *info);
>>>>>>>>>> +__devm_fpga_mgr_register_full(struct device *parent, const struct fpga_manager_info *info,
>>>>>>>>>> +			      struct module *owner);
>>>>>>>>>
>>>>>>>>> Add a line here. I can do it myself if you agree.
>>>>>>>>
>>>>>>>> Sure, that is fine by me. I also spotted a typo in the commit log body
>>>>>>>> (in taken -> is taken). Do you want me to send a v6, or do you prefer
>>>>>>>> to fix that in place?
>>>>>>>
>>>>>>> No need, I can fix it.
>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> There is still a RFC prefix for this patch. Are you ready to get it merged?
>>>>>>>>> If yes, Acked-by: Xu Yilun <yilun.xu@...el.com>
>>>>>>>>
>>>>>>>> I'm ready for the patch to be merged. However, I recently sent an RFC
>>>>>>>> to propose a safer implementation of try_module_get() that would
>>>>>>>> simplify the code and may also benefit other subsystems. What do you
>>>>>>>> think?
>>>>>>>>
>>>>>>>> https://lore.kernel.org/linux-modules/20240130193614.49772-1-marpagan@redhat.com/
>>>>>>>
>>>>>>> I suggest take your fix to linux-fpga/for-next now. If your try_module_get()
>>>>>>> proposal is applied before the end of this cycle, we could re-evaluate
>>>>>>> this patch.
>>>>>>
>>>>>> That's fine by me.
>>>>>
>>>>> Sorry, I still found issues about this solution.
>>>>>
>>>>> void fpga_mgr_unregister(struct fpga_manager *mgr)
>>>>> {
>>>>>         dev_info(&mgr->dev, "%s %s\n", __func__, mgr->name);
>>>>>
>>>>>         /*
>>>>>          * If the low level driver provides a method for putting fpga into
>>>>>          * a desired state upon unregister, do it.
>>>>>          */
>>>>>         fpga_mgr_fpga_remove(mgr);
>>>>>
>>>>>         mutex_lock(&mgr->mops_mutex);
>>>>>
>>>>>         mgr->mops = NULL;
>>>>>
>>>>>         mutex_unlock(&mgr->mops_mutex);
>>>>>
>>>>>         device_unregister(&mgr->dev);
>>>>> }
>>>>>
>>>>> Note that fpga_mgr_unregister() doesn't have to be called in module_exit().
>>>>> So if we do fpga_mgr_get() then fpga_mgr_unregister(), We finally had a
>>>>> fpga_manager dev without mops, this is not what the user want and cause
>>>>> problem when using this fpga_manager dev for other FPGA APIs.
>>>>
>>>> How about moving mgr->mops = NULL from fpga_mgr_unregister() to
>>>> class->dev_release()? In that way, mops will be set to NULL only when the
>>>> manager dev refcount reaches 0.
>>>
>>> I'm afraid it doesn't help.  The lifecycle of the module and the fpga
>>> mgr dev is different.
>>>
>>> We use mops = NULL to indicate module has been freed or will be freed in no
>>> time.  On the other hand mops != NULL means module is still there, so
>>> that try_module_get() could be safely called.  It is possible someone
>>> has got fpga mgr dev but not the module yet, at that time the module is
>>> unloaded, then try_module_get() triggers crash.
>>>
>>>>
>>>> If fpga_mgr_unregister() is called from module_exit(), we are sure that nobody
>>>> got the manager dev earlier using fpga_mgr_get(), or it would have bumped up
>>>
>>> No, someone may get the manager dev but not the module yet, and been
>>> scheduled out.
>>>
>>
>> You are right. Overall, it's a bad idea. How about then using an additional 
>> bool flag instead of "overloading" the mops pointer? Something like:
>>
>> get:
>> 	if (!mgr->owner_valid || !try_module_get(mgr->mops_owner))
>>
>> remove:
>> 	mgr->owner_valid = false;
> 
> I'm not quite sure which function is actually mentioned by "remove".  I
> assume it should be fpga_mgr_unregister().  IIUC this flag means no
> more reference to fpga mgr, but existing references are still valid.
> 
> It works for me. But the name of this flag could be reconsidered to
> avoid misunderstanding.  The owner is still valid (we still need to put
> the owner) but allows no more reference.  Maybe "owner_inactive"?
> 
> I still wanna this owner reference change been splitted, so that
> we could simply revert it when the try_module_get_safe() got accepted.
> 

Just to be sure that I understood correctly, you want to split the
changes into two patches, like:

a) add module owner to the manager struct and take it in
__fpga_mgr_get(); move put_device() from __fpga_mgr_get() to
fpga_mgr_get() and of_fpga_mgr_get().

b) add the mutex and the unregistered flag for protection against races.

So that (b) can be reverted if try_module_get_safe() will be accepted?

> [...]

Thanks,
Marco


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ