lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2023122118-captive-suburb-6717@gregkh>
Date: Thu, 21 Dec 2023 09:22:00 +0100
From: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
To: Marco Pagani <marpagan@...hat.com>
Cc: Moritz Fischer <mdf@...nel.org>, Wu Hao <hao.wu@...el.com>,
	Xu Yilun <yilun.xu@...el.com>, Tom Rix <trix@...hat.com>,
	linux-kernel@...r.kernel.org, linux-fpga@...r.kernel.org
Subject: Re: [RFC PATCH v3 2/2] fpga: set owner of fpga_manager_ops for
 existing low-level modules

On Wed, Dec 20, 2023 at 11:24:20PM +0100, Marco Pagani wrote:
> 
> 
> On 19/12/23 19:11, Greg Kroah-Hartman wrote:
> > On Tue, Dec 19, 2023 at 06:17:20PM +0100, Marco Pagani wrote:
> >>
> >> On 2023-12-19 16:10, Greg Kroah-Hartman wrote:
> >>> On Tue, Dec 19, 2023 at 03:54:25PM +0100, Marco Pagani wrote:
> >>>>
> >>>>
> >>>> On 2023-12-18 21:33, Greg Kroah-Hartman wrote:
> >>>>> On Mon, Dec 18, 2023 at 09:28:09PM +0100, Marco Pagani wrote:
> >>>>>> This patch tentatively set the owner field of fpga_manager_ops to
> >>>>>> THIS_MODULE for existing fpga manager low-level control modules.
> >>>>>>
> >>>>>> Signed-off-by: Marco Pagani <marpagan@...hat.com>
> >>>>>> ---
> >>>>>>  drivers/fpga/altera-cvp.c             | 1 +
> >>>>>>  drivers/fpga/altera-pr-ip-core.c      | 1 +
> >>>>>>  drivers/fpga/altera-ps-spi.c          | 1 +
> >>>>>>  drivers/fpga/dfl-fme-mgr.c            | 1 +
> >>>>>>  drivers/fpga/ice40-spi.c              | 1 +
> >>>>>>  drivers/fpga/lattice-sysconfig.c      | 1 +
> >>>>>>  drivers/fpga/machxo2-spi.c            | 1 +
> >>>>>>  drivers/fpga/microchip-spi.c          | 1 +
> >>>>>>  drivers/fpga/socfpga-a10.c            | 1 +
> >>>>>>  drivers/fpga/socfpga.c                | 1 +
> >>>>>>  drivers/fpga/stratix10-soc.c          | 1 +
> >>>>>>  drivers/fpga/tests/fpga-mgr-test.c    | 1 +
> >>>>>>  drivers/fpga/tests/fpga-region-test.c | 1 +
> >>>>>>  drivers/fpga/ts73xx-fpga.c            | 1 +
> >>>>>>  drivers/fpga/versal-fpga.c            | 1 +
> >>>>>>  drivers/fpga/xilinx-spi.c             | 1 +
> >>>>>>  drivers/fpga/zynq-fpga.c              | 1 +
> >>>>>>  drivers/fpga/zynqmp-fpga.c            | 1 +
> >>>>>>  18 files changed, 18 insertions(+)
> >>>>>>
> >>>>>> diff --git a/drivers/fpga/altera-cvp.c b/drivers/fpga/altera-cvp.c
> >>>>>> index 4ffb9da537d8..aeb913547dd8 100644
> >>>>>> --- a/drivers/fpga/altera-cvp.c
> >>>>>> +++ b/drivers/fpga/altera-cvp.c
> >>>>>> @@ -520,6 +520,7 @@ static const struct fpga_manager_ops altera_cvp_ops = {
> >>>>>>  	.write_init	= altera_cvp_write_init,
> >>>>>>  	.write		= altera_cvp_write,
> >>>>>>  	.write_complete	= altera_cvp_write_complete,
> >>>>>> +	.owner		= THIS_MODULE,
> >>>>>
> >>>>> Note, this is not how to do this, force the compiler to set this for you
> >>>>> automatically, otherwise everyone will always forget to do it.  Look at
> >>>>> how functions like usb_register_driver() works.
> >>>>>
> >>>>> Also, are you _sure_ that you need a module owner in this structure?  I
> >>>>> still don't know why...
> >>>>>
> >>>>
> >>>> Do you mean moving the module owner field to the manager context and setting
> >>>> it during registration with a helper macro?
> >>>
> >>> I mean set it during registration with a helper macro.
> >>>
> >>>> Something like:
> >>>>
> >>>> struct fpga_manager {
> >>>> 	...
> >>>> 	struct module *owner;
> >>>> };
> >>>>
> >>>> #define fpga_mgr_register(parent, ...) \
> >>>> 	__fpga_mgr_register(parent,..., THIS_MODULE)
> >>>>
> >>>> struct fpga_manager *
> >>>> __fpga_mgr_register(struct device *parent, ..., struct module *owner)
> >>>> {
> >>>> 	...
> >>>> 	mgr->owner = owner;
> >>>> }
> >>>
> >>> Yes.
> >>>
> >>> But again, is a module owner even needed?  I don't think you all have
> >>> proven that yet...
> >>
> >> Programming an FPGA involves a potentially lengthy sequence of interactions
> >> with the reconfiguration engine. The manager conceptually organizes these
> >> interactions as a sequence of ops. Low-level modules implement these ops/steps
> >> for a specific device. If we don't protect the low-level module, someone might
> >> unload it right when we are in the middle of a low-level op programming the
> >> FPGA. As far as I know, the kernel would crash in that case.
> > 
> > The only way an unload of a module can happen is if a user explicitly
> > asks for it to be unloaded.  So they get what they ask for, right?
> >
> 
> Right, the user should get what he asked for, including hanging the
> hardware. My only concern is that the kernel should not crash.
> 
> > How do you "know" it is active?  And why doesn't the normal
> > "driver/device" bindings prevent unloading from being a problem?  When
> > you unload a module, you stop all ops on the driver, and then unregister
> > it, which causes any future ones to fail.
> > 
> > Or am I missing something here?
> >
>  
> I think the problem is that the ops are not directly tied to the driver
> of the manager's parent device.

Then that needs to be fixed right there, as that is obviously not using
the driver model properly.

Why aren't the "ops" a driver that is bound to this device?  If it is
the one responsible for controlling it, then it should be a driver and
as such, the driver model logic will handle things if/when a module is
unloaded to tear things down better.

> It is not even required to have a driver
> to register a manager. The only way to know if the fpga manager is
> active (i.e., someone is running one op) is by poking manager->state.

That too seems wrong, why is this?

> One possibility that comes into my mind, excluding a major reworking,
> is waiting in fpga_mgr_unregister() until the manager reaches a steady
> state (no ops are running) before unregistering the device. However, it
> feels questionable because if one of the ops hangs, the module removal
> will also hang.

You never know when a new operand will come in, so there's no way to
know "all is quiet", sorry.

Try fixing this properly, buy using the driver model correctly, that
should help resolve these issues automatically instead of hacked up
module reference count attempts.

Remember, this is the whole reason why the driver model was created all
those 20+ years ago, to move away from these module reference count
issues, let's not forget history please.

thanks,

greg k-h

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ