lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <37f7a0dc-6983-437e-a338-65e1abd751c2@redhat.com>
Date: Fri, 22 Dec 2023 21:52:27 +0100
From: Marco Pagani <marpagan@...hat.com>
To: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Cc: Moritz Fischer <mdf@...nel.org>, Wu Hao <hao.wu@...el.com>,
 Xu Yilun <yilun.xu@...el.com>, Tom Rix <trix@...hat.com>,
 linux-kernel@...r.kernel.org, linux-fpga@...r.kernel.org
Subject: Re: [RFC PATCH v3 2/2] fpga: set owner of fpga_manager_ops for
 existing low-level modules



On 2023-12-21 09:22, Greg Kroah-Hartman wrote:
> On Wed, Dec 20, 2023 at 11:24:20PM +0100, Marco Pagani wrote:
>>
>>
>> On 19/12/23 19:11, Greg Kroah-Hartman wrote:
>>> On Tue, Dec 19, 2023 at 06:17:20PM +0100, Marco Pagani wrote:
>>>>
>>>> On 2023-12-19 16:10, Greg Kroah-Hartman wrote:
>>>>> On Tue, Dec 19, 2023 at 03:54:25PM +0100, Marco Pagani wrote:
>>>>>>
>>>>>>
>>>>>> On 2023-12-18 21:33, Greg Kroah-Hartman wrote:
>>>>>>> On Mon, Dec 18, 2023 at 09:28:09PM +0100, Marco Pagani wrote:
>>>>>>>> This patch tentatively set the owner field of fpga_manager_ops to
>>>>>>>> THIS_MODULE for existing fpga manager low-level control modules.
>>>>>>>>
>>>>>>>> Signed-off-by: Marco Pagani <marpagan@...hat.com>
>>>>>>>> ---
>>>>>>>>  drivers/fpga/altera-cvp.c             | 1 +
>>>>>>>>  drivers/fpga/altera-pr-ip-core.c      | 1 +
>>>>>>>>  drivers/fpga/altera-ps-spi.c          | 1 +
>>>>>>>>  drivers/fpga/dfl-fme-mgr.c            | 1 +
>>>>>>>>  drivers/fpga/ice40-spi.c              | 1 +
>>>>>>>>  drivers/fpga/lattice-sysconfig.c      | 1 +
>>>>>>>>  drivers/fpga/machxo2-spi.c            | 1 +
>>>>>>>>  drivers/fpga/microchip-spi.c          | 1 +
>>>>>>>>  drivers/fpga/socfpga-a10.c            | 1 +
>>>>>>>>  drivers/fpga/socfpga.c                | 1 +
>>>>>>>>  drivers/fpga/stratix10-soc.c          | 1 +
>>>>>>>>  drivers/fpga/tests/fpga-mgr-test.c    | 1 +
>>>>>>>>  drivers/fpga/tests/fpga-region-test.c | 1 +
>>>>>>>>  drivers/fpga/ts73xx-fpga.c            | 1 +
>>>>>>>>  drivers/fpga/versal-fpga.c            | 1 +
>>>>>>>>  drivers/fpga/xilinx-spi.c             | 1 +
>>>>>>>>  drivers/fpga/zynq-fpga.c              | 1 +
>>>>>>>>  drivers/fpga/zynqmp-fpga.c            | 1 +
>>>>>>>>  18 files changed, 18 insertions(+)
>>>>>>>>
>>>>>>>> diff --git a/drivers/fpga/altera-cvp.c b/drivers/fpga/altera-cvp.c
>>>>>>>> index 4ffb9da537d8..aeb913547dd8 100644
>>>>>>>> --- a/drivers/fpga/altera-cvp.c
>>>>>>>> +++ b/drivers/fpga/altera-cvp.c
>>>>>>>> @@ -520,6 +520,7 @@ static const struct fpga_manager_ops altera_cvp_ops = {
>>>>>>>>  	.write_init	= altera_cvp_write_init,
>>>>>>>>  	.write		= altera_cvp_write,
>>>>>>>>  	.write_complete	= altera_cvp_write_complete,
>>>>>>>> +	.owner		= THIS_MODULE,
>>>>>>>
>>>>>>> Note, this is not how to do this, force the compiler to set this for you
>>>>>>> automatically, otherwise everyone will always forget to do it.  Look at
>>>>>>> how functions like usb_register_driver() works.
>>>>>>>
>>>>>>> Also, are you _sure_ that you need a module owner in this structure?  I
>>>>>>> still don't know why...
>>>>>>>
>>>>>>
>>>>>> Do you mean moving the module owner field to the manager context and setting
>>>>>> it during registration with a helper macro?
>>>>>
>>>>> I mean set it during registration with a helper macro.
>>>>>
>>>>>> Something like:
>>>>>>
>>>>>> struct fpga_manager {
>>>>>> 	...
>>>>>> 	struct module *owner;
>>>>>> };
>>>>>>
>>>>>> #define fpga_mgr_register(parent, ...) \
>>>>>> 	__fpga_mgr_register(parent,..., THIS_MODULE)
>>>>>>
>>>>>> struct fpga_manager *
>>>>>> __fpga_mgr_register(struct device *parent, ..., struct module *owner)
>>>>>> {
>>>>>> 	...
>>>>>> 	mgr->owner = owner;
>>>>>> }
>>>>>
>>>>> Yes.
>>>>>
>>>>> But again, is a module owner even needed?  I don't think you all have
>>>>> proven that yet...
>>>>
>>>> Programming an FPGA involves a potentially lengthy sequence of interactions
>>>> with the reconfiguration engine. The manager conceptually organizes these
>>>> interactions as a sequence of ops. Low-level modules implement these ops/steps
>>>> for a specific device. If we don't protect the low-level module, someone might
>>>> unload it right when we are in the middle of a low-level op programming the
>>>> FPGA. As far as I know, the kernel would crash in that case.
>>>
>>> The only way an unload of a module can happen is if a user explicitly
>>> asks for it to be unloaded.  So they get what they ask for, right?
>>>
>>
>> Right, the user should get what he asked for, including hanging the
>> hardware. My only concern is that the kernel should not crash.
>>
>>> How do you "know" it is active?  And why doesn't the normal
>>> "driver/device" bindings prevent unloading from being a problem?  When
>>> you unload a module, you stop all ops on the driver, and then unregister
>>> it, which causes any future ones to fail.
>>>
>>> Or am I missing something here?
>>>
>>  
>> I think the problem is that the ops are not directly tied to the driver
>> of the manager's parent device.
> 
> Then that needs to be fixed right there, as that is obviously not using
> the driver model properly.
> 
> Why aren't the "ops" a driver that is bound to this device?  If it is
> the one responsible for controlling it, then it should be a driver and
> as such, the driver model logic will handle things if/when a module is
> unloaded to tear things down better.
> 
>> It is not even required to have a driver
>> to register a manager. The only way to know if the fpga manager is
>> active (i.e., someone is running one op) is by poking manager->state.
> 
> That too seems wrong, why is this?

I don't know. I was not around when the fpga subsystem was laid down.

> 
>> One possibility that comes into my mind, excluding a major reworking,
>> is waiting in fpga_mgr_unregister() until the manager reaches a steady
>> state (no ops are running) before unregistering the device. However, it
>> feels questionable because if one of the ops hangs, the module removal
>> will also hang.
> 
> You never know when a new operand will come in, so there's no way to
> know "all is quiet", sorry.
> 
> Try fixing this properly, buy using the driver model correctly, that
> should help resolve these issues automatically instead of hacked up
> module reference count attempts.
> 
> Remember, this is the whole reason why the driver model was created all
> those 20+ years ago, to move away from these module reference count
> issues, let's not forget history please.
> 

I do not entirely understand this part. The subsystem only provides an
in-kernel API for programming the fpga that in-kernel consumers can use.
The ops that the low-level module implements are used only internally by
the manager in a predefined order.

There is no standard interface for programming the fpga exposed to
userspace using file_operations or attributes exported via sysfs.
The manager only exports read-only attributes for status. On top
of that, there is only the support for device tree overlays.

Would it be correct to assume that the responsibility of keeping
the low-level module in while programming the fpga is on the kernel
component that consumes the subsystem's in-kernel API and (eventually)
exports a programming interface to userspace?

If we consider the case where the programming is done through a
userspace interface exported by the same module that implements the ops,
then we should be good even without taking the low-level module in the
manager.

However, I guess the decision to take the low-level module in the
manager was meant to address the case where the module implementing the
ops and the consumer of the in-kernel API (that may optionally export a
userspace interface for programming) are two separate entities.

Thanks,
Marco


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ