lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZUxpHk/8pCusjXOb@yilunxu-OptiPlex-7050>
Date:   Thu, 9 Nov 2023 13:07:42 +0800
From:   Xu Yilun <yilun.xu@...ux.intel.com>
To:     Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Cc:     Marco Pagani <marpagan@...hat.com>,
        Moritz Fischer <mdf@...nel.org>, Wu Hao <hao.wu@...el.com>,
        Xu Yilun <yilun.xu@...el.com>, Tom Rix <trix@...hat.com>,
        Alan Tull <atull@...nsource.altera.com>,
        linux-fpga@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH] fpga: remove module reference counting from core
 components

On Wed, Nov 08, 2023 at 05:20:53PM +0100, Greg Kroah-Hartman wrote:
> On Wed, Nov 08, 2023 at 11:52:52PM +0800, Xu Yilun wrote:
> > > >>
> > > >> In fpga_region_get() / fpga_region_put(): call get_device() before
> > > >> acquiring the mutex and put_device() after having released the mutex
> > > >> to avoid races.
> 
> Why do you need another reference count with a lock?  You already have
> that with the calls to get/put_device().

The low-level driver module could still be possibly unloaded at the same
time, if so, when FPGA core run some callbacks provided by low-level driver
module, its referenced page of code is unmapped...

I actually got this idea from this mail thread:

https://lore.kernel.org/all/20231010003746.GN800259@ZenIV/

And I see get/put of "struct file_operations.owner" in many framework
code for this purpose, to ensure no fops->read/write/ioctl() is ongoing
when module unloading.

> 
> > > > Could you help elaborate more about the race?
> > > > 
> > > 
> > > I accidentally misused the word race. My concern was that memory might
> > > be released after the last put_device(), causing mutex_unlock() to be
> > > called on a mutex that does not exist anymore. It should not happen
> > > for the moment since the region does not use devres, but I think it
> > > still makes the code more brittle.
> > 
> > It makes sense.
> > 
> > But I dislike the mutex itself. The purpose is to exclusively grab the
> > device, but a mutex is too much heavy for this.
> 
> Why "heavy"?  Is this a fast-path?  Have you measured it?

It's not a fast-path. But I didn't make it clear, see below...

> 
> > The lock/unlock of mutex
> > scattered in different functions is also an uncommon style. Maybe some
> > atomic count should be enough.
> 
> Don't make a fake lock with an atomic variable, use real locks if you
> need it.

I mean, here it doesn't not looks like a "locking" senario, although it
works.

The purpose here is to declare a device state, which says the device is
exclusively used by a user, no other user is allowed. But usually we use
mutex to protect against critical code blocks, not to represent a long
live device state.

I'm still OK for the existing mutex usage as it doesn't break anything
and if we don't want change much.

> 
> Or don't even care about module unloading at all!  Why does it matter?
> It can never happen without explicit intervention and it's something
> that a lot of the time, just will cause problems.  Don't do a lot of
> extra work for something that doesn't matter.

mm.. as mentioned above some fundamental subsystems do care about
module unloading, and I tend to keep it same way. But mm... I'm OK to
make things easier if you insist.

Thanks,
Yilun

> 
> thanks,
> 
> greg k-h

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ