[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190819053937.GR16384@42.do-not-panic.com>
Date: Mon, 19 Aug 2019 05:39:37 +0000
From: Luis Chamberlain <mcgrof@...nel.org>
To: Scott Branden <scott.branden@...adcom.com>
Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Andy Gross <andy.gross@...aro.org>,
David Brown <david.brown@...aro.org>,
Alexander Viro <viro@...iv.linux.org.uk>,
Shuah Khan <shuah@...nel.org>, bjorn.andersson@...aro.org,
"Rafael J . Wysocki" <rafael@...nel.org>,
linux-kernel@...r.kernel.org, linux-arm-msm@...r.kernel.org,
linux-fsdevel@...r.kernel.org,
BCM Kernel Feedback <bcm-kernel-feedback-list@...adcom.com>,
Olof Johansson <olof@...om.net>,
Andrew Morton <akpm@...ux-foundation.org>,
Dan Carpenter <dan.carpenter@...cle.com>,
Colin Ian King <colin.king@...onical.com>,
Kees Cook <keescook@...omium.org>,
Takashi Iwai <tiwai@...e.de>, linux-kselftest@...r.kernel.org
Subject: Re: [PATCH 3/3] firmware: add mutex fw_lock_fallback for race
condition
On Thu, Aug 15, 2019 at 05:09:45PM -0700, Scott Branden wrote:
> A race condition exists between _request_firmware_prepare checking
> if firmware is assigned and firmware_fallback_sysfs creating a sysfs
> entry (kernel trace below). To avoid such condition add a mutex
> fw_lock_fallback to protect against such condition.
I am not buying this fix, and it seems sloppy. More below.
> misc test_firmware: Falling back to sysfs fallback for: nope-test-firmware.bin
So the fallback kicks in with the file that is not there.
> sysfs: cannot create duplicate filename '/devices/virtual/misc/test_firmware/nope-test-firmware.bin'
And we have a duplicate entry, for the *device* created to allow us to
create a file entry to allow us to copy the file. Your tests had a loop,
so there is actually a race between two entries being created while
one one failed.
> CPU: 4 PID: 2059 Comm: test_firmware-3 Not tainted 5.3.0-rc4 #1
> Hardware name: Dell Inc. OptiPlex 7010/0KRC95, BIOS A13 03/25/2013
> Call Trace:
> dump_stack+0x67/0x90
> sysfs_warn_dup.cold+0x17/0x24
> sysfs_create_dir_ns+0xb3/0xd0
> kobject_add_internal+0xa6/0x2a0
> kobject_add+0x7e/0xb0
Note: kobject_add().
> ? _cond_resched+0x15/0x30
> device_add+0x121/0x670
> firmware_fallback_sysfs+0x15c/0x3c9
> _request_firmware+0x432/0x5a0
> ? devres_find+0x63/0xc0
> request_firmware_into_buf+0x63/0x80
> test_fw_run_batch_request+0x96/0xe0
> kthread+0xfb/0x130
> ? reset_store+0x30/0x30
> ? kthread_park+0x80/0x80
> ret_from_fork+0x3a/0x50
> kobject_add_internal failed for nope-test-firmware.bin with -EEXIST, don't try to register things with the same name in the same directory.
So above it makes it even clearer, two kobjets with the same name.
> Signed-off-by: Scott Branden <scott.branden@...adcom.com>
> ---
> drivers/base/firmware_loader/main.c | 15 +++++++++++++++
> 1 file changed, 15 insertions(+)
>
> diff --git a/drivers/base/firmware_loader/main.c b/drivers/base/firmware_loader/main.c
> index bf44c79beae9..ce9896e3b782 100644
> --- a/drivers/base/firmware_loader/main.c
> +++ b/drivers/base/firmware_loader/main.c
> @@ -88,6 +88,7 @@ static inline struct fw_priv *to_fw_priv(struct kref *ref)
> /* fw_lock could be moved to 'struct fw_sysfs' but since it is just
> * guarding for corner cases a global lock should be OK */
> DEFINE_MUTEX(fw_lock);
> +DEFINE_MUTEX(fw_lock_fallback);
The reason I don't like this fix is that this mutex is named after ther
fallback interface... but...
>
> static struct firmware_cache fw_cache;
>
> @@ -758,6 +759,17 @@ _request_firmware(const struct firmware **firmware_p, const char *name,
> if (!firmware_p)
> return -EINVAL;
>
> + /*
> + * There is a race condition between _request_firmware_prepare checking
> + * if firmware is assigned and firmware_fallback_sysfs creating sysfs
> + * entries with duplicate names.
> + * Yet, with this lock the firmware_test locks up with cache enabled
> + * and no event used during firmware test.
> + * This points to some very racy code I don't know how to entirely fix.
> + */
> + if (opt_flags & FW_OPT_NOCACHE)
> + mutex_lock(&fw_lock_fallback);
Whoa.. What does no-cache have anything to do with the fallback interface
other than the fact we enable this feature for the fallback interface?
We don't need to penalize non-fallback users who *also* may want to
enable the no-cache feature.
So, the fix should be within the boundaries of the creation / deletion
of the kobject, not this nocache feature. Can you please re-evaluate
this code and look for a more compartamentalized solution to the
fallback code only?
Luis
Powered by blists - more mailing lists