lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b118af4c-e4cc-c50b-59aa-d768f1ec69ff@linaro.org>
Date:   Tue, 3 Jan 2023 14:22:48 +0000
From:   Srinivas Kandagatla <srinivas.kandagatla@...aro.org>
To:     Hector Martin <marcan@...can.st>
Cc:     Sven Peter <sven@...npeter.dev>,
        Alyssa Rosenzweig <alyssa@...enzweig.io>,
        asahi@...ts.linux.dev, linux-kernel@...r.kernel.org,
        stable@...r.kernel.org, Eric Curtin <ecurtin@...hat.com>,
        "Russell King (Oracle)" <linux@...linux.org.uk>
Subject: Re: [PATCH v2] nvmem: core: Fix race in nvmem_register()



On 03/01/2023 13:48, Hector Martin wrote:
> On 03/01/2023 21.41, Srinivas Kandagatla wrote:
>>
>>
>> On 03/01/2023 11:44, Hector Martin wrote:
>>> nvmem_register() currently registers the device before adding the nvmem
>>> cells, which creates a race window where consumers may find the nvmem
>>> device (and not get PROBE_DEFERred), but then fail to find the cells and
>>> error out.
>>>
>>> Move device registration to the end of nvmem_register(), to close the
>>> race.
>>>
>>> Observed when the stars line up on Apple Silicon machines with the (not
>>> yet upstream, but trivial) spmi nvmem driver and the macsmc-rtc client:
>>>
>>> [    0.487375] macsmc-rtc macsmc-rtc: error -ENOENT: Failed to get rtc_offset NVMEM cell
>>>
>>> Fixes: eace75cfdcf7 ("nvmem: Add a simple NVMEM framework for nvmem providers")
>>> Cc: stable@...r.kernel.org
>>> Reviewed-by: Eric Curtin <ecurtin@...hat.com>
>>> Signed-off-by: Hector Martin <marcan@...can.st>
>>> ---
>>
>> What has changed since v1?
> 
> What you told me to. I'm trying to get a silly bug fixed after you
> ignored my original patch until Russell independently discovered and
> submitted a fix for the same thing. I think we've wasted enough
> developer time here already.

You should remember that maintainers have other regular job and holidays 
apart from maintaining. When you last sent the patch we are already in 
near/middle of merge window. If I had applied your original patch 
without proper review, It would have introduced more regressions. Be 
patient and we/I understand your concern.

Its always possible that multiple developers hit same bug and endup in 
multiple patches, there is no way to avoid this unless every developer 
checks for similar patches on the list.

> 
>>
>>>    drivers/nvmem/core.c | 32 +++++++++++++++++---------------
>>>    1 file changed, 17 insertions(+), 15 deletions(-)
>>>
>>> diff --git a/drivers/nvmem/core.c b/drivers/nvmem/core.c
>>> index 321d7d63e068..606f428d6292 100644
>>> --- a/drivers/nvmem/core.c
>>> +++ b/drivers/nvmem/core.c
>>> @@ -822,11 +822,8 @@ struct nvmem_device *nvmem_register(const struct nvmem_config *config)
>>>    		break;
>>>    	}
>>>
>>> -	if (rval) {
>>> -		ida_free(&nvmem_ida, nvmem->id);
>>> -		kfree(nvmem);
>>> -		return ERR_PTR(rval);
>>> -	}
>>> +	if (rval)
>>> +		goto err_gpiod_put;
>>
>> Why was gpiod changes added to this patch, that should be a separate
>> patch/discussion, as this is not relevant to the issue that you are
>> reporting.
> 
> Because freeing the device also does a gpiod_put in the destructor, so
This are clearly untested, And I dont want this to be in the middle to 
fix to the issue you are hitting.
We should always be careful about untested changes, in this case gpiod 
has some conditions to check before doing a put. So the patch is 
incorrect as it is.

> doing this is correct in every other instance below and maintains
> existing behavior, and it just so happens that this instance converges
> into the same codepath so it is correct to merge it, and it just so
> happens that the gpiod put was missing in this path to begin with so
> this becomes a drive-by bugfix.
> 
> If you don't like it I can remove it (i.e. reintroduce the bug for no
> good reason) and you can submit this fix yourself, because I have no
> incentive to waste time submitting a separate patch to fix a GPIO leak
> in an error path corner case in a subsystem I don't own and I have much
> bigger things to spend my (increasingly lower and lower) willingness to
> fight for upstream submissions than this.
> 
> Seriously, what is wrong with y'all kernel people. No other open source
> project wastes contributors' time with stupid nitpicks like this. I

These are not stupid nit picks your v1/v2 patches introduced memory 
leaks and regressions so i will not be picking up any patches that fall 
in that area.

> found a bug, I fixed it, I then fixed the issues you pointed out, and I
> don't have the time nor energy to fight over this kind of nonsense next.

I think its worth reading ./Documentation/process/submitting-patches.rst


thanks,
Srini
> Do you want bugs fixed or not?
> 
>>>
>>>    	nvmem->read_only = device_property_present(config->dev, "read-only") ||
>>>    			   config->read_only || !nvmem->reg_write;
>>> @@ -837,20 +834,16 @@ struct nvmem_device *nvmem_register(const struct nvmem_config *config)
>>>
>>>    	dev_dbg(&nvmem->dev, "Registering nvmem device %s\n", config->name);
>>>
>>> -	rval = device_register(&nvmem->dev);
>>> -	if (rval)
>>> -		goto err_put_device;
>>> -
>>>    	if (nvmem->nkeepout) {
>>>    		rval = nvmem_validate_keepouts(nvmem);
>>>    		if (rval)
>>> -			goto err_device_del;
>>> +			goto err_gpiod_put;
>>>    	}
>>>
>>>    	if (config->compat) {
>>>    		rval = nvmem_sysfs_setup_compat(nvmem, config);
>>>    		if (rval)
>>> -			goto err_device_del;
>>> +			goto err_gpiod_put;
>>>    	}
>>>
>>>    	if (config->cells) {
>>> @@ -867,6 +860,15 @@ struct nvmem_device *nvmem_register(const struct nvmem_config *config)
>>>    	if (rval)
>>>    		goto err_remove_cells;
>>>
>>> +	rval = device_register(&nvmem->dev);
>>> +	if (rval) {
>>> +		nvmem_device_remove_all_cells(nvmem);
>>> +		if (config->compat)
>>> +			nvmem_sysfs_remove_compat(nvmem, config);
>>> +		put_device(&nvmem->dev);
>>> +		return ERR_PTR(rval);
>>> +	}
>>> +
>>>    	blocking_notifier_call_chain(&nvmem_notifier, NVMEM_ADD, nvmem);
>>>
>>>    	return nvmem;
>>> @@ -876,10 +878,10 @@ struct nvmem_device *nvmem_register(const struct nvmem_config *config)
>>>    err_teardown_compat:
>>>    	if (config->compat)
>>>    		nvmem_sysfs_remove_compat(nvmem, config);
>>> -err_device_del:
>>> -	device_del(&nvmem->dev);
>>> -err_put_device:
>>> -	put_device(&nvmem->dev);
>>> +err_gpiod_put:
>>> +	gpiod_put(nvmem->wp_gpio);
>>> +	ida_free(&nvmem_ida, nvmem->id);
>>> +	kfree(nvmem);
>>>
>>>    	return ERR_PTR(rval);
>>>    }
>>> --
>>> 2.35.1
>>>
>>
> 
> 
> - Hector

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ