linux-kernel - Re: [PATCH v16 01/14] power: reset: reboot-mode: Synchronize list traversal

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251103081601.ybwv7jzpxd6mndki@hu-mojha-hyd.qualcomm.com>
Date: Mon, 3 Nov 2025 13:46:01 +0530
From: Mukesh Ojha <mukesh.ojha@....qualcomm.com>
To: Shivendra Pratap <shivendra.pratap@....qualcomm.com>
Cc: Bjorn Andersson <andersson@...nel.org>,
        Bartosz Golaszewski <bartosz.golaszewski@...aro.org>,
        Sebastian Reichel <sre@...nel.org>, Rob Herring <robh@...nel.org>,
        Sudeep Holla <sudeep.holla@....com>,
        Souvik Chakravarty <Souvik.Chakravarty@....com>,
        Krzysztof Kozlowski <krzk+dt@...nel.org>,
        Conor Dooley <conor+dt@...nel.org>, Andy Yan <andy.yan@...k-chips.com>,
        Mark Rutland <mark.rutland@....com>,
        Lorenzo Pieralisi <lpieralisi@...nel.org>,
        Arnd Bergmann <arnd@...db.de>, Konrad Dybcio <konradybcio@...nel.org>,
        cros-qcom-dts-watchers@...omium.org, Vinod Koul <vkoul@...nel.org>,
        Catalin Marinas <catalin.marinas@....com>,
        Will Deacon <will@...nel.org>,
        Florian Fainelli <florian.fainelli@...adcom.com>,
        Moritz Fischer <moritz.fischer@...us.com>,
        John Stultz <john.stultz@...aro.org>,
        Matthias Brugger <matthias.bgg@...il.com>,
        Krzysztof Kozlowski <krzk@...nel.org>,
        Dmitry Baryshkov <dmitry.baryshkov@....qualcomm.com>,
        Stephen Boyd <swboyd@...omium.org>,
        Andre Draszik <andre.draszik@...aro.org>,
        Kathiravan Thirumoorthy <kathiravan.thirumoorthy@....qualcomm.com>,
        linux-pm@...r.kernel.org, linux-kernel@...r.kernel.org,
        devicetree@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
        linux-arm-msm@...r.kernel.org,
        Elliot Berman <quic_eberman@...cinc.com>,
        Srinivas Kandagatla <srini@...nel.org>
Subject: Re: [PATCH v16 01/14] power: reset: reboot-mode: Synchronize list
 traversal

On Wed, Oct 29, 2025 at 11:18:52PM +0530, Shivendra Pratap wrote:
> 
> 
> On 10/28/2025 9:04 AM, Bjorn Andersson wrote:
> > On Wed, Oct 15, 2025 at 10:08:16AM +0530, Shivendra Pratap wrote:
> >> List traversals must be synchronized to prevent race conditions
> >> and data corruption. The reboot-mode list is not protected by a
> >> lock currently, which can lead to concurrent access and race.
> > 
> > Is it a theoretical future race or something that we can hit in the
> > current implementation?
> > 
> >>
> >> Introduce a mutex lock to guard all operations on the reboot-mode
> >> list and ensure thread-safe access. The change prevents unsafe
> >> concurrent access on reboot-mode list.
> > 
> > I was under the impression that these lists where created during boot
> > and then used at some later point, which at best would bring a
> > theoretical window for a race... Reviewing the code supports my
> > understanding, but perhaps I'm missing something?
> > 
> >>
> >> Fixes: 4fcd504edbf7 ("power: reset: add reboot mode driver")
> >> Fixes: ca3d2ea52314 ("power: reset: reboot-mode: better compatibility with DT (replace ' ,/')")
> >>
> > 
> > Skip this empty line, please.
> > 
> > 
> > And given that you have fixes here, I guess this is a problem today. In
> > which case, this shouldn't have been carried for 16 versions - but have
> > sent and been merged on its own already.
> > 
> > So please, if this is a real issue, start your commit message with a
> > descriptive problem description, to make it clear that this needs to be
> > merged yesterday - or drop the fixes.
> > 
> >> Signed-off-by: Shivendra Pratap <shivendra.pratap@....qualcomm.com>
> >> ---
> >>  drivers/power/reset/reboot-mode.c | 96 +++++++++++++++++++++------------------
> >>  include/linux/reboot-mode.h       |  4 ++
> >>  2 files changed, 57 insertions(+), 43 deletions(-)
> >>
> >> diff --git a/drivers/power/reset/reboot-mode.c b/drivers/power/reset/reboot-mode.c
> >> index fba53f638da04655e756b5f8b7d2d666d1379535..8fc3e14638ea757c8dc3808c240ff569cbd74786 100644
> >> --- a/drivers/power/reset/reboot-mode.c
> >> +++ b/drivers/power/reset/reboot-mode.c
> >> @@ -29,9 +29,11 @@ static unsigned int get_reboot_mode_magic(struct reboot_mode_driver *reboot,
> >>  	if (!cmd)
> >>  		cmd = normal;
> >>  
> >> -	list_for_each_entry(info, &reboot->head, list)
> >> -		if (!strcmp(info->mode, cmd))
> >> -			return info->magic;
> >> +	scoped_guard(mutex, &reboot->rb_lock) {
> >> +		list_for_each_entry(info, &reboot->head, list)
> >> +			if (!strcmp(info->mode, cmd))
> >> +				return info->magic;
> >> +	}
> >>  
> >>  	/* try to match again, replacing characters impossible in DT */
> >>  	if (strscpy(cmd_, cmd, sizeof(cmd_)) == -E2BIG)
> >> @@ -41,9 +43,11 @@ static unsigned int get_reboot_mode_magic(struct reboot_mode_driver *reboot,
> >>  	strreplace(cmd_, ',', '-');
> >>  	strreplace(cmd_, '/', '-');
> >>  
> >> -	list_for_each_entry(info, &reboot->head, list)
> >> -		if (!strcmp(info->mode, cmd_))
> >> -			return info->magic;
> >> +	scoped_guard(mutex, &reboot->rb_lock) {
> >> +		list_for_each_entry(info, &reboot->head, list)
> >> +			if (!strcmp(info->mode, cmd_))
> >> +				return info->magic;
> >> +	}
> >>  
> >>  	return 0;
> >>  }
> >> @@ -78,46 +82,50 @@ int reboot_mode_register(struct reboot_mode_driver *reboot)
> >>  
> >>  	INIT_LIST_HEAD(&reboot->head);
> >>  
> >> -	for_each_property_of_node(np, prop) {
> >> -		if (strncmp(prop->name, PREFIX, len))
> >> -			continue;
> >> -
> >> -		info = devm_kzalloc(reboot->dev, sizeof(*info), GFP_KERNEL);
> >> -		if (!info) {
> >> -			ret = -ENOMEM;
> >> -			goto error;
> >> -		}
> >> -
> >> -		if (of_property_read_u32(np, prop->name, &info->magic)) {
> >> -			dev_err(reboot->dev, "reboot mode %s without magic number\n",
> >> -				info->mode);
> >> -			devm_kfree(reboot->dev, info);
> >> -			continue;
> >> -		}
> >> -
> >> -		info->mode = kstrdup_const(prop->name + len, GFP_KERNEL);
> >> -		if (!info->mode) {
> >> -			ret =  -ENOMEM;
> >> -			goto error;
> >> -		} else if (info->mode[0] == '\0') {
> >> -			kfree_const(info->mode);
> >> -			ret = -EINVAL;
> >> -			dev_err(reboot->dev, "invalid mode name(%s): too short!\n",
> >> -				prop->name);
> >> -			goto error;
> >> +	mutex_init(&reboot->rb_lock);
> >> +
> >> +	scoped_guard(mutex, &reboot->rb_lock) {
> > 
> > I don't see how this can race with anything, reboot_mode_register() is
> > supposed to be called from some probe function, with reboot_mode_driver
> > being a "local" object.
> > 
> > The guard here "protects" &reboot->head, but that is not a shared
> > resources at this point.
> > 
> >> +		for_each_property_of_node(np, prop) {
> >> +			if (strncmp(prop->name, PREFIX, len))
> >> +				continue;
> >> +
> >> +			info = devm_kzalloc(reboot->dev, sizeof(*info), GFP_KERNEL);
> >> +			if (!info) {
> >> +				ret = -ENOMEM;
> >> +				goto error;
> >> +			}
> >> +
> >> +			if (of_property_read_u32(np, prop->name, &info->magic)) {
> >> +				dev_err(reboot->dev, "reboot mode %s without magic number\n",
> >> +					info->mode);
> >> +				devm_kfree(reboot->dev, info);
> >> +				continue;
> >> +			}
> >> +
> >> +			info->mode = kstrdup_const(prop->name + len, GFP_KERNEL);
> >> +			if (!info->mode) {
> >> +				ret =  -ENOMEM;
> >> +				goto error;
> >> +			} else if (info->mode[0] == '\0') {
> >> +				kfree_const(info->mode);
> >> +				ret = -EINVAL;
> >> +				dev_err(reboot->dev, "invalid mode name(%s): too short!\n",
> >> +					prop->name);
> >> +				goto error;
> >> +			}
> >> +
> >> +			list_add_tail(&info->list, &reboot->head);
> >>  		}
> >>  
> >> -		list_add_tail(&info->list, &reboot->head);
> >> -	}
> >> -
> >> -	reboot->reboot_notifier.notifier_call = reboot_mode_notify;
> >> -	register_reboot_notifier(&reboot->reboot_notifier);
> >> +		reboot->reboot_notifier.notifier_call = reboot_mode_notify;
> >> +		register_reboot_notifier(&reboot->reboot_notifier);
> > 
> > Once register_reboot_notifier() has been called, &reboot->head is
> > visible outside the specific driver instance.
> > 
> > So, there's no reason to lock in reboot_mode_register().
> > 
> >>  
> >> -	return 0;
> >> +		return 0;
> >>  
> >>  error:
> >> -	list_for_each_entry(info, &reboot->head, list)
> >> -		kfree_const(info->mode);
> >> +		list_for_each_entry(info, &reboot->head, list)
> >> +			kfree_const(info->mode);
> >> +	}
> >>  
> >>  	return ret;
> >>  }
> >> @@ -133,8 +141,10 @@ int reboot_mode_unregister(struct reboot_mode_driver *reboot)
> >>  
> >>  	unregister_reboot_notifier(&reboot->reboot_notifier);
> >>  
> >> -	list_for_each_entry(info, &reboot->head, list)
> >> -		kfree_const(info->mode);
> >> +	scoped_guard(mutex, &reboot->rb_lock) {
> > 
> > get_reboot_mode_magic() is only called from reboot_mode_notify(), which
> > is only invoked by blocking_notifier_call_chain().
> > 
> > blocking_notifier_call_chain() takes a read semaphore.
> > unregister_reboot_notifier() take a write semaphore.
> > 
> > So, if we're racing with a shutdown or reboot, I see two possible
> > things:
> > 
> > 1) blocking_notifier_call_chain() happens first and calls
> >    reboot_mode_notify(), blocking unregister_reboot_notifier(). Once it
> >    returns, the unregister proceeds and we enter case #2
> > 
> > 2) unregister_reboot_notifier() happens first (or after the
> >    blocking_notifier_call_chain() returns). Our reboot object is removed
> >    from the list and blocking_notifier_call_chain() will not invoke
> >    reboot_mode_notify().
> > 
> > In either case, the list has a single owner here.
> > 
> > 
> > As far as I can see, the only race left is if multiple concurrent calls
> > happens to blocking_notifier_call_chain(), the behavior of
> > reboot->write() might be undefined. But I think that is reasonable.
> > 
> > 
> > Please let me know if I'm missing something.
> 
> Thank you for the detailed review. Tried to summarize below:
> 
> The mutex lock was introduced in v13 following earlier discussions about
> whether the issue was a theoretical future race or something that could
> occur in the current implementation.
> 
> At the time (prior to v13), we concluded that while no race condition was
> observable in the current code, there could be a potential in future
> changes or usage patterns — making it a theoretical concern.
> 
> During review, there was further discussion around whether it's acceptable
> to leave the list unprotected simply because no race is currently suspected.
> The general consensus was that it's better practice to protect shared data
> structures like lists with a mutex to ensure correctness and future-proofing.
> 
> Based on that feedback, the mutex lock introduced in v13.

I agree with Bjorn here, we do not need a lock here..

> 
> Later in v15, the reboot-mode maintainer suggested that the patch should
> include a Fixes tag, which was incorporated accordingly.
> 
> So both the mutex addition in v13 and the Fixes tag in v15 were made in
> response to upstream review comments.
> 
> Need some guidance on how to take this forward or is it ok to protect all
> list operation, as done in this patch and keep the fixes tag as suggested
> in earlier reviews?

We should drop this patch., 

> 
> thanks,
> Shivendra

-- 
-Mukesh Ojha