lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <YegsfuLPJDqYPr0n@ripper>
Date:   Wed, 19 Jan 2022 07:21:34 -0800
From:   Bjorn Andersson <bjorn.andersson@...aro.org>
To:     Sudeep Holla <sudeep.holla@....com>
Cc:     Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        "Rafael J. Wysocki" <rafael@...nel.org>,
        Viresh Kumar <viresh.kumar@...aro.org>,
        Lukasz Luba <lukasz.luba@....com>,
        Vladimir Zapolskiy <vladimir.zapolskiy@...aro.org>,
        Thara Gopinath <thara.gopinath@...il.com>,
        linux-kernel@...r.kernel.org, linux-arm-msm@...r.kernel.org,
        linux-pm@...r.kernel.org
Subject: Re: [PATCH 2/2] arch_topology: Sanity check cpumask in thermal
 pressure update

On Wed 19 Jan 06:43 PST 2022, Sudeep Holla wrote:

> On Tue, Jan 18, 2022 at 10:56:12AM -0800, Bjorn Andersson wrote:
> > Occasionally during boot the Qualcomm cpufreq driver was able to cause
> > an invalid memory access in topology_update_thermal_pressure() on the
> > line:
> > 
> > 	if (max_freq <= capped_freq)
> > 
> > It turns out that this was caused by a race, which resulted in the
> > cpumask passed to the function being empty, in which case
> > cpumask_first() will return a cpu beyond the number of valid cpus, which
> > when used to access the per_cpu max_freq would return invalid pointer.
> > 
> > The bug in the Qualcomm cpufreq driver is being fixed, but having a
> > sanity check of the arguments would have saved quite a bit of time and
> > it's not unlikely that others will run into the same issue.
> > 
> > Signed-off-by: Bjorn Andersson <bjorn.andersson@...aro.org>
> > ---
> >  drivers/base/arch_topology.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> > index 976154140f0b..6560a0c3b969 100644
> > --- a/drivers/base/arch_topology.c
> > +++ b/drivers/base/arch_topology.c
> > @@ -177,6 +177,9 @@ void topology_update_thermal_pressure(const struct cpumask *cpus,
> >  	u32 max_freq;
> >  	int cpu;
> >  
> > +	if (WARN_ON(cpumask_empty(cpus)))
> > +		return;
> > +
> 
> Why can't the caller check and call this only when cpus is not empty ?
> IIUC there are many such APIs that use cpumask and could result in similar
> issues if called with empty cpus. Probably we could add a note that cpus
> must not be empty if that helps the callers ?
> 

As indicated in the commit message, it took me a while to conclude that
the cause for a memory fault on what seemed to be a comparison between
two variables on the stack was actually caused by this race - which
isn't trivially reproducible, unless you know what the bug is.

Now _I_ know better and will hopefully recognize the oops signature
right away, but my hope was to put the sanity check on this side to save
the next caller of this API some time. Updating the comment probably
would have saved me a minute or two at the end, probably as confirmation
of my findings after the fact...

If you prefer to keep topology_update_thermal_pressure() clean(er) and
exciting I can hack around the issue in the Qualcomm driver.

PS. I'm onboard with Greg's objection to the WARN_ON()...

Regards,
Bjorn

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ