lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 21 Feb 2024 08:03:07 +0100
From: Nuno Sá <noname.nuno@...il.com>
To: Saravana Kannan <saravanak@...gle.com>, Herve Codina
	 <herve.codina@...tlin.com>
Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>, "Rafael J. Wysocki"
 <rafael@...nel.org>, Rob Herring <robh+dt@...nel.org>, Frank Rowand
 <frowand.list@...il.com>, Lizhi Hou <lizhi.hou@....com>, Max Zhen
 <max.zhen@....com>, Sonal Santan <sonal.santan@....com>, Stefano Stabellini
 <stefano.stabellini@...inx.com>, Jonathan Cameron
 <Jonathan.Cameron@...wei.com>,  linux-kernel@...r.kernel.org,
 devicetree@...r.kernel.org, Allan Nielsen <allan.nielsen@...rochip.com>,
 Horatiu Vultur <horatiu.vultur@...rochip.com>,  Steen Hegelund
 <steen.hegelund@...rochip.com>, Thomas Petazzoni
 <thomas.petazzoni@...tlin.com>, Luca Ceresoli <luca.ceresoli@...tlin.com>,
 Nuno Sa <nuno.sa@...log.com>, Android Kernel Team <kernel-team@...roid.com>
Subject: Re: [PATCH 2/2] of: overlay: Synchronize of_overlay_remove() with
 the devlink removals

On Tue, 2024-02-20 at 16:37 -0800, Saravana Kannan wrote:
> On Thu, Nov 30, 2023 at 9:41 AM Herve Codina <herve.codina@...tlin.com> wrote:
> > 
> > In the following sequence:
> >   1) of_platform_depopulate()
> >   2) of_overlay_remove()
> > 
> > During the step 1, devices are destroyed and devlinks are removed.
> > During the step 2, OF nodes are destroyed but
> > __of_changeset_entry_destroy() can raise warnings related to missing
> > of_node_put():
> >   ERROR: memory leak, expected refcount 1 instead of 2 ...
> > 
> > Indeed, during the devlink removals performed at step 1, the removal
> > itself releasing the device (and the attached of_node) is done by a job
> > queued in a workqueue and so, it is done asynchronously with respect to
> > function calls.
> > When the warning is present, of_node_put() will be called but wrongly
> > too late from the workqueue job.
> > 
> > In order to be sure that any ongoing devlink removals are done before
> > the of_node destruction, synchronize the of_overlay_remove() with the
> > devlink removals.
> > 
> 
> Add Fixes tag for this one too to point to the change that added the workqueue.
> 
> Please CC Nuno and Luca in your v2 series.
> 
> > Signed-off-by: Herve Codina <herve.codina@...tlin.com>
> > ---
> >  drivers/of/overlay.c | 6 ++++++
> >  1 file changed, 6 insertions(+)
> > 
> > diff --git a/drivers/of/overlay.c b/drivers/of/overlay.c
> > index a9a292d6d59b..5c5f808b163e 100644
> > --- a/drivers/of/overlay.c
> > +++ b/drivers/of/overlay.c
> > @@ -1202,6 +1202,12 @@ int of_overlay_remove(int *ovcs_id)
> >                 goto out;
> >         }
> > 
> > +       /*
> > +        * Wait for any ongoing device link removals before removing some of
> > +        * nodes
> > +        */
> > +       device_link_wait_removal();
> > +
> 
> Nuno in his patch[1] had this "wait" happen inside
> __of_changeset_entry_destroy(). Which seems to be necessary to not hit
> the issue that Luca reported[2] in this patch series. Is there any
> problem with doing that?
> 

In my tests, I did not saw any issue. Logically it also makes sense as you wait for
all possible refcount drops just before checking your assumptions. But I might be
missing something though...

> Luca for some reason did a unlock/lock(of_mutex) in his test patch and
> I don't think that's necessary.

Yes, I agree. queue_work() and flush_worqueue() should already have all the
synchronization semantics internally.

- Nuno Sá



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ