lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAGETcx9W81JNeqKFJvG_ydGMdBML9SvNMsmoxnCuyKgf2vN89w@mail.gmail.com>
Date: Tue, 5 Mar 2024 18:59:47 -0800
From: Saravana Kannan <saravanak@...gle.com>
To: Nuno Sá <noname.nuno@...il.com>
Cc: Herve Codina <herve.codina@...tlin.com>, Rob Herring <robh@...nel.org>, 
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>, "Rafael J. Wysocki" <rafael@...nel.org>, 
	Frank Rowand <frowand.list@...il.com>, Lizhi Hou <lizhi.hou@....com>, Max Zhen <max.zhen@....com>, 
	Sonal Santan <sonal.santan@....com>, Stefano Stabellini <stefano.stabellini@...inx.com>, 
	Jonathan Cameron <Jonathan.Cameron@...wei.com>, linux-kernel@...r.kernel.org, 
	devicetree@...r.kernel.org, Allan Nielsen <allan.nielsen@...rochip.com>, 
	Horatiu Vultur <horatiu.vultur@...rochip.com>, 
	Steen Hegelund <steen.hegelund@...rochip.com>, Luca Ceresoli <luca.ceresoli@...tlin.com>, 
	Nuno Sa <nuno.sa@...log.com>, Thomas Petazzoni <thomas.petazzoni@...tlin.com>, 
	stable@...r.kernel.org
Subject: Re: [PATCH v3 2/2] of: overlay: Synchronize of_overlay_remove() with
 the devlink removals

On Tue, Mar 5, 2024 at 2:43 AM Nuno Sá <noname.nuno@...il.com> wrote:
>
> On Tue, 2024-03-05 at 11:27 +0100, Herve Codina wrote:
> > Hi Nuno, Saravana, Rob,
> >
> > On Tue, 05 Mar 2024 08:36:45 +0100
> > Nuno Sá <noname.nuno@...il.com> wrote:
> >
> > > On Mon, 2024-03-04 at 22:47 -0800, Saravana Kannan wrote:
> > > > On Mon, Mar 4, 2024 at 8:49 AM Herve Codina <herve.codina@...tlin.com>
> > > > wrote:
> > > > >
> > > > > Hi Rob,
> > > > >
> > > > > On Mon, 4 Mar 2024 09:22:02 -0600
> > > > > Rob Herring <robh@...nel.org> wrote:
> > > > >
> > > > > ...
> > > > >
> > > > > > > > @@ -853,6 +854,14 @@ static void free_overlay_changeset(struct
> > > > > > > > overlay_changeset *ovcs)
> > > > > > > >  {
> > > > > > > >   int i;
> > > > > > > >
> > > > > > > > + /*
> > > > > > > > +  * Wait for any ongoing device link removals before removing
> > > > > > > > some of
> > > > > > > > +  * nodes. Drop the global lock while waiting
> > > > > > > > +  */
> > > > > > > > + mutex_unlock(&of_mutex);
> > > > > > > > + device_link_wait_removal();
> > > > > > > > + mutex_lock(&of_mutex);
> > > > > > >
> > > > > > > I'm still not convinced we need to drop the lock. What happens if
> > > > > > > someone else
> > > > > > > grabs the lock while we are in device_link_wait_removal()? Can we
> > > > > > > guarantee that
> > > > > > > we can't screw things badly?
> > > > > >
> > > > > > It is also just ugly because it's the callers of
> > > > > > free_overlay_changeset() that hold the lock and now we're releasing it
> > > > > > behind their back.
> > > > > >
> > > > > > As device_link_wait_removal() is called before we touch anything,
> > > > > > can't
> > > > > > it be called before we take the lock? And do we need to call it if
> > > > > > applying the overlay fails?
> > > >
> > > > Rob,
> > > >
> > > > This[1] scenario Luca reported seems like a reason for the
> > > > device_link_wait_removal() to be where Herve put it. That example
> > > > seems reasonable.
> > > >
> > > > [1] - https://lore.kernel.org/all/20231220181627.341e8789@booty/
> > > >
> > >
> > > I'm still not totally convinced about that. Why not putting the check right
> > > before checking the kref in __of_changeset_entry_destroy(). I'll contradict
> > > myself a bit because this is just theory but if we look at pci_stop_dev(),
> > > which
> > > AFAIU, could be reached from a sysfs write(), we have:
> > >
> > > device_release_driver(&dev->dev);
> > > ...
> > > of_pci_remove_node(dev);
> > >     of_changeset_revert(np->data);
> > >     of_changeset_destroy(np->data);
> > >
> > > So looking at the above we would hit the same issue if we flush the queue in
> > > free_overlay_changeset() - as the queue won't be flushed at all and we could
> > > have devlink removal due to device_release_driver(). Right?
> > >
> > > Again, completely theoretical but seems like a reasonable one plus I'm not
> > > understanding the push against having the flush in
> > > __of_changeset_entry_destroy(). Conceptually, it looks the best place to me
> > > but
> > > I may be missing some issue in doing it there?
> >
> > Instead of having the wait called in __of_changeset_entry_destroy() and so
> > called in a loop. I could move this call in the __of_changeset_entry_destroy()
> > caller (without any of_mutex lock drop).
> >
>
> Oh, good catch! At this point all the devlinks removals (related to the
> changeset) should have been queued so yes, we should only need to flush once.
>
> > So this will look like this:
> > --- 8< ---
> > void of_changeset_destroy(struct of_changeset *ocs)
> > {
> >       struct of_changeset_entry *ce, *cen;
> >
> >       device_link_wait_removal();
> >
> >       list_for_each_entry_safe_reverse(ce, cen, &ocs->entries, node)
> >               __of_changeset_entry_destroy(ce);
> > }
> > --- 8< ---
> >
> > I already tested on my system and it works correctly with
> > device_link_wait_removal() only called from of_changeset_destroy()
> > as proposed.
> >
> > Saravana, Nuno, Rob does it seems ok for you ?

Looks good to me.

-Saravana

> >
>
> It looks good to me...
>
> - Nuno Sá
> >
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ