linux-kernel - Re: [PATCH v8 11/12] zram: fix crashes with cpu hotplug multistate

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LSU.2.21.2110251144270.7294@pobox.suse.cz>
Date:   Mon, 25 Oct 2021 11:58:51 +0200 (CEST)
From:   Miroslav Benes <mbenes@...e.cz>
To:     Greg KH <gregkh@...uxfoundation.org>
cc:     Ming Lei <ming.lei@...hat.com>,
        Luis Chamberlain <mcgrof@...nel.org>,
        Benjamin Herrenschmidt <benh@...nel.crashing.org>,
        Paul Mackerras <paulus@...ba.org>, tj@...nel.org,
        akpm@...ux-foundation.org, minchan@...nel.org, jeyu@...nel.org,
        shuah@...nel.org, bvanassche@....org, dan.j.williams@...el.com,
        joe@...ches.com, tglx@...utronix.de, keescook@...omium.org,
        rostedt@...dmis.org, linux-spdx@...r.kernel.org,
        linux-doc@...r.kernel.org, linux-block@...r.kernel.org,
        linux-fsdevel@...r.kernel.org, linux-kselftest@...r.kernel.org,
        linux-kernel@...r.kernel.org, live-patching@...r.kernel.org,
        pmladek@...e.com
Subject: Re: [PATCH v8 11/12] zram: fix crashes with cpu hotplug multistate

On Wed, 20 Oct 2021, Greg KH wrote:

> On Wed, Oct 20, 2021 at 10:19:27AM +0200, Miroslav Benes wrote:
> > On Wed, 20 Oct 2021, Ming Lei wrote:
> > 
> > > On Wed, Oct 20, 2021 at 08:43:37AM +0200, Miroslav Benes wrote:
> > > > On Tue, 19 Oct 2021, Ming Lei wrote:
> > > > 
> > > > > On Tue, Oct 19, 2021 at 08:23:51AM +0200, Miroslav Benes wrote:
> > > > > > > > By you only addressing the deadlock as a requirement on approach a) you are
> > > > > > > > forgetting that there *may* already be present drivers which *do* implement
> > > > > > > > such patterns in the kernel. I worked on addressing the deadlock because
> > > > > > > > I was informed livepatching *did* have that issue as well and so very
> > > > > > > > likely a generic solution to the deadlock could be beneficial to other
> > > > > > > > random drivers.
> > > > > > > 
> > > > > > > In-tree zram doesn't have such deadlock, if livepatching has such AA deadlock,
> > > > > > > just fixed it, and seems it has been fixed by 3ec24776bfd0.
> > > > > > 
> > > > > > I would not call it a fix. It is a kind of ugly workaround because the 
> > > > > > generic infrastructure lacked (lacks) the proper support in my opinion. 
> > > > > > Luis is trying to fix that.
> > > > > 
> > > > > What is the proper support of the generic infrastructure? I am not
> > > > > familiar with livepatching's model(especially with module unload), you mean
> > > > > livepatching have to do the following way from sysfs:
> > > > > 
> > > > > 1) during module exit:
> > > > > 	
> > > > > 	mutex_lock(lp_lock);
> > > > > 	kobject_put(lp_kobj);
> > > > > 	mutex_unlock(lp_lock);
> > > > > 	
> > > > > 2) show()/store() method of attributes of lp_kobj
> > > > > 	
> > > > > 	mutex_lock(lp_lock)
> > > > > 	...
> > > > > 	mutex_unlock(lp_lock)
> > > > 
> > > > Yes, this was exactly the case. We then reworked it a lot (see 
> > > > 958ef1e39d24 ("livepatch: Simplify API by removing registration step"), so 
> > > > now the call sequence is different. kobject_put() is basically offloaded 
> > > > to a workqueue scheduled right from the store() method. Meaning that 
> > > > Luis's work would probably not help us currently, but on the other hand 
> > > > the issues with AA deadlock were one of the main drivers of the redesign 
> > > > (if I remember correctly). There were other reasons too as the changelog 
> > > > of the commit describes.
> > > > 
> > > > So, from my perspective, if there was a way to easily synchronize between 
> > > > a data cleanup from module_exit callback and sysfs/kernfs operations, it 
> > > > could spare people many headaches.
> > > 
> > > kobject_del() is supposed to do so, but you can't hold a shared lock
> > > which is required in show()/store() method. Once kobject_del() returns,
> > > no pending show()/store() any more.
> > > 
> > > The question is that why one shared lock is required for livepatching to
> > > delete the kobject. What are you protecting when you delete one kobject?
> > 
> > I think it boils down to the fact that we embed kobject statically to 
> > structures which livepatch uses to maintain data. That is discouraged 
> > generally, but all the attempts to implement it correctly were utter 
> > failures.
> 
> Sounds like this is the real problem that needs to be fixed.  kobjects
> should always control the lifespan of the structure they are embedded
> in.  If not, then that is a design flaw of the user of the kobject :(

Right, and you've already told us. A couple of times.

For example 
here https://lore.kernel.org/all/20190502074230.GA27847@kroah.com/

:)
 
> Where in the kernel is this happening?  And where have been the attempts
> to fix this up?

include/linux/livepatch.h and kernel/livepatch/core.c. See 
klp_{patch,object,func}.

It took some archeology, but I think 
https://lore.kernel.org/all/1464018848-4303-1-git-send-email-pmladek@suse.com/ 
is it. Petr might correct me.

It was long before we added some important features to the code, so it 
might be even more difficult today.

It resurfaced later when Tobin tried to fix some of kobject call sites in 
the kernel...

https://lore.kernel.org/all/20190430001534.26246-1-tobin@kernel.org/
https://lore.kernel.org/all/20190430233803.GB10777@eros.localdomain/
https://lore.kernel.org/all/20190502023142.20139-6-tobin@kernel.org/

There are probably more references.

Anyway, the current code works fine (well, one could argue about that). If 
someone wants to take a (another) stab at this, then why not, but it 
seemed like a rabbit hole without a substantial gain in the past. On the 
other hand, we currently misuse the API to some extent.

/me scratches head

Miroslav