linux-kernel - Re: [RFC PATCH] livepatch: allow removal of a disabled patch

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LNX.2.00.1605041623240.22292@pobox.suse.cz>
Date:	Wed, 4 May 2016 16:35:28 +0200 (CEST)
From:	Miroslav Benes <mbenes@...e.cz>
To:	Josh Poimboeuf <jpoimboe@...hat.com>
cc:	Jiri Kosina <jikos@...nel.org>, jeyu@...hat.com, pmladek@...e.com,
	jslaby@...e.cz, live-patching@...r.kernel.org,
	linux-kernel@...r.kernel.org, huawei.libin@...wei.com,
	minfei.huang@...oo.com
Subject: Re: [RFC PATCH] livepatch: allow removal of a disabled patch

On Wed, 4 May 2016, Josh Poimboeuf wrote:

> On Wed, May 04, 2016 at 01:58:47PM +0200, Miroslav Benes wrote:
> > On Tue, 3 May 2016, Josh Poimboeuf wrote:
> > 
> > > On Tue, May 03, 2016 at 09:39:48PM -0500, Josh Poimboeuf wrote:
> > > > On Wed, May 04, 2016 at 12:31:12AM +0200, Jiri Kosina wrote:
> > > > > On Tue, 3 May 2016, Josh Poimboeuf wrote:
> > > > > 
> > > > > > > 1. Do we really need a completion? If I am not missing something
> > > > > > > kobject_del() always waits for sysfs callers to leave thanks to kernfs
> > > > > > > active protection.
> > > > > > 
> > > > > > What do you mean by "kernfs active protection"?  I see that
> > > > > > kernfs_remove() gets the kernfs_mutex lock, but I can't find anywhere
> > > > > > that a write to a sysfs file uses that lock.
> > > > > > 
> > > > > > I'm probably missing something...
> > > > > 
> > > > > I don't want to speak on Miroslav's behalf, but I'm pretty sure that what 
> > > > > he has on mind is per-kernfs_node active refcounting kernfs does (see 
> > > > > kernfs_node->active, and especially it's usage in __kernfs_remove()).
> > > > > 
> > > > > More specifically, execution of store() and show() sysfs callbacks is 
> > > > > guaranteed (by kernfs) to happen with that particular attribute's active 
> > > > > reference held for reading (and that makes it impossible for that 
> > > > > attribute to vanish prematurely).
> > > > 
> > > > Thanks, that makes sense.
> > > > 
> > > > So what exactly is the problem the completion is trying to solve?  Is it
> > > > to ensure that the kobject has been cleaned up before it returns to the
> > > > caller, in case the user wants to call klp_register() again after
> > > > unregistering?
> > > > 
> > > > If so, that's quite an unusual use case which I think we should just
> > > > consider unsupported.  In fact, if you try to do it, kobject_init()
> > > > complains loudly because kobj->state_initialized is still 1 because
> > > > kobjects aren't meant to be reused like that.
> > > 
> > > ... and now I realize the point is actually to prevent the caller from
> > > freeing klp_patch before kobject_cleanup() runs.
> > 
> > Exactly. Sorry I was so brief.
> > 
> > > So yeah, it looks like we need the completion in case
> > > CONFIG_DEBUG_KOBJECT_RELEASE is enabled.
> > > 
> > > Or alternatively we could convert patch->kobj to be dynamically
> > > allocated instead of embedded in klp_patch.
> > 
> > But that wouldn't help, would it? Dynamic kobjects registers generic 
> > release function dynamic_kobj_release() and that's it. We're in the same 
> > situation. I have got a feeling that dynamic kobjects are only for trivial 
> > cases.
> 
> But the patch release doesn't need to do anything, right?

That is correct. I wanted to point out that dynamic_kobj_release() did not 
really solve our "completion" issue. If there is a problem in our code and 
we need completion, dynamic kobjects would not help. If we don't need a 
completion at all we could move to dynamic kobjects.

There is still container_of() though.

> > Moreover we use container_of() several times in the code and that does not
> > work with dynamically allocated kobjects.
> > 
> > Anyway I am really confused now. When I read changelog of c817a67ecba7 
> > ("kobject: delayed kobject release: help find buggy drivers") all makes 
> > perfect sense. But isn't our situation somewhat special, because we have 
> > refcounts completely under control? So we know that once we call 
> > kobject_put() we can let a patch go... I must be missing something.
> > 
> > It does not make sense to introduce completion just to satisfy a feature 
> > which was introduced to debug general cases.
> 
> I think our situation is "special" because klp_patch and its embedded
> kobject have separate lifetimes.  We have a kobject, which we own, which
> is embedded in a klp_patch struct, which we don't own.
> 
> If I understand correctly, normally when you release a kobject, the
> containing struct gets freed.  But we can't do that here because the
> caller allocated the klp_patch.

Normally only struct kobject is freed, no? 

>From Documentation/kobject.txt:

"One important point cannot be overstated: every kobject must have a
release() method, and the kobject must persist (in a consistent state)
until that method is called. If these constraints are not met, the code is
flawed."

So we need to only make sure that klp_patch does not disappear before 
calling release() method. In our case that cannot happen even without 
completion, because we call kobject_put() in klp_unregister_patch() when 
the patch module is going and kobject_put() calls our release (potentially 
empty) method in a "sync" way. If I read that code correctly.

This does not hold only if CONFIG_DEBUG_KOBJECT_RELEASE=y.

> So I really get the feeling that a dynamic kobject would be more
> appropriate here.

See above.

> That said, the sysfs and kobject stuff always throws me for a loop.  So
> take what I'm saying with several grains of salt.

Tell me about it.

Miroslav