[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <87fuljjua3.fsf@rustcorp.com.au>
Date:   Tue, 20 Dec 2016 11:23:08 +1030
From:   Rusty Russell <rusty@...tcorp.com.au>
To:     "Luis R. Rodriguez" <mcgrof@...nel.org>
Cc:     Filipe Manana <fdmanana@...e.com>,
        "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
        linux-doc@...r.kernel.org, rgoldwyn@...e.com, hare <hare@...e.com>,
        Jonathan Corbet <corbet@....net>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        linux-kselftest@...r.kernel.org,
        Andrew Morton <akpm@...ux-foundation.org>,
        Dan Williams <dan.j.williams@...el.com>,
        Aaron Tomlin <atomlin@...hat.com>, rwright@....com,
        Heinrich Schuchardt <xypron.glpk@....de>,
        Michal Marek <mmarek@...e.com>, martin.wilck@...e.com,
        Jeff Mahoney <jeffm@...e.com>, Ingo Molnar <mingo@...hat.com>,
        Petr Mladek <pmladek@...e.com>,
        Dmitry Torokhov <dmitry.torokhov@...il.com>,
        Guenter Roeck <linux@...ck-us.net>,
        "Eric W. Biederman" <ebiederm@...ssion.com>, shuah@...nel.org,
        DSterba@...e.com, Kees Cook <keescook@...omium.org>,
        Josh Poimboeuf <jpoimboe@...hat.com>,
        Arnaldo Carvalho de Melo <acme@...hat.com>,
        Miroslav Benes <mbenes@...e.cz>, NeilBrown <neilb@...e.com>,
        linux-kernel@...r.kernel.org, David Miller <davem@...emloft.net>,
        Jessica Yu <jeyu@...hat.com>,
        Subash Abhinov Kasiviswanathan <subashab@...eaurora.org>,
        Julia Lawall <julia.lawall@...6.fr>
Subject: Re: [RFC 10/10] kmod: add a sanity check on module loading
"Luis R. Rodriguez" <mcgrof@...nel.org> writes:
> On Dec 16, 2016 9:54 PM, "Rusty Russell" <rusty@...tcorp.com.au> wrote:
> > AFAICT the mistake here is that kmod is returning "done, OK" when the
> > module it is trying to load is already loading (but not finished
> > loading).  That's the root problem; it's an attempt at optimization by
> > kmod which goes awry.
>
> This is true! To be precise though the truth of the matter is that kmod'd
> respective usermode helper: modprobe can be buggy and may lie to us. It may
> allow request_module() to return 0 but since we don't validate it, any
> assumption we make can be deadly. In the case of get_fs_type() its a null
> dereference.
Wait, what??  I can't see that in get_fs_type, which hasn't changed
since 2013.  If a caller is assuming get_fs_type() doesn't return NULL,
they're broken and need fixing of course:
        struct file_system_type *get_fs_type(const char *name)
        {
        	struct file_system_type *fs;
        	const char *dot = strchr(name, '.');
        	int len = dot ? dot - name : strlen(name);
        	fs = __get_fs_type(name, len);
        	if (!fs && (request_module("fs-%.*s", len, name) == 0))
        		fs = __get_fs_type(name, len);
        	if (dot && fs && !(fs->fs_flags & FS_HAS_SUBTYPE)) {
        		put_filesystem(fs);
        		fs = NULL;
        	}
        	return fs;
        }
Where does this NULL-deref is the module isn't correctly loaded?
> *Iff* we want a sanity check to verify kmod's umh is not lying to us we
> need to verify after 0 was returned that it was not lying to us. Since kmod
> accepts aliases but find_modules_all() only works on the real module name a
> validation check cannot happen when all you have are aliases.
request_module() should block until resolution, but that's fundamentally
a userspace problem.  Let's not paper over it in kernelspace.
> *Iff* we are sure we don't want a validation (or another earlier
> optimization to avoid calling out to modrobe if the alias requested is
> already present, which does the time shaving I mentioned on the tests) then
> naturally no request_module() calls returning 0 can assert information
> about the requested module. I think we might need to change more code if we
> accept we cannot trust request_module() calls, or we accept userspace
> telling the kernel something may mean we sometimes crash. This later
> predicament seems rather odd so hence the patch.
>
> Perhaps in some cases validation of work from a umh is not critical in
> kernel but for request_module() I can tell you that today get_fs_type code
> currently asserts the module found can never be NULL.
OK, what am I missing in the code above?  
> > Looking at the code in the kernel, we *already* get this right: block if
> > a module is still loading anyway.  Once it succeeds we return -EBUSY if
> >
> > it fails we'll proceed to try to load it again.
> >
> > I don't understand what you're trying to fix with adding aliases
> > in-kernel?
>
> Two fold now:
>
> a) validation on request_module() work when an alias is used
But why?
> b) since kmod accepts aliaes, if we get aliases support, it means we could
> *also* preemptively avoid calling out to userspace for modules already
> present.
No, because once we have a module we don't request it: requesting is the
fallback case.
> >> FWIW a few things did occur to me:
> >>
> >> a) list_add_rcu() is used so new modules get added first
> >
> > Only after we're sure that there are no duplicates.
> >
> >
> OK! This is a very critical assertion. I should be able to add a debug
> WARN_ON() should two modules be on the modules list for the same module
> then ?
Yes, names must be unique.
>> b) find_module_all() returns the last module which was added as it
> traverses
>>    the module list
>
>> BTW should find_module_all() use rcu to traverse?
>
> Yes; the kallsyms code does this on Oops.  Not really a big issue in
> practice, but a nice fix.
>
> Ok, will bundle into my queue.
Please submit to Jessica for her module queue, as it's orthogonal
AFAICT.
> I will note though that I still think there's a bug in this code --
> upon a failure other "spinning" requests can fail, I believe this may
> be due to not having another state or informing pending modules too
> early of a failure but I haven't been able to prove this conjecture
> yet.
That's possible, but I can't see it from quickly re-checking the code.
The module should be fully usable at this point; the module's init has
been called successfully, so in the case of __get_fs_type() it should
now succeed.  The module cleans up its init section, but that should be
independent.
If there is a race, it's likely to be when some other caller wakes the
queue.  Moving the wakeup as soon as possible should make it easier to
trigger:
diff --git a/kernel/module.c b/kernel/module.c
index f57dd63186e6..78bd89d41a22 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -3397,6 +3397,7 @@ static noinline int do_init_module(struct module *mod)
 
 	/* Now it's a first class citizen! */
 	mod->state = MODULE_STATE_LIVE;
+	wake_up_all(&module_wq);
 	blocking_notifier_call_chain(&module_notify_list,
 				     MODULE_STATE_LIVE, mod);
 
@@ -3445,7 +3446,6 @@ static noinline int do_init_module(struct module *mod)
 	 */
 	call_rcu_sched(&freeinit->rcu, do_free_init);
 	mutex_unlock(&module_mutex);
-	wake_up_all(&module_wq);
 
 	return 0;
 
Thanks,
Rusty.
Powered by blists - more mailing lists
 
