[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130315163916.GA31995@redhat.com>
Date: Fri, 15 Mar 2013 17:39:16 +0100
From: Oleg Nesterov <oleg@...hat.com>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
Andi Kleen <andi@...stfloor.org>,
Lucas De Marchi <lucas.de.marchi@...il.com>,
Benjamin Herrenschmidt <benh@...nel.crashing.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Paul Mackerras <paulus@...ba.org>, david@...son.dropbear.id.au,
Kees Cook <keescook@...omium.org>,
Serge Hallyn <serge.hallyn@...onical.com>,
"Rafael J. Wysocki" <rjw@...k.pl>,
Feng Hong <hongfeng@...vell.com>,
Lucas De Marchi <lucas.demarchi@...fusion.mobi>
Subject: Re: [PATCH 1/1] poweroff: change orderly_poweroff() to use
schedule_work()
On 03/14, Andrew Morton wrote:
>
> On Wed, 13 Mar 2013 18:47:05 +0100 Oleg Nesterov <oleg@...hat.com> wrote:
>
> > This means that orderly_poweroff() becomes async even if we do not
> > run the command and always succeeds, schedule_work() can only fail
> > if the work is already pending. We can export __orderly_poweroff()
> > and change the non-atomic callers which want the old semantics.
> >
> > ...
> >
> > @@ -2218,21 +2237,9 @@ static int __orderly_poweroff(void)
> > */
> > int orderly_poweroff(bool force)
> > {
> > + if (force) /* do not override the pending "true" */
> > + poweroff_force = true;
> > + schedule_work(&poweroff_work);
> > + return 0;
> > }
>
> afaict the current version of orderly_poweroff() will never return -
> either __orderly_poweroff() will block until the machine shuts down or
> kernel_power_off() will do so.
Note that __orderly_poweroff() uses UMH_WAIT_EXEC, not UMH_WAIT_PROC,
so it returns right after /sbin/poweroff starts to execute.
So it is already asynchronous unless execve() fails.
> However with this patch there is a path via which orderly_poweroff()
> can return to its caller, I think?
See above, but please also read the changelog.
With this patch orderly_poweroff() is always async, even if exec fails,
but
> If so, the caller might be rather
> surprised and we're exercising never-before-used code paths. In fact
> if the surprised caller goes oops, the poweroff might not occur at all.
This should not happen.
Anyway. Please also note that now we can export __orderly_poweroff() and
probably change it, it can have another argument "bool use_UMH_WAIT_PROC".
int __orderly_poweroff(bool force, bool sync)
{
int wait = sync ? UMH_WAIT_EXEC : UMH_WAIT_EXEC;
ret = call_usermodehelper(argv[0], argv, envp, wait);
if (force) {
// EXEC failed or /sbin/poweroff didn't do its work
if (ret || sync)
kernel_power_off();
}
}
The non-atomic callers can use __orderly_poweroff(sync => true).
---------------------------------------------------------------------------
And, Andrew, et all... Could you help with another mentioned problem? It is
really trivial, but exactly because it is trivial I do not know what should
I do.
To remind, say, argv_split(poweroff_cmd) can race with sysctl changing this
string, in this case it can write to the memory after argv[] array. We can
fix this, or we can rewrite argv_split/free:
void argv_free(char **argv)
{
kfree(argv[-1]);
kfree(argv);
}
char **argv_split(gfp_t gfp, const char *str, int *argcp)
{
char *argv_str;
bool was_space;
char **argv, **argv_ret;
int argc;
argv_str = kstrndup(str, KMALLOC_MAX_SIZE, gfp);
if (!argv_str)
return NULL;
argc = count_argc(argv_str);
argv = kmalloc(sizeof(*argv) * (argc + 2), gfp);
if (!argv) {
kfree(argv_str);
return NULL;
}
*argv = argv_str;
argv_ret = ++argv;
for (was_space = true; *argv_str; argv_str++) {
if (isspace(*argv_str)) {
was_space = true;
*argv_str = 0;
} else if (was_space) {
was_space = false;
*argv++ = argv_str;
}
}
*argv = NULL;
if (argcp)
*argcp = argc;
return argv_ret;
}
This way it uses a single kstrndup() to keep the arguments and it is
always safe.
But, whatever we do with argv_split(), it can hit the string "in between".
Personally I think we do not really care, but...
Perhaps we should add proc_dostring_lock() which takes some lock and
modify the callers of argv_split() (or add argv_split_lock) ?
Or perhaps we should introduce the rwsem which should protect every
sysctl-string and proc_dostring() should take this lock?
Help! I'd prefer to rewrite argv_split(), but I agree with any suggestion
in advance.
Oleg.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists