linux-kernel - Re: [PATCH 3/4] kthreads: rework kthread

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 02 Feb 2009 09:57:36 -0800
From:	ebiederm@...ssion.com (Eric W. Biederman)
To:	Oleg Nesterov <oleg@...hat.com>
Cc:	Rusty Russell <rusty@...tcorp.com.au>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Christoph Hellwig <hch@....de>, Ingo Molnar <mingo@...e.hu>,
	Pavel Emelyanov <xemul@...nvz.org>,
	Vitaliy Gusev <vgusev@...nvz.org>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 3/4] kthreads: rework kthread_stop()

Oleg Nesterov <oleg@...hat.com> writes:

> On 01/31, Rusty Russell wrote:
>>
>> On Friday 30 January 2009 23:20:58 Oleg Nesterov wrote:
>> > On 01/30, Oleg Nesterov wrote:
>> > >
>> > > With this patch kthread() allocates all neccesary data (struct kthread)
>> > > on its own stack, globals kthread_stop_xxx are deleted. ->vfork_done
>> > > is used as a pointer into "struct kthread", this means kthread_stop()
>> > > can easily wait for kthread's exit.
>> 
>> > struct kthread {
>> > 	int should_stop;
>> > 	struct completion exited;
>> > };
>> 
>> Mildly prefer bool in new code.
>
> OK, and
>
>> > #define to_kthread(tsk)	\
>> > 	container_of((tsk)->vfork_done, struct kthread, exited)
>> 
>> This needs a comment.  Especially since to_xxx(yyy) is usually simply a
>> container_of(yyy, xxx, member).  This one is special.
>
> OK, I'll send the cleanup patch.
>
>> > int kthread_stop(struct task_struct *k)
>> > {
>> > 	struct kthread *kthread;
>> > 	int ret;
>> >
>> > 	trace_sched_kthread_stop(k);
>> > 	get_task_struct(k);
>> >
>> > 	kthread = to_kthread(k);
>> > 	barrier(); /* it might have exited */
>> > 	if (k->vfork_done != NULL) {
>> > 		kthread->should_stop = 1;
>> > 		wake_up_process(k);
>> > 		wait_for_completion(&kthread->exited);
>> > 	}
>> > 	ret = k->exit_code;
>>
>> I don't think this works.  How does do_exit() preserve a stack var, other
>> than for a few cycles longer?  Sure, the vfork_done will be OK, but this code
>> here will not be.  I think you'd need a get_task_struct(current) before the
>> do_exit(ret)
>
> I think this works ;)
>
> This stack frame can't disappear until
> __put_task_struct()->...->free_thread_info().
> So, if you have a reference to task_struct, then it it is safe to dereference
> to_kthread(task).

Correct.

> Before this patch, kthread_stop() can only be used when we know that kthread
> must not exit by its own. And with this patch we are safe in this case, note
> that kthread_stop() does get_task_struct() before it sets ->should_stop = 1.
> And this also pins the memory pointed by to_kthread().

>> (the case where the kthread fn calls do_exit() is fine: you're
>> not allowed to call kthread stop on such threads).
>
> This was not allowed, but now this is fine. Please look at the 4/4 patch.
> But, in that case you must pin the task_struct after kthread_create(),
> otherwise (with or without this patch) you just can't use this task_struct
> in any way.

To finish the conversion of everything to kthreads from kernel_thread
we need to be able to call kthread_stop on threads that call do_exit.
That kthread_stop cannot be called on such threads is currently a major
deficiency of the kthread api.

>> In which case using vfork_done is really just a convenience pointer inside
>> struct task_struct to stash the struct kthread.  And that's horribly ugly,
>> which is why I stuck with a simple global.  Changing to a linked-list of
> things
>> to stop would avoid the deadlock you mentioned where a kthread stops another
>> kthread.
>
> Well, this patch overloads ->vfork_done, and I agree this is a bit ugly.
> But what you suggest (if I undestand correctly) is more complex, and doesn't
> have any advantages, imho.

No.  This patch does not overload or really abuse vfork_done.
vfork_done is named a bit misleadingly.  It should arguably be called
mm_done.  vfork_done (if set) always points to a completion that will
be completed when do_exit() -> mm_release() is called.

What we want is some completion called from inside of do_exit.
mm_release happens to be such a completion and the code already
exists.

If mm_release did not do: tsk->vfork_done = NULL then the entire
test of if (k->vfork_done != NULL) would be unnecessary.

Oleg on that note we should not need a barrier at all. We should be
able to simply say: 

cmplp = k->vfork_done;
if (cmplp){
	/* if vfork_done is NULL we have passed mm_release */
	kthread = container_of(cmplp, struct kthread, exited);
	kthread->should_stop = 1;
	wake_up_process(k);
	wait_for_completion(&kthread->exited);
}

Thinking of it I wish we had someplace we could store a pointer
that would not be cleared so we could remove that whole confusing
conditional.  I just looked through task_struct and there doesn't
appear to be anything promising.

Perhaps we could rename vfork_done mm_done and not clear it in
mm_release.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/