linux-kernel - Re: [PATCH 02/14] aoe: kernel thread handles I/O completions for simple locking

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-Id: <1345944687.24968@cat.he.net>
Date:	Sat, 25 Aug 2012 18:31:27 -0700
From:	ecashin@....he.net
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	linux-kernel@...r.kernel.org, Ed Cashin <ecashin@...aid.com>
Subject: Re: [PATCH 02/14] aoe: kernel thread handles I/O completions for
 simple locking

On Fri, 24 Aug 2012 14:22:57 -0700, Andrew Morton wrote:

> On Fri, 17 Aug 2012 21:24:08 -0400
> Ed Cashin <ecashin@...aid.com> wrote:
...
> > +#ifdef PF_NOFREEZE
> 
> PF_NOFREEZE can never be undefined.
> 
> > +	current->flags |= PF_NOFREEZE;
> > +#endif
> > +	set_user_nice(current, -10);
> > +	sigfillset(&blocked);
> > +	sigprocmask(SIG_BLOCK, &blocked, NULL);
> > +	flush_signals(current);
> 
> This is a kernel thread - it shouldn't need to fiddle with signals.
> 
> > +	complete(&k->rendez);
> 
> That's odd.  Why do a complete() before we even start?  A code comment
> is needed if this is indeed correct.

There is a whole class of races that goes away when the code starting
a thread waits for the thread to be running before proceeding, and that's
what this rendezvous is for.  I've added a comment that will appear next
time I send the patchset.

(More comments below.)

> > +	do {
> > +		__set_current_state(TASK_UNINTERRUPTIBLE);
> 
> I think this statement is simply unneeded.
> 
> > +		spin_lock_irq(k->lock);
> > +		more = k->fn();
> > +		if (!more) {
> > +			add_wait_queue(k->waitq, &wait);
> > +			__set_current_state(TASK_INTERRUPTIBLE);
> > +		}
> > +		spin_unlock_irq(k->lock);
> > +		if (!more) {
> > +			schedule();
> > +			remove_wait_queue(k->waitq, &wait);
> > +		} else
> > +			cond_resched();
> 
> Here we can do a cond_resched() when in state TASK_INTERRUPTIBLE.  Such
> a schedule() will never return unless some other thread flips this task
> into state TASK_RUNNING.  But if another thread does that, we should
> have been on that waitqueue!
> 
> It seems all confused and racy.

When we do a cond_resched, it's only when "more" is non-zero, in which
case, we did not set the state to TASK_INTERRUPTIBLE.  I do like
your suggestions, though.

Please check out the (post-patchset) changes below that I plan to
incorporate into the patchset for resubmission, and let me know if
you see a race now that your suggestions have been incorporated.

It seems to work just as well in the testing I did with the changes
below incorporated.

diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c
index d91b8d0..97d05fa 100644
--- a/drivers/block/aoe/aoecmd.c
+++ b/drivers/block/aoe/aoecmd.c
@@ -1075,20 +1075,13 @@ kthread(void *vp)
 {
 	struct ktstate *k;
 	DECLARE_WAITQUEUE(wait, current);
-	sigset_t blocked;
 	int more;
 
 	k = vp;
-#ifdef PF_NOFREEZE
 	current->flags |= PF_NOFREEZE;
-#endif
 	set_user_nice(current, -10);
-	sigfillset(&blocked);
-	sigprocmask(SIG_BLOCK, &blocked, NULL);
-	flush_signals(current);
-	complete(&k->rendez);
+	complete(&k->rendez);	/* tell spawner we're running */
 	do {
-		__set_current_state(TASK_UNINTERRUPTIBLE);
 		spin_lock_irq(k->lock);
 		more = k->fn();
 		if (!more) {
@@ -1102,8 +1095,7 @@ kthread(void *vp)
 		} else
 			cond_resched();
 	} while (!kthread_should_stop());
-	__set_current_state(TASK_RUNNING);
-	complete(&k->rendez);
+	complete(&k->rendez);	/* tell spawner we're stopping */
 	return 0;
 }
 
@@ -1122,10 +1114,10 @@ aoe_ktstart(struct ktstate *k)
 	init_completion(&k->rendez);
 	task = kthread_run(kthread, k, k->name);
 	if (task == NULL || IS_ERR(task))
-		return -EFAULT;
+		return -ENOMEM;
 	k->task = task;
-	wait_for_completion(&k->rendez);
-	init_completion(&k->rendez);	/* for exit */
+	wait_for_completion(&k->rendez); /* allow kthread to start */
+	init_completion(&k->rendez);	/* for waiting for exit later */
 	return 0;
 }
 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/