[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090401194511.GB16033@redhat.com>
Date: Wed, 1 Apr 2009 21:45:11 +0200
From: Oleg Nesterov <oleg@...hat.com>
To: Ingo Molnar <mingo@...e.hu>
Cc: Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Markus Metzger <markus.t.metzger@...el.com>,
linux-kernel@...r.kernel.org, tglx@...utronix.de, hpa@...or.com,
markus.t.metzger@...il.com, roland@...hat.com,
eranian@...glemail.com, juan.villacis@...el.com,
ak@...ux.jf.intel.com
Subject: Re: [patch 3/21] x86, bts: wait until traced task has been
scheduled out
On 04/01, Ingo Molnar wrote:
>
> * Oleg Nesterov <oleg@...hat.com> wrote:
>
> > On 03/31, Markus Metzger wrote:
> > >
> > > +static void wait_to_unschedule(struct task_struct *task)
> > > +{
> > > + unsigned long nvcsw;
> > > + unsigned long nivcsw;
> > > +
> > > + if (!task)
> > > + return;
> > > +
> > > + if (task == current)
> > > + return;
> > > +
> > > + nvcsw = task->nvcsw;
> > > + nivcsw = task->nivcsw;
> > > + for (;;) {
> > > + if (!task_is_running(task))
> > > + break;
> > > + /*
> > > + * The switch count is incremented before the actual
> > > + * context switch. We thus wait for two switches to be
> > > + * sure at least one completed.
> > > + */
> > > + if ((task->nvcsw - nvcsw) > 1)
> > > + break;
> > > + if ((task->nivcsw - nivcsw) > 1)
> > > + break;
> > > +
> > > + schedule();
> >
> > schedule() is a nop here. We can wait unpredictably long...
> >
> > Ingo, do have have any ideas to improve this helper?
>
> hm, there's a similar looking existing facility:
> wait_task_inactive(). Have i missed some subtle detail that makes it
> inappropriate for use here?
Yes, there are similar, but still different.
wait_to_unschedule(task) waits until this task does context switch at
least once. It is fine if this task runs again when wait_to_unschedule()
returns. (if !task_is_running(task), it already did context switch).
wait_task_inactive() ensures that this task is deactivated. It can't be
used here, because it can "never" be deactivated.
> > int force_unschedule(struct task_struct *p)
> > {
> > struct rq *rq;
> > unsigned long flags;
> > int running;
> >
> > rq = task_rq_lock(p, &flags);
> > running = task_running(rq, p);
> > task_rq_unlock(rq, &flags);
> >
> > if (running)
> > wake_up_process(rq->migration_thread);
> >
> > return running;
> > }
> >
> > which should be used instead of task_is_running() ?
>
> Yes - wait_task_inactive() should be switched to a scheme like that
Yes, I thought about this, perhaps we can improve wait_task_inactive()
a bit. Unfortunately, this is not enough to kill schedule_timeout(1).
> - it would fix bugs like:
>
> 53da1d9: fix ptrace slowness
I don't think so. Quite contrary, the problem with "fix ptrace slowness"
is that we do not want the TASK_TRACED task to be preempted before it
does the voluntary schedule() (without PREEMPT_ACTIVE).
> > void wait_to_unschedule(struct task_struct *task)
> > {
> > struct migration_req req;
> >
> > rq = task_rq_lock(p, &task);
> > running = task_running(rq, p);
> > if (running) {
> > // make sure __migrate_task() will do nothing
> > req->dest_cpu = NR_CPUS + 1;
> > init_completion(&req->done);
> > list_add(&req->list, &rq->migration_queue);
> > }
> > task_rq_unlock(rq, &flags);
> >
> > if (running) {
> > wake_up_process(rq->migration_thread);
> > wait_for_completion(&req.done);
> > }
> > }
> >
> > This way we don't poll, and we need only one helper.
>
> Looks even better. The migration thread would run complete(), right?
Yes,
> A detail: i suspect this needs to be in a while() loop, for the case
> that the victim task raced with us and went to another CPU before we
> kicked it off via the migration thread.
I think this doesn't matter. If the task is not running - we don't
care and do nothing. If it is running and migrates - it should do
a context switch at least once.
But the code above is not right wrt cpu hotplug. wake_up_process()
can hit the NULL rq->migration_thread if we race with CPU_DEAD.
Hmm. don't we have this problem in, say, set_cpus_allowed_ptr() ?
Unless it is called without get_online_cpus(), ->migration_thread
can go away once we drop rq->lock.
Perhaps, we need something like this
--- kernel/sched.c
+++ kernel/sched.c
@@ -6132,8 +6132,10 @@ int set_cpus_allowed_ptr(struct task_str
if (migrate_task(p, cpumask_any_and(cpu_online_mask, new_mask), &req)) {
/* Need help from migration thread: drop lock and wait. */
+ preempt_disable();
task_rq_unlock(rq, &flags);
wake_up_process(rq->migration_thread);
+ preempt_enable();
wait_for_completion(&req.done);
tlb_migrate_finish(p->mm);
return 0;
?
Oleg.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists