[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090401114140.GB23678@elte.hu>
Date: Wed, 1 Apr 2009 13:41:40 +0200
From: Ingo Molnar <mingo@...e.hu>
To: Oleg Nesterov <oleg@...hat.com>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>
Cc: Markus Metzger <markus.t.metzger@...el.com>,
linux-kernel@...r.kernel.org, tglx@...utronix.de, hpa@...or.com,
markus.t.metzger@...il.com, roland@...hat.com,
eranian@...glemail.com, juan.villacis@...el.com,
ak@...ux.jf.intel.com
Subject: Re: [patch 3/21] x86, bts: wait until traced task has been
scheduled out
* Oleg Nesterov <oleg@...hat.com> wrote:
> On 03/31, Markus Metzger wrote:
> >
> > +static void wait_to_unschedule(struct task_struct *task)
> > +{
> > + unsigned long nvcsw;
> > + unsigned long nivcsw;
> > +
> > + if (!task)
> > + return;
> > +
> > + if (task == current)
> > + return;
> > +
> > + nvcsw = task->nvcsw;
> > + nivcsw = task->nivcsw;
> > + for (;;) {
> > + if (!task_is_running(task))
> > + break;
> > + /*
> > + * The switch count is incremented before the actual
> > + * context switch. We thus wait for two switches to be
> > + * sure at least one completed.
> > + */
> > + if ((task->nvcsw - nvcsw) > 1)
> > + break;
> > + if ((task->nivcsw - nivcsw) > 1)
> > + break;
> > +
> > + schedule();
>
> schedule() is a nop here. We can wait unpredictably long...
>
> Ingo, do have have any ideas to improve this helper?
hm, there's a similar looking existing facility:
wait_task_inactive(). Have i missed some subtle detail that makes it
inappropriate for use here?
> Not that I really like it, but how about
>
> int force_unschedule(struct task_struct *p)
> {
> struct rq *rq;
> unsigned long flags;
> int running;
>
> rq = task_rq_lock(p, &flags);
> running = task_running(rq, p);
> task_rq_unlock(rq, &flags);
>
> if (running)
> wake_up_process(rq->migration_thread);
>
> return running;
> }
>
> which should be used instead of task_is_running() ?
Yes - wait_task_inactive() should be switched to a scheme like that
- it would fix bugs like:
53da1d9: fix ptrace slowness
in a cleaner way.
> We can even do something like
>
> void wait_to_unschedule(struct task_struct *task)
> {
> struct migration_req req;
>
> rq = task_rq_lock(p, &task);
> running = task_running(rq, p);
> if (running) {
> // make sure __migrate_task() will do nothing
> req->dest_cpu = NR_CPUS + 1;
> init_completion(&req->done);
> list_add(&req->list, &rq->migration_queue);
> }
> task_rq_unlock(rq, &flags);
>
> if (running) {
> wake_up_process(rq->migration_thread);
> wait_for_completion(&req.done);
> }
> }
>
> This way we don't poll, and we need only one helper.
Looks even better. The migration thread would run complete(), right?
A detail: i suspect this needs to be in a while() loop, for the case
that the victim task raced with us and went to another CPU before we
kicked it off via the migration thread.
This looks very useful to me. It could also be tested easily: revert
53da1d9 and you should see:
time strace dd if=/dev/zero of=/dev/null bs=1024 count=1000000
performance plummet on an SMP box. The with your fix it should go up
to near full speed again.
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists