[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130821113551.GA1472@redhat.com>
Date: Wed, 21 Aug 2013 13:35:51 +0200
From: Oleg Nesterov <oleg@...hat.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Arjan van de Ven <arjan@...ux.intel.com>,
Fernando Luis Vázquez Cao
<fernando_b1@....ntt.co.jp>,
Frederic Weisbecker <fweisbec@...il.com>,
Ingo Molnar <mingo@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>,
LKML <linux-kernel@...r.kernel.org>,
Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>,
Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH 2/4] nohz: Synchronize sleep time stats with seqlock
On 08/21, Peter Zijlstra wrote:
>
> On Tue, Aug 20, 2013 at 08:25:53PM +0200, Oleg Nesterov wrote:
> > On 08/20, Peter Zijlstra wrote:
> > >
> > > On Tue, Aug 20, 2013 at 06:33:12PM +0200, Oleg Nesterov wrote:
>
> > > > + if (unlikely(prev->in_iowait)) {
> > > > + raw_spin_lock_irq(&rq->lock);
> > > > + rq->nr_iowait--;
> > > > + raw_spin_unlock_irq(&rq->lock);
> > > > + }
> > >
> > > This seems like the wrong place, this is where you return from
> > > schedule() running another task,
> >
> > Yes, but prev is current, and rq should be "correct" for
> > rq->nr_iowait-- ?
>
> Yes its the right rq, but the wrong time.
Hmm. Just in case, it is not that I think this patch really makes sense,
but I'd like to understand why do you think it is wrong.
> > This local var should be equal to its value when this task called
> > context_switch() in the past.
> >
> > Like any other variable, like "rq = raw_rq()" in io_schedule().
> >
> > > not where the task you just send to
> > > sleep wakes up.
> >
> > sure, but currently io_schedule() does the same.
>
> No it doesn't. It only does the decrement when the task is woken back
> up. Not right after it switches out.
But it is not "after it switches out", it is after it switched back.
Lets ignore the locking,
if (prev->in_iowait)
rq->nr_iowait++;
context_switch(prev, next);
if (prev->in_iowait)
rq->nr_iowait--;
>From the task_struct's (current's) pov prev/rq are the same, before or
after context_switch().
But from the CPU's pov they differ. And ignoring more details on UP the
code above is equivalent to
if (prev->in_iowait)
rq->nr_iowait++;
if (next->in_iowait)
rq->nr_iowait--;
context_switch(prev, next);
No?
Yes, need_resched()/preemption can trigger more inc/dec's than io_schedule()
does, but I don't think this was your concern.
> > Btw. Whatever we do, can't we unify io_schedule/io_schedule_timeout?
>
> I suppose we could, a timeout of MAX_SCHEDULE_TIMEOUT will act like a
> regular schedule, but it gets all the overhead of doing
> schedule_timeout(). So I don't think its a win.
Well, the only overhead is "if(to == MAX_SCHEDULE_TIMEOUT)" at the start.
I don't think it makes sense to copy-and-paste the identical code to
avoid it. But please ignore, this is really minor and off-topic.
Oleg.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists