[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAFTL4hzxN2nj8H+ycPaFLzvM0NK5TtfWtfVJVTSEtNzofTvqSw@mail.gmail.com>
Date: Tue, 1 Oct 2013 16:26:01 +0200
From: Frederic Weisbecker <fweisbec@...il.com>
To: Oleg Nesterov <oleg@...hat.com>,
Peter Zijlstra <peterz@...radead.org>
Cc: Arjan van de Ven <arjan@...ux.intel.com>,
Fernando Luis Vázquez Cao
<fernando_b1@....ntt.co.jp>, Ingo Molnar <mingo@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>,
LKML <linux-kernel@...r.kernel.org>,
Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>,
Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH 2/4] nohz: Synchronize sleep time stats with seqlock
2013/10/1 Frederic Weisbecker <fweisbec@...il.com>:
> On Wed, Aug 21, 2013 at 06:41:46PM +0200, Oleg Nesterov wrote:
>> On 08/21, Peter Zijlstra wrote:
>> >
>> > The other consideration is that this adds two branches to the normal
>> > schedule path. I really don't know what the regular ratio between
>> > schedule() and io_schedule() is -- and I suspect it can very much depend
>> > on workload -- but it might be a net loss due to that, even if it makes
>> > io_schedule() 'lots' cheaper.
>>
>> Yes, agreed. Please ignore it for now, I didn't try to actually suggest
>> this change. And even if this is fine perfomance wise, this needs some
>> benchmarking.
>>
>> Well. actually I have a vague feeling that _perhaps_ this change can
>> help to solve other problems we are discussing, but I am not sure and
>> right now I can't even explain the idea to me.
>>
>> In short: please ignore ;)
>>
>> Oleg.
>>
>
> Ok, the discussion is hard to synthesize but I think I now have more
> clues to send a better new iteration.
>
> So we have the choice between keeping atomics or using rq->lock. It seems
> that using rq->lock is going to be worrisome for several reasons. So let's
> stick to atomics (but tell me if you prefer the other way).
>
> So the idea for the next try is to do something along the lines of:
>
> struct cpu_idletime {
> nr_iowait,
> seqlock,
> idle_start,
> idle_time,
> iowait_time,
> } __cacheline_aligned_in_smp;
>
> DEFINE_PER_CPU(struct cpu_idletime, cpu_idletime);
>
> io_schedule()
> {
> int prev_cpu;
>
> preempt_disable();
> prev_cpu_idletime = __this_cpu_ptr(&cpu_idletime);
> atomic_inc(prev_cpu_idletime->nr_iowait);
> WARN_ON_ONCE(is_idle_task(current));
> preempt_enable_no_resched();
>
> schedule();
>
> write_seqlock(prev_cpu_idletime->seqlock)
> if (!atomic_dec_return(prev_cpu_idletime->nr_iowait))
> flush_cpu_idle_time(prev_cpu_idletime, 1)
I forgot...
cpu_idletime->idle_start;
after the update.
Also now I wonder if we actually should lock the inc part. Otherwise
it may be hard to get the readers right...
Thanks.
> write_sequnlock(prev_cpu_idletime->seqlock)
>
> }
>
> flush_cpu_idle_time(cpu_idletime, iowait)
> {
> if (!cpu_idletime->idle_start)
> return;
>
> if (nr_iowait)
> cpu_idletime->iowait_time = NOW() - cpu_idletime->idle_start;
> else
> cpu_idletime->idle_time = NOW() - cpu_idletime->idle_start;
> }
>
> idle_entry()
> {
> write_seqlock(this_cpu_idletime->seqlock)
> this_cpu_idletime->idle_start = NOW();
> write_sequnlock(iowait(cpu)->seqlock)
> }
>
> idle_exit()
> {
> write_seqlock(this_cpu_idletime->seqlock)
> flush_cpu_idle_time(this_cpu_idletime, atomic_read(&this_cpu_idletime->nr_iowait));
> this_cpu_idletime->idle_start = 0;
> write_sequnlock(this_cpu_idletime->seqlock)
> }
>
>
> Now this all realy on the fact that atomic_inc(cpu_idletime->nr_iowait) can't happen
> in a CPU that is already idle. So it can't happen between idle_entry() and idle_exit().
> Hence the WARN_ON(is_idle_task(current)) below after the inc.
>
> Hmm?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists