[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <AANLkTinPtwSUjMtBRkb96DU4lym8ZNrYwPe_q6rstT4D@mail.gmail.com>
Date: Tue, 25 May 2010 14:40:42 -0700
From: Venkatesh Pallipadi <venki@...gle.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Ingo Molnar <mingo@...e.hu>, "H. Peter Anvin" <hpa@...or.com>,
Thomas Gleixner <tglx@...utronix.de>,
Balbir Singh <balbir@...ux.vnet.ibm.com>,
Paul Menage <menage@...gle.com>, linux-kernel@...r.kernel.org,
Paul Turner <pjt@...gle.com>,
Martin Schwidefsky <schwidefsky@...ibm.com>,
Heiko Carstens <heiko.carstens@...ibm.com>,
Paul Mackerras <paulus@...ba.org>
Subject: Re: [RFC PATCH 2/4] x86: Add IRQ_TIME_ACCOUNTING, finer accounting of
irq time to task
On Mon, May 24, 2010 at 11:35 PM, Peter Zijlstra <peterz@...radead.org> wrote:
> On Mon, 2010-05-24 at 17:11 -0700, Venkatesh Pallipadi wrote:
>> +void account_system_vtime(struct task_struct *tsk)
>> +{
>> + unsigned long flags;
>> + int cpu;
>> + u64 now;
>> +
>> + local_irq_save(flags);
>> + cpu = task_cpu(tsk);
>> + now = sched_clock_cpu(cpu);
>> + if (hardirq_count())
>> + tsk->hi_time += now - per_cpu(irq_start_time, cpu);
>> + else if (softirq_count())
>> + tsk->si_time += now - per_cpu(irq_start_time, cpu);
>> +
>> + per_cpu(irq_start_time, cpu) = now;
>> + local_irq_restore(flags);
>> +}
>
> Right, so this gets called from irq_enter/exit() and __do_softirq().
>
> The reason I never pressed onwards with this (I had patches to add IRQ
> time accounting) is that it sucks terribly for anything falling back to
> jiffies -- maybe find some smart way to disable the whole call when
> there's no TSC available, preferably without adding conditionals, using
> alternatives maybe?
>
> I guess you mostly side-stepped that by adding a IRQ_TIME_ACCOUNTING
> config (but forgot to make it depend on X86_TSC).
Yes. The TSC dependency is unfortunately not CONFIG time. Even with X86_TSC,
there is support for tsc_disabled. So, this should be a run time
conditional or an alternative.
> Another thing I dislike about this account_system_vtime() is that its
> the same call for both IRQ and SoftIRQ, leaving us to add conditionals
> inside the call to figure out what context we got called from.
>
> [ Adedd the s390 and ppc guys who already use this stuff ]
We can probably add a parameter on from where it is being called. Even
with that we still need some conditionals to handle hardirq overlapping a
softirq. That part kind of happens naturally with the current code.
> Anyway, once we have this, please also add it to sched_rt_avg_update()
> (which should really be called sched_!fair_avg_update()).
You mean take out the hi and si time from rt_delta?
> Also, did you measure the overhead of doing this? sched_clock_cpu() adds
> a cmpxchg64 on all systems that don't have a rock solid TSC (ie. most of
> todays machines).
>
> Another thing that would be real nice is if you could find a way to not
> make all of this x86 specific.
Yes. Thats one of the reason I made this based of sched_clock. May be
we can have
a has_fast_sched_clock feature that archs and opt in to that enables
this with an
alternative?
overhead: My initial testing was on reliable TSC system, where sched_clock_cpu()
takes <50 cycles. So, no noticable overhead there. I still have to run
this on other
platforms, will post the data once I do that.
Thanks,
Venki
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists