[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20141219181613.GA86430@devbig257.prn2.facebook.com>
Date: Fri, 19 Dec 2014 10:16:13 -0800
From: Shaohua Li <shli@...com>
To: Andy Lutomirski <luto@...capital.net>
CC: Chris Mason <clm@...com>, Peter Zijlstra <peterz@...radead.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
X86 ML <x86@...nel.org>, <Kernel-team@...com>,
"H. Peter Anvin" <hpa@...or.com>, Ingo Molnar <mingo@...hat.com>,
John Stultz <john.stultz@...aro.org>
Subject: Re: [PATCH v2 3/3] X86: Add a thread cpu time implementation to vDSO
On Fri, Dec 19, 2014 at 09:53:24AM -0800, Andy Lutomirski wrote:
> On Fri, Dec 19, 2014 at 9:42 AM, Chris Mason <clm@...com> wrote:
> >
> >
> > On Fri, Dec 19, 2014 at 11:48 AM, Andy Lutomirski <luto@...capital.net>
> > wrote:
> >>
> >> On Fri, Dec 19, 2014 at 3:23 AM, Peter Zijlstra <peterz@...radead.org>
> >> wrote:
> >>>
> >>> On Thu, Dec 18, 2014 at 04:22:59PM -0800, Andy Lutomirski wrote:
> >>>>
> >>>> Bad news: this patch is incorrect, I think. Take a look at
> >>>> update_rq_clock -- it does fancy things involving irq time and
> >>>> paravirt steal time. So this patch could result in extremely
> >>>> non-monotonic results.
> >>>
> >>>
> >>> Yeah, I'm not sure how (and if) we could make all that work :/
> >>
> >>
> >> I obviously can't comment on what Facebook needs, but if I were
> >> rigging something up to profile my own code*, I'd want a count of
> >> elapsed time, including user, system, and probably interrupt as well.
> >> I would probably not want to count time during which I'm not
> >> scheduled, and I would also probably not want to count steal time.
> >> The latter makes any implementation kind of nasty.
> >>
> >> The API presumably doesn't need to be any particular clock id for
> >> clock_gettime, and it may not even need to be clock_gettime at all.
> >>
> >> Is perf self-monitoring good enough for this? If not, can we make it
> >> good enough?
> >>
> >> * I do this today using CLOCK_MONOTONIC
> >
> >
> > The clock_gettime calls are used for a wide variety of things, but usually
> > they are trying to instrument how much CPU the application is using. So for
> > example with the HHVM interpreter they have a ratio of the number of hhvm
> > instructions they were able to execute in N seconds of cputime. This gets
> > used to optimize the HHVM implementation and can be used as a push blocking
> > counter (code can't go in if it makes it slower).
> >
> > Wall time isn't a great representation of this because it includes factors
> > that might be outside a given HHVM patch, but it sounds like we're saying
> > almost the same thing.
> >
> > I'm not familiar with the perf self monitoring?
>
> You can call perf_event_open and mmap the result. Then you can read
> the docs^Wheader file.
>
> On the god side, it's an explicit mmap, so all the nasty preemption
> issues are entirely moot. And you can count cache misses and such if
> you want to be fancy.
>
> On the bad side, the docs are a bit weak, and the added context switch
> overhead might be higher.
I'll measure the overhead for sure. If overhead isn't high, the perf
approach is very interesting. On the other hand, is it acceptable the
clock_gettime fallbacks to slow path if irq time is enabled (it's
overhead is high, we don't enable it actually)?
Thanks,
Shaohua
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists