[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190904042310.GA159235@google.com>
Date: Wed, 4 Sep 2019 00:23:10 -0400
From: Joel Fernandes <joel@...lfernandes.org>
To: Valentin Schneider <valentin.schneider@....com>
Cc: Radim Krčmář <rkrcmar@...hat.com>,
linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Thomas Gleixner <tglx@...utronix.de>,
Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>,
Steven Rostedt <rostedt@...dmis.org>,
"H. Peter Anvin" <hpa@...or.com>,
Andy Lutomirski <luto@...nel.org>,
Jirka Hladký <jhladky@...hat.com>,
Jiří Vozár <jvozar@...hat.com>,
x86@...nel.org, Qais Yousef <qais.yousef@....com>
Subject: Re: [PATCH 2/2] sched/debug: add sched_update_nr_running tracepoint
On Tue, Sep 03, 2019 at 05:05:47PM +0100, Valentin Schneider wrote:
> On 03/09/2019 16:43, Radim Krčmář wrote:
> > The paper "The Linux Scheduler: a Decade of Wasted Cores" used several
> > custom data gathering points to better understand what was going on in
> > the scheduler.
> > Red Hat adapted one of them for the tracepoint framework and created a
> > tool to plot a heatmap of nr_running, where the sched_update_nr_running
> > tracepoint is being used for fine grained monitoring of scheduling
> > imbalance.
> > The tool is available from https://github.com/jirvoz/plot-nr-running.
> >
> > The best place for the tracepoints is inside the add/sub_nr_running,
> > which requires some shenanigans to make it work as they are defined
> > inside sched.h.
> > The tracepoints have to be included from sched.h, which means that
> > CREATE_TRACE_POINTS has to be defined for the whole header and this
> > might cause problems if tree-wide headers expose tracepoints in sched.h
> > dependencies, but I'd argue it's the other side's misuse of tracepoints.
> >
> > Moving the import sched.h line lower would require fixes in s390 and ppc
> > headers, because they don't include dependecies properly and expect
> > sched.h to do it, so it is simpler to keep sched.h there and
> > preventively undefine CREATE_TRACE_POINTS right after.
> >
> > Exports of the pelt tracepoints remain because they don't need to be
> > protected by CREATE_TRACE_POINTS and moving them closer would be
> > unsightly.
> >
>
> Pure trace events are frowned upon in scheduler world, try going with
> trace points. Qais did something very similar recently:
>
> https://lore.kernel.org/lkml/20190604111459.2862-1-qais.yousef@arm.com/
>
> You'll have to implement the associated trace events in a module, which
> lets you define your own event format and doesn't form an ABI :).
Is that really true? eBPF programs loaded from userspace can access
tracepoints through BPF_RAW_TRACEPOINT_OPEN, which is UAPI:
https://github.com/torvalds/linux/blob/master/include/uapi/linux/bpf.h#L103
I don't have a strong opinion about considering tracepoints as ABI / API or
not, but just want to get the facts straight :)
thanks,
- Joel
Powered by blists - more mailing lists