[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAM9d7cgqr0Op5UuTcV2q8-Ju3yB7cYPvG7=Nrb4K=oW4Lt4Lcg@mail.gmail.com>
Date: Wed, 22 May 2024 21:34:21 -0700
From: Namhyung Kim <namhyung@...nel.org>
To: Ian Rogers <irogers@...gle.com>
Cc: Howard Chu <howardchu95@...il.com>, Arnaldo Carvalho de Melo <acme@...nel.org>, peterz@...radead.org,
mingo@...hat.com, mark.rutland@....com, alexander.shishkin@...ux.intel.com,
jolsa@...nel.org, adrian.hunter@...el.com, kan.liang@...ux.intel.com,
zegao2021@...il.com, leo.yan@...ux.dev, ravi.bangoria@....com,
linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org,
bpf@...r.kernel.org
Subject: Re: [PATCH v2 0/4] Dump off-cpu samples directly
Hello,
On Wed, May 15, 2024 at 9:56 PM Ian Rogers <irogers@...gle.com> wrote:
>
> On Wed, May 15, 2024 at 9:24 PM Howard Chu <howardchu95@...il.com> wrote:
> >
> > Hello,
> >
> > Here is a little update on --off-cpu.
> >
> > > > It would be nice to start landing this work so I'm wondering what the
> > > > minimal way to do that is. It seems putting behavior behind a flag is
> > > > a first step.
> >
> > The flag to determine output threshold of off-cpu has been implemented.
> > If the accumulated off-cpu time exceeds this threshold, output the sample
> > directly; otherwise, save it later for off_cpu_write.
> >
> > But adding an extra pass to handle off-cpu samples introduces performance
> > issues, here's the processing rate of --off-cpu sampling(with the
> > extra pass to extract raw
> > sample data) and without. The --off-cpu-threshold is in nanoseconds.
> >
> > +-----------------------------------------------------+---------------------------------------+----------------------+
> > | comm | type
> > | process rate |
> > +-----------------------------------------------------+---------------------------------------+----------------------+
> > | -F 4999 -a | regular
> > samples (w/o extra pass) | 13128.675 samples/ms |
> > +-----------------------------------------------------+---------------------------------------+----------------------+
> > | -F 1 -a --off-cpu --off-cpu-threshold 100 | offcpu samples
> > (extra pass) | 2843.247 samples/ms |
> > +-----------------------------------------------------+---------------------------------------+----------------------+
> > | -F 4999 -a --off-cpu --off-cpu-threshold 100 | offcpu &
> > regular samples (extra pass) | 3910.686 samples/ms |
> > +-----------------------------------------------------+---------------------------------------+----------------------+
> > | -F 4999 -a --off-cpu --off-cpu-threshold 1000000000 | few offcpu &
> > regular (extra pass) | 4661.229 samples/ms |
> > +-----------------------------------------------------+---------------------------------------+----------------------+
What does the process rate mean? Is the sample for the
off-cpu event or other (cpu-cycles)? Is it from a single CPU
or system-wide or per-task?
> >
> > It's not ideal. I will find a way to reduce overhead. For example
> > process them samples
> > at save time as Ian mentioned.
> >
> > > > To turn the bpf-output samples into off-cpu events there is a pass
> > > > added to the saving. I wonder if that can be more generic, like a save
> > > > time perf inject.
> >
> > And I will find a default value for such a threshold based on performance
> > and common use cases.
> >
> > > Sounds good. We might add an option to specify the threshold to
> > > determine whether to dump the data or to save it for later. But ideally
> > > it should be able to find a good default.
> >
> > These will be done before the GSoC kick-off on May 27.
>
> This all sounds good. 100ns seems like quite a low threshold and 1s
> extremely high, shame such a high threshold is marginal for the
> context switch performance change. I wonder 100 microseconds may be a
> more sensible threshold. It's 100 times larger than the cost of 1
> context switch but considerably less than a frame redraw at 60FPS (16
> milliseconds).
I don't know what's the sensible default. But 1 msec could be
another candidate for the similar reason. :)
Thanks,
Namhyung
Powered by blists - more mailing lists