[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAADnVQ+m8F0BsJr_T1ePpB_zQ2vS+3OD2h+Wrfv1x+an9fSLkw@mail.gmail.com>
Date: Tue, 22 Apr 2025 10:28:42 -0700
From: Alexei Starovoitov <alexei.starovoitov@...il.com>
To: Jianlin Lv <iecedge@...il.com>
Cc: bpf <bpf@...r.kernel.org>, Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>, Andrii Nakryiko <andrii@...nel.org>,
Martin KaFai Lau <martin.lau@...ux.dev>, Eduard <eddyz87@...il.com>, Song Liu <song@...nel.org>,
Yonghong Song <yonghong.song@...ux.dev>, John Fastabend <john.fastabend@...il.com>,
KP Singh <kpsingh@...nel.org>, Stanislav Fomichev <sdf@...ichev.me>, Hao Luo <haoluo@...gle.com>,
Jiri Olsa <jolsa@...nel.org>, Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>, Juri Lelli <juri.lelli@...hat.com>,
Vincent Guittot <vincent.guittot@...aro.org>, Dietmar Eggemann <dietmar.eggemann@....com>,
Steven Rostedt <rostedt@...dmis.org>, Benjamin Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
Valentin Schneider <vschneid@...hat.com>, LKML <linux-kernel@...r.kernel.org>, jianlv@...y.com
Subject: Re: [RFC PATCH bpf-next 1/2] Enhance BPF execution timing by
excluding IRQ time
On Tue, Apr 22, 2025 at 6:47 AM Jianlin Lv <iecedge@...il.com> wrote:
>
> From: Jianlin Lv <iecedge@...il.com>
>
> This commit excludes IRQ time from the total execution duration of BPF
> programs. When CONFIG_IRQ_TIME_ACCOUNTING is enabled, IRQ time is
> accounted for separately, offering a more accurate assessment of CPU
> usage for BPF programs.
>
> Signed-off-by: Jianlin Lv <iecedge@...il.com>
> ---
> include/linux/filter.h | 24 ++++++++++++++++++++++--
> 1 file changed, 22 insertions(+), 2 deletions(-)
>
> diff --git a/include/linux/filter.h b/include/linux/filter.h
> index f5cf4d35d83e..3e0f975176a6 100644
> --- a/include/linux/filter.h
> +++ b/include/linux/filter.h
> @@ -703,12 +703,32 @@ static __always_inline u32 __bpf_prog_run(const struct bpf_prog *prog,
> cant_migrate();
> if (static_branch_unlikely(&bpf_stats_enabled_key)) {
> struct bpf_prog_stats *stats;
> - u64 duration, start = sched_clock();
> + u64 duration, start, start_time, end_time, irq_delta;
> unsigned long flags;
> + unsigned int cpu;
>
> - ret = dfunc(ctx, prog->insnsi, prog->bpf_func);
> + #ifdef CONFIG_IRQ_TIME_ACCOUNTING
> + if (in_task()) {
> + cpu = get_cpu();
> + put_cpu();
> + start_time = irq_time_read(cpu);
> + }
> + #endif
>
> + start = sched_clock();
> + ret = dfunc(ctx, prog->insnsi, prog->bpf_func);
> duration = sched_clock() - start;
> +
> + #ifdef CONFIG_IRQ_TIME_ACCOUNTING
> + if (in_task()) {
> + end_time = irq_time_read(cpu);
> + if (end_time > start_time) {
> + irq_delta = end_time - start_time;
> + duration -= irq_delta;
> + }
> + }
> + #endif
> +
This is way too much overhead.
This timing loop is optimized to measure bpf prog runtime.
See commit ce09cbdd9888 ("bpf: Improve program stats run-time calculation")
IRQ can happen and distort the numbers, but you shouldn't
be running with bpf_stats_enabled for a long time.
You need to sample it instead.
Every couple minutes turn it on for a second, capture the stats
and aggregate over time. Filter out outliers due to IRQ or whatever.
Powered by blists - more mailing lists