linux-kernel - Re: [PATCH v4 1/2] perf trace: Implement syscall summary in BPF

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <aAkmY0hLXarmCSIA@google.com>
Date: Wed, 23 Apr 2025 10:41:55 -0700
From: Namhyung Kim <namhyung@...nel.org>
To: Arnaldo Carvalho de Melo <acme@...nel.org>
Cc: Howard Chu <howardchu95@...il.com>, Ian Rogers <irogers@...gle.com>,
	Kan Liang <kan.liang@...ux.intel.com>, Jiri Olsa <jolsa@...nel.org>,
	Adrian Hunter <adrian.hunter@...el.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...nel.org>, LKML <linux-kernel@...r.kernel.org>,
	linux-perf-users@...r.kernel.org, Song Liu <song@...nel.org>,
	bpf@...r.kernel.org
Subject: Re: [PATCH v4 1/2] perf trace: Implement syscall summary in BPF

Hi Arnaldo,

On Wed, Apr 23, 2025 at 01:26:48PM -0300, Arnaldo Carvalho de Melo wrote:
> On Fri, Mar 28, 2025 at 06:46:36PM -0700, Howard Chu wrote:
> > Hello Namhyung,
> > 
> > On Tue, Mar 25, 2025 at 9:40 PM Namhyung Kim <namhyung@...nel.org> wrote:
> > >
> > > When -s/--summary option is used, it doesn't need (augmented) arguments
> > > of syscalls.  Let's skip the augmentation and load another small BPF
> > > program to collect the statistics in the kernel instead of copying the
> > > data to the ring-buffer to calculate the stats in userspace.  This will
> > > be much more light-weight than the existing approach and remove any lost
> > > events.
> > >
> > > Let's add a new option --bpf-summary to control this behavior.  I cannot
> > > make it default because there's no way to get e_machine in the BPF which
> > > is needed for detecting different ABIs like 32-bit compat mode.
> > >
> > > No functional changes intended except for no more LOST events. :)
> > >
> > >   $ sudo ./perf trace -as --summary-mode=total --bpf-summary sleep 1
> > >
> > >    Summary of events:
> > >
> > >    total, 6194 events
> > >
> > >      syscall            calls  errors  total       min       avg       max       stddev
> > >                                        (msec)    (msec)    (msec)    (msec)        (%)
> > >      --------------- --------  ------ -------- --------- --------- ---------     ------
> > >      epoll_wait           561      0  4530.843     0.000     8.076   520.941     18.75%
> > >      futex                693     45  4317.231     0.000     6.230   500.077     21.98%
> > >      poll                 300      0  1040.109     0.000     3.467   120.928     17.02%
> > >      clock_nanosleep        1      0  1000.172  1000.172  1000.172  1000.172      0.00%
> > >      ppoll                360      0   872.386     0.001     2.423   253.275     41.91%
> > >      epoll_pwait           14      0   384.349     0.001    27.453   380.002     98.79%
> > >      pselect6              14      0   108.130     7.198     7.724     8.206      0.85%
> > >      nanosleep             39      0    43.378     0.069     1.112    10.084     44.23%
> > >      ...
> 
> I added the following to align sched_[gs]etaffinity,

Thanks for processing the patch and updating this.  But I'm afraid there
are more syscalls with longer names and this is not the only place to
print the syscall names.  Also I think we need to update length of the
time fields.  So I prefer handling them in a separate patch later.

Thanks,
Namhyung
 
> 
> diff --git a/tools/perf/util/bpf-trace-summary.c b/tools/perf/util/bpf-trace-summary.c
> index 114d8d9ed9b2d3f3..af37d3bb5f9c42e7 100644
> --- a/tools/perf/util/bpf-trace-summary.c
> +++ b/tools/perf/util/bpf-trace-summary.c
> @@ -139,9 +139,9 @@ static int print_common_stats(struct syscall_data *data, FILE *fp)
>  		/* TODO: support other ABIs */
>  		name = syscalltbl__name(EM_HOST, node->syscall_nr);
>  		if (name)
> -			printed += fprintf(fp, "   %-15s", name);
> +			printed += fprintf(fp, "   %-17s", name);
>  		else
> -			printed += fprintf(fp, "   syscall:%-7d", node->syscall_nr);
> +			printed += fprintf(fp, "   syscall:%-9d", node->syscall_nr);
>  
>  		printed += fprintf(fp, " %8u %6u %9.3f %9.3f %9.3f %9.3f %9.2f%%\n",
>  				   stat->count, stat->error, total, min, avg, max,