[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAP-5=fUi4RYebxCGYZVHVEt0BpWVmUA6+-vDQfbai25_KJRs7A@mail.gmail.com>
Date: Tue, 11 Feb 2025 09:24:51 -0800
From: Ian Rogers <irogers@...gle.com>
To: Arnd Bergmann <arnd@...db.de>, Arnaldo Carvalho de Melo <acme@...nel.org>, Howard Chu <howardchu95@...il.com>
Cc: Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>,
Namhyung Kim <namhyung@...nel.org>, Mark Rutland <mark.rutland@....com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>, Jiri Olsa <jolsa@...nel.org>,
Adrian Hunter <adrian.hunter@...el.com>, Kan Liang <kan.liang@...ux.intel.com>,
John Garry <john.g.garry@...cle.com>, Will Deacon <will@...nel.org>,
James Clark <james.clark@...aro.org>, Mike Leach <mike.leach@...aro.org>,
Leo Yan <leo.yan@...ux.dev>, guoren <guoren@...nel.org>,
Paul Walmsley <paul.walmsley@...ive.com>, Palmer Dabbelt <palmer@...belt.com>,
Albert Ou <aou@...s.berkeley.edu>, Charlie Jenkins <charlie@...osinc.com>,
Bibo Mao <maobibo@...ngson.cn>, Huacai Chen <chenhuacai@...nel.org>,
Catalin Marinas <catalin.marinas@....com>, Jiri Slaby <jirislaby@...nel.org>,
Björn Töpel <bjorn@...osinc.com>,
linux-kernel@...r.kernel.org, linux-perf-users@...r.kernel.org,
linux-arm-kernel@...ts.infradead.org,
"linux-csky@...r.kernel.org" <linux-csky@...r.kernel.org>, linux-riscv@...ts.infradead.org
Subject: Re: [PATCH v2 5/7] perf trace beauty: Add syscalltbl.sh generating
all system call tables
On Tue, Feb 11, 2025 at 12:09 AM Arnd Bergmann <arnd@...db.de> wrote:
>
> On Mon, Feb 10, 2025, at 17:51, Ian Rogers wrote:
>
> > +# Each line of the syscall table should have the following format:
> > +#
> > +# NR ABI NAME [NATIVE] [COMPAT]
> > +#
> > +# NR syscall number
> > +# ABI ABI name
> > +# NAME syscall name
> > +# NATIVE native entry point (optional)
> > +# COMPAT compat entry point (optional)
>
> On x86, there is now a sixth optional field.
Thanks, I'll add and repost a v3. I had some other questions below so
I'll try to do everything together to avoid noise.
> > +#if defined(ALL_SYSCALLTBL) || defined(__arm__) || defined(__aarch64__)
> > +EOF
> > +build_tables "$tools_dir/perf/arch/arm/entry/syscalls/syscall.tbl"
> > "$outfile" common,32,oabi EM_ARM
> > +build_tables
>
> The oabi syscalls probably shouldn't be part of the default set here.
> Technically these are two separate ABIs, though EABI is a subset of
> OABI for the most most part. Some of the calling conventions are
> also different.
Ack. I was carrying forward:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/arch/arm/entry/syscalls/Kbuild#n3
but noticed that we weren't adding this for arm64:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/arch/arm64/entry/syscalls/Kbuild
I'm happy to drop and have the ARM64 behavior. I'll make it a separate
patch in case there is a desire from someone to revert.
> > "$tools_dir/perf/arch/arm64/entry/syscalls/syscall_64.tbl" "$outfile"
> > common,64,renameat,rlimit,memfd_secret EM_AARCH64
> > +cat >> "$outfile" <<EOF
> > +#endif // defined(ALL_SYSCALLTBL) || defined(__arm__) ||
> > defined(__aarch64__)
>
> Hardcoding the set of ABIs in the middle of the script seems
> too fragile to me, I'm worried that these get out of sync quickly.
I agree, again this is carrying forward a behavior and at least after
these changes the location is just one place. Do you have any
suggestions on how to do better?
Fwiw, I wonder a related problem/question that has come up primarily
with Arnaldo and Howard is in having a way to determine system call
argument types so that perf trace can pretty print them. For example,
if via BTF it is found an argument is a "const char*" then it is
assumed to be a string, but a "char *" is not as it may just be an out
argument. There's a source for more information in the syzkaller
project:
https://github.com/google/syzkaller/blob/master/sys/linux/sys.txt
Perhaps there's a way to generate this information from the Linux
build and feed it into perf's build. It is out-of-scope for what I'm
trying to do here, but I thought it worth a mention given my general
ignorance on wider things.
> > +#if defined(ALL_SYSCALLTBL) || defined(__mips__)
> > +EOF
> > +build_tables
> > "$tools_dir/perf/arch/mips/entry/syscalls/syscall_n64.tbl" "$outfile"
> > common,64,n64 EM_MIPS
> > +cat >> "$outfile" <<EOF
> > +#endif // defined(ALL_SYSCALLTBL) || defined(__mips__)
>
> What about n32/o32? The syscall tables are completely different here.
So perf hasn't historically supported them and no one is asking for
support. Generating more tables isn't the problem, but we need to have
some way of determining which table to use for n32/o32. I see
EF_MIPS_ABI_O32 and EF_MIPS_ABI_O64, so we could add support by
extending the lookup of the table to be both of e_machine and e_flags.
I'm less clear on choosing n32. That said, back in the 90s I was
working to port MIPS code to Itanium via binary translation. Given now
Itanium is obsolete, I'm not sure it is worth adding complexity for
the sake of MIPS. I'm happy to do what others feel is best here, but
my default position is just to carry what the existing behavior is
forward.
> > +#if defined(ALL_SYSCALLTBL) || defined(__powerpc__) ||
> > defined(__powerpc64__)
> > +EOF
> > +build_tables "$tools_dir/perf/arch/powerpc/entry/syscalls/syscall.tbl"
> > "$outfile" common,32,nospu EM_PPC
> > +build_tables "$tools_dir/perf/arch/powerpc/entry/syscalls/syscall.tbl"
> > "$outfile" common,64,nospu EM_PPC64
> > +cat >> "$outfile" <<EOF
> > +#endif // defined(ALL_SYSCALLTBL) || defined(__powerpc__) ||
> > defined(__powerpc64__)
>
> This skips the SPU table, but I think that's fine.
>
> > +EOF
> > +build_tables "$tools_dir/perf/arch/s390/entry/syscalls/syscall.tbl"
> > "$outfile" common,64,renameat,rlimit,memfd_secret EM_S390
> > +cat >> "$outfile" <<EOF
> > +#endif // defined(ALL_SYSCALLTBL) || defined(__s390x__)
>
> This skips the 32-bit table, though I think that one is already
> planned to be discontinued in the future.
Thankfully we have awesome s390 devs on the mailing list, hopefully
they'll shout out if I'm doing things wrong.
> > +#if defined(ALL_SYSCALLTBL) || defined(__i386__) || defined(__x86_64__)
> > +EOF
> > +build_tables "$tools_dir/perf/arch/x86/entry/syscalls/syscall_32.tbl"
> > "$outfile" common,32,i386 EM_386
> > +build_tables "$tools_dir/perf/arch/x86/entry/syscalls/syscall_64.tbl"
> > "$outfile" common,64 EM_X86_64
> > +cat >> "$outfile" <<EOF
> > +#endif // defined(ALL_SYSCALLTBL) || defined(__i386__) ||
> > defined(__x86_64__)
>
> This misses the x32 table.
Again I'm carrying forward a behavior. Would it be worth adding x32?
Context: I handled x86 on Android over 10 years ago. x86 phones were
64-bit long before ARM or Apple phones, the kernel was 64-bit but the
userland was forced to be 32-bit. ARM32 has R15 be the program
counter, Sophie Wilson's idea, and when Android's security folks
experimented with ASLR they found it to be free as a consequence of
ARM32. On x86 not x32 then you're in the land of thunk bx, losing a
register and extra instructions. We never did x32 on Android as it
became irrelevant when I brought up x86-64 on Android. The desire for
x32 was for RIP encoding to fix the ASLR issue, not so much the extra
registers, and because Android used to mandate a 32-bit user land
(this was so extreme that even the developer's C to generally ARM
cross-compilers were built as 32-bit binaries). Given x32 was done for
Android, Android never used it, I have a hard time thinking we should
be adding support to perf. That said there is likely other context
that I'm unaware of as I'm surprised x32 still exists. Fwiw, it
saddens me that the x32 experience means that for APX we're still
getting the x86-64 ABI moved forward and silly things like the var
args convention on %al (there for C80 compatibility I once believed -
not really an issue today). On ARM64 registers there are at least 8
callee-save general purpose and floating point registers, so the x86
model of pretty much everything is caller-save means function calls
are expensive and you may need aggressive inlining. Anyway, sorry for
going on so much.
Thanks,
Ian
Powered by blists - more mailing lists