[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAEf4BzaL=qsSyDc8OxeN4pr7+Lvv+de4f+hM5a56LY8EABAk3w@mail.gmail.com>
Date: Tue, 9 Feb 2021 12:59:51 -0800
From: Andrii Nakryiko <andrii.nakryiko@...il.com>
To: Jiri Olsa <jolsa@...hat.com>
Cc: Nathan Chancellor <nathan@...nel.org>,
Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
Andrii Nakryiko <andrii@...nel.org>,
Martin KaFai Lau <kafai@...com>,
Song Liu <songliubraving@...com>, Yonghong Song <yhs@...com>,
John Fastabend <john.fastabend@...il.com>,
KP Singh <kpsingh@...nel.org>,
Nick Desaulniers <ndesaulniers@...gle.com>,
Networking <netdev@...r.kernel.org>, bpf <bpf@...r.kernel.org>,
clang-built-linux <clang-built-linux@...glegroups.com>,
Veronika Kabatova <vkabatov@...hat.com>,
Jiri Olsa <jolsa@...nel.org>
Subject: Re: FAILED unresolved symbol vfs_truncate on arm64 with LLVM
On Tue, Feb 9, 2021 at 7:09 AM Jiri Olsa <jolsa@...hat.com> wrote:
>
> On Tue, Feb 09, 2021 at 01:36:41PM +0100, Jiri Olsa wrote:
> > On Tue, Feb 09, 2021 at 12:49:04AM -0700, Nathan Chancellor wrote:
> > > On Mon, Feb 08, 2021 at 10:56:36PM -0800, Andrii Nakryiko wrote:
> > > > On Mon, Feb 8, 2021 at 10:13 PM Andrii Nakryiko
> > > > <andrii.nakryiko@...il.com> wrote:
> > > > >
> > > > > On Mon, Feb 8, 2021 at 10:09 PM Andrii Nakryiko
> > > > > <andrii.nakryiko@...il.com> wrote:
> > > > > >
> > > > > > On Mon, Feb 8, 2021 at 9:23 PM Nathan Chancellor <nathan@...nel.org> wrote:
> > > > > > >
> > > > > > > On Mon, Feb 08, 2021 at 08:45:43PM -0800, Andrii Nakryiko wrote:
> > > > > > > > On Mon, Feb 8, 2021 at 7:44 PM Nathan Chancellor <nathan@...nel.org> wrote:
> > > > > > > > >
> > > > > > > > > Hi all,
> > > > > > > > >
> > > > > > > > > Recently, an issue with CONFIG_DEBUG_INFO_BTF was reported for arm64:
> > > > > > > > > https://groups.google.com/g/clang-built-linux/c/de_mNh23FOc/m/E7cu5BwbBAAJ
> > > > > > > > >
> > > > > > > > > $ make -skj"$(nproc)" ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- \
> > > > > > > > > LLVM=1 O=build/aarch64 defconfig
> > > > > > > > >
> > > > > > > > > $ scripts/config \
> > > > > > > > > --file build/aarch64/.config \
> > > > > > > > > -e BPF_SYSCALL \
> > > > > > > > > -e DEBUG_INFO_BTF \
> > > > > > > > > -e FTRACE \
> > > > > > > > > -e FUNCTION_TRACER
> > > > > > > > >
> > > > > > > > > $ make -skj"$(nproc)" ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- \
> > > > > > > > > LLVM=1 O=build/aarch64 olddefconfig all
> > > > > > > > > ...
> > > > > > > > > FAILED unresolved symbol vfs_truncate
> > > > > > > > > ...
> > > > > > > > >
> > > > > > > > > My bisect landed on commit 6e22ab9da793 ("bpf: Add d_path helper")
> > > > > > > > > although that seems obvious given that is what introduced
> > > > > > > > > BTF_ID(func, vfs_truncate).
> > > > > > > > >
> > > > > > > > > I am using the latest pahole v1.20 and LLVM is at
> > > > > > > > > https://github.com/llvm/llvm-project/commit/14da287e18846ea86e45b421dc47f78ecc5aa7cb
> > > > > > > > > although I can reproduce back to LLVM 10.0.1, which is the earliest
> > > > > > > > > version that the kernel supports. I am very unfamiliar with BPF so I
> > > > > > > > > have no idea what is going wrong here. Is this a known issue?
> > > > > > > > >
> > > > > > > >
> > > > > > > > I'll skip the reproduction games this time and will just request the
> > > > > > > > vmlinux image. Please upload somewhere so that we can look at DWARF
> > > > > > > > and see what's going on. Thanks.
> > > > > > > >
> > > > > > >
> > > > > > > Sure thing, let me know if this works. I uploaded in two places to make
> > > > > > > it easier to grab:
> > > > > > >
> > > > > > > zstd compressed:
> > > > > > > https://github.com/nathanchance/bug-files/blob/3b2873751e29311e084ae2c71604a1963f5e1a48/btf-aarch64/vmlinux.zst
> > > > > > >
> > > > > >
> > > > > > Thanks. I clearly see at least one instance of seemingly well-formed
> > > > > > vfs_truncate DWARF declaration. Also there is a proper ELF symbol for
> > > > > > it. Which means it should have been generated in BTF, but it doesn't
> > > > > > appear to be, so it does seem like a pahole bug. I (or someone else
> > > > > > before me) will continue tomorrow.
> > > > > >
> > > > > > $ llvm-dwarfdump vmlinux
> > > > > > ...
> > > > > >
> > > > > > 0x00052e6f: DW_TAG_subprogram
> > > > > > DW_AT_name ("vfs_truncate")
> > > > > > DW_AT_decl_file
> > > > > > ("/home/nathan/cbl/src/linux/include/linux/fs.h")
> > > > > > DW_AT_decl_line (2520)
> > > > > > DW_AT_prototyped (true)
> > > > > > DW_AT_type (0x000452cb "long int")
> > > > > > DW_AT_declaration (true)
> > > > > > DW_AT_external (true)
> > > > > >
> > > > > > 0x00052e7b: DW_TAG_formal_parameter
> > > > > > DW_AT_type (0x00045fc6 "const path*")
> > > > > >
> > > > > > 0x00052e80: DW_TAG_formal_parameter
> > > > > > DW_AT_type (0x00045213 "long long int")
> > > > > >
> > > > > > ...
> > > > > >
> > > > >
> > > > > ... and here's the *only* other one (not marked as declaration, but I
> > > > > thought we already handle that, Jiri?):
> > > > >
> > > > > 0x01d0da35: DW_TAG_subprogram
> > > > > DW_AT_low_pc (0xffff80001031f430)
> > > > > DW_AT_high_pc (0xffff80001031f598)
> > > > > DW_AT_frame_base (DW_OP_reg29)
> > > > > DW_AT_GNU_all_call_sites (true)
> > > > > DW_AT_name ("vfs_truncate")
> > > > > DW_AT_decl_file ("/home/nathan/cbl/src/linux/fs/open.c")
> > > > > DW_AT_decl_line (69)
> > > > > DW_AT_prototyped (true)
> > > > > DW_AT_type (0x01cfdfe4 "long int")
> > > > > DW_AT_external (true)
> > > > >
> > > >
> > > > Ok, the problem appears to be not in DWARF, but in mcount_loc data.
> > > > vfs_truncate's address is not recorded as ftrace-attachable, and thus
> > > > pahole ignores it. I don't know why this happens and it's quite
> > > > strange, given vfs_truncate is just a normal global function.
> >
> > right, I can't see it in mcount adresses.. but it begins with instructions
> > that appears to be nops, which would suggest it's traceable
> >
> > ffff80001031f430 <vfs_truncate>:
> > ffff80001031f430: 5f 24 03 d5 hint #34
> > ffff80001031f434: 1f 20 03 d5 nop
> > ffff80001031f438: 1f 20 03 d5 nop
> > ffff80001031f43c: 3f 23 03 d5 hint #25
> >
> > > >
> > > > I'd like to understand this issue before we try to fix it, but there
> > > > is at least one improvement we can make: pahole should check ftrace
> > > > addresses only for static functions, not the global ones (global ones
> > > > should be always attachable, unless they are special, e.g., notrace
> > > > and stuff). We can easily check that by looking at the corresponding
> > > > symbol. But I'd like to verify that vfs_truncate is ftrace-attachable
>
> I'm still trying to build the kernel.. however ;-)
>
> patch below adds the ftrace check only for static functions
> and lets the externa go through.. but as you said, in this
> case we'll need to figure out the 'notrace' and other checks
> ftrace is doing
>
> jirka
>
>
> ---
> diff --git a/btf_encoder.c b/btf_encoder.c
> index b124ec20a689..4d147406cfa5 100644
> --- a/btf_encoder.c
> +++ b/btf_encoder.c
> @@ -734,7 +734,7 @@ int cu__encode_btf(struct cu *cu, int verbose, bool force,
> continue;
> if (!has_arg_names(cu, &fn->proto))
> continue;
> - if (functions_cnt) {
> + if (!fn->external && functions_cnt) {
I wouldn't trust DWARF, honestly. Wouldn't checking GLOBAL vs LOCAL
FUNC ELF symbol be more reliable?
> struct elf_function *func;
> const char *name;
>
> @@ -746,9 +746,6 @@ int cu__encode_btf(struct cu *cu, int verbose, bool force,
> if (!func || func->generated)
> continue;
> func->generated = true;
> - } else {
> - if (!fn->external)
> - continue;
> }
>
> btf_fnproto_id = btf_elf__add_func_proto(btfe, cu, &fn->proto, type_id_off);
>
Powered by blists - more mailing lists