[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPhsuW7F4KritXPXixoPSw4zbCsqpfZaYBuw5BgD+KKXaoeGxg@mail.gmail.com>
Date: Mon, 24 Jan 2022 23:07:45 -0800
From: Song Liu <song@...nel.org>
To: Hao Luo <haoluo@...gle.com>
Cc: Alexei Starovoitov <ast@...nel.org>,
Andrii Nakryiko <andrii@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
Song Liu <songliubraving@...com>, Yonghong Song <yhs@...com>,
Martin KaFai Lau <kafai@...com>,
KP Singh <kpsingh@...nel.org>, bpf <bpf@...r.kernel.org>,
open list <linux-kernel@...r.kernel.org>,
Jiri Olsa <jolsa@...nel.org>,
Blake Jones <blakejones@...gle.com>,
Alexey Alexandrov <aalexand@...gle.com>,
Namhyung Kim <namhyung@...gle.com>,
Ian Rogers <irogers@...gle.com>,
"pasha.tatashin@...een.com" <pasha.tatashin@...een.com>
Subject: Re: [Question] How to reliably get BuildIDs from bpf prog
On Mon, Jan 24, 2022 at 2:43 PM Hao Luo <haoluo@...gle.com> wrote:
>
> Dear BPF experts,
>
> I'm working on collecting some kernel performance data using BPF
> tracing prog. Our performance profiling team wants to associate the
> data with user stack information. One of the requirements is to
> reliably get BuildIDs from bpf_get_stackid() and other similar helpers
> [1].
>
> As part of an early investigation, we found that there are a couple
> issues that make bpf_get_stackid() much less reliable than we'd like
> for our use:
>
> 1. The first page of many binaries (which contains the ELF headers and
> thus the BuildID that we need) is often not in memory. The failure of
> find_get_page() (called from build_id_parse()) is higher than we would
> want.
Our top use case of bpf_get_stack() is called from NMI, so there isn't
much we can do. Maybe it is possible to improve it by changing the
layout of the binary and the libraries? Specifically, if the text is
also in the first page, it is likely to stay in memory?
> 2. When anonymous huge pages are used to hold some regions of process
> text, build_id_parse() also fails to get a BuildID because
> vma->vm_file is NULL.
How did the text get in anonymous memory? I guess it is NOT from JIT?
We had a hack to use transparent huge page for application text. The
hack looks like:
"At run time, the application creates an 8MB temporary buffer and the
hot section of the executable memory is copied to it. The 8MB region in
the executable memory is then converted to a huge page (by way of an
mmap() to anonymous pages and an madvise() to create a huge page), the
data is copied back to it, and it is made executable again using
mprotect()."
If your case is the same (or similar), it can probably be fixed with
CONFIG_READ_ONLY_THP_FOR_FS, and modified user space.
Thanks,
Song
Powered by blists - more mailing lists