[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAEf4BzZ4XgSOHz0T5nXPyd+keo=rQvH5jc0Jghw1db0a7qR9GQ@mail.gmail.com>
Date: Thu, 14 Nov 2024 15:44:12 -0800
From: Andrii Nakryiko <andrii.nakryiko@...il.com>
To: Jiri Olsa <olsajiri@...il.com>
Cc: Peter Zijlstra <peterz@...radead.org>, Oleg Nesterov <oleg@...hat.com>,
Andrii Nakryiko <andrii@...nel.org>, bpf@...r.kernel.org, Song Liu <songliubraving@...com>,
Yonghong Song <yhs@...com>, John Fastabend <john.fastabend@...il.com>, Hao Luo <haoluo@...gle.com>,
Steven Rostedt <rostedt@...dmis.org>, Masami Hiramatsu <mhiramat@...nel.org>,
Alan Maguire <alan.maguire@...cle.com>, linux-kernel@...r.kernel.org,
linux-trace-kernel@...r.kernel.org
Subject: Re: [RFC perf/core 05/11] uprobes: Add mapping for optimized uprobe trampolines
On Tue, Nov 5, 2024 at 8:33 AM Jiri Olsa <olsajiri@...il.com> wrote:
>
> On Tue, Nov 05, 2024 at 03:23:27PM +0100, Peter Zijlstra wrote:
> > On Tue, Nov 05, 2024 at 02:33:59PM +0100, Jiri Olsa wrote:
> > > Adding interface to add special mapping for user space page that will be
> > > used as place holder for uprobe trampoline in following changes.
> > >
> > > The get_tramp_area(vaddr) function either finds 'callable' page or create
> > > new one. The 'callable' means it's reachable by call instruction (from
> > > vaddr argument) and is decided by each arch via new arch_uprobe_is_callable
> > > function.
> > >
> > > The put_tramp_area function either drops refcount or destroys the special
> > > mapping and all the maps are clean up when the process goes down.
> >
> > In another thread somewhere, Andrii mentioned that Meta has executables
> > with more than 4G of .text. This isn't going to work for them, is it?
> >
>
> not if you can't reach the trampoline from the probed address
That specific example was about 1.5GB (though we might have bigger
.text, I didn't do exhaustive research). As Jiri said, this would be
best effort trying to find closest free mapping to stay within +/-2GB
offset. If that fails, we always would be falling back to slower
int3-based uprobing, yep.
Jiri, we could also have an option to support 64-bit call, right? We'd
need nop9 for that, but it's an option as well to future-proofing this
approach, no?
Also, can we somehow use fs/gs-based indirect calls/jumps somehow to
have a guarantee that offset is always small (<2GB away relative to
the base stored in fs/gs). Not sure if this is feasible, but I thought
it would be good to bring this up just to make sure it doesn't work.
If segment based absolute call is somehow feasible, we can probably
simplify a bunch of stuff by allocating it eagerly, once, and
somewhere high up next to VDSO (or maybe even put it into VDSO, don't
now).
Anyways, let's brainstorm if there are any clever alternatives here.
>
> jirka
Powered by blists - more mailing lists