[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220713225511.70d03fc6@gandalf.local.home>
Date: Wed, 13 Jul 2022 22:55:11 -0400
From: Steven Rostedt <rostedt@...dmis.org>
To: Song Liu <songliubraving@...com>
Cc: Song Liu <song@...nel.org>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"bpf@...r.kernel.org" <bpf@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"ast@...nel.org" <ast@...nel.org>,
"daniel@...earbox.net" <daniel@...earbox.net>,
"andrii@...nel.org" <andrii@...nel.org>,
Kernel Team <Kernel-team@...com>,
"jolsa@...nel.org" <jolsa@...nel.org>,
"mhiramat@...nel.org" <mhiramat@...nel.org>
Subject: Re: [PATCH v2 bpf-next 1/5] ftrace: allow customized flags for
ftrace_direct_multi ftrace_ops
On Thu, 14 Jul 2022 01:42:59 +0000
Song Liu <songliubraving@...com> wrote:
> > As I replied to patch 3, here's my thoughts.
> >
> > DIRECT is treated as though it changes the IP. If you register it to a
> > function that has an IPMODIFY already set to it, it will call the
> > ops->ops_func() asking if the ops can use SHARED_IPMODIFY (which means
> > a DIRECT can be shared with IPMODIFY). If it can, then it returns true,
> > and the SHARED_IPMODIFY is set *by ftrace*. The user of the ftrace APIs
> > should not have to manage this. It should be managed by the ftrace
> > infrastructure.
>
> Hmm... I don't think this gonna work.
>
> First, two IPMODIFY ftrace ops cannot work together on the same kernel
> function. So there won't be a ops with both IPMODIFY and SHARE_IPMODIFY.
I'm not saying that.
I'm saying that ftrace does not have any clue (nor cares) about what a
DIRECT ops does. It might modify the IP or it might not. It doesn't know.
But ftrace has control over the callbacks it does control.
A DIRECT ops knows if it can work with another ops that has IPMODIFY set.
If the DIRECT ops does not work with IPMODIFY (perhaps it wants to modify
the IP), then it will tell ftrace that it can't and ftrace will not allow
it.
That is, ftrace doesn't care if the DIRECT ops modifies the IP or not. It
can only ask (through the ops->ops_func()) if the direct trampoline can
honor the IP that is modified. If it can, then it reports back that it can,
and then ftrace will set that ops to SHARED_MODIFY, and the direct function
can switch the ops->func() to one that uses SHARED_MODIFY.
>
> non-direct ops without IPMODIFY can already share with IPMODIFY ops.
It can? ftrace sets IPMODIFY for all DIRECT callers to prevent that. Except
for this patch that removes that restriction (which I believe is broken).
> Only direct ops need SHARE_IPMODIFY flag, and it means "I can share with
> another ops with IPMODIFY". So there will be different flavors of
> direct ops:
I agree that only DIRECT ops can have SHARED_IPMODIFY set. That's what I'm
saying. But I'm saying it gets set by ftrace.
>
> 1. w/ IPMODIFY, w/o SHARE_IPMODIFY;
> 2. w/o IPMODIFY, w/o SHARE_IPMODIFY;
> 3. w/o IPMODIFY, w/ SHARE_IPMODIFY.
>
> #1 can never work on the same function with another IPMODIFY ops, and
> we don't plan to make it work. #2 does not work with another IPMODIFY
> ops. And #3 works with another IPMODIFY ops.
Lets look at this differently. What I'm proposing is that registering a
direct ops does not need to tell ftrace if it modifies the IP or not.
That's because ftrace doesn't care. Once ftrace calls the direct trampoline
it loses all control. With the ftrace ops callbacks, it is the one
responsible for setting up the modified IP. That's not the case with the
direct trampolines.
I'm saying that ftrace doesn't care what the DIRECT ops does. But it will
not, by default, allow an IPMODIFY to happen when a DIRECT ops is on the
same function, or vice versa.
What I'm suggesting is when a IPMODIFY tries to attach to a function that
also has a DIRECT ops, or a DIRECT tries to attach to a function that
already has an IPMODIFY ops on it, that ftrace calls the direct
ops->ops_func() asking if it is safe to use with an IPMODIFY function.
If the direct ops modifies the IP itself, it will return a "no", and ftrace
will reject the attachment. If the direct ops can, it returns a "yes" and
then ftrace will set the SHARED_IPMODIFY flag to that ops and continue.
The fact that the ops->ops_func was called will let the caller (bpf) know
that it needs to use the stack to return to the function, or to call it if
it is also tracing the return.
If the IPMODIFY ops is removed, then ftrace will call the ops->ops_func()
telling it it no longer has the IPMODIFY set, and will clear the
SHARED_IPMODIFY flag, and then the owner of the direct ops can go back to
doing whatever it did before (the calling the function directly, or
whatever).
>
> The owner of the direct trampoline uses these flags to tell ftrace
> infrastructure the property of this trampoline.
I disagree. The owner shouldn't have to care about the flags. Let ftrace
handle it. This will make things so much more simple for both BPF and
ftrace.
>
> BPF trampolines with only fentry calls are #3 direct ops. BPF
> trampolines with fexit or fmod_ret calls will be #2 trampoline by
> default, but it is also possible to generate #3 trampoline for it.
And ftrace doesn't care about this. But bpf needs to care if the IP is
being modified or not. Which can be done by the ops->ops_func() that you
added.
>
> BPF side will try to register #2 trampoline, If ftrace detects another
> IPMODIFY ops on the same function, it will let BPF trampoline know
> with -EAGAIN from register_ftrace_direct_multi(). Then, BPF side will
> regenerate a #3 trampoline and register it again.
This is too complex. You are missing the simple way.
>
> I know this somehow changes the policy with direct ops, but it is the
> only way this can work, AFAICT.
I disagree. There's a much better way that this can work.
>
> Does this make sense? Did I miss something?
Let me start from the beginning.
1. Live kernel patching patches function foo.
2. lkp has an ops->flags | IPMODIFY set when it registers.
3. bpf registers a direct trampoline to function foo.
4. bpf has an ops->flags | DIRECT set when it registers
5. ftrace sees that the function already has an ops->flag = IPMODIFY on it,
so it calls bpf ops->ops_func(SHARE_WITH_IPMODIFY)
6. bpf can and does the following
a. if it's the simple #1 trampoline (just traces the start of a function)
it doesn't need to do anything special returns "yes".
b. if it's the #2 trampoline, it will change the trampoline to use the
stack to find what to call, and returns "yes".
7. ftrace gets "yes" and sets the *ipmodify* ops with SHARED_IPMODIFY
(there's a reason for setting this flag for the ipmodify ops and not the
direct ops).
8. LKP is removed from the function foo.
9. ftrace sees the lkp IPMODIFY ops has SHARED_IPMODIFY on it, and knows
that there's a direct call here too. It removes the IPMODIFY ops, and
then calls the direct ops->ops_func(STOP_SHARE_WITH_IPMODIFY) to let the
direct code know that it is no longer sharing with an IPMODIFY such that
it can change to call the function directly and not use the stack.
See how simple this is? ftrace doesn't have to care if the direct caller
changes the IP or not. It just wants to know if it can be shared with an
IPMODIFY ops. And BPF doesn't have to care about extra flags used to manage
the ftrace infrastructure.
Does this make sense now?
-- Steve
Powered by blists - more mailing lists