[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z5eYzcF5JLR4o5Yl@pathway.suse.cz>
Date: Mon, 27 Jan 2025 15:31:41 +0100
From: Petr Mladek <pmladek@...e.com>
To: Yafang Shao <laoar.shao@...il.com>
Cc: jpoimboe@...nel.org, jikos@...nel.org, mbenes@...e.cz,
joe.lawrence@...hat.com, live-patching@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH 2/2] livepatch: Implement livepatch hybrid mode
On Mon 2025-01-27 14:35:26, Yafang Shao wrote:
> The atomic replace livepatch mechanism was introduced to handle scenarios
> where we want to unload a specific livepatch without unloading others.
> However, its current implementation has significant shortcomings, making
> it less than ideal in practice. Below are the key downsides:
[...]
> In the hybrid mode:
>
> - Specific livepatches can be marked as "non-replaceable" to ensure they
> remain active and unaffected during replacements.
>
> - Other livepatches can be marked as "replaceable," allowing targeted
> replacements of only those patches.
>
> This selective approach would reduce unnecessary transitions, lower the
> risk of temporary patch loss, and mitigate performance issues during
> livepatch replacement.
>
> --- a/kernel/livepatch/core.c
> +++ b/kernel/livepatch/core.c
> @@ -658,6 +658,8 @@ static int klp_add_nops(struct klp_patch *patch)
> klp_for_each_object(old_patch, old_obj) {
> int err;
>
> + if (!old_patch->replaceable)
> + continue;
This is one example where things might get very complicated.
The same function might be livepatched by more livepatches, see
ops->func_stack. For example, let's have funcA and three livepatches:
a
+ lp1:
.replace = false,
.non-replace = true,
.func = {
.old_name = "funcA",
.new_func = lp1_funcA,
}, { }
+ lp2:
.replace = false,
.non-replace = false,
.func = {
.old_name = "funcA",
.new_func = lp2_funcA,
},{
.old_name = "funcB",
.new_func = lp2_funcB,
}, { }
+ lp3:
.replace = true,
.non-replace = false,
.func = {
.old_name = "funcB",
.new_func = lp3_funcB,
}, { }
Now, apply lp1:
+ funcA() gets redirected to lp1_funcA()
Then, apply lp2
+ funcA() gets redirected to lp2_funcA()
Finally, apply lp3:
+ The proposed code would add "nop()" for
funcA() because it exists in lp2 and does not exist in lp3.
+ funcA() will get redirected to the original code
because of the nop() during transition
+ nop() will get removed in klp_complete_transition() and
funcA() will get suddenly redirected to lp1_funcA()
because it will still be on ops->func_stack even
after the "nop" and lp2_funcA() gets removed.
=> The entire system will start using another funcA
implementation at some random time
=> this would violate the consistency model
The proper solution might be tricky:
1. We would need to detect this situation and do _not_ add
the "nop" for lp3 and funcA().
2. We would need a more complicate code for handling the task states.
klp_update_patch_state() sets task->patch_state using
the global "klp_target_state". But in the above example,
when enabling lp3:
+ funcA would need to get transitioned _backward_:
KLP_TRANSITION_PATCHED -> KLP_TRANSITION_UNPATCHED
, so that it goes on ops->func_stack:
lp2_funcA -> lp1->funcA
while:
+ funcA would need to get transitioned forward:
KLP_TRANSITION_UNPATCHED -> KLP_TRANSITION_PATCHED
, so that it goes on ops->func_stack:
lp2_funcB -> lp3->funcB
=> the hybrid mode would complicate the life for both livepatch
creators/maintainers and kernel code developers/maintainers.
I am afraid that this complexity is not acceptable if there are
better solutions for the original problem.
> err = klp_add_object_nops(patch, old_obj);
> if (err)
> return err;
I am sorry but I am quite strongly against this approach!
Best Regards,
Petr
Powered by blists - more mailing lists