[an error occurred while processing this directive]
|
|
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2810b68b-a6ec-86f9-6915-bda958b39d3a@linux.microsoft.com>
Date: Mon, 26 Jul 2021 08:49:58 -0500
From: "Madhavan T. Venkataraman" <madvenka@...ux.microsoft.com>
To: broonie@...nel.org, mark.rutland@....com, jpoimboe@...hat.com,
ardb@...nel.org, nobuta.keiya@...itsu.com,
sjitindarsingh@...il.com, catalin.marinas@....com, will@...nel.org,
jmorris@...ei.org, pasha.tatashin@...een.com, jthierry@...hat.com,
linux-arm-kernel@...ts.infradead.org,
live-patching@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH v6 0/3] arm64: Implement stack trace reliability
checks
Hi Mark Brown, Mark Rutland,
Could you please review this version of reliable stack trace?
Thanks.
Madhavan
On 6/30/21 5:33 PM, madvenka@...ux.microsoft.com wrote:
> From: "Madhavan T. Venkataraman" <madvenka@...ux.microsoft.com>
>
> Unwinder return value
> =====================
>
> Currently, the unwinder returns a tri-state return value:
>
> 0 means "continue with the unwind"
> -ENOENT means "successful termination of the stack trace"
> -EINVAL means "fatal error, abort the stack trace"
>
> This is confusing. To fix this, define an enumeration of different return
> codes to make it clear.
>
> enum {
> UNWIND_CONTINUE, /* No errors encountered */
> UNWIND_ABORT, /* Fatal errors encountered */
> UNWIND_FINISH, /* End of stack reached successfully */
> };
>
> Reliability checks
> ==================
>
> There are a number of places in kernel code where the stack trace is not
> reliable. Enhance the unwinder to check for those cases.
>
> Return address check
> --------------------
>
> Check the return PC of every stack frame to make sure that it is a valid
> kernel text address (and not some generated code, for example).
>
> Low-level function check
> ------------------------
>
> Low-level assembly functions are, by nature, unreliable from an unwinder
> perspective. The unwinder must check for them in a stacktrace. See the
> "Assembly Functions" section below.
>
> Other checks
> ------------
>
> Other checks may be added in the future. Once all of the checks are in place,
> the unwinder can provide a reliable stack trace. But before this can be used
> for livepatch, some other entity needs to validate the frame pointer in kernel
> functions. objtool is currently being worked on to address that need.
>
> Return code
> -----------
>
> If a reliability check fails, it is a non-fatal error. The unwinder needs to
> return an appropriate code so the caller knows that some non-fatal error has
> occurred. Add another code to the enumeration:
>
> enum {
> UNWIND_CONTINUE, /* No errors encountered */
> UNWIND_CONTINUE_WITH_RISK, /* Non-fatal errors encountered */
> UNWIND_ABORT, /* Fatal errors encountered */
> UNWIND_FINISH, /* End of stack reached successfully */
> };
>
> When the unwinder returns UNWIND_CONTINUE_WITH_RISK:
>
> - Debug-type callers can choose to continue the unwind
>
> - Livepatch-type callers will stop unwinding
>
> So, arch_stack_walk_reliable() (implemented in the future) will look like
> this:
>
> /*
> * Walk the stack like arch_stack_walk() but stop the walk as soon as
> * some unreliability is detected in the stack.
> */
> int arch_stack_walk_reliable(stack_trace_consume_fn consume_entry,
> void *cookie, struct task_struct *task)
> {
> struct stackframe frame;
> enum unwind_rc rc;
>
> if (task == current) {
> rc = start_backtrace(&frame,
> (unsigned long)__builtin_frame_address(0),
> (unsigned long)arch_stack_walk_reliable);
> } else {
> /*
> * The task must not be running anywhere for the duration of
> * arch_stack_walk_reliable(). The caller must guarantee
> * this.
> */
> rc = start_backtrace(&frame,
> thread_saved_fp(task),
> thread_saved_pc(task));
> }
>
> while (rc == UNWIND_CONTINUE) {
> if (!consume_entry(cookie, frame.pc))
> return -EINVAL;
> rc = unwind_frame(task, &frame);
> }
>
> return rc == UNWIND_FINISH ? 0 : -EINVAL;
> }
>
> Assembly functions
> ==================
>
> There are a number of assembly functions in arm64. Except for a couple of
> them, these functions do not have a frame pointer prolog or epilog. Also,
> many of them manipulate low-level state such as registers. These functions
> are, by definition, unreliable from a stack unwinding perspective. That is,
> when these functions occur in a stack trace, the unwinder would not be able
> to unwind through them reliably.
>
> Assembly functions are defined as SYM_FUNC_*() functions or SYM_CODE_*()
> functions. objtool peforms static analysis of SYM_FUNC functions. It ignores
> SYM_CODE functions because they have low level code that is difficult to
> analyze. When objtool becomes ready eventually, SYM_FUNC functions will
> be analyzed and "fixed" as necessary. So, they are not "interesting" for
> the reliable unwinder.
>
> That leaves SYM_CODE functions. It is for the unwinder to deal with these
> for reliable stack trace. The unwinder needs to do the following:
>
> - Recognize SYM_CODE functions in a stack trace.
>
> - If a particular SYM_CODE function can be unwinded through using
> some special logic, then do it. E.g., the return trampoline for
> Function Graph Tracing.
>
> - Otherwise, return UNWIND_CONTINUE_WITH_RISK.
>
> Current approach
> ================
>
> Define an ARM64 version of SYM_CODE_END() like this:
>
> #define SYM_CODE_END(name) \
> SYM_END(name, SYM_T_NONE) ;\
> 99: ;\
> .pushsection "sym_code_functions", "aw" ;\
> .quad name ;\
> .quad 99b ;\
> .popsection
>
> The above macro does the usual SYM_END(). In addition, it records the
> function's address range in a special section called "sym_code_functions".
> This way, all SYM_CODE functions get recorded in the section automatically.
>
> Implement an early_initcall() called init_sym_code_functions() that allocates
> an array called sym_code_functions[] and copies the function ranges from the
> section to the array.
>
> Add a reliability check in unwind_check_frame() that compares a return
> PC with sym_code_functions[]. If there is a match, then return
> UNWIND_CONTINUE_WITH_RISK.
>
> Call unwinder_is_unreliable() on every return PC from unwind_frame(). If there
> is a match, then return UNWIND_CONTINUE_WITH_RISK.
>
> Last stack frame
> ================
>
> If a SYM_CODE function occurs in the very last frame in the stack trace,
> then the stack trace is not considered unreliable. This is because there
> is no more unwinding to do. Examples:
>
> - EL0 exception stack traces end in the top level EL0 exception
> handlers.
>
> - All kernel thread stack traces end in ret_from_fork().
>
> Special SYM_CODE functions
> ==========================
>
> The return trampolines of the Function Graph Tracer and Kretprobe can
> be recognized by the unwinder. If the return PCs can be translated to the
> original PCs, then, the unwinder should perform that translation before
> checking for reliability. The final PC that we end up with after all the
> translations is the one we need to check for reliability.
>
> Accordingly, I have moved the reliability checks to after the logic that
> handles the Function Graph Tracer.
>
> So, the approach is - all SYM_CODE functions are unreliable. If a SYM_CODE
> function is "fixed" to make it reliable, then it should become a SYM_FUNC
> function. Or, if the unwinder has special logic to unwind through a SYM_CODE
> function, then that can be done.
>
> Special cases
> =============
>
> Some special cases need to be mentioned:
>
> - EL1 interrupt and exception handlers end up in sym_code_ranges[].
> So, all EL1 interrupt and exception stack traces will be considered
> unreliable. This the correct behavior as interrupts and exceptions
> can happen on any instruction including ones in the frame pointer
> prolog and epilog. Unless objtool generates metadata so the unwinder
> can unwind through these special cases, such stack traces will be
> considered unreliable.
>
> - A task can get preempted at the end of an interrupt. Stack traces
> of preempted tasks will show the interrupt frame in the stack trace
> and will be considered unreliable.
>
> - Breakpoints are exceptions. So, all stack traces in the break point
> handler (including probes) will be considered unreliable.
>
> - All of the ftrace trampolines end up in sym_code_functions[]. All
> stack traces taken from tracer functions will be considered
> unreliable.
> ---
> Changelog:
>
> v6:
> From Mark Rutland:
>
> - The per-frame reliability concept and flag are acceptable. But more
> work is needed to make the per-frame checks more accurate and more
> complete. E.g., some code reorg is being worked on that will help.
>
> I have now removed the frame->reliable flag and deleted the whole
> concept of per-frame status. This is orthogonal to this patch series.
> Instead, I have improved the unwinder to return proper return codes
> so a caller can take appropriate action without needing per-frame
> status.
>
> - Remove the mention of PLTs and update the comment.
>
> I have replaced the comment above the call to __kernel_text_address()
> with the comment suggested by Mark Rutland.
>
> Other comments:
>
> - Other comments on the per-frame stuff are not relevant because
> that approach is not there anymore.
>
> v5:
> From Keiya Nobuta:
>
> - The term blacklist(ed) is not to be used anymore. I have changed it
> to unreliable. So, the function unwinder_blacklisted() has been
> changed to unwinder_is_unreliable().
>
> From Mark Brown:
>
> - Add a comment for the "reliable" flag in struct stackframe. The
> reliability attribute is not complete until all the checks are
> in place. Added a comment above struct stackframe.
>
> - Include some of the comments in the cover letter in the actual
> code so that we can compare it with the reliable stack trace
> requirements document for completeness. I have added a comment:
>
> - above unwinder_is_unreliable() that lists the requirements
> that are addressed by the function.
>
> - above the __kernel_text_address() call about all the cases
> the call covers.
>
> v4:
> From Mark Brown:
>
> - I was checking the return PC with __kernel_text_address() before
> the Function Graph trace handling. Mark Brown felt that all the
> reliability checks should be performed on the original return PC
> once that is obtained. So, I have moved all the reliability checks
> to after the Function Graph Trace handling code in the unwinder.
> Basically, the unwinder should perform PC translations first (for
> rhe return trampoline for Function Graph Tracing, Kretprobes, etc).
> Then, the reliability checks should be applied to the resulting
> PC.
>
> - Mark said to improve the naming of the new functions so they don't
> collide with existing ones. I have used a prefix "unwinder_" for
> all the new functions.
>
> From Josh Poimboeuf:
>
> - In the error scenarios in the unwinder, the reliable flag in the
> stack frame should be set. Implemented this.
>
> - Some of the other comments are not relevant to the new code as
> I have taken a different approach in the new code. That is why
> I have not made those changes. E.g., Ard wanted me to add the
> "const" keyword to the global section array. That array does not
> exist in v4. Similarly, Mark Brown said to use ARRAY_SIZE() for
> the same array in a for loop.
>
> Other changes:
>
> - Add a new definition for SYM_CODE_END() that adds the address
> range of the function to a special section called
> "sym_code_functions".
>
> - Include the new section under initdata in vmlinux.lds.S.
>
> - Define an early_initcall() to copy the contents of the
> "sym_code_functions" section to an array by the same name.
>
> - Define a function unwinder_blacklisted() that compares a return
> PC against sym_code_sections[]. If there is a match, mark the
> stack trace unreliable. Call this from unwind_frame().
>
> v3:
> - Implemented a sym_code_ranges[] array to contains sections bounds
> for text sections that contain SYM_CODE_*() functions. The unwinder
> checks each return PC against the sections. If it falls in any of
> the sections, the stack trace is marked unreliable.
>
> - Moved SYM_CODE functions from .text and .init.text into a new
> text section called ".code.text". Added this section to
> vmlinux.lds.S and sym_code_ranges[].
>
> - Fixed the logic in the unwinder that handles Function Graph
> Tracer return trampoline.
>
> - Removed all the previous code that handles:
> - ftrace entry code for traced function
> - special_functions[] array that lists individual functions
> - kretprobe_trampoline() special case
>
> v2
> - Removed the terminating entry { 0, 0 } in special_functions[]
> and replaced it with the idiom { /* sentinel */ }.
>
> - Change the ftrace trampoline entry ftrace_graph_call in
> special_functions[] to ftrace_call + 4 and added explanatory
> comments.
>
> - Unnested #ifdefs in special_functions[] for FTRACE.
>
> v1
> - Define a bool field in struct stackframe. This will indicate if
> a stack trace is reliable.
>
> - Implement a special_functions[] array that will be populated
> with special functions in which the stack trace is considered
> unreliable.
>
> - Using kallsyms_lookup(), get the address ranges for the special
> functions and record them.
>
> - Implement an is_reliable_function(pc). This function will check
> if a given return PC falls in any of the special functions. If
> it does, the stack trace is unreliable.
>
> - Implement check_reliability() function that will check if a
> stack frame is reliable. Call is_reliable_function() from
> check_reliability().
>
> - Before a return PC is checked against special_funtions[], it
> must be validates as a proper kernel text address. Call
> __kernel_text_address() from check_reliability().
>
> - Finally, call check_reliability() from unwind_frame() for
> each stack frame.
>
> - Add EL1 exception handlers to special_functions[].
>
> el1_sync();
> el1_irq();
> el1_error();
> el1_sync_invalid();
> el1_irq_invalid();
> el1_fiq_invalid();
> el1_error_invalid();
>
> - The above functions are currently defined as LOCAL symbols.
> Make them global so that they can be referenced from the
> unwinder code.
>
> - Add FTRACE trampolines to special_functions[]:
>
> ftrace_graph_call()
> ftrace_graph_caller()
> return_to_handler()
>
> - Add the kretprobe trampoline to special functions[]:
>
> kretprobe_trampoline()
>
> Previous versions and discussion
> ================================
>
> v5: https://lore.kernel.org/linux-arm-kernel/20210526214917.20099-1-madvenka@linux.microsoft.com/
> v4: https://lore.kernel.org/linux-arm-kernel/20210516040018.128105-1-madvenka@linux.microsoft.com/
> v3: https://lore.kernel.org/linux-arm-kernel/20210503173615.21576-1-madvenka@linux.microsoft.com/
> v2: https://lore.kernel.org/linux-arm-kernel/20210405204313.21346-1-madvenka@linux.microsoft.com/
> v1: https://lore.kernel.org/linux-arm-kernel/20210330190955.13707-1-madvenka@linux.microsoft.com/
> Madhavan T. Venkataraman (3):
> arm64: Improve the unwinder return value
> arm64: Introduce stack trace reliability checks in the unwinder
> arm64: Create a list of SYM_CODE functions, check return PC against
> list
>
> arch/arm64/include/asm/linkage.h | 12 ++
> arch/arm64/include/asm/sections.h | 1 +
> arch/arm64/include/asm/stacktrace.h | 16 ++-
> arch/arm64/kernel/perf_callchain.c | 5 +-
> arch/arm64/kernel/process.c | 8 +-
> arch/arm64/kernel/return_address.c | 10 +-
> arch/arm64/kernel/stacktrace.c | 180 ++++++++++++++++++++++++----
> arch/arm64/kernel/time.c | 9 +-
> arch/arm64/kernel/vmlinux.lds.S | 7 ++
> 9 files changed, 213 insertions(+), 35 deletions(-)
>
>
> base-commit: bf05bf16c76bb44ab5156223e1e58e26dfe30a88
>
Powered by blists - more mailing lists