[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20221129100647.4957579e@gandalf.local.home>
Date: Tue, 29 Nov 2022 10:06:47 -0500
From: Steven Rostedt <rostedt@...dmis.org>
To: Nadav Amit <namit@...are.com>
Cc: Peter Zijlstra <peterz@...radead.org>,
Arnd Bergmann <arnd@...db.de>,
Thomas Gleixner <tglx@...utronix.de>,
"linux-arm-kernel@...ts.infradead.org"
<linux-arm-kernel@...ts.infradead.org>,
kernel list <linux-kernel@...r.kernel.org>,
"linux-ia64@...r.kernel.org" <linux-ia64@...r.kernel.org>,
"linux-um@...ts.infradead.org" <linux-um@...ts.infradead.org>,
Linux-Arch <linux-arch@...r.kernel.org>,
Linux-MM <linux-mm@...ck.org>, Andy Lutomirski <luto@...nel.org>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>,
X86 ML <x86@...nel.org>, Richard Weinberger <richard@....at>,
Anton Ivanov <anton.ivanov@...bridgegreys.com>,
Johannes Berg <johannes@...solutions.net>,
Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH 3/3] compiler: inline does not imply notrace
On Tue, 29 Nov 2022 04:25:38 +0000
Nadav Amit <namit@...are.com> wrote:
> I will need to further debug it, but this issue does not occur every time.
>
> The kernel didn’t crash exactly - it’s more of a deadlock. I have lockdep
> enabled, so it is not a deadlock that lockdep knows. Could it be that
> somehow things just slowed down due to IPIs and mostly-disabled IRQs? I have
> no idea. I would need to recreate the scenario.
You have lockdep enabled and you are running function tracing with stack
trace on? So you are doing a stack trace on *every* function that is traced?
I don't think you hit a deadlock, I think you hit a live lock. You could
possibly slow the system down so much that when an interrupt finishes it's
time for it to be triggered again, and you never make forward progress.
>
> For the record, I tried to saved some details in the previous email. It was
> kind of hard to understand what’s going on on the other cores, since the
> trace of other cores was interleaved. I extract the parts from that I think
> the refer to the another CPU (yes, the output is really slow, as seen in the
> timestamps):
>
> [531413.923628] Code: 00 00 31 c0 eb f1 0f 1f 80 00 00 00 00 e8 1b 2e 16 3e 55 48 89 e5 c6 07 00 0f 1f 00 f7 c6 00 02 00 00 74 06 fb 0f 1f 44 00 00 <bf> 01 00 00 00 e8 99 da f1 fe 65 8b 05 f2 99 d7 7d 85 c0 74 02 5d
>
> All code
> ========
> 0: 00 00 add %al,(%rax)
> 2: 31 c0 xor %eax,%eax
> 4: eb f1 jmp 0xfffffffffffffff7
> 6: 0f 1f 80 00 00 00 00 nopl 0x0(%rax)
> d: e8 1b 2e 16 3e call 0x3e162e2d
> 12: 55 push %rbp
> 13: 48 89 e5 mov %rsp,%rbp
> 16: c6 07 00 movb $0x0,(%rdi)
> 19: 0f 1f 00 nopl (%rax)
> 1c: f7 c6 00 02 00 00 test $0x200,%esi
> 22: 74 06 je 0x2a
> 24: fb sti
> 25: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
> 2a:* bf 01 00 00 00 mov $0x1,%edi <-- trapping instruction
> 2f: e8 99 da f1 fe call 0xfffffffffef1dacd
> 34: 65 8b 05 f2 99 d7 7d mov %gs:0x7dd799f2(%rip),%eax # 0x7dd79a2d
> 3b: 85 c0 test %eax,%eax
> 3d: 74 02 je 0x41
> 3f: 5d pop %rbp
>
> Code starting with the faulting instruction
> ===========================================
> 0: bf 01 00 00 00 mov $0x1,%edi
> 5: e8 99 da f1 fe call 0xfffffffffef1daa3
> a: 65 8b 05 f2 99 d7 7d mov %gs:0x7dd799f2(%rip),%eax # 0x7dd79a03
> 11: 85 c0 test %eax,%eax
> 13: 74 02 je 0x17
> 15: 5d pop %rbp
>
> [531414.066765] RSP: 0018:ffffc9000c9a77d8 EFLAGS: 00000206
> [531414.077943] RIP: 0010:smp_call_function_many_cond (kernel/smp.c:443 kernel/smp.c:988)
> [531416.987351] on_each_cpu_cond_mask (kernel/smp.c:1155)
> [531416.205862] ? text_poke_memset (arch/x86/kernel/alternative.c:1296)
> [531416.681294] ? text_poke_memset (arch/x86/kernel/alternative.c:1296)
> [531417.468443] text_poke_bp_batch (arch/x86/kernel/alternative.c:1553)
> [531418.939923] arch_ftrace_update_trampoline (arch/x86/kernel/ftrace.c:500)
> [531419.882055] ? ftrace_no_pid_write (kernel/trace/ftrace.c:7864)
> [531420.510376] ftrace_update_pid_func (kernel/trace/ftrace.c:374 (discriminator 1))
> [531420.784703] ftrace_pid_open (kernel/trace/ftrace.c:2918 kernel/trace/ftrace.c:2932 kernel/trace/ftrace.c:7725 kernel/trace/ftrace.c:7835 kernel/trace/ftrace.c:7865)
> [531421.851294] vfs_open (fs/open.c:1017)
>
>
>
Do you have an issue with normal function tracing, and not tracing every
function.
I should also add this, because it detects recursion faster than the
atomic_inc_return() does.
-- Steve
diff --git a/kernel/trace/trace_functions.c b/kernel/trace/trace_functions.c
index 9f1bfbe105e8..93ec756dc24b 100644
--- a/kernel/trace/trace_functions.c
+++ b/kernel/trace/trace_functions.c
@@ -221,12 +221,18 @@ function_stack_trace_call(unsigned long ip, unsigned long parent_ip,
struct trace_array_cpu *data;
unsigned long flags;
long disabled;
+ int bit;
int cpu;
unsigned int trace_ctx;
if (unlikely(!tr->function_enabled))
return;
+ /* Faster than atomic_inc_return() */
+ bit = ftrace_test_recursion_trylock(ip, parent_ip);
+ if (bit < 0)
+ return;
+
/*
* Need to use raw, since this must be called before the
* recursive protection is performed.
@@ -244,6 +250,7 @@ function_stack_trace_call(unsigned long ip, unsigned long parent_ip,
atomic_dec(&data->disabled);
local_irq_restore(flags);
+ ftrace_test_recursion_unlock(bit);
}
static inline bool is_repeat_check(struct trace_array *tr,
Powered by blists - more mailing lists