linux-kernel - Re: a question about how to debug this case in ftrace

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <xiune2bsqgin5ksk33q5bkihuz5qrv5casjofdyopes55zfcpc@uvvnlwxb4wcp>
Date: Tue, 25 Jun 2024 08:20:15 -0500
From: Andrew Halaney <ahalaney@...hat.com>
To: Song Chen <chensong_2000@....cn>
Cc: Derek Barbosa <debarbos@...hat.com>, pmladek@...e.com, 
	rostedt@...dmis.org, john.ogness@...utronix.de, senozhatsky@...omium.org, 
	linux-rt-users@...r.kernel.org, linux-kernel@...r.kernel.org, williams@...hat.com, 
	jlelli@...hat.com
Subject: Re: a question about how to debug this case in ftrace

Hey Song,

I'll answer that bit as those were my hacks :P

I _thought_ that doing a ftrace_dump_on_oops + softlockup_panic on the
command line (or similar at runtime) would dump the logs, but that
wasn't working for me and I was in a bit of a rush (and already hacking
the kernel up anyways).

So I ended up doing a sequence like:

    tracing_off();
    ftrace_dump(DUMP_ALL);

in the softlockup code when it was detected. Ideally I wanted to look at
the vmcore and look at the ftrace data in there (since debugging printk
by using printk is a little confusing), but there was a makedumpfile bug
I hit... so I went with the hacky route to prove to myself what was
going on. I think since then that's been resolved. Hope that helps!

Thanks,
Andrew

On Tue, Jun 25, 2024 at 09:36:54AM GMT, Song Chen wrote:
> Hi Derek,
> 
> I am working on a hungtask case, i saw your trace_printk logs go to the same
> output with softlockup warning, It's a smart way to easily build a timeline
> for analysis.
> 
> As far as i know, trace_printk goes to ftrace ring buffer and softlockup
> info goes to dmesg. Could you please let me know how you did that, i marked
> the part i'm interested in below, thanks a lot.
> 
> BR
> 
> Song
> 
> 
> 在 2024/6/18 23:20, Derek Barbosa 写道:
> > lgoncalv@...hat.com, jwyatt@...hat.com, aubaker@...hat.com
> > Bcc:
> > Subject: watchdog BUG: soft lockup - CPU#x stuck for 78s
> > Reply-To:
> > 
> > Hi,
> > 
> > The realtime team at Red Hat has recently backported the latest printk changes
> > present in 6.6-rt stable (HEAD at 20fd4439f644 printk: nbcon: move locked_port flag to
> > struct uart_port) to CentOS Stream 9 for performance improvements and
> > printk-related bugfixes.
> > 
> > Since merging this said code, we've hit an interesting bug during testing,
> > specifically, on larger systems, a softlockup may be reported by the watchdog
> > when there is a heavy amount of printing to tty devices (whether it be through
> > printk, /dev/kmsg, etc).
> > 
> > We have a modicum of reasonable suspicion to believe that nbcon_reacquire, or
> > some other nbcon mechanism *may* be causing such behavior.
> > 
> > Since we've succesfully reproduced this in the Kernel-ARK/Fedora-ELN
> > (osbuild-rt), and linux-rt-devel 6.10.rc4-rt6, we are reporting this bug
> > upstream.
> > 
> > Anyway, here is a more in-depth description, along with some call traces.
> > 
> > Description:
> > 
> > On x86 systems with a large amount of logical cores (nproc ~> 60), a softlockup can
> > be observed with accompanying call trace when a large amount of "printing"
> > activity is taking place.
> > 
> > As shown in the call traces appended below, during some kind of numa
> > balancing/numa_migration after a task_numa_fault --where a set of processess are being migrated/swapped
> > between two CPUs-- there is a busy thread that is being waited on (in the order
> > of seconds), causing a livelock. Additional investigation of collected vmcores
> > by toggling panic on softlockup shows that the waiting thread may be waiting for
> > a thread looping with nbcon_reacquire.
> > 
> > I suspect that some logic within nbcon_context_try_acquire may be a good place
> > to start. My understanding of the code becomes a bit fuzzy here, so apologies
> > in advance for any erroneous statements. As I see it, there may be something
> > happening during migration (or under heavy load) in which nbcon_reacquire() is in
> > a non-migratable or non-preemtible state as it is attempting to regain access to
> > a lost/taken console. It could very well be a situation in which context
> > was forcefully taken from the printing thread.
> > 
> > Alternatively, Andrew Halaney <ahalaney@...hat.com> suspects that it is the loop within
> > nbcon_kthread_func() -- since there is nothing that would yield the task in said
> > loop (cant_migrate()), the migrate code would be essentially waiting forever for
> > the aforementioned loop to "finish". I believe in PREEMPT_RT, there would be a
> > preemption point here. Furthermore, in his investigation, there were signs that the
> > loop was just iterating up until the crash, leaving reason to
> > believe that task would be the culprit.
> > 
> > In fact, with the following diff, we noticed this output:
> > 
> > ```
> > ahalaney@...en2nano ~/git/linux-rt-devel (git)-[tags/v6.10-rc4-rt6-rebase] % git diff | cat
> > diff --git a/kernel/printk/nbcon.c b/kernel/printk/nbcon.c
> > index bb9689f94d30..d716b72bf2f8 100644
> > --- a/kernel/printk/nbcon.c
> > +++ b/kernel/printk/nbcon.c
> > @@ -1075,6 +1075,7 @@ static int nbcon_kthread_func(void *__console)
> >        *
> >        * This pairs with rcuwait_has_sleeper:A and nbcon_kthread_wake:A
>