lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <51dfc4a0-f6cf-092f-109f-a04eeb240655@samsung.com>
Date:   Fri, 29 Apr 2022 15:53:33 +0200
From:   Marek Szyprowski <m.szyprowski@...sung.com>
To:     John Ogness <john.ogness@...utronix.de>,
        Petr Mladek <pmladek@...e.com>
Cc:     Sergey Senozhatsky <senozhatsky@...omium.org>,
        Steven Rostedt <rostedt@...dmis.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        linux-kernel@...r.kernel.org,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        linux-amlogic@...ts.infradead.org
Subject: Re: [PATCH printk v5 1/1] printk: extend console_lock for
 per-console locking

Hi John,

On 27.04.2022 18:15, John Ogness wrote:
> On 2022-04-27, Marek Szyprowski <m.szyprowski@...sung.com> wrote:
>> Here is the full serial console log:
>>
>> https://protect2.fireeye.com/v1/url?k=087c101e-57e728e3-087d9b51-000babff317b-69d8576a8b9d481f&q=1&e=5f72c413-9d23-4e64-98e4-377fcc2038de&u=https%3A%2F%2Fpastebin.com%2FE5CDH88L
> Here are a few ideas from me:
>
> 1. For next-20220427 the printk-threaded series was slightly changed. I
> do not expect it to work any different, but I would prefer we are
> debugging the current version. If possible, could you move to
> next-20220427?

I've moved to next-20220429. Nothing changed compared to next-20220427.


> 2. I noticed you boot with the kernel boot arguments "earlycon" and
> "no_console_suspend". Could you try booting without this? I expect this
> will make no difference.

Well, nothing changed.


> 3. It looks like the problem happens quite late in the boot process. I
> expect it is due to some userspace process that is running that is
> interacting with printk (either /dev/kmsg or /proc/kmsg) and is causing
> problems. If you boot with init=/bin/sh then I expect the system is
> running fine. (You don't have much of a system running, but it should
> not hang.) We need to isolate which userspace process is triggering the
> issue.

The same issue happens if I boot with init=/bin/bash


> 4. Have you tried issuing magic sysrq commands on the serial line? (For
> example, sending a break signal and then the letter 't' or sending a
> break signal and then the letter 'c'?) That might trigger various dumps
> so that we can see the system state.
>
> 5. You are not running a VT console, so the graphics driver should not
> be affecting the printk subsystem at all. I expect your autologin is
> also starting various services and programs. If you disable the
> automatic login and instead manually login (perhaps as another user) can
> you manually start those services one at a time to see at what point the
> system hangs?
>
> Thanks for you help with this!

I found something really interesting. When lockup happens, I'm still 
able to log via ssh and trigger any magic sysrq action via 
/proc/sysrq-trigger (triggering it from UART console via break doesn't 
work).

It turned out that the UART console is somehow blocked, but it receives 
and buffers all the input. For example after issuing "echo 
 >/proc/sysrq-trigger" from the ssh console, the UART console has been 
updated and I see the magic sysrq banner and then all the commands I 
blindly typed in the UART console! However this doesn't unblock the console.

Here is the output of 't' magic sys request:

https://pastebin.com/fjbRuy4f

If you have any more suggestion what to check let me know.

This issue must be somehow related to the way the UART driver works on 
the Amlogic Meson boards. The other boards based on different SoCs 
(Exynos, QCOM, BCM) I have in my test farm (with the same userspace and 
configuration) work fine with those patches.

Best regards
-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ