lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87eda0cljg.fsf@kernel.org>
Date: Fri, 17 May 2024 21:58:59 +0300
From: Kalle Valo <kvalo@...nel.org>
To: Dave Hansen <dave.hansen@...el.com>
Cc: Borislav Petkov <bp@...en8.de>,  Pawan Gupta
 <pawan.kumar.gupta@...ux.intel.com>,  Thomas Gleixner
 <tglx@...utronix.de>,  Ingo Molnar <mingo@...hat.com>,  Dave Hansen
 <dave.hansen@...ux.intel.com>,  "Rafael J. Wysocki" <rafael@...nel.org>,
  x86@...nel.org,  linux-pm@...r.kernel.org,  linux-kernel@...r.kernel.org,
  regressions@...ts.linux.dev,  Jeff Johnson <quic_jjohnson@...cinc.com>
Subject: Re: [regression] suspend stress test stalls within 30 minutes

Dave Hansen <dave.hansen@...el.com> writes:

> On 5/17/24 11:37, Kalle Valo wrote:
>> While writing this email I found another way to continue the suspend
>> after a stall: terminate rtcwake with CTRL-C in the ssh session running
>> the for loop. That explains why 'sudo shutdown -h now' makes the suspend
>> go forward, it most likely kills the stalled rtcwake process.
>
> Could we try and figure out what rtcwake is doing during its stall?  A
> couple of ideas:
>
> You could strace it to see if it's hung in the kernel:
>
> 	strace -o strace.log rtcwake ... <args here>
>
> You could look at its stack in /proc, like this:
>
> # cat /proc/`pidof sleep`/stack
> [<0>] hrtimer_nanosleep+0xb5/0x190
> [<0>] common_nsleep+0x44/0x50
> [<0>] __x64_sys_clock_nanosleep+0xcb/0x140
> [<0>] do_syscall_64+0x65/0x140
> [<0>] entry_SYSCALL_64_after_hwframe+0x6e/0x76
>
> Or you can use sysrq:
>
> 	echo t > /proc/sysrq-trigger
>
> to get *all* tasks' stacks dumped out to dmesg.
>
> I'd probably do all three in that order.
>
> Getting a function-graph trace of rtcwake during the stall would also be
> nice, but that's a lot of data so let's try the easier things first.

I can do all that but most probably not this week. Luckily it's quite
easy to reproduce the bug, one time I even saw it in the first iteration
and usually within 15 minutes or so.

And do let me know if there's anything else I should try.

-- 
https://patchwork.kernel.org/project/linux-wireless/list/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ