linux-kernel - Fail to freeze process

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [day] [month] [year] [list]

Message-ID: <DU0PR04MB9417DDFE48AF703586561C0A88AB9@DU0PR04MB9417.eurprd04.prod.outlook.com>
Date:   Thu, 23 Feb 2023 06:49:31 +0000
From:   Peng Fan <peng.fan@....com>
To:     "mingo@...hat.com" <mingo@...hat.com>,
        "peterz@...radead.org" <peterz@...radead.org>,
        "juri.lelli@...hat.com" <juri.lelli@...hat.com>,
        "vincent.guittot@...aro.org" <vincent.guittot@...aro.org>,
        "dietmar.eggemann@....com" <dietmar.eggemann@....com>,
        "rostedt@...dmis.org" <rostedt@...dmis.org>,
        "bsegall@...gle.com" <bsegall@...gle.com>,
        "mgorman@...e.de" <mgorman@...e.de>,
        "bristot@...hat.com" <bristot@...hat.com>,
        "vschneid@...hat.com" <vschneid@...hat.com>,
        "tglx@...utronix.de" <tglx@...utronix.de>,
        "Rafael J. Wysocki" <rafael@...nel.org>
CC:     "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Jan Kiszka <jan.kiszka@...mens.com>
Subject: Fail to freeze process

Hi kernel experts,

I am facing a suspend/resume issue with linux on top of jailhouse hypervisor on
ARM64 platform with 6.1 kernel. 
Actually without enabling jailhouse hypervisor, the kernel suspend/resume well.
So it should be the jailhouse hypervisor introduce some interrupt/timer or else
bug cause this issue. But I have no idea for now what bug may introduce such
issue. So I wanna narrow and debug from linux side see why freeze time, then
move into jailhouse hypervisor to fix it.

I have try to enlarge freeze time to 90s, still has similar issue, process freeze
failure, the issue not happen every time, but after a few round suspend/resume, 
it triggers. And the cpu running the process has a very large timer expiration value.
Even I use jtag to trigger the timer interrupt, the cpu runs into idle again.

I see the process has flag 0xa05, it has SIG Pending, but not sure why it could
not freeze.

Seems I have no idea to wakeup the cpu from idle and let it schedule.

Hope you have any ideas.

---- Running < /unit_tests/SRTC/rtcwakeup.out > test ----

rtcwakeup.[ 1153.430758] PM: suspend entry (deep)
out: wakeup from "mem" using rtc0[ 1153.435689] Filesystems sync: 0.000 seconds
 at Fri Jan  2 00:20:51 1970
[ 1153.487507] Freezing user space processes ...
[ 1173.495070] Freezing of tasks failed after 20.003 seconds (1 tasks refusing to freeze, wq_busy=0):
[ 1173.504091] task:systemd-userwor state:R stack:0     pid:1563  ppid:588    flags:0x00000a05
[ 1173.512457] Call trace:
[ 1173.514909]  __switch_to+0xf0/0x170
[ 1173.518416]  __schedule+0x28c/0x710
[ 1173.521916]  schedule+0x5c/0xd0
[ 1173.525064]  schedule_timeout+0x8c/0x100
[ 1173.528996]  __skb_wait_for_more_packets+0x128/0x190
[ 1173.533975]  __skb_recv_datagram+0x80/0xe0
[ 1173.538081]  skb_recv_datagram+0x34/0x90
[ 1173.542014]  unix_accept+0xa0/0x1c0
[ 1173.545511]  do_accept+0x114/0x190
[ 1173.548916]  __sys_accept4+0x70/0xe4
[ 1173.552503]  __arm64_sys_accept4+0x20/0x30
[ 1173.556609]  invoke_syscall+0x48/0x114
[ 1173.560368]  el0_svc_common.constprop.0+0xcc/0xec
[ 1173.565085]  do_el0_svc+0x2c/0xd0
[ 1173.568412]  el0_svc+0x2c/0x84
[ 1173.571472]  el0t_64_sync_handler+0xf4/0x120
[ 1173.575752]  el0t_64_sync+0x18c/0x190
[ 1173.579434]
[ 1173.580947] OOM killer enabled.
[ 1173.584095] Restarting tasks ... done.
[ 1173.589831] random: crng reseeded on system resumption
[ 1173.595422] PM: suspend exit
write /sys/power/state: Device or resource busy
===============================
suspend 57 times
===============================

Thanks,
Peng.