lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bd6e3abddda328de20570e866b54c975127e202c.camel@HansenPartnership.com>
Date: Mon, 31 Mar 2025 10:49:22 -0400
From: James Bottomley <James.Bottomley@...senPartnership.com>
To: Jan Kara <jack@...e.cz>
Cc: Christian Brauner <brauner@...nel.org>, linux-fsdevel@...r.kernel.org, 
 linux-kernel@...r.kernel.org, mcgrof@...nel.org, hch@...radead.org, 
 david@...morbit.com, rafael@...nel.org, djwong@...nel.org,
 pavel@...nel.org,  peterz@...radead.org, mingo@...hat.com, will@...nel.org,
 boqun.feng@...il.com
Subject: Re: [PATCH v2 0/6] Extend freeze support to suspend and hibernate

On Mon, 2025-03-31 at 12:36 +0200, Jan Kara wrote:
> On Sat 29-03-25 13:02:32, James Bottomley wrote:
> > On Sat, 2025-03-29 at 10:04 -0400, James Bottomley wrote:
> > > On Sat, 2025-03-29 at 09:42 +0100, Christian Brauner wrote:
> > > > Add the necessary infrastructure changes to support freezing
> > > > for
> > > > suspend and hibernate.
> > > > 
> > > > Just got back from LSFMM. So still jetlagged and likelihood of
> > > > bugs
> > > > increased. This should all that's needed to wire up power.
> > > > 
> > > > This will be in vfs-6.16.super shortly.
> > > > 
> > > > ---
> > > > Changes in v2:
> > > > - Don't grab reference in the iterator make that a requirement
> > > > for
> > > > the callers that need custom behavior.
> > > > - Link to v1:
> > > > https://lore.kernel.org/r/20250328-work-freeze-v1-0-a2c3a6b0e7a6@kernel.org
> > > 
> > > Given I've been a bit quiet on this, I thought I'd better explain
> > > what's going on: I do have these built, but I made the mistake of
> > > doing a dist-upgrade on my testing VM master image and it pulled
> > > in a
> > > version of systemd (257.4-3) that has a broken hibernate.  Since
> > > I
> > > upgraded in place I don't have the old image so I'm spending my
> > > time
> > > currently debugging systemd ... normal service will hopefully
> > > resume
> > > shortly.
> > 
> > I found the systemd bug
> > 
> > https://github.com/systemd/systemd/issues/36888
> > 
> > And hacked around it, so I can confirm a simple hibernate/resume
> > works
> > provided the sd_start_write() patches are applied (and the hooks
> > are
> > plumbed in to pm).
> > 
> > There is an oddity: the systemd-journald process that would usually
> > hang hibernate in D wait goes into R but seems to be hung and can't
> > be killed by the watchdog even with a -9.  It's stack trace says
> > it's still stuck in sb_start_write:
> > 
> > [<0>] percpu_rwsem_wait.constprop.10+0xd1/0x140
> > [<0>] ext4_page_mkwrite+0x3c1/0x560 [ext4]
> > [<0>] do_page_mkwrite+0x38/0xa0
> > [<0>] do_wp_page+0xd5/0xba0
> > [<0>] __handle_mm_fault+0xa29/0xca0
> > [<0>] handle_mm_fault+0x16a/0x2d0
> > [<0>] do_user_addr_fault+0x3ab/0x810
> > [<0>] exc_page_fault+0x68/0x150
> > [<0>] asm_exc_page_fault+0x22/0x30
> > 
> > So I think there's something funny going on in thaw.
> 
> As Christian wrote, it seems systemd-journald does a memory store to
> mmapped file and gets blocked on sb_start_write() while doing the
> page fault. What's strange is that R state. Is the task really
> executing on some CPU or it only has 'R' state (i.e., got woken but
> never scheduled)?

Yes, ps shows it definitely stuck in R state.  The trace above
identifies the rwsem being at set_current_state() which seems to imply
it never returns from schedule() even though it's in state R.

I've actually managed to reproduce this now just doing filesystem
freeze and thaw without using the freezer, so I'll continue
investigating.

Regards,

James


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ