lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250330-heimweg-packen-b73908210f79@brauner>
Date: Sun, 30 Mar 2025 10:33:53 +0200
From: Christian Brauner <brauner@...nel.org>
To: James Bottomley <James.Bottomley@...senpartnership.com>, jack@...e.cz
Cc: linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org, 
	mcgrof@...nel.org, hch@...radead.org, david@...morbit.com, rafael@...nel.org, 
	djwong@...nel.org, pavel@...nel.org, peterz@...radead.org, mingo@...hat.com, 
	will@...nel.org, boqun.feng@...il.com
Subject: Re: [PATCH v2 0/6] Extend freeze support to suspend and hibernate

On Sat, Mar 29, 2025 at 01:02:32PM -0400, James Bottomley wrote:
> On Sat, 2025-03-29 at 10:04 -0400, James Bottomley wrote:
> > On Sat, 2025-03-29 at 09:42 +0100, Christian Brauner wrote:
> > > Add the necessary infrastructure changes to support freezing for
> > > suspend and hibernate.
> > > 
> > > Just got back from LSFMM. So still jetlagged and likelihood of bugs
> > > increased. This should all that's needed to wire up power.
> > > 
> > > This will be in vfs-6.16.super shortly.
> > > 
> > > ---
> > > Changes in v2:
> > > - Don't grab reference in the iterator make that a requirement for
> > > the callers that need custom behavior.
> > > - Link to v1:
> > > https://lore.kernel.org/r/20250328-work-freeze-v1-0-a2c3a6b0e7a6@kernel.org
> > 
> > Given I've been a bit quiet on this, I thought I'd better explain
> > what's going on: I do have these built, but I made the mistake of
> > doing a dist-upgrade on my testing VM master image and it pulled in a
> > version of systemd (257.4-3) that has a broken hibernate.  Since I
> > upgraded in place I don't have the old image so I'm spending my time
> > currently debugging systemd ... normal service will hopefully resume
> > shortly.
> 
> I found the systemd bug
> 
> https://github.com/systemd/systemd/issues/36888

I don't think that's a systemd bug.

> And hacked around it, so I can confirm a simple hibernate/resume works
> provided the sd_start_write() patches are applied (and the hooks are
> plumbed in to pm).
> 
> There is an oddity: the systemd-journald process that would usually
> hang hibernate in D wait goes into R but seems to be hung and can't be
> killed by the watchdog even with a -9.  It's stack trace says it's
> still stuck in sb_start_write:
> 
> [<0>] percpu_rwsem_wait.constprop.10+0xd1/0x140
> [<0>] ext4_page_mkwrite+0x3c1/0x560 [ext4]
> [<0>] do_page_mkwrite+0x38/0xa0
> [<0>] do_wp_page+0xd5/0xba0
> [<0>] __handle_mm_fault+0xa29/0xca0
> [<0>] handle_mm_fault+0x16a/0x2d0
> [<0>] do_user_addr_fault+0x3ab/0x810
> [<0>] exc_page_fault+0x68/0x150
> [<0>] asm_exc_page_fault+0x22/0x30
> 
> So I think there's something funny going on in thaw.

My uneducated guess is that it's probably an issue with ext4 freezing
and unfreezing. xfs stops workqueues after all writes and pagefault
writers have stopped. This is done in ->sync_fs() when it's called from
freeze_super(). They are restarted when ->unfreeze_fs is called.

But for ext4 in ->sync_fs() the rsv_conversion_wq is flushed. I think
that should be safe to do but I'm not sure if there can't be other work
coming in on it before the actual freeze call. Jan will be able to
explain this a lot better. I don't have time today to figure out what
this does.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ