lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJZ5v0g6+CR80=zQi=aTbJDc3FAMRAFUE=e-QURcwi_25zgAfw@mail.gmail.com>
Date:   Mon, 11 Jul 2022 20:13:18 +0200
From:   "Rafael J. Wysocki" <rafael@...nel.org>
To:     Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>
Cc:     Greg KH <gregkh@...uxfoundation.org>,
        Oliver Neukum <oneukum@...e.com>,
        Wedson Almeida Filho <wedsonaf@...gle.com>,
        "Rafael J. Wysocki" <rjw@...k.pl>,
        Arjan van de Ven <arjan@...ux.intel.com>,
        Len Brown <len.brown@...el.com>,
        Dmitry Vyukov <dvyukov@...gle.com>,
        Linux PM <linux-pm@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2 3/4] PM: hibernate: allow wait_for_device_probe() to
 timeout when resuming from hibernation

On Sun, Jul 10, 2022 at 4:25 AM Tetsuo Handa
<penguin-kernel@...ove.sakura.ne.jp> wrote:
>
> syzbot is reporting hung task at misc_open() [1], for there is a race
> window of AB-BA deadlock which involves probe_count variable.
>
> Even with "char: misc: allow calling open() callback without misc_mtx
> held" and "PM: hibernate: call wait_for_device_probe() without
> system_transition_mutex held", wait_for_device_probe() from snapshot_open()
> can sleep forever if probe_count cannot become 0.
>
> Since snapshot_open() is a userland-driven hibernation/resume request,
> it should be acceptable to fail if something is wrong.

Not really.

If you are resuming from hibernation and the image cannot be reached
(which is the situation described above), failing and continuing to
boot means discarding the image and possible user data loss.

There is no "graceful failure" in this case.

> Users would not want to wait for hours if device stopped responding.

If the device holding the image is not responding, we should better
wait for it or panic().  Or let the user make the system reboot.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ