lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJZ5v0iUOv0MYOHP7938V=6qsURBYO6B4LHqfdm_DCHH8GvfLA@mail.gmail.com>
Date:   Sun, 30 Jun 2019 00:02:54 +0200
From:   "Rafael J. Wysocki" <rafael@...nel.org>
To:     Hans de Goede <hdegoede@...hat.com>
Cc:     "Rafael J. Wysocki" <rjw@...ysocki.net>,
        Linux PM <linux-pm@...r.kernel.org>,
        Linux PCI <linux-pci@...r.kernel.org>,
        Linux ACPI <linux-acpi@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Bjorn Helgaas <helgaas@...nel.org>,
        Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
        Mika Westerberg <mika.westerberg@...ux.intel.com>,
        "Robert R. Howell" <RHowell@...o.edu>
Subject: Re: [PATCH 4/6] ACPI: LPSS: Fix ->suspend_late callbacks handling

On Sat, Jun 29, 2019 at 1:34 PM Hans de Goede <hdegoede@...hat.com> wrote:
>
> Hi Rafael,
>
> On 29-06-19 11:50, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@...el.com>
> >
> > If the resume_from_noirq flag is set in dev_desc, the ->suspend_late
> > callback provided by the device driver will be invoked at the "noirq"
> > stage of system suspend, via acpi_lpss_do_suspend_late(), which is
> > incorrect.
> >
> > To fix that, drop acpi_lpss_do_suspend_late() and rearrange
> > acpi_lpss_suspend_late() to call pm_generic_suspend_late()
> > directly, before calling acpi_lpss_suspend(), in analogy with
> > acpi_subsys_suspend_late().
>
> Ah now I see the logic in your previous test-patch.
>
> I'm afraid that this is going to break things though, the calling
> of the device-driver's suspend-late method at noirq time is
> *intentional* !

But it is a bug too.

> The resume_from_noirq flag is only set for i2c controllers which
> use: drivers/i2c/busses/i2c-designware-platdrv.c as driver.
>
> This driver's suspend late method looks like this:
>
> static int dw_i2c_plat_suspend(struct device *dev)
> {
>          struct dw_i2c_dev *i_dev = dev_get_drvdata(dev);
>
>          i_dev->suspended = true;
>
>          if (i_dev->shared_with_punit)
>                  return 0;
>
>          i_dev->disable(i_dev);
>          i2c_dw_prepare_clk(i_dev, false);
>
>          return 0;
> }
>
> The i_dev->disable(i_dev) and i2c_dw_prepare_clk(i_dev, false) calls here
> will make the i2c controller non functional. But (some of) these i2c
> controllers are used by code in the _PS0  / _PS3 methods of some PCI
> devices and the PCI core calls _PS0 / _PS3 at *noirq* time, so as explained
> in the commit message which introduced acpi_lpss_do_suspend_late():
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=48402cee6889fb3fce58e95fea1471626286dc63
>
> We must not only make sure that the suspending of the i2c controller is
> ordered so that it happens after these PCI devices are suspended, we must
> also make sure that the i2c controller stays functional until the
> i2c-controller is put in the suspend-noirq state.
>
> If you really want to go this route, we must duplicate the resume_from_noirq
> flag inside drivers/i2c/busses/i2c-designware-platdrv.c, setting it
> only for acpi_lpss enumerated devices (the driver handles a whole lot more
> devices) ans then make the driver's suspend_late method a no-op and instead
> to the suspend from its suspend_noirq callback.
>
> Since pm_generic_suspend_late() is just a wrapper to call dev->driver->pm->suspend_late
> duplicating the resume_from_noirq flag inside i2c-designware-platdrv.c seems
> unproductive.
>
> Note that we have the same thing going on in acpi_lpss.c with resume_early vs
> resume_noirq, we call the resume_early callback from acpi_lpss_resume_noirq
> if the resume_from_noirq flag is set.
>
> TL;DR: the behavior you are trying to fix here is intentional and
> IMHO this patch should be dropped.

I can drop the patch, but the current code is simply incorrect.

If the driver provided a ->suspend_late callback, it wanted that
callback to be invoked during the "late" stage of suspend.  Calling it
later simply papers over a driver bug.  If invoking that callback
during the "late" stage doesn't work, the driver should have provided
a "noirq" callback instead.

> I guess we could / should do a patch adding a comment that the calling
> the drivers' suspend_late / resume_early callback at noirq time is intentional
> to avoid this confusing people in the future.

No.  We need to fix drivers doing wrong things.

Thanks!

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ