lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240205111254.70d5a5c1@xps-13>
Date: Mon, 5 Feb 2024 11:12:54 +0100
From: Miquel Raynal <miquel.raynal@...tlin.com>
To: Théo Lebrun <theo.lebrun@...tlin.com>
Cc: "Mark Brown" <broonie@...nel.org>, "Apurva Nandan" <a-nandan@...com>,
 "Dhruva Gole" <d-gole@...com>, <linux-spi@...r.kernel.org>,
 <linux-kernel@...r.kernel.org>, "Gregory CLEMENT"
 <gregory.clement@...tlin.com>, "Vladimir Kondratiev"
 <vladimir.kondratiev@...ileye.com>, "Thomas Petazzoni"
 <thomas.petazzoni@...tlin.com>, "Tawfik Bayouk"
 <tawfik.bayouk@...ileye.com>
Subject: Re: [PATCH] spi: cadence-qspi: stop calling system-wide PM helpers
 for runtime PM

Hi Théo,

> > > The fatal conclusion of this is a deadlock: we acquire a lock on each
> > > operation but while running the operation, we might want to runtime
> > > resume and acquire the same lock.
> > > 
> > > Anyway, those helpers (spi_controller_{suspend,resume}) are aimed at
> > > system-wide suspend and resume and should NOT be called at runtime
> > > suspend & resume.
> > > 
> > > Side note: the previous implementation had a second issue. It acquired a
> > > pointer to both `struct cqspi_st` and `struct spi_controller` using
> > > dev_get_drvdata(). Neither embed the other. This lead to memory
> > > corruption that was being hidden inside the big cqspi->f_pdata array on
> > > my setup. It was working until I tried changing the array side to its
> > > theorical max of 4, which lead to the discovery of this gnarly bug.
> > > 
> > > Fixes: 0578a6dbfe75 ("spi: spi-cadence-quadspi: add runtime pm support")
> > > Fixes: 2087e85bb66e ("spi: cadence-quadspi: fix suspend-resume implementations")  
> >
> > Your commit log makes total sense but I believe the diff is gonna break
> > again the suspend to RAM operation. This is only my understanding
> > right after quickly going through the whole story, so maybe I'm
> > totally off topic.  
> 
> The current ->runtime_suspend() implementation would indeed (probably)
> work for suspend-to-RAM if it wasn't for the wrong pointers to cqspi
> and spi_controller (see side note from commit message).

Yeah, this probably needs to be fixed aside.

> I've not found a moment where `struct cqspi_st` embed `struct
> spi_controller` at its start, so I do not believe this has ever worked.
> It might be the result of a mistake while porting a patch from a branch
> that included other changes.
> 
> > What happened if I understand the two commits blamed above:
> >
> > - There were PM hooks.
> > - Someone turned them into runtime PM hooks (breaking regular
> >   suspend/resume).
> > - Someone else added the "missing" suspend/resume logic inside the
> >   runtime PM hooks to fix suspend and resume.
> > - You are removing this logic because it leads to deadlocks.
> >
> > There was likely a misconception of what is expected in both cases
> > (quick and small power savings vs. full power cycle/loosing the whole
> > configuration).
> >
> > I would propose instead to create two distinct set of functions:
> > - One for runtime PM
> > - One for suspend/resume
> > This way the runtime PM no longer deadlocks and people using
> > suspend/resume won't get affected? I don't know if your runtime hooks
> > *will* always be called during a suspend/resume. I hope so, which would
> > make the split quite easy and without any code duplication.  
> 
> That does indeed sound like the right approach. Runtime hooks can be
> called from suspend/resume if needs be. Runtime PM then gets disabled
> at the late stage.

Would make sense indeed.

> I do not believe currently system-wide suspend can be working.
> spi_controller_{suspend,resume} are being called with a bogus pointer.
> This makes me ask: should the system-wide suspend/resume part be
> addressed with this patch or a follow-up? It feels like a separate
> concern to me.

Probably two patches, yes.

Thanks,
Miquèl

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ