lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220929185305.GA1912842@bhelgaas>
Date:   Thu, 29 Sep 2022 13:53:05 -0500
From:   Bjorn Helgaas <helgaas@...nel.org>
To:     Krishna Chaitanya Chundru <quic_krichai@...cinc.com>
Cc:     linux-pci@...r.kernel.org, linux-arm-msm@...r.kernel.org,
        linux-kernel@...r.kernel.org, mka@...omium.org,
        quic_vbadigan@...cinc.com, quic_hemantk@...cinc.com,
        quic_nitegupt@...cinc.com, quic_skananth@...cinc.com,
        quic_ramkri@...cinc.com, manivannan.sadhasivam@...aro.org,
        swboyd@...omium.org, dmitry.baryshkov@...aro.org,
        svarbanov@...sol.com, agross@...nel.org, andersson@...nel.org,
        konrad.dybcio@...ainline.org, lpieralisi@...nel.org,
        robh@...nel.org, kw@...ux.com, bhelgaas@...gle.com,
        linux-phy@...ts.infradead.org, vkoul@...nel.org, kishon@...com,
        mturquette@...libre.com, linux-clk@...r.kernel.org,
        Bjorn Andersson <bjorn.andersson@...aro.org>,
        "Rafael J. Wysocki" <rafael@...nel.org>, linux-pm@...r.kernel.org
Subject: Re: [PATCH v7 1/5] PCI: qcom: Add system suspend and resume support

On Mon, Sep 26, 2022 at 09:00:11PM +0530, Krishna Chaitanya Chundru wrote:
> On 9/23/2022 7:56 PM, Bjorn Helgaas wrote:
> > On Fri, Sep 23, 2022 at 07:29:31AM +0530, Krishna Chaitanya Chundru wrote:
> > > On 9/23/2022 12:12 AM, Bjorn Helgaas wrote:
> > > > On Thu, Sep 22, 2022 at 09:09:28PM +0530, Krishna Chaitanya Chundru wrote:
> > > > > On 9/21/2022 10:26 PM, Bjorn Helgaas wrote:
> > > > > > On Wed, Sep 21, 2022 at 03:23:35PM +0530, Krishna Chaitanya Chundru wrote:
> > > > > > > On 9/20/2022 11:46 PM, Bjorn Helgaas wrote:
> > > > > > > > On Tue, Sep 20, 2022 at 03:52:23PM +0530, Krishna chaitanya chundru wrote:
> > > > > > > > > In qcom platform PCIe resources( clocks, phy etc..) can
> > > > > > > > > released when the link is in L1ss to reduce the power
> > > > > > > > > consumption. So if the link is in L1ss, release the PCIe
> > > > > > > > > resources. And when the system resumes, enable the PCIe
> > > > > > > > > resources if they released in the suspend path.
> > > > > > > > What's the connection with L1.x?  Links enter L1.x based on
> > > > > > > > activity and timing.  That doesn't seem like a reliable
> > > > > > > > indicator to turn PHYs off and disable clocks.
> > > > > > > This is a Qcom PHY-specific feature (retaining the link state in
> > > > > > > L1.x with clocks turned off).  It is possible only with the link
> > > > > > > being in l1.x. PHY can't retain the link state in L0 with the
> > > > > > > clocks turned off and we need to re-train the link if it's in L2
> > > > > > > or L3. So we can support this feature only with L1.x.  That is
> > > > > > > the reason we are taking l1.x as the trigger to turn off clocks
> > > > > > > (in only suspend path).
> > > > > > This doesn't address my question.  L1.x is an ASPM feature, which
> > > > > > means hardware may enter or leave L1.x autonomously at any time
> > > > > > without software intervention.  Therefore, I don't think reading the
> > > > > > current state is a reliable way to decide anything.
> > > > > After the link enters the L1.x it will come out only if there is
> > > > > some activity on the link.  AS system is suspended and NVMe driver
> > > > > is also suspended( queues will  freeze in suspend) who else can
> > > > > initiate any data.
> > > > I don't think we can assume that nothing will happen to cause exit
> > > > from L1.x.  For instance, PCIe Messages for INTx signaling, LTR, OBFF,
> > > > PTM, etc., may be sent even though we think the device is idle and
> > > > there should be no link activity.
> > > I don't think after the link enters into L1.x there will some
> > > activity on the link as you mentioned, except for PCIe messages like
> > > INTx/MSI/MSIX. These messages also will not come because the client
> > > drivers like NVMe will keep their device in the lowest power mode.
> > > 
> > > The link will come out of L1.x only when there is config or memory
> > > access or some messages to trigger the interrupts from the devices.
> > > We are already making sure this access will not be there in S3.  If
> > > the link is in L0 or L0s what you said is expected but not in L1.x
> > Forgive me for being skeptical, but we just spent a few months
> > untangling the fact that some switches send PTM request messages even
> > when they're in a non-D0 state.  We expected that devices in D3hot
> > would not send such messages because "why would they?"  But it turns
> > out the spec allows that, and they actually *do*.
> > 
> > I don't think it's robust interoperable design for a PCI controller
> > driver like qcom to assume anything about PCI devices unless it's
> > required by the spec.
>
> From pci spec 4, in sec 5.5
> "Ports that support L1 PM Substates must not require a reference clock while
> in L1 PM Substates
> other than L1.0".
> If there is no reference clk we can say there is no activity on the link.
> If anything needs to be sent (such as LTR, or some messages ), the link
> needs to be back in L0 before it
> sends the packet to the link partner.
> 
> To exit from L1.x clkreq pin should be asserted.
> 
> In suspend after turning off clocks and phy we can enable to trigger an
> interrupt whenever the clk req pin asserts.
> In that interrupt handler, we can enable the pcie resources back.

>From the point of view of the endpoint driver, ASPM should be
invisible -- no software intervention required.  I think you're
suggesting that the PCIe controller driver could help exit L1.x by
handling a clk req interrupt and enabling clock and PHY then.

But doesn't L1.x exit also have to happen within the time the endpoint
can tolerate?  E.g., I think L1.2 exit has to happen within the LTR
time advertised by the endpoint (PCIe r6.0, sec 5.5.5).  How can we
guarantee that if software is involved?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ