lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAK00qKBiGd0vt5VHEBq25fp_r0OZa7qiWGWdYM0-fiYBZmCDgw@mail.gmail.com>
Date:   Wed, 18 Oct 2023 18:05:40 +0800
From:   Victor Shih <victorshihgli@...il.com>
To:     Kai-Heng Feng <kai.heng.feng@...onical.com>
Cc:     ulf.hansson@...aro.org, adrian.hunter@...el.com,
        linux-mmc@...r.kernel.org, linux-kernel@...r.kernel.org,
        benchuanggli@...il.com, HL.Liu@...esyslogic.com.tw,
        Greg.tu@...esyslogic.com.tw, kangzhen.lou@...l.com,
        Victor Shih <victor.shih@...esyslogic.com.tw>
Subject: Re: [PATCH V1] mmc: sdhci-pci-gli: GL975[05]: Mask the replay timer
 timeout of AER

On Wed, Oct 11, 2023 at 2:35 PM Kai-Heng Feng
<kai.heng.feng@...onical.com> wrote:
>
> On Fri, Oct 6, 2023 at 6:30 PM Victor Shih <victorshihgli@...il.com> wrote:
> >
> > On Mon, Oct 2, 2023 at 10:18 AM Kai-Heng Feng
> > <kai.heng.feng@...onical.com> wrote:
> > >
> > > Hi Victor,
> > >
> > > On Tue, Sep 26, 2023 at 4:21 PM Victor Shih <victorshihgli@...il.com> wrote:
> > > >
> > > > On Fri, Sep 22, 2023 at 3:11 PM Kai-Heng Feng
> > > > <kai.heng.feng@...onical.com> wrote:
> > > > >
> > > > > Hi Victor,
> > > > >
> > > > > On Wed, Sep 20, 2023 at 4:54 PM Victor Shih <victorshihgli@...il.com> wrote:
> > > > > >
> > > > > > On Tue, Sep 19, 2023 at 3:31 PM Kai-Heng Feng
> > > > > > <kai.heng.feng@...onical.com> wrote:
> > > > > > >
> > > > > > > Hi Victor,
> > > > > > >
> > > > > > > On Tue, Sep 19, 2023 at 3:10 PM Victor Shih <victorshihgli@...il.com> wrote:
> > > > > > > >
> > > > > > > > On Tue, Sep 19, 2023 at 12:24 PM Kai-Heng Feng
> > > > > > > > <kai.heng.feng@...onical.com> wrote:
> > > > > > > > >
> > > > > > > > > Hi Victor,
> > > > > > > > >
> > > > > > > > > On Mon, Sep 18, 2023 at 6:31 PM Victor Shih <victorshihgli@...il.com> wrote:
> > > > > > > > > >
> > > > > > > > > > From: Victor Shih <victor.shih@...esyslogic.com.tw>
> > > > > > > > > >
> > > > > > > > > > Due to a flaw in the hardware design, the GL975x replay timer frequently
> > > > > > > > > > times out when ASPM is enabled. As a result, the system will resume
> > > > > > > > > > immediately when it enters suspend. Therefore, the replay timer
> > > > > > > > > > timeout must be masked.
> > > > > > > > >
> > > > > > > > > This patch solves AER error when its PCI config gets accessed, but the
> > > > > > > > > AER still happens at system suspend:
> > > > > > > > >
> > > > > > > > > [ 1100.103603] ACPI: EC: interrupt blocked
> > > > > > > > > [ 1100.268244] ACPI: EC: interrupt unblocked
> > > > > > > > > [ 1100.326960] pcieport 0000:00:1c.0: AER: Corrected error received:
> > > > > > > > > 0000:00:1c.0
> > > > > > > > > [ 1100.326991] pcieport 0000:00:1c.0: PCIe Bus Error:
> > > > > > > > > severity=Corrected, type=Data Link Layer, (Transmitter ID)
> > > > > > > > > [ 1100.326993] pcieport 0000:00:1c.0:   device [8086:7ab9] error
> > > > > > > > > status/mask=00001000/00002000
> > > > > > > > > [ 1100.326996] pcieport 0000:00:1c.0:    [12] Timeout
> > > > > > > > >
> > > > > > > > > Kai-Heng
> > > > > > > > >
> > > > > > > >
> > > > > > > > Hi, Kai-Heng
> > > > > > > >
> > > > > > > > Could you try applying the patch and re-testing again after restarting
> > > > > > > > the system?
> > > > > > >
> > > > > > > Same issue happens after coldboot.
> > > > > > >
> > > > > > > > Because I applied the patch and restarted the system and it didn't happen.
> > > > > > > > The system can enter suspend normally.
> > > > > > > >
> > > > > > > > If you still have the issue after following the above instructions,
> > > > > > > > please provide me with your environment and I will verify it again.
> > > > > > >
> > > > > > > The patch gets applied on top of next-20230918. Please let me know
> > > > > > > what else you want to know.
> > > > > > >
> > > > > > > Kai-Heng
> > > > > > >
> > > > > >
> > > > > > Hi, Kai-Heng
> > > > > >
> > > > > > If I want to mask the replay timer timeout AER of the upper layer root port,
> > > > > > could you give me some suggestions?
> > > > > > Or could you provide sample code for my reference?
> > > > >
> > > > > I am not aware of anyway to mask "replay timer timeout" from root port.
> > > > > I wonder if the device supoprt D3hot? Or should it stay at D0 when
> > > > > ASPM L1.2 is enabled?
> > > > >
> > > > > Kai-Heng
> > > > >
> > > >
> > > > Hi, Kai-Heng
> > > >
> > > > Do you know any way to mask the replay timer timeout AER of the
> > > > upstream port from the device?
> > >
> > > Per PCIe Spec, I don't think it's possible to only mask 'replay timer timeout'.
> > >
> > > > The device supports D3hot.
> > >
> > > Do you think such error plays any crucial rule? Otherwise disable
> > > 'correctable' errors may be plausible.
> > >
> > > Kai-Heng
> > >
> >
> > Hi, Kai-Heng
> >
> > Due to a flaw in the hardware design, the GL975x replay timer frequently
> > times out when ASPM is enabled.
> > This patch solves the AER error of the replay timer timeout for GL975x.
> > We have not encountered any other errors so far.
>
> On the system I tested, this patch reduces the occurrence of the
> error, but not completely eliminated.
>
> > Does your 'correctable' errors mean the AER error of the replay timer timeout?
> > May I ask if you have any other comments on this patch?
>
> Spamming `lspci -vv -s` on the device can still observe the AER error.
>
> I think the "correctable" mask should be optional, let me send a patch
> to PCI for comment.
>
> Kai-Heng
>

Hi, Kai-Heng

As we discussed in another email, if you want to solve the issue of suspension,
you only need masking the replay timer timeout on its root port.
I haven't seen the PCI patch you submitted yet.
If you provide me with the PCI patch, I can help you test it.

This patch only solves the warning messages that will often appear in
the system log
when the system accesses the GL975x PCI config.
Therefore, I will revise the commit message and submit the V2 version.

Thanks, Victor Shih

> >
> > Thanks, Victor Shih
> >
> > > >
> > > > Thanks, Victor Shih
> > > >
> > > > > >
> > > > > > Thanks, Victor Shih
> > > > > >
> > > > > > > >
> > > > > > > > Thanks, Victor Shih
> > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Signed-off-by: Victor Shih <victor.shih@...esyslogic.com.tw>
> > > > > > > > > > ---
> > > > > > > > > >  drivers/mmc/host/sdhci-pci-gli.c | 16 ++++++++++++++++
> > > > > > > > > >  1 file changed, 16 insertions(+)
> > > > > > > > > >
> > > > > > > > > > diff --git a/drivers/mmc/host/sdhci-pci-gli.c b/drivers/mmc/host/sdhci-pci-gli.c
> > > > > > > > > > index d83261e857a5..d8a991b349a8 100644
> > > > > > > > > > --- a/drivers/mmc/host/sdhci-pci-gli.c
> > > > > > > > > > +++ b/drivers/mmc/host/sdhci-pci-gli.c
> > > > > > > > > > @@ -28,6 +28,9 @@
> > > > > > > > > >  #define PCI_GLI_9750_PM_CTRL   0xFC
> > > > > > > > > >  #define   PCI_GLI_9750_PM_STATE          GENMASK(1, 0)
> > > > > > > > > >
> > > > > > > > > > +#define PCI_GLI_9750_CORRERR_MASK                              0x214
> > > > > > > > > > +#define   PCI_GLI_9750_CORRERR_MASK_REPLAY_TIMER_TIMEOUT         BIT(12)
> > > > > > > > > > +
> > > > > > > > > >  #define SDHCI_GLI_9750_CFG2          0x848
> > > > > > > > > >  #define   SDHCI_GLI_9750_CFG2_L1DLY    GENMASK(28, 24)
> > > > > > > > > >  #define   GLI_9750_CFG2_L1DLY_VALUE    0x1F
> > > > > > > > > > @@ -152,6 +155,9 @@
> > > > > > > > > >  #define PCI_GLI_9755_PM_CTRL     0xFC
> > > > > > > > > >  #define   PCI_GLI_9755_PM_STATE    GENMASK(1, 0)
> > > > > > > > > >
> > > > > > > > > > +#define PCI_GLI_9755_CORRERR_MASK                              0x214
> > > > > > > > > > +#define   PCI_GLI_9755_CORRERR_MASK_REPLAY_TIMER_TIMEOUT         BIT(12)
> > > > > > > > > > +
> > > > > > > > > >  #define SDHCI_GLI_9767_GM_BURST_SIZE                   0x510
> > > > > > > > > >  #define   SDHCI_GLI_9767_GM_BURST_SIZE_AXI_ALWAYS_SET    BIT(8)
> > > > > > > > > >
> > > > > > > > > > @@ -561,6 +567,11 @@ static void gl9750_hw_setting(struct sdhci_host *host)
> > > > > > > > > >         value &= ~PCI_GLI_9750_PM_STATE;
> > > > > > > > > >         pci_write_config_dword(pdev, PCI_GLI_9750_PM_CTRL, value);
> > > > > > > > > >
> > > > > > > > > > +       /* mask the replay timer timeout of AER */
> > > > > > > > > > +       pci_read_config_dword(pdev, PCI_GLI_9750_CORRERR_MASK, &value);
> > > > > > > > > > +       value |= PCI_GLI_9750_CORRERR_MASK_REPLAY_TIMER_TIMEOUT;
> > > > > > > > > > +       pci_write_config_dword(pdev, PCI_GLI_9750_CORRERR_MASK, value);
> > > > > > > > > > +
> > > > > > > > > >         gl9750_wt_off(host);
> > > > > > > > > >  }
> > > > > > > > > >
> > > > > > > > > > @@ -770,6 +781,11 @@ static void gl9755_hw_setting(struct sdhci_pci_slot *slot)
> > > > > > > > > >         value &= ~PCI_GLI_9755_PM_STATE;
> > > > > > > > > >         pci_write_config_dword(pdev, PCI_GLI_9755_PM_CTRL, value);
> > > > > > > > > >
> > > > > > > > > > +       /* mask the replay timer timeout of AER */
> > > > > > > > > > +       pci_read_config_dword(pdev, PCI_GLI_9755_CORRERR_MASK, &value);
> > > > > > > > > > +       value |= PCI_GLI_9755_CORRERR_MASK_REPLAY_TIMER_TIMEOUT;
> > > > > > > > > > +       pci_write_config_dword(pdev, PCI_GLI_9755_CORRERR_MASK, value);
> > > > > > > > > > +
> > > > > > > > > >         gl9755_wt_off(pdev);
> > > > > > > > > >  }
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > 2.25.1
> > > > > > > > > >

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ