linux-kernel - [PATCH] ahci: libahci: clear pending interrupt status

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-Id: <20230810093147.45678-1-Chloe_Chen@asmedia.com.tw>
Date:   Thu, 10 Aug 2023 17:31:47 +0800
From:   Szuying Chen <chensiying21@...il.com>
To:     dlemoal@...nel.org, linux-ide@...r.kernel.org,
        linux-kernel@...r.kernel.org
Cc:     Jesse1_Chang@...edia.com.tw, Richard_Hsu@...edia.com.tw,
        Szuying Chen <Chloe_Chen@...edia.com.tw>
Subject: [PATCH] ahci: libahci: clear pending interrupt status

On 8/10/23 14:12, Damien Le Moal wrote: 
>    On 8/10/23 14:05, Szuying Chen wrote:
>    > When ISR handle interface fatal error with error recovery after clear PxIS
>    > and PxIE. Another FIS(SDB FIS with err) that set PxIS.IFS to 1 is recevied
>    > during error recovery, which causing the HBA not issue any new commands
>    > after cmd.ST set 1.
>
>    This is not very clear. If there was a fatal error, the drive should be in
>    error state and no other SDB FIS can be received as the drive does absolutely
>    nothing while in error state (it only waits for a read log 10h command to be>
>    issued to get it out of error state). So if you are seeing 2 SDB FIS with
>    errors one after the other, you have a buggy device...
>
>    However, I may be misunderstanding your issue here. Could you provide more
>    details and a dmesg output example of the issue ?

According to AHCI 1.3.1 specification ch6.1.9, when an R_ERR is received
on an H2D Data FIS in normal operation, the HBA sets PxIS.IFS to 1
(fatal error) and halts operation. Referring to SATA 3.0 specification we
know the device will halt queued command processing and transmit SDB FIS to
host with ERR bit in Status field set to one(set PxIS.TFES to 1).

In our case, the ISR handles fatal errors(PxIS.IFS) and enters error 
recovery after cleaning up PxIS and PxIE. Then a SDB FIS is received 
with interrupt bit(PxIS.TFES) set to 1. According to AHCI 1.3.1 
specification ch6.2.2, HBA can't issue(cmd.ST set to 1) any new commands
under PxIS.TFES alive during error recovery.

>    >
>    > Signed-off-by: Szuying Chen <Chloe_Chen@...edia.com.tw>
>    > ---
>    >  drivers/ata/libahci.c | 12 ++++++++++++
>    >  1 file changed, 12 insertions(+)
>    >
>    > diff --git a/drivers/ata/libahci.c b/drivers/ata/libahci.c
>    > index 06aec35f88f2..0ae51fd95d46 100644
>    > --- a/drivers/ata/libahci.c
>    > +++ b/drivers/ata/libahci.c
>    > @@ -679,9 +679,21 @@ static int ahci_scr_write(struct ata_link *link, unsigned int sc_reg, u32 val)
>    >
>    >  void ahci_start_engine(struct ata_port *ap)
>    >  {
>    > +     struct ahci_host_priv *hpriv = ap->host->private_data;
>    >       void __iomem *port_mmio = ahci_port_base(ap);
>    >       u32 tmp;
>    >
>    > +     /* clear SError */
>    > +     tmp = readl(port_mmio + PORT_SCR_ERR);
>    > +     writel(tmp, port_mmio + PORT_SCR_ERR);
>    > +
>    > +     /* clear port IRQ */
>    > +     tmp = readl(port_mmio + PORT_IRQ_STAT);
>    > +     if (tmp)
>    > +             writel(tmp, port_mmio + PORT_IRQ_STAT);
>    > +
>    > +     writel(1 << ap->port_no, hpriv->mmio + PORT_IRQ_STAT);
>    > +
>    >       /* start DMA */
>    >       tmp = readl(port_mmio + PORT_CMD);
>    >       tmp |= PORT_CMD_START;
>    > --
>    > 2.39.2
>   >
>
>   -- 
>   Damien Le Moal
>    Western Digital Research
>
Thanks.