linux-kernel - Re: Re: [PATCH] ahci: libahci: clear pending interrupt status

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <ZOzyl0r2+ijSx7il@x1-carbon>
Date:   Mon, 28 Aug 2023 19:16:44 +0000
From:   Niklas Cassel <Niklas.Cassel@....com>
To:     Szuying Chen <chensiying21@...il.com>
CC:     "dlemoal@...nel.org" <dlemoal@...nel.org>,
        "linux-ide@...r.kernel.org" <linux-ide@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "Jesse1_Chang@...edia.com.tw" <Jesse1_Chang@...edia.com.tw>,
        "Richard_Hsu@...edia.com.tw" <Richard_Hsu@...edia.com.tw>,
        "Chloe_Chen@...edia.com.tw" <Chloe_Chen@...edia.com.tw>
Subject: Re: Re: [PATCH] ahci: libahci: clear pending interrupt status

On Thu, Aug 24, 2023 at 06:51:59PM +0800, Szuying Chen wrote:
> 
> On 8/22/23 16:41, Niklas Cassel wrote: 
> >
> >On Mon, Aug 14, 2023 at 12:17:20PM +0800, Szuying Chen wrote:
> >> On 8/10/23 17:59, Damien Le Moal wrote: 
> >> >On 8/10/23 18:31, Szuying Chen wrote:
> >> >>> On 8/10/23 14:12, Damien Le Moal wrote: 
> >> >>>    On 8/10/23 14:05, Szuying Chen wrote:
> >> >>>    > When ISR handle interface fatal error with error recovery after clear PxIS
> >> >>>    > and PxIE. Another FIS(SDB FIS with err) that set PxIS.IFS to 1 is recevied
> >> >>>    > during error recovery, which causing the HBA not issue any new commands
> >> >>>    > after cmd.ST set 1.
> >> >>>
> >> >>>    This is not very clear. If there was a fatal error, the drive should be in
> >> >>>    error state and no other SDB FIS can be received as the drive does absolutely
> >> >>>    nothing while in error state (it only waits for a read log 10h command to be>
> >> >>>    issued to get it out of error state). So if you are seeing 2 SDB FIS with
> >> >>>    errors one after the other, you have a buggy device...
> >> >>>
> >> >>>    However, I may be misunderstanding your issue here. Could you provide more
> >> >>>    details and a dmesg output example of the issue ?
> >> >> 
> >> >> According to AHCI 1.3.1 specification ch6.1.9, when an R_ERR is 
> >> >> received on an H2D Data FIS in normal operation, the HBA sets 
> >> >> PxIS.IFS to 1 (fatal error) and halts operation. Referring to SATA 
> >> >> 3.0 specification we know the device will halt queued command 
> >> >> processing and transmit SDB FIS to host with ERR bit in Status field set to one(set PxIS.TFES to 1).
> >> >
> >> >Sure, but that SBD FIS should be completely ignored by the adapter 
> >> >since it stopped operation. If you see it, then it means that the 
> >> >handling of the first error was incomplete.
> >> >
> >> The SDB FIS  happens between pxIS.IE was cleared and PxCMD.ST not 
> >> cleared to 0. Because the PxCMD.ST still setting, so the SDB FIS can't ignored.
> >> 
> >> >> In our case, the ISR handles fatal errors(PxIS.IFS) and enters 
> >> >> error recovery after cleaning up PxIS and PxIE. Then a SDB FIS is 
> >> >> received with interrupt bit(PxIS.TFES) set to 1. According to AHCI 
> >> >> 1.3.1 specification ch6.2.2, HBA can't issue(cmd.ST set to 1) any 
> >> >> new commands under PxIS.TFES alive during error recovery.
> >> >
> >> >But how come you see a new command being issued ? This entire 
> >> >situation should result in a port reset with the first error. I do not see how this is possible.
> >> >Are you saying that the port reset is not cleaning things up properly 
> >> >? Could you share the dmesg output of this case ?
> >> >
> >> >
> >> dmesg:
> >> [  654.244958] nouveau 0000:09:00.0: AMD-Vi: Event logged 
> >> [IO_PAGE_FAULT
> >> domain=0x0019 address=0x0 flags=0x0000] [  654.460522] nouveau 
> >> 0000:09:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT
> >> domain=0x0019 address=0x0 flags=0x0000] [  656.180375] ata7.00: 
> >> exception Emask 0x10 SAct 0x7c0 SErr 0x200100
> >
> >Hello Szuying Chen,
> >
> >From your logs we can see that SErr 0x200100:
> >This means that PxSERR.DIAG.C was set.
> >
> >This means that:
> >"Calculated Different CRC than Received: When this occurs, the HBA returns R_ERR and returns to idle."
> >
> >(Since PxIS.IFS is set, we know that it was a D2H Data FIS that had invalid CRC. (If it was a D2H non-Data FIS PxIS.INFS would have been set instead.))
> >
> >
> >So your commit message is wrong. In your case it is the HBA that sends a R_ERR to the device, so you should look at "6.1.10 CRC error in received FIS" and not "6.1.9 Device responds to FIS with R_ERR".
> >
> >"When a CRC error occurs on a D2H Data FIS, the HBA sets PxIS.IFS to ‘1’.
> >After a CRC error on a D2H Data FIS, the device may send a D2H Register FIS to indicate the reason for the error. The host should allow posting of non-Data FISes before PxCMD.ST is cleared to ‘0’."
> >
> >Looking at the state machine "5.3.11.2 DR:Receive" you will jump to state ERR:Fatal. (Meaning that we need to clear PxCMD.ST in order to issue new
> >commands.)
> >
> >
> >The HBA should this trigger an error irq when noticing the bad CRC.
> >
> >It is possible that the HBA triggers another error IRQ when receiving a D2H with TFES bit set, if it is received before PxCMD.ST is cleared to 0, see NDR:Accept:
> >FIS Type is D2H Register and PxCMD.ST=0, jump to P:RegFISUpdate, which does not trigger any IRQ.
> >If PxCMD.ST is set, the state machine will instead jump to RegFIS:Entry.
> >
> >
> >So, in order to understand what is going on, can you please add unconditional prints in ahci_error_intr(), so we can actually see how many times this is called. (Right now it seems to only be called once). Please also print when PxCMD.ST is cleared to 0.
> >(So that we can verify that ahci_error_intr() is never called after that). Also please remove all prints from other drivers, e.g. nouveau.
> >
> >
> >Kind regards,
> >Niklas
> >
> >
> Hi Niklas,
> 
> In our case, receiving D2H with TFES bit set(PxIS.TFES set to 1) doesn't
> triggers aother error IRQ before PxCMD.ST is cleared. Because IRQ was 
> turn off(PxIE=0).
> 
> After received D2H with TFES bit set, port is restart(clearing PxCMD.ST to 
> '0' and then setting PxCMD.ST to '1').In the dmesg output, device is lose 
> after the port restart 3 times. Durring this time, the TFES bit was not 
> cleared.
> 
> The Linux kernel dmesg output as shown below:
>  [   40.251795] ahci_error_intr start (irq_stat 0x08000000) ****
> [   40.251799] ahci_error_intr: interface fatal error
> [   40.251800] ahci_error_intr: freeze (set pxIE to 0)
> [   40.251805] ahci_error_intr end (PORT_IRQ_MASK = 0x00000000 )****
> [   40.298113] ata8.00: exception Emask 0x10 SAct 0x70e000 SErr 0x280100 
> action 0x6 frozen
> [   40.298117] ata8.00: irq_stat 0x08000000, interface fatal error
> [   40.298119] ata8: SError: { UnrecovData 10B8B BadCRC }
> [   40.298121] ata8.00: failed command: READ FPDMA QUEUED
> [   40.298122] ata8.00: cmd 60/b8:68:c0:c3:03/08:00:00:00:00/40 tag 13 
> ncq dma 1142784 in
>          res 40/00:ff:00:00:00/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
> [   40.298126] ata8.00: status: { DRDY }
> [   40.298128] ata8.00: failed command: READ FPDMA QUEUED
> [   40.298129] ata8.00: cmd 60/b8:70:78:cc:03/05:00:00:00:00/40 tag 14 
> ncq dma 749568 in
>          res 40/00:ff:00:00:00/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
> [   40.298132] ata8.00: status: { DRDY }
> [   40.298133] ata8.00: failed command: READ FPDMA QUEUED
> [   40.298134] ata8.00: cmd 60/00:78:30:d2:03/0a:00:00:00:00/40 tag 15 
> ncq dma 1310720 in
>          res 40/00:ff:00:00:00/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
> [   40.298137] ata8.00: status: { DRDY }
> [   40.298138] ata8.00: failed command: READ FPDMA QUEUED
> [   40.298139] ata8.00: cmd 60/00:a0:30:dc:03/0a:00:00:00:00/40 tag 20 
> ncq dma 1310720 in
>          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
> [   40.298142] ata8.00: status: { DRDY }
> [   40.298143] ata8.00: failed command: READ FPDMA QUEUED
> [   40.298144] ata8.00: cmd 60/40:a8:30:e6:03/08:00:00:00:00/40 tag 21 
> ncq dma 1081344 in
>          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
> [   40.298147] ata8.00: status: { DRDY }
> [   40.298148] ata8.00: failed command: READ FPDMA QUEUED
> [   40.298149] ata8.00: cmd 60/90:b0:70:ee:03/09:00:00:00:00/40 tag 22 
> ncq dma 1253376 in
>          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x10 (ATA bus error)
> [   40.298151] ata8.00: status: { DRDY }
> [   40.298154] ata8: hard resetting link
> [   40.298155] ata8: ahci_hardreset *****
> [   40.298160] ahci 0000:6f:00.0: ahci_stop_engine start ***
> ( PORT_IRQ_STAT 0x40000008 PORT_CMD 0x0004d517)

Hello Szuying Chen,


Here we can see that PORT_IRQ_STAT is still set after a hardreset (COMRESET).

This does not seem to be in-line with AHCI 1.3.1.

libata does clear PxIS in ahci_port_init(), but that function seems to only
be called at probe time.
libata does also clear PxIS in ahci_thaw(), but that is after issuing the
COMRESET.



The current libata flow, for your error, is like this:

ahci_error_intr() -> ata_port_freeze()
ata_port_abort() -> EH scheduled for all active QCs.

ata_scsi_error() -> ata_scsi_port_error_handler() -> ahci_error_handler() ->
sata_pmp_error_handler() -> ata_eh_autopsy() -> ata_eh_link_autopsy() ->
ata_eh_analyze_serror()

ata_eh_analyze_serror() will see that SError has SERR_DATA data set,
so it will mark it with:
	err_mask |= AC_ERR_ATA_BUS;
	action |= ATA_EH_RESET;

sata_pmp_eh_recover() -> ata_eh_recover() -> ata_eh_reset() ->
ahci_hardreset() -> ahci_do_hardreset().



ahci_do_hardreset() does:

hpriv->stop_engine() which clears PxCMD.ST.

then performs a COMRESET (calls sata_link_hardreset())

then calls hpriv->start_engine().


After that ata_eh_reset() (which called ahci_do_hardreset()),
will call ata_eh_thaw_port() which will clear PxIS and enable IRQs.

ata_eh_reset() will then call postreset (ahci_postreset()).

ahci_postreset() will call ata_std_postreset() which will clear SError.



The problem with this is that a lot of this is not in line with
AHCI 1.3.1 - section 6.2.2.2, which states that error recovery
should be performed in the following order:

- Reads PxSACT to see which commands have not yet completed
- Clears PxCMD.ST to ‘0’ to reset the PxCI register, waits for PxCMD.CR to
clear to ‘0’
- Clears any error bits in PxSERR to enable capturing new errors.
- Clears status bits in PxIS as appropriate
- If PxTFD.STS.BSY or PxTFD.STS.DRQ is set to ‘1’, issue a COMRESET to the
device to put it in an idle state.
- Sets PxCMD.ST to ‘1’ to enable issuing new commands
- Issue READ LOG EXT to determine the cause of the error condition if
  software did not have to perform a reset (COMRESET or software reset)
  as part of the error recovery


To be in-line with the spec, I think the clearing of PxIS (and SError)
should optimally be done in ahci_do_hardreset(), between the clearing
and setting of PxCMD.ST, and before the COMRESET (before calling
sata_link_hardreset()).



Your current patch proposal (which makes things work for you),
clears PxIS after the COMRESET... Which is also not in line with the spec.
Since your non-spec compliant patch makes things work for you, perhaps you
could instead try to put the PxIS clearing in ahci_postreset().
At least then the clearing of SError and PxIS is done at the same time
(even if both are done after setting PxCMD.ST to 1...)


Kind regards,
Niklas