[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <611dfd5e-33b7-12ad-5902-ad20edf3b02e@kernel.org>
Date: Tue, 15 Aug 2023 11:41:07 +0900
From: Damien Le Moal <dlemoal@...nel.org>
To: Li Nan <linan666@...weicloud.com>
Cc: linux-ide@...r.kernel.org, linux-kernel@...r.kernel.org,
linan122@...wei.com, yukuai3@...wei.com, yi.zhang@...wei.com,
houtao1@...wei.com, yangerkun@...wei.com, jianghong011@...wei.com,
zhangcheng75@...wei.com
Subject: Re: [PATCH] scsi: ata: Fix a race condition between scsi error
handler and ahci interrupt
On 8/14/23 22:20, Li Nan wrote:
>
> 在 2023/8/14 15:50, Damien Le Moal 写道:
>> On 8/14/23 15:41, Li Nan wrote:
>>>> This is definitely not correct because EH may have been scheduled for a non
>>>> fatal action like a device revalidate or to get sense data for successful
>>>> commands. With this change, the port will NOT be frozen when a hard error IRQ
>>>> comes while EH is waiting to start, that is, while EH waits for all commands to
>>>> complete first.
>>>>
>>>
>>> Yeah, we should find a better way to fix it. Do you have any suggesstions?
>>>
>>>> Furthermore, if you get an IRQ that requires the port to be frozen, it means
>>>> that you had a failed command. In that case, the drive is in error state per
>>>> ATA specs and stops all communication until a read log 10h command is issued.
>>>> So you should never ever see 2 error IRQs one after the other. If you do, it
>>>> very likely means that you have buggy hardware.
>>>>
>>>> How do you get into this situation ? What adapter and disk are you using ?
>>>>
>>>
>>> > How do you get into this situation ?
>>> The first IRQ is io error, the second IRQ is disk link flash break.
>>
>> What does "link flash break" mean ?
>>
>>>
>>> > What adapter and disk are you using ?
>>> It is a disk developed by our company, but we think the same issue
>>> exists when using other disks.
>>
>> As I said, I find this situation highly suspect because if the first IRQ was to
>> signal an IO error that the drive reported, then per ATA specifications, the
>> drive should be in error mode and should NOT have transmitted any other FIS
>> after the SDB FIS that signaled the error. Nothing at all should come after that
>> error SDB FIS, until the host issues a read log 10h to get thee drive out of
>> error state.
>>
>> If this is a prototype device, I would recommend that you take an ATA bus trace
>> and verify the FIS traffic. Something fishy is going on with the drive in my
>> opinion.
>>
>
> Thank you for your patient explanation. I'm sorry I didn't explain the
> problem clearly before. After discussing with my colleagues who know
> more about dirvers, Let me re-describe the problem.
>
> The problem`s situation is the SATA link is quickly disconnected and
> connected. For example, when an I/O error is processed in error handling
> thread, the disk is manually removed and inserted, and the AHCI chip
> reports a hot plug interrupt.
>
> This scenario is not just an NCQ error, but a disk is removed and
> quickly inserted before the error processing is completed. For the error
> handling process, the disk status needs to be restored after the error
> handling is complete.
In your original email, you showed:
interrupt scsi_eh
ahci_error_intr
=>ata_port_freeze
=>__ata_port_freeze
=>ahci_freeze (turn IRQ off)
=>ata_port_abort
=>ata_port_schedule_eh
=>shost->host_eh_scheduled++;
host_eh_scheduled = 1
scsi_error_handler
=>ata_scsi_error
=>ata_scsi_port_error_handler
=>ahci_error_handler
. =>sata_pmp_error_handler
. =>ata_eh_thaw_port
. =>ahci_thaw (turn IRQ on)
ahci_error_intr .
=>ata_port_freeze .
=>__ata_port_freeze .
=>ahci_freeze (turn IRQ off) .
=>ata_port_abort .
=>ata_port_schedule_eh .
=>shost->host_eh_scheduled++; .
host_eh_scheduled = 2 .
But here, I do not understand how host_eh_scheduled can be incremented since the
shost state should still be SHOST_RECOVERY until scsi_restart_operations() is
called at the end of scsi_error_handler(), which is after ata_std_end_eh() is
executed toward the end of ata_scsi_port_error_handler().
Not sure how what you are showing here can happen. Can you have a closer look ?
=>ata_std_end_eh
=>host->host_eh_scheduled = 0;
In any case, for you particular failure pattern, given that the disk "goes away"
while EH is running, I would expect the commands executed during EH (e.g. read
log 10h) to timeout, which would cause a reset and a revalidate after that. The
reset should clear the port interrupt error bits, which should allow everything
to recover after aborting all commands caught by the first EH run.
--
Damien Le Moal
Western Digital Research
Powered by blists - more mailing lists