lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <8dfe634c-7466-dca3-6838-b305b2eb465a@opensource.wdc.com>
Date:   Mon, 6 Dec 2021 20:57:58 +0900
From:   Damien Le Moal <damien.lemoal@...nsource.wdc.com>
To:     Ayan Kumar Halder <ayan.kumar.halder@...inx.com>,
        linux-ide@...r.kernel.org
Cc:     axboe@...nel.dk, linux-kernel@...r.kernel.org,
        Stefano Stabellini <sstabellini@...nel.org>,
        Stefano Stabellini <stefano.stabellini@...inx.com>
Subject: Re: Need help to debug ata errors

On 2021/12/06 20:18, Ayan Kumar Halder wrote:
> Hi Damien,
> 
> Thanks a lot for your inputs.
> 
> On 06/12/2021 00:12, Damien Le Moal wrote:
>> On 2021/12/03 20:11, Ayan Kumar Halder wrote:
>>> Hi All,
>>>
>>> I am trying to run linux as a DomU guest on Xen with AHCI assigned to it.
>>> I can confirm that SATA works (ie able to detect sdb) as a Dom0 guest.
>>> However, it does not work as a DomU guest.
>>>
>>> Hardware :- ZCU102 board and it has two sata ports
>>> Kernel :- 5.10
>>>
>>> I have enabled the debug logs in drivers/ata
>>>
>>> 1. Logs from dom0 (where SATA works) https://pastebin.com/2BhMDq47
>>> 2. Logs from domU (where SATA does not work) https://pastebin.com/fE8WZnZ0
>>>
>>> Can some help me to answer these questions
>>> 1. What does this mean "1st FIS failed" ?
>>>
>>> 2. In the dom0 logs, PORT_SCR_ERR = 0x41d0002 whereas in domU logs,
>>> PORT_SCR_ERR = 0. Does it give some hints ?
>>>
>>> 3. Any other issues or hints to debug this ?
>>>
>>> I can confirm that in domU scenario, we do not get any interrupts from
>>> the device. What might be going wrong here ?
>>
>> That would be the first thing to check since without interrupts you will not get
>> any command completion. Commands will timeout and probe will not work.
>> And this IRQ problem is Xen territory, not ata.
> 
> I am actually debugging the interrupts from the Xen's side. I can 
> confirm that do_IRQ() (Xen's irq handler) does not receive AHCI 
> interrupts. It does get invoked for interrupts from serial and other 
> devices.
> 
> I have seen commands being timed out which is due to the iRQ issue. But 
> suprisingly, ahci probe is successful.

That cannot be. Without any interrupt, there will be no command completion.
Command that timeout are retried. So you may have seen timeouts because the
platform or device is very slow to respond, but you must be getting interrupts
if you get a good device probe. Otherwise, you would not see any disk connected
to your ports.

>>
>> The 1st FIS failed error may be due to some problems with AHCI PCI bar/register
>> accesses, which may not be working. This I think points again to Xen setup with
>> domU, which may not have the necessary access rights to get IRQ and PCI bar
>> accesses ? (I have no experience with Xen)
> 
> This is the device tree https://pastebin.com/HtdLx63v . I think it is 
> not related to PCI bus. Please correct me if mistaken.

Well, since you have an ahci node, I do not think that adapter is behind the PCI
bus :) It is a child of the axi bus. Not familiar with that type of setup...
Are you sure all properties of the ahci node are correct ?

> I have the necessary debug support from Xen. Can you let me know what 
> bits I can debug from SATA side (for eg reading a particular register) 
> which will confirm if SATA has been programmed correctly or not ?

The device probe with domU should be no different than what it is with dom0, I
think. Again, I do not have experience with Xen, so not entirely sure.

Note that from the dmesg you sent, for the working case, the port seems to be
awfully slow to link up. Not sure if that is normal for this platform.


-- 
Damien Le Moal
Western Digital Research

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ