linux-kernel - Re: [PATCH v4 1/1] nvme: handle persistent internal error AER from NVMe controller

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4a0b859e-8e2f-8ae0-1485-0f1ddec190ed@nvidia.com>
Date:   Thu, 9 Jun 2022 05:06:02 +0000
From:   Chaitanya Kulkarni <chaitanyak@...dia.com>
To:     "Michael Kelley (LINUX)" <mikelley@...rosoft.com>
CC:     Caroline Subramoney <Caroline.Subramoney@...rosoft.com>,
        "linux-nvme@...ts.infradead.org" <linux-nvme@...ts.infradead.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "axboe@...com" <axboe@...com>,
        Richard Wurdack <riwurd@...rosoft.com>,
        Nathan Obr <Nathan.Obr@...rosoft.com>,
        "sagi@...mberg.me" <sagi@...mberg.me>,
        "kbusch@...nel.org" <kbusch@...nel.org>, "hch@....de" <hch@....de>
Subject: Re: [PATCH v4 1/1] nvme: handle persistent internal error AER from
 NVMe controller

On 6/8/2022 6:30 PM, Michael Kelley (LINUX) wrote:
> From: Chaitanya Kulkarni <chaitanyak@...dia.com>
>>
>> On 6/8/22 17:22, Chaitanya Kulkarni wrote:
>>> On 6/8/22 11:52, Michael Kelley wrote:
>>>> In the NVM Express Revision 1.4 spec, Figure 145 describes possible
>>>> values for an AER with event type "Error" (value 000b). For a
>>>> Persistent Internal Error (value 03h), the host should perform a
>>>> controller reset.
>>>>
>>>> Add support for this error using code that already exists for
>>>> doing a controller reset. As part of this support, introduce
>>>> two utility functions for parsing the AER type and subtype.
>>>>
>>>> This new support was tested in a lab environment where we can
>>>> generate the persistent internal error on demand, and observe
>>>> both the Linux side and NVMe controller side to see that the
>>>> controller reset has been done.
>>>>
>>>>
>>
>> Can you please clarify that which transports you have tested
>> such as RDMA, TCP, and PCIe ?
>>
> 
> I've tested PCIe only -- that's all I have access to.  I can tweak
> the commit message to be more specific.
> 
> Michael

It's okay we have it documented now, thanks again.

-ck