[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <58E6011F.6030002@cn.fujitsu.com>
Date: Thu, 6 Apr 2017 16:49:35 +0800
From: Cao jin <caoj.fnst@...fujitsu.com>
To: "Michael S. Tsirkin" <mst@...hat.com>
CC: Alex Williamson <alex.williamson@...hat.com>,
<linux-kernel@...r.kernel.org>, <kvm@...r.kernel.org>,
<qemu-devel@...gnu.org>, <izumi.taku@...fujitsu.com>
Subject: Re: [PATCH v6] vfio error recovery: kernel support
On 04/06/2017 05:56 AM, Michael S. Tsirkin wrote:
> On Wed, Apr 05, 2017 at 04:54:33PM +0800, Cao jin wrote:
>> Apparently, I don't have experience to induce non-fatal error, device
>> error is more of a chance related with the environment(temperature,
>> humidity, etc) as I understand.
>
> I'm not sure how to interpret this statement. I think what Alex is
> saying is simply that patches should include some justification. They
> make changes but what are they improving?
> For example:
>
> I tested device ABC in conditions DEF. Without a patch VM
> stops. With the patches applied VM recovers and proceeds to
> use the device normally.
>
> is one reasonable justification imho.
>
Got it. But unfortunately, until now, I haven't seen a VM stop caused by
a real device non-fatal error during device assignment(Only saw real
fatal errors after start VM).
On one side, AER error could occur theoretically; on the other side,
seldom people have seen a VM stop caused by AER. Now I am asked that do
I have a real evidence or scenario to prove that this patchset is really
useful? I don't, and we all know it is hard to trigger a real hardware
error, so, seems I am pushed into the corner. I guess these questions
also apply for AER driver's author, if the scenario is easy to
reproduce, there is no need to write aer_inject to fake errors.
--
Sincerely,
Cao jin
Powered by blists - more mailing lists