lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3ce018d9-e005-f988-37ed-016c559973ec@huawei.com>
Date:   Wed, 18 Jan 2023 20:34:03 +0800
From:   "wangjie (L)" <wangjie125@...wei.com>
To:     Leon Romanovsky <leon@...nel.org>
CC:     Hao Lan <lanhao@...wei.com>, <davem@...emloft.net>,
        <kuba@...nel.org>, <yisen.zhuang@...wei.com>,
        <salil.mehta@...wei.com>, <edumazet@...gle.com>,
        <pabeni@...hat.com>, <richardcochran@...il.com>,
        <shenjian15@...wei.com>, <netdev@...r.kernel.org>
Subject: Re: [PATCH net-next 2/2] net: hns3: add vf fault process in hns3 ras



On 2023/1/17 19:21, Leon Romanovsky wrote:
> On Tue, Jan 17, 2023 at 03:04:15PM +0800, wangjie (L) wrote:
>>
>>
>> On 2023/1/13 14:51, Leon Romanovsky wrote:
>>> On Fri, Jan 13, 2023 at 10:08:29AM +0800, Hao Lan wrote:
>>>> From: Jie Wang <wangjie125@...wei.com>
>>>>
>>>> Currently hns3 driver supports vf fault detect feature. Several ras caused
>>>> by VF resources don't need to do PF function reset for recovery. The driver
>>>> only needs to reset the specified VF.
>>>>
>>>> So this patch adds process in ras module. New process will get detailed
>>>> information about ras and do the most correct measures based on these
>>>> accurate information.
>>>>
>>>> Signed-off-by: Jie Wang <wangjie125@...wei.com>
>>>> Signed-off-by: Hao Lan <lanhao@...wei.com>
>>>> ---
>>>>  drivers/net/ethernet/hisilicon/hns3/hnae3.h   |   1 +
>>>>  .../hns3/hns3_common/hclge_comm_cmd.h         |   1 +
>>>>  .../hisilicon/hns3/hns3pf/hclge_err.c         | 113 +++++++++++++++++-
>>>>  .../hisilicon/hns3/hns3pf/hclge_err.h         |   2 +
>>>>  .../hisilicon/hns3/hns3pf/hclge_main.c        |   3 +-
>>>>  .../hisilicon/hns3/hns3pf/hclge_main.h        |   1 +
>>>>  6 files changed, 115 insertions(+), 6 deletions(-)
>>>
>>> Why is it good idea to reset VF from PF?
>>> What will happen with driver bound to this VF?
>>> Shouldn't PCI recovery handle it?
>>>
>>> Thanks
>>> .
>> PF doesn't reset VF directly. These VF faults are detected by hardware,
>> and only reported to PF. PF get the VF id from firmware, then notify the VF
>> that it needs reset. VF will do reset after receive the request.
>
> This description isn't aligned with the code. You are issuing
> hclge_func_reset_cmd() command which will reset VF, while notification
> are handled by hclge_func_reset_notify_vf().
>
> It also doesn't make any sense to send notification event to VF through
> FW while the goal is to recover from stuck FW in that VF.
>
Yes, I misunderstand the hclge_func_reset_notify_vf and
hclge_func_reset_cmd. It should use hclge_func_reset_notify_vf to inform
the VF for recovery. I will fix and retest it in V2.

This patch is used to recover specific vf hardware errors, for example the
tx queue configuration exceptions. It make sense in these cases for the
firmware is still working properly and can do the recovery rightly.
>>
>> These hardware faults are not standard PCI ras event, so we prefer to use
>> MSIx path.
>
> What is different here?
>
These hardware faults are reported by MSIx interrupts instead of PCI ras
path.

Thanks!
>>>
> .
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ