netdev - Re: [PATCH V2 net] ibmvnic: Continue with reset if set link down failed

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20210422065843.GA2743610@us.ibm.com>
Date:   Wed, 21 Apr 2021 23:58:43 -0700
From:   Sukadev Bhattiprolu <sukadev@...ux.ibm.com>
To:     Lijun Pan <lijunp213@...il.com>
Cc:     Lijun Pan <ljp@...ux.vnet.ibm.com>,
        Dany Madden <drt@...ux.ibm.com>,
        David Miller <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>,
        Tom Falcon <tlfalcon@...ux.ibm.com>, netdev@...r.kernel.org,
        Paul Mackerras <paulus@...ba.org>,
        linuxppc-dev@...ts.ozlabs.org
Subject: Re: [PATCH V2 net] ibmvnic: Continue with reset if set link down
 failed

Lijun Pan [lijunp213@...il.com] wrote:
> > Now, sure we can attempt a "thorough hard reset" which also does
> > the same hcalls to reestablish the connection. Is there any
> > other magic in do_hard_reset()? But in addition, it also frees lot
> > more Linux kernel buffers and reallocates them for instance.
> 
> Working around everything in do_reset will make the code very difficult

We are not working around everything. We are doing in do_reset()
exactly what we would do in hard reset for this error (ignore the
set link down error and try to reestablish the connection with the
VIOS).

What we are avoiding is unnecessary work on the Linux side for a
communication problem on the VIOS side.

> to manage. Ultimately do_reset can do anything I am afraid, and do_hard_reset
> can be removed completely or merged into do_reset.
> 
> >
> > If we are having a communication problem with the VIOS, what is
> > the point of freeing and reallocating Linux kernel buffers? Beside
> > being inefficient, it would expose us to even more errors during
> > reset under heavy workloads?
> 
> No real customer runs the system under that heavy load created by
> HTX stress test, which can tear down any working system.

We need to talk to capacity planning and test architects about that,
but all I want to know is what hard reset would do differently to
fix this communication error with VIOS.

Sukadev