lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAOhMmr6c2M68fj0Mec=vhHr7krYkB8Bih-koC9o9F=0CJOCQgQ@mail.gmail.com>
Date:   Wed, 23 Dec 2020 02:21:09 -0600
From:   Lijun Pan <lijunp213@...il.com>
To:     Jakub Kicinski <kuba@...nel.org>
Cc:     Lijun Pan <ljp@...ux.ibm.com>, netdev@...r.kernel.org
Subject: Re: [PATCH net] ibmvnic: continue fatal error reset after passive init

On Tue, Dec 22, 2020 at 8:48 PM Jakub Kicinski <kuba@...nel.org> wrote:
>
> On Sat, 19 Dec 2020 15:40:34 -0600 Lijun Pan wrote:
> > Commit f9c6cea0b385 ("ibmvnic: Skip fatal error reset after passive init")
> > says "If the passive
> > CRQ initialization occurs before the FATAL reset task is processed,
> > the FATAL error reset task would try to access a CRQ message queue
> > that was freed, causing an oops. The problem may be most likely to
> > occur during DLPAR add vNIC with a non-default MTU, because the DLPAR
> > process will automatically issue a change MTU request.
> > Fix this by not processing fatal error reset if CRQ is passively
> > initialized after client-driven CRQ initialization fails."
> >
> > Even with this commit, we still see similar kernel crashes. In order
> > to completely solve this problem, we'd better continue the fatal error
> > reset, capture the kernel crash, and try to fix it from that end.
>
> This basically reverts the quoted fix. Does the quoted fix make things
> worse? Otherwise we should leave the code be until proper fix is found.

Yes, I think the quoted commit makes things worse. It skips the specific
reset condition, but that does not fix the problem it claims to fix.
The effective fix is upstream SHA 0e435befaea4 and a0faaa27c716. So I
think reverting it to the original "else" condition is the right thing to do.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ