[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CACsaVZJYcGv26gM=Kcm2R3-A7Yqc9Jc1cd8TchA0POpxx1NHow@mail.gmail.com>
Date: Wed, 4 Oct 2023 20:55:34 -0700
From: Kyle Sanderson <kyle.leet@...il.com>
To: Giovanni Cabiddu <giovanni.cabiddu@...el.com>
Cc: Linux-Kernal <linux-kernel@...r.kernel.org>, qat-linux@...el.com,
Greg KH <gregkh@...uxfoundation.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Linux Crypto Mailing List <linux-crypto@...r.kernel.org>
Subject: Re: Linux 6.1.52 regression: Intel QAT kernel panic (memory corruption)
On Thu, Sep 14, 2023 at 10:55 PM Giovanni Cabiddu
<giovanni.cabiddu@...el.com> wrote:
>
> On Thu, Sep 14, 2023 at 10:27:22PM -0700, Kyle Sanderson wrote:
> > Hello Intel QAT Maintainers,
> >
> > It looks like QAT has regressed again. The present symptom is just
> > straight up memory corruption. I was running Canonical 6.1.0-1017-oem
> > and it doesn't happen, with 6.1.0-1020-oem and 6.1.0-1021-oem it does.
> > I don't know what these map to upstream, however with NixOS installed
> > the same corruption failure occurs on 6.1.52.
> This is probably be related to [1].
> Versions from 6.1.39 to 6.1.52 are affected. Fixed in v6.1.53.
>
> [1] https://www.spinics.net/lists/stable/msg678947.html
>
> Regards,
>
> --
> Giovanni
Thank you Giovanni - that appears to have been it. Ubuntu
6.1.0-1023-oem (v6.1.53) no longer reproduces the issue.
K.
On Thu, Sep 14, 2023 at 10:55 PM Giovanni Cabiddu
<giovanni.cabiddu@...el.com> wrote:
>
> On Thu, Sep 14, 2023 at 10:27:22PM -0700, Kyle Sanderson wrote:
> > Hello Intel QAT Maintainers,
> >
> > It looks like QAT has regressed again. The present symptom is just
> > straight up memory corruption. I was running Canonical 6.1.0-1017-oem
> > and it doesn't happen, with 6.1.0-1020-oem and 6.1.0-1021-oem it does.
> > I don't know what these map to upstream, however with NixOS installed
> > the same corruption failure occurs on 6.1.52. The stack traces give
> > illegal instructions and all kinds of badness across all modules when
> > the device is simply present on the system, resulting in a hung
> > system, or a multitude of processes crashing and the system failing to
> > start. Disabling the device in the system BIOS results in a working
> > system, and no extreme corruption. kmem_cache_alloc_node is the common
> > fixture in the traces (I don't have a serial line), but I suspect
> > that's not where the problem is. The corruption this time happens
> > without block crypto being involved, and simply booting the installer
> > from a USB stick.
> This is probably be related to [1].
> Versions from 6.1.39 to 6.1.52 are affected. Fixed in v6.1.53.
>
> [1] https://www.spinics.net/lists/stable/msg678947.html
>
> Regards,
>
> --
> Giovanni
Powered by blists - more mailing lists