lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1c057afa-92df-ee3c-5978-3731d3db9345@kernel.dk>
Date:   Sun, 14 Aug 2022 19:04:22 -0600
From:   Jens Axboe <axboe@...nel.dk>
To:     Linus Torvalds <torvalds@...ux-foundation.org>,
        Andres Freund <andres@...razel.de>,
        James Bottomley <James.Bottomley@...senpartnership.com>,
        "Martin K. Petersen" <martin.petersen@...cle.com>
Cc:     Guenter Roeck <linux@...ck-us.net>, linux-kernel@...r.kernel.org,
        Greg KH <gregkh@...uxfoundation.org>
Subject: Re: upstream kernel crashes

On 8/14/22 4:47 PM, Linus Torvalds wrote:
> On Sun, Aug 14, 2022 at 3:37 PM Andres Freund <andres@...razel.de> wrote:
>>
>> That range had different symptoms, I think (networking not working, but not
>> oopsing). I hit similar issues when tried to reproduce the issue
>> interactively, to produce more details, and unwisely did git pull instead of
>> checking out the precise revision, ending up with aea23e7c464b. That's when
>> symptoms look similar to the above.  So it'd be 69dac8e431af..aea23e7c464b
>> that I'd be more suspicious off in the context of this thread.
> 
> Ok.
> 
>> Which would make me look at the following first:
>> e140f731f980 Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
>> abe7a481aac9 Merge tag 'block-6.0-2022-08-12' of git://git.kernel.dk/linux-block
>> 1da8cf961bb1 Merge tag 'io_uring-6.0-2022-08-13' of git://git.kernel.dk/linux-block
> 
> All right, that maks sense.The reported oopses seem to be about block
> requests. Some of them were scsi in particular.
> 
> Let's bring in Jens and the SCSI people. Maybe that host reference
> counting? There's quite a lot of "move freeing around" in that late
> scsi pull, even if it was touted as "mostly small bug fixes and
> trivial updates".
> 
> Here's the two threads:
> 
>   https://lore.kernel.org/all/20220814212610.GA3690074@roeck-us.net/
>   https://lore.kernel.org/all/20220814043906.xkmhmnp23bqjzz4s@awork3.anarazel.de/
> 
> but I guess I'll do an -rc1 regardless of this, because I need to
> close the merge window.

I took a quick look and added more SCSI bits to my vm images, but
haven't been able to hit it. But if this is happening after the above
mentioned merges, does seem like it's more SCSI related. The block side
is only really an error handling fix on that front, the rest is just
nvme. Seems unlikely that'd be the culprit.

Sounds like Andres is already bisecting this, so I guess we'll be wiser
soon enough.

-- 
Jens Axboe

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ