lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 24 Jan 2024 18:51:28 +0100
From: Thorsten Leemhuis <regressions@...mhuis.info>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Dave Airlie <airlied@...il.com>, Daniel Vetter <daniel.vetter@...ll.ch>,
 dri-devel <dri-devel@...ts.freedesktop.org>,
 LKML <linux-kernel@...r.kernel.org>,
 Linux regressions mailing list <regressions@...ts.linux.dev>,
 Mikhail Gavrilov <mikhail.v.gavrilov@...il.com>,
 Mario Limonciello <mario.limonciello@....com>,
 Vlastimil Babka <vbabka@...e.cz>, Donald Carr <sirspudd@...il.com>
Subject: Re: [git pull] drm for 6.8

Linus, if you have a minute, I'd really like to know...

On 24.01.24 17:41, Mario Limonciello wrote:
> On 1/24/2024 10:24, Vlastimil Babka wrote:
>> On 1/24/24 16:31, Donald Carr wrote:
>>> On Wed, Jan 24, 2024 at 7:06 AM Vlastimil Babka <vbabka@...e.cz> wrote:
>>>> When testing the rc1 on my openSUSE Tumbleweed desktop, I've started
>>>> experiencing "frozen desktop" (KDE/Wayland) issues. The symptoms are
>>>> that
>>>> everything freezes including mouse cursor. After a while it either
>>>> resolves,
>>>> or e.g. firefox crashes (if it was actively used when it froze) or it's
>>>> frozen for too long and I reboot with alt-sysrq-b. When it's frozen
>>>> I can
>>>> still ssh to the machine, and there's nothing happening in dmesg.
>>>> The machine is based on Amd Ryzen 7 2700 and Radeon RX7600.
>>> [...]
>>> I am experiencing the exact same symptoms;
>>
>> Big thanks to Thorsten who suggested I look at the following:
>>
>> https://lore.kernel.org/all/20240123021155.2775-1-mario.limonciello@amd.com/
>> https://lore.kernel.org/all/CABXGCsM2VLs489CH-vF-1539-s3in37=bwuOWtoeeE+q26zE+Q@mail.gmail.com/
>>
>> Instead of further bisection I've applied Mario's revert from the
>> first link
>> on top of 6.8-rc1 and the issue seems gone for me now.
> 
> Thanks for confirming.  I don't think we should jump right to the revert
> right now.
>
>  I posted it in case that is the direction we need to go
> (simple git revert didn't work due to contextual changes).
> 
> Let's give the folks who work on GPU scheduler some time to understand
> the failure and see if they can fix it.

..how you think about this and other situations like this. Given that
we have

* two affected people in this thread
* one earlier thread about it
* the machine that made Mario write the patch
* and I have someone in #fedora-kernel that likely is affected as well

it seems that this is not some corner case very few people run into.
Hence I tend to say that this should be dealt with rather sooner than
later. Maybe before rc2? Or is this asking too much?

The thing from my point of view is, that each such problem might
discourage testers from testing again or lead to thoughts like "I only
start testing after -rc4". Not to mention that other people will try to
bisect the problem like Vlastimil did, which will cost them quite some
time and effort -- only to find out that we known about the problem
already and did not quickly fix it. That is discouraging for them as
well and thus bad for field testing I'd assume.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ