lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Sun, 5 May 2024 07:37:09 -0500
From: Mario Limonciello <superm1@...il.com>
To: Linux regressions mailing list <regressions@...ts.linux.dev>,
 Micha Albert <kernel@...ha.zone>
Cc: "stable@...r.kernel.org" <stable@...r.kernel.org>,
 "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
 Mario Limonciello <mario.limonciello@....com>
Subject: Re: [REGRESSION] Thunderbolt Host Reset Change Causes eGPU
 Disconnection from 6.8.7=>6.8.8



On 5/4/24 23:59, Linux regression tracking (Thorsten Leemhuis) wrote:
> [CCing Mario, who asked for the two suspected commits to be backported]
> 
> On 05.05.24 03:12, Micha Albert wrote:
>>
>>      I have an AMD Radeon 6600 XT GPU in a cheap Thunderbolt eGPU board.
>> In 6.8.7, this works as expected, and my Plymouth screen (including the
>> LUKS password prompt) shows on my 2 monitors connected to the GPU as
>> well as my main laptop screen. Upon entering the password, I'm put into
>> userspace as expected. However, upon upgrading to 6.8.8, I will be
>> greeted with the regular password prompt, but after entering my password
>> and waiting for it to be accepted, my eGPU will reset and not function.
>> I can tell that it resets since I can hear the click of my ATX power
>> supply turning off and on again, and the status LED of the eGPU board
>> goes from green to blue and back to green, all in less than a second.
>>
>>     I talked to a friend, and we found out that the kernel parameter
>> thunderbolt.host_reset=false fixes the issue. He also thinks that
>> commits cc4c94 (59a54c upstream) and 11371c (ec8162 upstream) look
>> suspicious. I've attached the output of dmesg when the error was
>> occurring, since I'm still able to use my laptop normally when this
>> happens, just not with my eGPU and its connected displays.
> 
> Thx for the report. Could you please test if 6.9-rc6 (or a later
> snapshot; or -rc7, which should be out in about ~18 hours) is affected
> as well? That would be really important to know.
> 
> It would also be great if you could try reverting the two patches you
> mentioned and see if they are really what's causing this. There iirc are
> two more; maybe you might need to revert some or all of them in the
> order they were applied.

There are two other things that I think would be good to understand this 
issue.

1) Is it related to trusted devices handling?

You can try to apply it both to 6.8.y or to 6.9-rc.

https://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu.git/commit/?h=iommu/fixes&id=0f91d0795741c12cee200667648669a91b568735

2) Is it because you have amdgpu in your initramfs but not thunderbolt?

If so; there's very likely an ordering issue.

[    2.325788] [drm] GPU posting now...
[   30.360701] ACPI: bus type thunderbolt registered

Can you remove amdgpu from your initramfs and wait for it to startup 
after you pivot rootfs?  Does this still happen?

> 
> Ciao, Thorsten
> 
> P.s.: To be sure the issue doesn't fall through the cracks unnoticed,
> I'm adding it to regzbot, the Linux kernel regression tracking bot:
> 
> #regzbot ^introduced v6.8.7..v6.8.8
> #regzbot title thunderbolt: eGPU disconnected during boot
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ