lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID:
 <PAWP195MB2417F79B41A8DD5383F041E9F2B5A@PAWP195MB2417.EURP195.PROD.OUTLOOK.COM>
Date: Tue, 23 Dec 2025 16:31:47 +0000
From: Matthew Stone <stoneygit@...mail.co.uk>
To: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Thermal shutdown of my PC seems to be AMD GPU related to kernel
 6.18.2

Hi

Apologies if this is a bad bug report, i've just been reading the guidelines. Please let me know if you want me to run any testing, diagnostics or anything else you might need. I'm happy to support! Linux Rocks! Great work!

Issue bypassed by downgrading Kernel to 6.17.6 (which I know was good for my machine). I'm quite new so I've since now got the LTS kernel ready for boot in grub.

System:
                 -`                     matt@...tArch
                .o+`                    -------------
               `ooo/                    OS: Arch Linux x86_64
              `+oooo:                   Host: MS-7C37 (3.0)
             `+oooooo:                  Kernel: Linux 6.17.6-arch1-1
             -+oooooo+:                 Uptime: 56 mins
           `/:-:++oooo+:                Packages: 1330 (pacman), 41 (flatpak), 12 (snap)
          `/++++/+++++++:               Shell: bash 5.3.9
         `/++++++++++++++:              Display (G274QPF-QD): 2560x1440 in 27", 144 Hz [External, HDR]
        `/+++ooooooooooooo/`            DE: KDE Plasma 6.5.4
       ./ooosssso++osssssso+`           WM: KWin (Wayland)
      .oossssso-````/ossssss+`          WM Theme: Oxygen
     -osssssso.      :ssssssso.         Theme: Default# (KvArcDark) [Qt], Breeze-Dark [GTK2], Breeze [GTK3/4]
    :osssssss/        osssso+++.        Icons: breeze-dark [Qt], breeze-dark [GTK2/3/4]
   /ossssssss/        +ssssooo/-        Font: Cantarell (10pt) [Qt], Cantarell (10pt) [GTK2/3/4]
 `/ossssso+/:-        -:/+osssso+-      Cursor: breeze (24px)
`+sso+:-`                 `.-/+oso:     Terminal: konsole 25.12.0
`++:.                           `-/+/    CPU: AMD Ryzen 9 3900X (24) @ 4.15 GHz
.`                                 `/    GPU: AMD Radeon RX 9070 XT [Discrete]
                                        Memory: 5.38 GiB / 31.28 GiB (17%)
                                        Swap: 0 B / 4.00 GiB (0%)
                                        Disk (/): 1.03 TiB / 1.79 TiB (58%) - ext4
                                        Local IP (enp39s0): 192.168.1.179/24
                                        Locale: en_GB.UTF-8


Issue:

So on Friday evening I was running some local AI models using Ollama qwen3:30b. After about 2 minutes the desktop and system was freezing. After a few restarts the same thing kept happening, I did run a pacman -Syu and try again so not sure if an updated kernel was loaded but the issue persisted anyway. Not wanting to do a deep dive on the issue on Friday I left it for the evening and figured to deal with it later.

Saturday I decided to play some Terminator Resistance after 20 to 30 minutes the system shutdown suddenly. When I felt the desktop tower it was extremely hot which led me to believe I'd experienced a thermal shutdown. Once the system had cooled I started it back up and made a decision that the local AI issue and the Thermal shutdown was likely related. So I loaded the local AI model and again had the system freeze after about 2 - 3 minutes, once I booted back up I ran journalctl -b.

Not resolved but bypassed:

The errors I found in journalctl -b related to amdgpu: ring comp_1.1.0 which when I googled I found forum posts related to Kernel 6.5 which had the same error message and at that point decided to downgrade the kernel.

After downgrading the kernel to 6.17.6 I then tested the AI model which worked no problem and continued to work (temps stable in btop). I then loaded Terminator Resistance and ensured I had the steam overlay including temps on it... After gaming for 1.5 hours all temps stable at around 250fps, no thermal shutdown.

Observations:

I did notice that when I was running the local AI model after the thermal shutdown on kernel 6.18.2 that it was pulling around 350 watts although according to BTOP  the temps were actually ok. After downgrading to 6.17.6 the watts the local AI was pulling averaged around 140 and spiked once to around 220. (My GPU is designed to pull this amount of watts and I have a 1000watt PSU, although I can't rule out that that my PSU could be on it's way out).

I also posted this on facebook to the Linux Fan Page and another Arch user experienced the same issue, however they dealt with it by setting new manual fan control profiles for their GPU. For me personally I would rather leave them as Auto for now if possible.

Conclusion:

More than happy to reload the kernel and do any testing + download any logs or anything that might be needed, for now though the issue for me at least appears to be related to the later kernels. I'm not a software dev so I'll hopefully leave this with people who know what their doing.

Cheers

Matt

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ