lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Date:   Tue, 19 Dec 2017 05:47:22 +0000
From:   Jordan Henderson <jhenderson@...group.org>
To:     "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: PROBLEM: Possible bug in AMDGPU DC code?


Hello,

I have not posted to LKML before, so I apologize if this is a cumbersome area to place this message.

I purchased the recently-released HP envy x360 laptop which has a Ryzen 2500U APU with a Vega 10 GPU. After setting up Slackware on the laptop, I compiled kernel 4.15-rc2 while enabling the AMDGPU DC code to try and  test out the current functionality. The result is that most of the time, the boot process seems to get hung at

"Switching to amdgpudrmfb from EFI VGA"

Very rarely the boot will succeed and everything seems to go smoothly. Adding "nomodeset" to the kernel parameters causes the boot to always succeed, at the cost of course of disabling amdgpu from working correctly,  since it requires modesetting.

I have also tried the same process within Ubuntu 17.10 and also using kernels 4.15-rc3 and 4.15-rc4 with the same results. The only way I was able to capture system output which seemed relevant was by blacklisting amdgpu  and then modprobing it once in my desktop environment, which promptly caused my system to freeze, but seemed to reveal some information about an MCE hardware error. Unfortunately it seems mcelog doesn't support Ryzen yet, so I can't retrieve any useful information  that way. However, /var/log/syslog did seem to cough up a little bit more, specifically:


Dec 19 04:23:44 darkstar kernel: [ 1139.605187] amdgpu 0000:03:00.0: [mmhub] VMC page fault (src_id:0 ring:153 vm_id:0 pas_id:0)
Dec 19 04:23:44 darkstar kernel: [ 1139.605191] amdgpu 0000:03:00.0:   at page 0x0000000000000000 from 18
Dec 19 04:23:44 darkstar kernel: [ 1139.605193] amdgpu 0000:03:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x00000000
Dec 19 04:23:44 darkstar kernel: [ 1139.605206] [Hardware Error]: Deferred error, no action required.
Dec 19 04:23:44 darkstar kernel: [ 1139.605212] [Hardware Error]: CPU:0 (17:11:0) MC20_STATUS[-|-|MiscV|-|AddrV|Deferred|-|SyndV|-|UECC]: 0x9c2030000001085b
Dec 19 04:23:44 darkstar kernel: [ 1139.605218] [Hardware Error]: Error Addr: 0x00007ffcffffff00
Dec 19 04:23:44 darkstar kernel: [ 1139.605220] [Hardware Error]: IPID: 0x0000002e00000000, Syndrome: 0x000000005b240205
Dec 19 04:23:44 darkstar kernel: [ 1139.605224] [Hardware Error]: Coherent Slave Extended Error Code: 1
Dec 19 04:23:44 darkstar kernel: [ 1139.605225] [Hardware Error]: Coherent Slave Error: Address violation.
Dec 19 04:23:44 darkstar kernel: [ 1139.605228] [Hardware Error]: cache level: L3/GEN, mem/io: IO, mem-tx: IRD, part-proc: SRC (no timeout)

 which at least appear to be related.

As I have not heard much else in the way of issues using the AMDGPU DC code, I believe that this is a problem localized to this particular laptop/BIOS/hardware configuration. Using the modprobe method, I have attached  everything that I have been able to capture up to the system hang which I believe is relevant or which has been suggested by the bug reporting FAQ; please let me know if there is more information that would be useful.
    
Download attachment "cpuinfo" of type "application/octet-stream" (10704 bytes)

Download attachment "dmesg" of type "application/octet-stream" (59769 bytes)

Download attachment "iomem" of type "application/octet-stream" (2803 bytes)

Download attachment "ioports" of type "application/octet-stream" (1379 bytes)

Download attachment "lspci" of type "application/octet-stream" (42444 bytes)

Download attachment "messages" of type "application/octet-stream" (85485 bytes)

Download attachment "modules" of type "application/octet-stream" (6093 bytes)

Download attachment "scsi" of type "application/octet-stream" (336 bytes)

Download attachment "syslog" of type "application/octet-stream" (9432 bytes)

Download attachment "ver_linux" of type "application/octet-stream" (2071 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ