lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Date:   Fri, 12 Aug 2022 23:24:11 +0200
From:   Baltazár Radics <baltazar.radics@...il.com>
To:     intel-gfx@...ts.freedesktop.org, linux-kernel@...r.kernel.org
Subject: Intel gpu memory corruption

Hello!

My laptop (ThinkPad T460) seems to have a memory corruption issue that
only occures when the gpu is in use (it has `Intel Corporation Skylake
GT2 [HD Graphics 520] (rev 07)` as reported by lspci).

I haven't been able to reproduce the corruption with standard memory
testing utilities like lenovo's builtin hardware diagnostic tool,
memtest86+, or even the user-space program memtester when it's the only
thing running.

However, running memtester alongside vkmark for example can reproduce
it quite consistently. It will always be a single address for a given
instance of memtester, but looking into /proc/[pid]/pagemap revealed
that seemingly it's always the same hardware address.

With this information, I think I managed to stop it from happening by
appending `memmap=4K$0x1F9D7C000` to my kernel commandline to stop that
address from being allocated. Since then I haven't been able to catch
it with memtester, but I did have a crash that kinda resembled the ones
I had earlier. Many processes segfaulted and I had some `Bad swap file
entry` errors in my dmesg.

I haven't been able to do testing on other OSes yet, but since none of
the regular memtests have found any issues, I'm fairly certain this is
not a hardware issue with my ram. Could still be a hardware issue with
the gpu itself, but for now I'm guessing this is a gpu driver bug.

Is there anything else I can test to confirm that this is i915's fault,
and if so, anything I can do to help track down the bug?

Thanks!

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ