lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250322122351.3268-1-spasswolf@web.de>
Date: Sat, 22 Mar 2025 13:23:48 +0100
From: Bert Karwatzki <spasswolf@....de>
To: Balbir Singh <balbirs@...dia.com>
Cc: Bert Karwatzki <spasswolf@....de>,
	Ingo Molnar <mingo@...nel.org>,
	Kees Cook <kees@...nel.org>,
	Bjorn Helgaas <bhelgaas@...gle.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Andy Lutomirski <luto@...nel.org>,
	Christian König <christian.koenig@....com>,
	Alex Deucher <alexander.deucher@....com>,
	linux-kernel@...r.kernel.org,
	amd-gfx@...ts.freedesktop.org
Subject: RE: commit 7ffb791423c7 breaks steam game

The problem occurs in this part of ttm_tt_populate(), in the nokaslr case
the loop is entered and repeatedly run because ttm_dma32_pages allocated exceeds
the ttm_dma32_pages_limit which leads to lots of calls to ttm_global_swapout().

if (!strcmp(get_current()->comm, "stellaris"))
	printk(KERN_INFO "%s: ttm_pages_allocated=0x%llx ttm_pages_limit=0x%lx ttm_dma32_pages_allocated=0x%llx ttm_dma32_pages_limit=0x%lx\n",
			__func__, ttm_pages_allocated.counter, ttm_pages_limit, ttm_dma32_pages_allocated.counter, ttm_dma32_pages_limit);
while (atomic_long_read(&ttm_pages_allocated) > ttm_pages_limit ||
       atomic_long_read(&ttm_dma32_pages_allocated) >
       ttm_dma32_pages_limit) {

	if (!strcmp(get_current()->comm, "stellaris"))
	printk(KERN_INFO "%s: count=%d ttm_pages_allocated=0x%llx ttm_pages_limit=0x%lx ttm_dma32_pages_allocated=0x%llx ttm_dma32_pages_limit=0x%lx\n",
			__func__, count++, ttm_pages_allocated.counter, ttm_pages_limit, ttm_dma32_pages_allocated.counter, ttm_dma32_pages_limit);
	ret = ttm_global_swapout(ctx, GFP_KERNEL);
	if (ret == 0)
		break;
	if (ret < 0)
		goto error;
}

In the case without nokaslr on the number of ttm_dma32_pages_allocated is 0 because
use_dma32 == false in this case.

So why is use_dma32 enabled with nokaslr? Some more printk()s give this result:

The GPUs:
built-in:
08:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Cezanne [Radeon Vega Series / Radeon Vega Mobile Series] (rev c5)
discrete:
03:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Navi 23 [Radeon RX 6600/6600 XT/6600M] (rev c3)

With nokaslr:
[    1.266517] [    T328] dma_addressing_limited: mask = 0xfffffffffff bus_dma_limit = 0x0 required_mask = 0xfffffffff
[    1.266519] [    T328] dma_addressing_limited: ops = 0000000000000000 use_dma_iommu(dev) = 0
[    1.266520] [    T328] dma_direct_all_ram_mapped: returning true
[    1.266521] [    T328] dma_addressing_limited: returning ret = 0
[    1.266521] [    T328] amdgpu 0000:03:00.0: amdgpu: amdgpu_ttm_init: calling ttm_device_init() with use_dma32 = 0
[    1.266525] [    T328] entering ttm_device_init, use_dma32 = 0
[    1.267115] [    T328] entering ttm_pool_init, use_dma32 = 0

[    3.965669] [    T328] dma_addressing_limited: mask = 0xfffffffffff bus_dma_limit = 0x0 required_mask = 0x3fffffffffff
[    3.965671] [    T328] dma_addressing_limited: returning true
[    3.965672] [    T328] amdgpu 0000:08:00.0: amdgpu: amdgpu_ttm_init: calling ttm_device_init() with use_dma32 = 1
[    3.965674] [    T328] entering ttm_device_init, use_dma32 = 1
[    3.965747] [    T328] entering ttm_pool_init, use_dma32 = 1

Without nokaslr:
[    1.300907] [    T351] dma_addressing_limited: mask = 0xfffffffffff bus_dma_limit = 0x0 required_mask = 0xfffffffff
[    1.300909] [    T351] dma_addressing_limited: ops = 0000000000000000 use_dma_iommu(dev) = 0
[    1.300910] [    T351] dma_direct_all_ram_mapped: returning true
[    1.300910] [    T351] dma_addressing_limited: returning ret = 0
[    1.300911] [    T351] amdgpu 0000:03:00.0: amdgpu: amdgpu_ttm_init: calling ttm_device_init() with use_dma32 = 0
[    1.300915] [    T351] entering ttm_device_init, use_dma32 = 0
[    1.301210] [    T351] entering ttm_pool_init, use_dma32 = 0

[    4.000602] [    T351] dma_addressing_limited: mask = 0xfffffffffff bus_dma_limit = 0x0 required_mask = 0xfffffffffff
[    4.000603] [    T351] dma_addressing_limited: ops = 0000000000000000 use_dma_iommu(dev) = 0
[    4.000604] [    T351] dma_direct_all_ram_mapped: returning true
[    4.000605] [    T351] dma_addressing_limited: returning ret = 0
[    4.000606] [    T351] amdgpu 0000:08:00.0: amdgpu: amdgpu_ttm_init: calling ttm_device_init() with use_dma32 = 0
[    4.000610] [    T351] entering ttm_device_init, use_dma32 = 0
[    4.000687] [    T351] entering ttm_pool_init, use_dma32 = 0

So with nokaslr the reuqired mask for the built-in GPU changes from 0xfffffffffff
to 0x3fffffffffff which causes dma_addressing_limited to return true which causes
the ttm_device init to be called with use_dma32 = true.
 It also show that for the discreate GPU nothing changes so the bug does not occur
there.

I also was able to work around the bug by calling ttm_device_init() with use_dma32=false
from amdgpu_ttm_init()  (drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c) but I'm not sure if this
has unwanted side effects.

int amdgpu_ttm_init(struct amdgpu_device *adev)
{
	uint64_t gtt_size;
	int r;

	mutex_init(&adev->mman.gtt_window_lock);

	dma_set_max_seg_size(adev->dev, UINT_MAX);
	/* No others user of address space so set it to 0 */
	dev_info(adev->dev, "%s: calling ttm_device_init() with use_dma32 = 0 ignoring %d\n", __func__, dma_addressing_limited(adev->dev));
	r = ttm_device_init(&adev->mman.bdev, &amdgpu_bo_driver, adev->dev,
			       adev_to_drm(adev)->anon_inode->i_mapping,
			       adev_to_drm(adev)->vma_offset_manager,
			       adev->need_swiotlb,
			       false /* use_dma32 */);
	if (r) {
		DRM_ERROR("failed initializing buffer object driver(%d).\n", r);
		return r;
	}


Bert Karwatzki

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ