lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <12d950ee-4152-4ad6-b93e-7c5b75804b1a@nvidia.com>
Date: Wed, 12 Mar 2025 12:24:15 +1100
From: Balbir Singh <balbirs@...dia.com>
To: Bert Karwatzki <spasswolf@....de>
Cc: Ingo Molnar <mingo@...nel.org>, Kees Cook <kees@...nel.org>,
 Bjorn Helgaas <bhelgaas@...gle.com>,
 Linus Torvalds <torvalds@...ux-foundation.org>,
 Peter Zijlstra <peterz@...radead.org>, Andy Lutomirski <luto@...nel.org>,
 linux-kernel@...r.kernel.org
Subject: Re: commit 7ffb791423c7 breaks steam game

On 3/12/25 10:09, Bert Karwatzki wrote:
> Am Mittwoch, dem 12.03.2025 um 09:10 +1100 schrieb Balbir Singh:
>>
>>
>> Thanks, so the issue is specific to the game and running it?
>>
>>>> 3. For some weird reason my kernel does not recongnize the nokaslr cmdline
>>>> parameter, so I build a kernel without CONFIG_RANDOMIZE_BASE and this does NOT
>>>> fix the issue.
>>
>> Can you clarify if you're booting with the compressed image bzImage/vmlinuz or
>> with vmlinux?
> 
> I'm booting vmlinuz images (the kernel is compile via make -j16 bindeb-pkg which
> gives debian packages which I install with dpkg).

Thanks

> 
>>>> 4. Most surprisingly removing CONFIG_PCI_P2PDMA also does NOT fix the issue.
>>>>
>>
>>
>>>
>>> I've done more experimenting regarding 4.:
>>> next-20250307 with "CONFIG_RANDOMIZE_BASE=y" AND "CONFIG_PCI_P2PDMA is not set"
>>> works as expected (i.e. no input lag when stellaris is running)
>>>
>>> next-20250307 with "CONFIG_RANDOMIZE_BASE is not set" AND "CONFIG_PCI_P2PDMA is
>>> not set" also shows the buggy behaviour (i.e. input lag when stellaris is
>>> running) (this was the configuration I tested before)
>>
>> This is an interesting experiment, I am beginning to wonder if the system relies
>> on a reduced direct map for the game to work correctly. Can you also check in this
>> scenario if CONFIG_RANDOMIZE_MEMORY is disabled?
>>
> I'm on it.
> 
>> Can you please share the dmesg
>> 1. output before and after the changes?
>> 2. Do you see any warnings/errors in journalctl or game specific log files?
>> 3. lspci -vvv output before and after the changes?
>>
> 
> My dmesg shows a warning, but this is seems to be unrelated (it's present in
> both the working and non-working case and also in 6.12.17). I have not bisected
> this, yet. I also tried CONFIG_LOCKDEP=y in next-20250307 (with and without the
> revert) and got a warning about a possible deadlock in Networkmanager in both
> cases (also not bisected, yet)
> 
> [ 11.241282] [ T1751] WARNING: CPU: 14 PID: 1751 at mm/util.c:674
> __kvmalloc_node_noprof+0xa2/0xb0
> [   11.241289] [   T1751] Modules linked in: snd_seq_dummy snd_hrtimer
> snd_seq_midi snd_seq_midi_event snd_seq rfcomm bnep nls_ascii nls_cp437 vfat fat
> snd_ctl_led snd_hda_codec_realtek snd_hda_codec_generic snd_hda_scodec_component
> btusb snd_hda_codec_hdmi btrtl btintel btbcm btmtk snd_hda_intel snd_usb_audio
> snd_intel_dspcfg uvcvideo snd_acp3x_pdm_dma snd_soc_dmic snd_acp3x_rn
> snd_usbmidi_lib snd_hda_codec videobuf2_vmalloc snd_ump videobuf2_memops uvc
> bluetooth snd_soc_core videobuf2_v4l2 snd_hwdep snd_hda_core snd_rawmidi
> videodev snd_seq_device snd_pcm_oss snd_mixer_oss snd_rn_pci_acp3x snd_pcm
> snd_acp_config videobuf2_common msi_wmi snd_soc_acpi ecdh_generic ecc mc
> sparse_keymap edac_mce_amd snd_timer wmi_bmof k10temp snd snd_pci_acp3x ccp
> soundcore ac battery button joydev hid_sensor_als hid_sensor_gyro_3d
> hid_sensor_prox hid_sensor_accel_3d hid_sensor_magn_3d hid_sensor_trigger
> amd_pmc industrialio_triggered_buffer kfifo_buf industrialio evdev
> hid_sensor_iio_common mt7921e mt7921_common mt792x_lib mt76_connac_lib mt76
> [   11.241354] [   T1751]  mac80211 libarc4 cfg80211 rfkill msr fuse
> nvme_fabrics efi_pstore configfs efivarfs autofs4 ext4 mbcache jbd2 amdgpu
> usbhid amdxcp i2c_algo_bit drm_client_lib drm_ttm_helper ttm drm_exec gpu_sched
> xhci_pci drm_suballoc_helper drm_panel_backlight_quirks xhci_hcd cec
> hid_sensor_hub hid_multitouch mfd_core drm_buddy hid_generic drm_display_helper
> usbcore i2c_hid_acpi psmouse nvme amd_sfh i2c_hid drm_kms_helper serio_raw hid
> nvme_core r8169 i2c_piix4 usb_common i2c_smbus crc16 i2c_designware_platform
> i2c_designware_core
> [   11.241391] [   T1751] CPU: 14 UID: 1000 PID: 1751 Comm: gst-plugin-scan Not
> tainted 6.14.0-rc6-nop2pdma #559
> [   11.241394] [   T1751] Hardware name: Micro-Star International Co., Ltd.
> Alpha 15 B5EEK/MS-158L, BIOS E158LAMS.107 11/10/2021
> [   11.241396] [   T1751] RIP: 0010:__kvmalloc_node_noprof+0xa2/0xb0
> [   11.241400] [   T1751] Code: 00 49 b9 63 01 00 00 00 00 00 80 68 00 04 00 00
> 4c 23 0d 79 0d ea 00 48 01 d1 e8 c9 af 03 00 48 83 c4 18 eb 9a 80 e7 20 75 95
> <0f> 0b eb 91 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90
> [   11.241402] [   T1751] RSP: 0018:ffffa32dc63abcc0 EFLAGS: 00010246
> [   11.241403] [   T1751] RAX: 0000000000000000 RBX: 00000000000000c0 RCX:
> 0000000000000000
> [   11.241405] [   T1751] RDX: 0000000000000000 RSI: 0000000000000017 RDI:
> 0000000000052cc0
> [   11.241406] [   T1751] RBP: 00000005c2980d00 R08: ffffa32dc63abe00 R09:
> ffffa32dc63abe10
> [   11.241407] [   T1751] R10: 0000000000000018 R11: 0000000000000000 R12:
> 00000000ffffffff
> [   11.241408] [   T1751] R13: ffff8c7e8d480010 R14: 00000005c2980d00 R15:
> ffffa32dc63abd28
> [   11.241410] [   T1751] FS:  00007fc1f34ed680(0000) GS:ffff8c8d2e780000(0000)
> knlGS:0000000000000000
> [   11.241412] [   T1751] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   11.241413] [   T1751] CR2: 00007fc1e0e2850b CR3: 00000002a28b8000 CR4:
> 0000000000750ef0
> [   11.241414] [   T1751] PKRU: 55555554
> [   11.241415] [   T1751] Call Trace:
> [   11.241417] [   T1751]  <TASK>
> [   11.241419] [   T1751]  ? __warn.cold+0x90/0x9e
> [   11.241423] [   T1751]  ? __kvmalloc_node_noprof+0xa2/0xb0
> [   11.241426] [   T1751]  ? report_bug+0xfa/0x140
> [   11.241430] [   T1751]  ? handle_bug+0x53/0x90
> [   11.241432] [   T1751]  ? exc_invalid_op+0x17/0x70
> [   11.241435] [   T1751]  ? asm_exc_invalid_op+0x1a/0x20
> [   11.241438] [   T1751]  ? __kvmalloc_node_noprof+0xa2/0xb0
> [   11.241442] [   T1751]  amdgpu_bo_create_list_entry_array+0x38/0x150 [amdgpu]
> [   11.241810] [   T1751]  ? rt_spin_unlock+0x12/0x40
> [   11.241815] [   T1751]  ? srso_alias_return_thunk+0x5/0xfbef5
> [   11.241821] [   T1751]  amdgpu_bo_list_ioctl+0x47/0x340 [amdgpu]
> [   11.242282] [   T1751]  ? __pfx_amdgpu_bo_list_ioctl+0x10/0x10 [amdgpu]
> [   11.242622] [   T1751]  drm_ioctl_kernel+0xa3/0xf0
> [   11.242627] [   T1751]  drm_ioctl+0x25e/0x4e0
> [   11.242630] [   T1751]  ? __pfx_amdgpu_bo_list_ioctl+0x10/0x10 [amdgpu]
> [   11.242930] [   T1751]  ? srso_alias_return_thunk+0x5/0xfbef5
> [   11.242934] [   T1751]  ? srso_alias_return_thunk+0x5/0xfbef5
> [   11.242936] [   T1751]  ? srso_alias_return_thunk+0x5/0xfbef5
> [   11.242938] [   T1751]  ? srso_alias_return_thunk+0x5/0xfbef5
> [   11.242941] [   T1751]  amdgpu_drm_ioctl+0x46/0x80 [amdgpu]
> [   11.243238] [   T1751]  __x64_sys_ioctl+0x92/0xc0
> [   11.243244] [   T1751]  do_syscall_64+0x5f/0x1a0
> [   11.243249] [   T1751]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [   11.243253] [   T1751] RIP: 0033:0x7fc1f381c8db
> [   11.243255] [   T1751] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24
> 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05
> <89> c2 3d 00 f0 ff ff 77 1c 48 8b 44 24 18 64 48 2b 04 25 28 00 00
> [   11.243257] [   T1751] RSP: 002b:00007ffcb8530080 EFLAGS: 00000246 ORIG_RAX:
> 0000000000000010
> [   11.243260] [   T1751] RAX: ffffffffffffffda RBX: 0000564fd56bc7a0 RCX:
> 00007fc1f381c8db
> [   11.243261] [   T1751] RDX: 00007ffcb8530130 RSI: 00000000c0106443 RDI:
> 0000000000000006
> [   11.243262] [   T1751] RBP: 00007ffcb8530130 R08: 0000000000000000 R09:
> 0000000000000000
> [   11.243263] [   T1751] R10: 000000000000002b R11: 0000000000000246 R12:
> 00000000c0106443
> [   11.243265] [   T1751] R13: 0000000000000006 R14: 00007ffcb85301a0 R15:
> 0000564fd56bc7b8
> [   11.243269] [   T1751]  </TASK>
> [   11.243270] [   T1751] ---[ end trace 0000000000000000 ]---
> 
> 

This warning indicates that a kvmalloc() was requested with a size > INT_MAX and
failed (NULL was returned)

>>
>>>
>>> As a sidenote, I've tested several kernel with nokaslr as command line parameter
>>> (6.1.128, 6.8.12, 6.12.17 (the debian sid distributional kernel)) and nokaslr is
>>> not recognized as a command line parameter in any of them
>>>
>>
>> Please see my comment above about booting. How did you check if nokaslr is being
>> recognized, is it via looking up dmesg?
>>
> When I boot with nokaslr I get the following messages in dmesg
> [    T0] Unknown kernel command line parameters "nokaslr
> BOOT_IMAGE=/boot/vmlinuz-6.14.0-rc5-next-20250307-master", will be passed to
> user space.
> 
> This also happens when I use the debian kernel with standard .config

That is quite strange, I can see nokaslr handling in choose_random_location() in
arch/x86/boot/compressed/kaslr.c (which depends on CONFIG_RANDOMIZE_BASE)

Thanks,
Balbir

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ