[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5e95a560-ec78-4e9f-a4f6-fb98f033eab2@alu.unizg.hr>
Date: Wed, 24 Jan 2024 18:48:39 +0100
From: Mirsad Todorovac <mirsad.todorovac@....unizg.hr>
To: "Ma, Jun" <majun@....com>, linux-kernel@...r.kernel.org,
amd-gfx@...ts.freedesktop.org
Cc: Sathishkumar S <sathishkumar.sundararaju@....com>,
Lijo Lazar <lijo.lazar@....com>,
Srinivasan Shanmugam <srinivasan.shanmugam@....com>,
Guchun Chen <guchun.chen@....com>, Lang Yu <Lang.Yu@....com>,
Felix Kuehling <Felix.Kuehling@....com>, "Pan, Xinhui" <Xinhui.Pan@....com>,
dri-devel@...ts.freedesktop.org, Marek Olšák
<marek.olsak@....com>, Boyuan Zhang <boyuan.zhang@....com>,
Daniel Vetter <daniel@...ll.ch>, David Francis <David.Francis@....com>,
Alex Deucher <alexander.deucher@....com>, David Airlie <airlied@...il.com>,
Christian König <christian.koenig@....com>
Subject: Re: BUG [RESEND][NEW BUG]: kernel NULL pointer dereference, address:
0000000000000008
Hi, Ma Jun,
Normally, I would reply under the quoted text, but I will adjust to your convention.
I have just discovered that your patch causes Ubuntu 22.04 LTS GNOME XWayland session
to block at typing password and ENTER in the graphical logon screen (tested several times).
After that, I was not able to even log from another box with ssh, or the session would
block (tested one time, second time too, thrid time it passed after I connected before
attempt to login on XWayland console).
You might find useful syslog and dmesg of the freeze on this link (they were +100K):
https://magrf.grf.hr/~mtodorov/linux/bugreports/6.7.0/amdgpu/6.7.0-xway-09721-g61da593f4458/
The exact applied patch was this:
marvin@...iant:~/linux/kernel/linux_torvalds$ git diff
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
index 73f6d7e72c73..6ef333df9adf 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c
@@ -3996,16 +3996,13 @@ static int gfx_v10_0_init_microcode(struct amdgpu_device *adev)
if (!amdgpu_sriov_vf(adev)) {
snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_rlc.bin", ucode_prefix);
- err = amdgpu_ucode_request(adev, &adev->gfx.rlc_fw, fw_name);
- /* don't check this. There are apparently firmwares in the wild with
- * incorrect size in the header
- */
- if (err == -ENODEV)
- goto out;
+ err = request_firmware(&adev->gfx.rlc_fw, fw_name, adev->dev);
if (err)
- dev_dbg(adev->dev,
- "gfx10: amdgpu_ucode_request() failed \"%s\"\n",
- fw_name);
+ goto out;
+
+ /* don't validate this firmware. There are apparently firmwares
+ * in the wild with incorrect size in the header
+ */
rlc_hdr = (const struct rlc_firmware_header_v2_0 *)adev->gfx.rlc_fw->data;
version_major = le16_to_cpu(rlc_hdr->header.header_version_major);
version_minor = le16_to_cpu(rlc_hdr->header.header_version_minor);
marvin@...iant:~/linux/kernel/linux_torvalds$ uname -rms
Linux 6.7.0-xway-09721-g61da593f4458 x86_64
marvin@...iant:~/linux/kernel/linux_torvalds$
So, there seems to be a problem with the way the patch affects XWayland.
Checked multiple times the exact commit with and without the diff.
Hope this helps, because I am not familiar with the amdgpu driver.
Best regards,
Mirsad Todorovac
On 1/22/24 09:34, Ma, Jun wrote:
> Perhaps similar to the problem I encountered earlier, you can
> try the following patch
>
> https://lists.freedesktop.org/archives/amd-gfx/2024-January/103259.html
>
> Regards,
> Ma Jun
>
> On 1/21/2024 3:54 AM, Mirsad Todorovac wrote:
>> Hi,
>>
>> The last email did not pass to the most of the recipients due to banned .xz attachment.
>>
>> As the .config is too big to send inline or uncompressed either, I will omit it in this
>> attempt. In the meantime, I had some success in decoding the stack trace, but sadly not
>> complete.
>>
>> I don't think this Oops is deterministic, but I am working on a reproducer.
>>
>> The platform is Ubuntu 22.04 LTS.
>>
>> Complete list of hardware and .config is available here:
>>
>> https://domac.alu.unizg.hr/~mtodorov/linux/bugreports/amdgpu/6.7.0-rtl-v02-nokcsan-09928-g052d534373b7/
>>
>> Best regards,
>> Mirsad
>>
>> -------------------------------------------------------------------------------------------
>> kernel: [ 5.576702] BUG: kernel NULL pointer dereference, address: 0000000000000008
>> kernel: [ 5.576707] #PF: supervisor read access in kernel mode
>> kernel: [ 5.576710] #PF: error_code(0x0000) - not-present page
>> kernel: [ 5.576712] PGD 0 P4D 0
>> kernel: [ 5.576715] Oops: 0000 [#1] PREEMPT SMP NOPTI
>> kernel: [ 5.576718] CPU: 9 PID: 650 Comm: systemd-udevd Not tainted 6.7.0-rtl-v0.2-nokcsan-09928-g052d534373b7 #2
>> kernel: [ 5.576723] Hardware name: ASRock X670E PG Lightning/X670E PG Lightning, BIOS 1.21 04/26/2023
>> kernel: [ 5.576726] RIP: 0010:gfx_v10_0_early_init (drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:4009 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:7478) amdgpu
>> kernel: [ 5.576872] Code: 8d 55 a8 4c 89 ff e8 e4 83 ec ff 41 89 c2 83 f8 ed 0f 84 b3 fd ff ff 85 c0 74 05 0f 1f 44 00 00 49 8b 87 08 87 01 00 4c 89 ff <48> 8b 40 08 0f b7 50 0a 0f b7 70 08 e8 e4 42 fb ff 41 89 c2 85 c0
>> All code
>> ========
>> 0: 8d 55 a8 lea -0x58(%rbp),%edx
>> 3: 4c 89 ff mov %r15,%rdi
>> 6: e8 e4 83 ec ff call 0xffffffffffec83ef
>> b: 41 89 c2 mov %eax,%r10d
>> e: 83 f8 ed cmp $0xffffffed,%eax
>> 11: 0f 84 b3 fd ff ff je 0xfffffffffffffdca
>> 17: 85 c0 test %eax,%eax
>> 19: 74 05 je 0x20
>> 1b: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
>> 20: 49 8b 87 08 87 01 00 mov 0x18708(%r15),%rax
>> 27: 4c 89 ff mov %r15,%rdi
>> 2a:* 48 8b 40 08 mov 0x8(%rax),%rax <-- trapping instruction
>> 2e: 0f b7 50 0a movzwl 0xa(%rax),%edx
>> 32: 0f b7 70 08 movzwl 0x8(%rax),%esi
>> 36: e8 e4 42 fb ff call 0xfffffffffffb431f
>> 3b: 41 89 c2 mov %eax,%r10d
>> 3e: 85 c0 test %eax,%eax
>>
>> Code starting with the faulting instruction
>> ===========================================
>> 0: 48 8b 40 08 mov 0x8(%rax),%rax
>> 4: 0f b7 50 0a movzwl 0xa(%rax),%edx
>> 8: 0f b7 70 08 movzwl 0x8(%rax),%esi
>> c: e8 e4 42 fb ff call 0xfffffffffffb42f5
>> 11: 41 89 c2 mov %eax,%r10d
>> 14: 85 c0 test %eax,%eax
>> kernel: [ 5.576878] RSP: 0018:ffffa5b3c103f720 EFLAGS: 00010282
>> kernel: [ 5.576881] RAX: 0000000000000000 RBX: ffffffffc1d73489 RCX: 0000000000000000
>> kernel: [ 5.576884] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff91ae4fa80000
>> kernel: [ 5.576886] RBP: ffffa5b3c103f7b0 R08: 0000000000000000 R09: 0000000000000000
>> kernel: [ 5.576889] R10: 00000000ffffffea R11: 0000000000000000 R12: ffff91ae4fa986e8
>> kernel: [ 5.576892] R13: ffff91ae4fa986d8 R14: ffff91ae4fa986f8 R15: ffff91ae4fa80000
>> kernel: [ 5.576895] FS: 00007fdaa343c8c0(0000) GS:ffff91bd58440000(0000) knlGS:0000000000000000
>> kernel: [ 5.576898] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> kernel: [ 5.576900] CR2: 0000000000000008 CR3: 00000001222d0000 CR4: 0000000000750ef0
>> kernel: [ 5.576903] PKRU: 55555554
>> kernel: [ 5.576905] Call Trace:
>> kernel: [ 5.576907] <TASK>
>> kernel: [ 5.576909] ? show_regs (arch/x86/kernel/dumpstack.c:479)
>> kernel: [ 5.576914] ? __die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434)
>> kernel: [ 5.576917] ? page_fault_oops (arch/x86/mm/fault.c:707)
>> kernel: [ 5.576921] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>> kernel: [ 5.576925] ? crypto_alloc_tfmmem.isra.0 (crypto/api.c:497)
>> kernel: [ 5.576930] ? do_user_addr_fault (arch/x86/mm/fault.c:1264)
>> kernel: [ 5.576934] ? exc_page_fault (./arch/x86/include/asm/paravirt.h:693 arch/x86/mm/fault.c:1515 arch/x86/mm/fault.c:1563)
>> kernel: [ 5.576937] ? asm_exc_page_fault (./arch/x86/include/asm/idtentry.h:570)
>> kernel: [ 5.576942] ? gfx_v10_0_early_init (drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:4009 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:7478) amdgpu
>> kernel: [ 5.577056] amdgpu_device_init (drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:2465 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:4042) amdgpu
>> kernel: [ 5.577158] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>> kernel: [ 5.577161] ? pci_bus_read_config_word (drivers/pci/access.c:67 (discriminator 2))
>> kernel: [ 5.577166] ? pci_read_config_word (drivers/pci/access.c:563)
>> kernel: [ 5.577168] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>> kernel: [ 5.577171] ? do_pci_enable_device (drivers/pci/pci.c:1975 drivers/pci/pci.c:1949)
>> kernel: [ 5.577176] amdgpu_driver_load_kms (drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c:146) amdgpu
>> kernel: [ 5.577275] amdgpu_pci_probe (drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c:2237) amdgpu
>> kernel: [ 5.577373] local_pci_probe (drivers/pci/pci-driver.c:324)
>> kernel: [ 5.577377] pci_device_probe (drivers/pci/pci-driver.c:392 drivers/pci/pci-driver.c:417 drivers/pci/pci-driver.c:460)
>> kernel: [ 5.577381] really_probe (drivers/base/dd.c:579 drivers/base/dd.c:658)
>> kernel: [ 5.577386] __driver_probe_device (drivers/base/dd.c:800)
>> kernel: [ 5.577389] driver_probe_device (drivers/base/dd.c:830)
>> kernel: [ 5.577392] __driver_attach (drivers/base/dd.c:1217)
>> kernel: [ 5.577396] ? __pfx___driver_attach (drivers/base/dd.c:1157)
>> kernel: [ 5.577399] bus_for_each_dev (drivers/base/bus.c:368)
>> kernel: [ 5.577402] driver_attach (drivers/base/dd.c:1234)
>> kernel: [ 5.577405] bus_add_driver (drivers/base/bus.c:674)
>> kernel: [ 5.577409] driver_register (drivers/base/driver.c:246)
>> kernel: [ 5.577411] ? __pfx_amdgpu_init (drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c:2497) amdgpu
>> kernel: [ 5.577521] __pci_register_driver (drivers/pci/pci-driver.c:1456)
>> kernel: [ 5.577524] amdgpu_init (drivers/gpu/drm/amd/amdgpu/amdgpu_drvc:2805) amdgpu
>> kernel: [ 5.577628] do_one_initcall (init/main.c:1236)
>> kernel: [ 5.577632] ? kmalloc_trace (mm/slub.c:3816 mm/slub.c:3860 mm/slub.c:4007)
>> kernel: [ 5.577637] do_init_module (kernel/module/main.c:2533)
>> kernel: [ 5.577640] load_module (kernel/module/main.c:2984)
>> kernel: [ 5.577647] init_module_from_file (kernel/module/main.c:3151)
>> kernel: [ 5.577649] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>> kernel: [ 5.577652] ? init_module_from_file (kernel/module/main.c:3151)
>> kernel: [ 5.577657] idempotent_init_module (kernel/module/main.c:3168)
>> kernel: [ 5.577661] __x64_sys_finit_module (./include/linux/file.h:45 kernel/module/main.c:3190 kernel/module/main.c:3172 kernel/module/main.c:3172)
>> kernel: [ 5.577664] do_syscall_64 (arch/x86/entry/common.c:52 arch/x86/entry/common.c:83)
>> kernel: [ 5.577668] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>> kernel: [ 5.577671] ? ksys_mmap_pgoff (mm/mmap.c:1428)
>> kernel: [ 5.577675] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>> kernel: [ 5.577678] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>> kernel: [ 5.577681] ? syscall_exit_to_user_mode (kernel/entry/common.c:215)
>> kernel: [ 5.577684] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>> kernel: [ 5.577687] ? do_syscall_64 (./arch/x86/include/asm/cpufeatureh:171 arch/x86/entry/common.c:98)
>> kernel: [ 5.577689] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>> kernel: [ 5.577692] ? do_syscall_64 (./arch/x86/include/asm/cpufeatureh:171 arch/x86/entry/common.c:98)
>> kernel: [ 5.577695] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>> kernel: [ 5.577698] ? do_syscall_64 (./arch/x86/include/asm/cpufeatureh:171 arch/x86/entry/common.c:98)
>> kernel: [ 5.577700] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
>> kernel: [ 5.577703] ? sysvec_call_function (arch/x86/kernel/smp.c:253 (discriminator 69))
>> kernel: [ 5.577707] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:129)
>> kernel: [ 5.577709] RIP: 0033:0x7fdaa331e88d
>> kernel: [ 5.577724] Code: 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 73 b5 0f 00 f7 d8 64 89 01 48
>> All code
>> ========
>> 0: 5b pop %rbx
>> 1: 41 5c pop %r12
>> 3: c3 ret
>> 4: 66 0f 1f 84 00 00 00 nopw 0x0(%rax,%rax,1)
>> b: 00 00
>> d: f3 0f 1e fa endbr64
>> 11: 48 89 f8 mov %rdi,%rax
>> 14: 48 89 f7 mov %rsi,%rdi
>> 17: 48 89 d6 mov %rdx,%rsi
>> 1a: 48 89 ca mov %rcx,%rdx
>> 1d: 4d 89 c2 mov %r8,%r10
>> 20: 4d 89 c8 mov %r9,%r8
>> 23: 4c 8b 4c 24 08 mov 0x8(%rsp),%r9
>> 28: 0f 05 syscall
>> 2a:* 48 3d 01 f0 ff ff cmp $0xfffffffffffff001,%rax <-- trapping instruction
>> 30: 73 01 jae 0x33
>> 32: c3 ret
>> 33: 48 8b 0d 73 b5 0f 00 mov 0xfb573(%rip),%rcx # 0xfb5ad
>> 3a: f7 d8 neg %eax
>> 3c: 64 89 01 mov %eax,%fs:(%rcx)
>> 3f: 48 rex.W
>>
>> Code starting with the faulting instruction
>> ===========================================
>> 0: 48 3d 01 f0 ff ff cmp $0xfffffffffffff001,%rax
>> 6: 73 01 jae 0x9
>> 8: c3 ret
>> 9: 48 8b 0d 73 b5 0f 00 mov 0xfb573(%rip),%rcx # 0xfb583
>> 10: f7 d8 neg %eax
>> 12: 64 89 01 mov %eax,%fs:(%rcx)
>> 15: 48 rex.W
>> kernel: [ 5.577729] RSP: 002b:00007ffeb4f87d28 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>> kernel: [ 5.577733] RAX: ffffffffffffffda RBX: 000055aedf3eeeb0 RCX: 00007fdaa331e88d
>> kernel: [ 5.577736] RDX: 0000000000000000 RSI: 000055aedf3efb80 RDI: 000000000000001a
>> kernel: [ 5.577738] RBP: 0000000000020000 R08: 0000000000000000 R09: 0000000000000002
>> kernel: [ 5.577741] R10: 000000000000001a R11: 0000000000000246 R12: 000055aedf3efb80
>> kernel: [ 5.577744] R13: 000055aedf3f2060 R14: 0000000000000000 R15: 000055aedf2b1220
>> kernel: [ 5.577748] </TASK>
>> kernel: [ 5.577750] Modules linked in: intel_rapl_msr intel_rapl_common amdgpu(+) edac_mce_amd kvm_amd kvm snd_hda_codec_realtek snd_hda_codec_generic irqbypass ledtrig_audio crct10dif_pclmul polyval_clmulni polyval_generic snd_hda_codec_hdmi ghash_clmulni_intel sha512_ssse3 sha256_ssse3 sha1_ssse3 amdxcp snd_hda_intel aesni_intel drm_exec snd_intel_dspcfg crypto_simd gpu_sched snd_intel_sdw_acpi cryptd nls_iso8859_1 drm_buddy snd_hda_codec snd_seq_midi drm_suballoc_helper snd_seq_midi_event drm_ttm_helper joydev snd_hda_core input_leds ttm rapl snd_rawmidi snd_hwdep drm_display_helper snd_seq snd_pcm wmi_bmof cec k10temp snd_seq_device ccp rc_core snd_timer snd drm_kms_helper i2c_algo_bit soundcore mac_hid tcp_bbr sch_fq msr parport_pc ppdev lp drm parport efi_pstore ip_tables x_tables autofs4 btrfs blake2b_generic xor raid6_pq libcrc32c hid_generic usbhid hid crc32_pclmul nvme r8169 ahci nvme_core i2c_piix4 xhci_pci libahci xhci_pci_renesas realtek video wmi gpio_amdpt
>> kernel: [ 5.577817] CR2: 0000000000000008
>> kernel: [ 5.577820] ---[ end trace 0000000000000000 ]---
>> kernel: [ 5.914230] RIP: 0010:gfx_v10_0_early_init (drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:4009 drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:7478) amdgpu
>> kernel: [ 5.914388] Code: 8d 55 a8 4c 89 ff e8 e4 83 ec ff 41 89 c2 83 f8 ed 0f 84 b3 fd ff ff 85 c0 74 05 0f 1f 44 00 00 49 8b 87 08 87 01 00 4c 89 ff <48> 8b 40 08 0f b7 50 0a 0f b7 70 08 e8 e4 42 fb ff 41 89 c2 85 c0
>> All code
>> ========
>> 0: 8d 55 a8 lea -0x58(%rbp),%edx
>> 3: 4c 89 ff mov %r15,%rdi
>> 6: e8 e4 83 ec ff call 0xffffffffffec83ef
>> b: 41 89 c2 mov %eax,%r10d
>> e: 83 f8 ed cmp $0xffffffed,%eax
>> 11: 0f 84 b3 fd ff ff je 0xfffffffffffffdca
>> 17: 85 c0 test %eax,%eax
>> 19: 74 05 je 0x20
>> 1b: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
>> 20: 49 8b 87 08 87 01 00 mov 0x18708(%r15),%rax
>> 27: 4c 89 ff mov %r15,%rdi
>> 2a:* 48 8b 40 08 mov 0x8(%rax),%rax <-- trapping instruction
>> 2e: 0f b7 50 0a movzwl 0xa(%rax),%edx
>> 32: 0f b7 70 08 movzwl 0x8(%rax),%esi
>> 36: e8 e4 42 fb ff call 0xfffffffffffb431f
>> 3b: 41 89 c2 mov %eax,%r10d
>> 3e: 85 c0 test %eax,%eax
>>
>> Code starting with the faulting instruction
>> ===========================================
>> 0: 48 8b 40 08 mov 0x8(%rax),%rax
>> 4: 0f b7 50 0a movzwl 0xa(%rax),%edx
>> 8: 0f b7 70 08 movzwl 0x8(%rax),%esi
>> c: e8 e4 42 fb ff call 0xfffffffffffb42f5
>> 11: 41 89 c2 mov %eax,%r10d
>> 14: 85 c0 test %eax,%eax
>> rsyslogd: rsyslogd's groupid changed to 111
>> kernel: [ 5.914394] RSP: 0018:ffffa5b3c103f720 EFLAGS: 00010282
>> kernel: [ 5.914397] RAX: 0000000000000000 RBX: ffffffffc1d73489 RCX: 0000000000000000
>> kernel: [ 5.914399] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff91ae4fa80000
>> kernel: [ 5.914402] RBP: ffffa5b3c103f7b0 R08: 0000000000000000 R09: 0000000000000000
>> kernel: [ 5.914405] R10: 00000000ffffffea R11: 0000000000000000 R12: ffff91ae4fa986e8
>> kernel: [ 5.914408] R13: ffff91ae4fa986d8 R14: ffff91ae4fa986f8 R15: ffff91ae4fa80000
>> kernel: [ 5.914410] FS: 00007fdaa343c8c0(0000) GS:ffff91bd58440000(0000) knlGS:0000000000000000
>> kernel: [ 5.914414] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> kernel: [ 5.914416] CR2: 0000000000000008 CR3: 00000001222d0000 CR4: 0000000000750ef0
>> kernel: [ 5.914419] PKRU: 55555554
>>
>> Best regards,
>> Mirsad
>>
>> On 1/18/24 18:23, Mirsad Todorovac wrote:
>>> Hi,
>>>
>>> Unfortunately, I was not able to reboot in this kernel again to do the stack decode, but I thought
>>> that any information about the NULL pointer dereference is better than no info.
>>>
>>> The system is Ubuntu 23.10 Mantic with AMD product: Navi 23 [Radeon RX 6600/6600 XT/6600M]
>>> graphic card.
>>>
>>> Please find the config and the hw listing attached.
>>>
>>> Best regards,
>>> Mirsad
>>
>>
>>
>>> kernel: [ 5.576702] BUG: kernel NULL pointer dereference, address: 0000000000000008
>>> kernel: [ 5.576707] #PF: supervisor read access in kernel mode
>>> kernel: [ 5.576710] #PF: error_code(0x0000) - not-present page
>>> kernel: [ 5.576712] PGD 0 P4D 0
>>> kernel: [ 5.576715] Oops: 0000 [#1] PREEMPT SMP NOPTI
>>> kernel: [ 5.576718] CPU: 9 PID: 650 Comm: systemd-udevd Not tainted 6.7.0-rtl-v0.2-nokcsan-09928-g052d534373b7 #2
>>> kernel: [ 5.576723] Hardware name: ASRock X670E PG Lightning/X670E PG Lightning, BIOS 1.21 04/26/2023
>>> kernel: [ 5.576726] RIP: 0010:gfx_v10_0_early_init+0x5ab/0x8d0 [amdgpu]
>>> kernel: [ 5.576872] Code: 8d 55 a8 4c 89 ff e8 e4 83 ec ff 41 89 c2 83 f8 ed 0f 84 b3 fd ff ff 85 c0 74 05 0f 1f 44 00 00 49 8b 87 08 87 01 00 4c 89 ff <48> 8b 40 08 0f b7 50 0a 0f b7 70 08 e8 e4 42 fb ff 41 89 c2 85 c0
>>> kernel: [ 5.576878] RSP: 0018:ffffa5b3c103f720 EFLAGS: 00010282
>>> kernel: [ 5.576881] RAX: 0000000000000000 RBX: ffffffffc1d73489 RCX: 0000000000000000
>>> kernel: [ 5.576884] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff91ae4fa80000
>>> kernel: [ 5.576886] RBP: ffffa5b3c103f7b0 R08: 0000000000000000 R09: 0000000000000000
>>> kernel: [ 5.576889] R10: 00000000ffffffea R11: 0000000000000000 R12: ffff91ae4fa986e8
>>> kernel: [ 5.576892] R13: ffff91ae4fa986d8 R14: ffff91ae4fa986f8 R15: ffff91ae4fa80000
>>> kernel: [ 5.576895] FS: 00007fdaa343c8c0(0000) GS:ffff91bd58440000(0000) knlGS:0000000000000000
>>> kernel: [ 5.576898] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> kernel: [ 5.576900] CR2: 0000000000000008 CR3: 00000001222d0000 CR4: 0000000000750ef0
>>> kernel: [ 5.576903] PKRU: 55555554
>>> kernel: [ 5.576905] Call Trace:
>>> kernel: [ 5.576907] <TASK>
>>> kernel: [ 5.576909] ? show_regs+0x72/0x90
>>> kernel: [ 5.576914] ? __die+0x25/0x80
>>> kernel: [ 5.576917] ? page_fault_oops+0x154/0x4c0
>>> kernel: [ 5.576921] ? srso_alias_return_thunk+0x5/0xfbef5
>>> kernel: [ 5.576925] ? crypto_alloc_tfmmem.isra.0+0x35/0x70
>>> kernel: [ 5.576930] ? do_user_addr_fault+0x30e/0x6e0
>>> kernel: [ 5.576934] ? exc_page_fault+0x84/0x1b0
>>> kernel: [ 5.576937] ? asm_exc_page_fault+0x27/0x30
>>> kernel: [ 5.576942] ? gfx_v10_0_early_init+0x5ab/0x8d0 [amdgpu]
>>> kernel: [ 5.577056] amdgpu_device_init+0xefa/0x2de0 [amdgpu]
>>> kernel: [ 5.577158] ? srso_alias_return_thunk+0x5/0xfbef5
>>> kernel: [ 5.577161] ? pci_bus_read_config_word+0x47/0x90
>>> kernel: [ 5.577166] ? pci_read_config_word+0x27/0x60
>>> kernel: [ 5.577168] ? srso_alias_return_thunk+0x5/0xfbef5
>>> kernel: [ 5.577171] ? do_pci_enable_device+0xe1/0x110
>>> kernel: [ 5.577176] amdgpu_driver_load_kms+0x1a/0x1c0 [amdgpu]
>>> kernel: [ 5.577275] amdgpu_pci_probe+0x1a8/0x5e0 [amdgpu]
>>> kernel: [ 5.577373] local_pci_probe+0x48/0xb0
>>> kernel: [ 5.577377] pci_device_probe+0xc8/0x290
>>> kernel: [ 5.577381] really_probe+0x1d2/0x440
>>> kernel: [ 5.577386] __driver_probe_device+0x8a/0x190
>>> kernel: [ 5.577389] driver_probe_device+0x23/0xd0
>>> kernel: [ 5.577392] __driver_attach+0x10f/0x220
>>> kernel: [ 5.577396] ? __pfx___driver_attach+0x10/0x10
>>> kernel: [ 5.577399] bus_for_each_dev+0x7a/0xe0
>>> kernel: [ 5.577402] driver_attach+0x1e/0x30
>>> kernel: [ 5.577405] bus_add_driver+0x127/0x240
>>> kernel: [ 5.577409] driver_register+0x64/0x140
>>> kernel: [ 5.577411] ? __pfx_amdgpu_init+0x10/0x10 [amdgpu]
>>> kernel: [ 5.577521] __pci_register_driver+0x68/0x80
>>> kernel: [ 5.577524] amdgpu_init+0x69/0xff0 [amdgpu]
>>> kernel: [ 5.577628] do_one_initcall+0x46/0x330
>>> kernel: [ 5.577632] ? kmalloc_trace+0x136/0x370
>>> kernel: [ 5.577637] do_init_module+0x6a/0x280
>>> kernel: [ 5.577640] load_module+0x2419/0x2500
>>> kernel: [ 5.577647] init_module_from_file+0x9c/0xf0
>>> kernel: [ 5.577649] ? srso_alias_return_thunk+0x5/0xfbef5
>>> kernel: [ 5.577652] ? init_module_from_file+0x9c/0xf0
>>> kernel: [ 5.577657] idempotent_init_module+0x184/0x240
>>> kernel: [ 5.577661] __x64_sys_finit_module+0x64/0xd0
>>> kernel: [ 5.577664] do_syscall_64+0x76/0x140
>>> kernel: [ 5.577668] ? srso_alias_return_thunk+0x5/0xfbef5
>>> kernel: [ 5.577671] ? ksys_mmap_pgoff+0x123/0x270
>>> kernel: [ 5.577675] ? srso_alias_return_thunk+0x5/0xfbef5
>>> kernel: [ 5.577678] ? srso_alias_return_thunk+0x5/0xfbef5
>>> kernel: [ 5.577681] ? syscall_exit_to_user_mode+0x97/0x1e0
>>> kernel: [ 5.577684] ? srso_alias_return_thunk+0x5/0xfbef5
>>> kernel: [ 5.577687] ? do_syscall_64+0x85/0x140
>>> kernel: [ 5.577689] ? srso_alias_return_thunk+0x5/0xfbef5
>>> kernel: [ 5.577692] ? do_syscall_64+0x85/0x140
>>> kernel: [ 5.577695] ? srso_alias_return_thunk+0x5/0xfbef5
>>> kernel: [ 5.577698] ? do_syscall_64+0x85/0x140
>>> kernel: [ 5.577700] ? srso_alias_return_thunk+0x5/0xfbef5
>>> kernel: [ 5.577703] ? sysvec_call_function+0x4e/0xb0
>>> kernel: [ 5.577707] entry_SYSCALL_64_after_hwframe+0x6e/0x76
>>> kernel: [ 5.577709] RIP: 0033:0x7fdaa331e88d
>>> kernel: [ 5.577724] Code: 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 73 b5 0f 00 f7 d8 64 89 01 48
>>> kernel: [ 5.577729] RSP: 002b:00007ffeb4f87d28 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
>>> kernel: [ 5.577733] RAX: ffffffffffffffda RBX: 000055aedf3eeeb0 RCX: 00007fdaa331e88d
>>> kernel: [ 5.577736] RDX: 0000000000000000 RSI: 000055aedf3efb80 RDI: 000000000000001a
>>> kernel: [ 5.577738] RBP: 0000000000020000 R08: 0000000000000000 R09: 0000000000000002
>>> kernel: [ 5.577741] R10: 000000000000001a R11: 0000000000000246 R12: 000055aedf3efb80
>>> kernel: [ 5.577744] R13: 000055aedf3f2060 R14: 0000000000000000 R15: 000055aedf2b1220
>>> kernel: [ 5.577748] </TASK>
>>> kernel: [ 5.577750] Modules linked in: intel_rapl_msr intel_rapl_common amdgpu(+) edac_mce_amd kvm_amd kvm snd_hda_codec_realtek snd_hda_codec_generic irqbypass ledtrig_audio crct10dif_pclmul polyval_clmulni polyval_generic snd_hda_codec_hdmi ghash_clmulni_intel sha512_ssse3 sha256_ssse3 sha1_ssse3 amdxcp snd_hda_intel aesni_intel drm_exec snd_intel_dspcfg crypto_simd gpu_sched snd_intel_sdw_acpi cryptd nls_iso8859_1 drm_buddy snd_hda_codec snd_seq_midi drm_suballoc_helper snd_seq_midi_event drm_ttm_helper joydev snd_hda_core input_leds ttm rapl snd_rawmidi snd_hwdep drm_display_helper snd_seq snd_pcm wmi_bmof cec k10temp snd_seq_device ccp rc_core snd_timer snd drm_kms_helper i2c_algo_bit soundcore mac_hid tcp_bbr sch_fq msr parport_pc ppdev lp drm parport efi_pstore ip_tables x_tables autofs4 btrfs blake2b_generic xor raid6_pq libcrc32c hid_generic usbhid hid crc32_pclmul nvme r8169 ahci nvme_core i2c_piix4 xhci_pci libahci xhci_pci_renesas realtek video wmi gpio_amdpt
>>> kernel: [ 5.577817] CR2: 0000000000000008
>>> kernel: [ 5.577820] ---[ end trace 0000000000000000 ]---
>>> kernel: [ 5.914230] RIP: 0010:gfx_v10_0_early_init+0x5ab/0x8d0 [amdgpu]
>>> kernel: [ 5.914388] Code: 8d 55 a8 4c 89 ff e8 e4 83 ec ff 41 89 c2 83 f8 ed 0f 84 b3 fd ff ff 85 c0 74 05 0f 1f 44 00 00 49 8b 87 08 87 01 00 4c 89 ff <48> 8b 40 08 0f b7 50 0a 0f b7 70 08 e8 e4 42 fb ff 41 89 c2 85 c0
>>> rsyslogd: rsyslogd's groupid changed to 111
>>> kernel: [ 5.914394] RSP: 0018:ffffa5b3c103f720 EFLAGS: 00010282
>>> kernel: [ 5.914397] RAX: 0000000000000000 RBX: ffffffffc1d73489 RCX: 0000000000000000
>>> kernel: [ 5.914399] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff91ae4fa80000
>>> kernel: [ 5.914402] RBP: ffffa5b3c103f7b0 R08: 0000000000000000 R09: 0000000000000000
>>> kernel: [ 5.914405] R10: 00000000ffffffea R11: 0000000000000000 R12: ffff91ae4fa986e8
>>> kernel: [ 5.914408] R13: ffff91ae4fa986d8 R14: ffff91ae4fa986f8 R15: ffff91ae4fa80000
>>> kernel: [ 5.914410] FS: 00007fdaa343c8c0(0000) GS:ffff91bd58440000(0000) knlGS:0000000000000000
>>> kernel: [ 5.914414] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> kernel: [ 5.914416] CR2: 0000000000000008 CR3: 00000001222d0000 CR4: 0000000000750ef0
>>> kernel: [ 5.914419] PKRU: 55555554
Powered by blists - more mailing lists