lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 7 Jul 2022 12:10:30 +0200
From:   Thomas Zimmermann <tzimmermann@...e.de>
To:     Mikhail Gavrilov <mikhail.v.gavrilov@...il.com>,
        amd-gfx list <amd-gfx@...ts.freedesktop.org>,
        Linux List Kernel Mailing <linux-kernel@...r.kernel.org>,
        Christian König <ckoenig.leichtzumerken@...il.com>
Subject: Re: [Bug][5.19-rc0] Between commits fdaf9a5840ac and babf0bb978e3 GPU
 stopped entering in graphic mode.

Hi

Am 07.07.22 um 02:20 schrieb Mikhail Gavrilov:
> On Tue, Jun 28, 2022 at 2:21 PM Mikhail Gavrilov
> <mikhail.v.gavrilov@...il.com> wrote:
>>
> 
> Christian can you look why
> drm_aperture_remove_conflicting_pci_framebuffers cause this kernel bug
> on my machine?

Thanks for reporting. This bug has been fixed in

 
https://cgit.freedesktop.org/drm/drm/commit/?h=drm-fixes&id=ee7a69aa38d87a3bbced7b8245c732c05ed0c6ec

The patch should reach mainline next week or so.

Best regards
Thomas

> 
> [    6.822385] amdgpu: Ignoring ACPI CRAT on non-APU system
> [    6.822462] amdgpu: Virtual CRAT table created for CPU
> [    6.822654] amdgpu: Topology: Add CPU node
> [    6.827643] Console: switching to colour dummy device 80x25
> [    6.845504] BUG: kernel NULL pointer dereference, address: 0000000000000038
> [    6.845509] #PF: supervisor read access in kernel mode
> [    6.845512] #PF: error_code(0x0000) - not-present page
> [    6.845515] PGD 0 P4D 0
> [    6.845518] Oops: 0000 [#1] PREEMPT SMP NOPTI
> [    6.845522] CPU: 27 PID: 612 Comm: systemd-udevd Tainted: G
> W        --------  ---
> 5.19.0-0.rc5.20220705gitc1084b6c5620.40.fc37.x86_64 #1
> [    6.845528] Hardware name: System manufacturer System Product
> Name/ROG STRIX X570-I GAMING, BIOS 4403 04/27/2022
> [    6.845533] RIP: 0010:kernfs_find_and_get_ns+0x11/0x70
> [    6.845539] Code: 78 e8 c3 fa 31 00 48 85 c0 75 e1 eb 93 66 66 2e
> 0f 1f 84 00 00 00 00 00 90 0f 1f 44 00 00 41 55 49 89 d5 41 54 49 89
> f4 55 53 <48> 8b 47 38 48 89 fb 48 85 c0 48 0f 44 c7 48 8b a8 80 00 00
> 00 48
> [    6.845546] RSP: 0018:ffffa98c022f3aa0 EFLAGS: 00010246
> [    6.845550] RAX: 0000000000000000 RBX: ffffffffaf52c3c0 RCX: ffff9e150147b640
> [    6.845553] RDX: 0000000000000000 RSI: ffffffffaf52c508 RDI: 0000000000000000
> [    6.845557] RBP: 0000000000000000 R08: 0000000000000000 R09: 00000000249249d4
> [    6.845560] R10: 0000000000000001 R11: 0000000000000000 R12: ffffffffaf52c508
> [    6.845563] R13: 0000000000000000 R14: ffff9e157aa93900 R15: 0000000000000000
> [    6.845567] FS:  00007fabaafbf680(0000) GS:ffff9e23e6a00000(0000)
> knlGS:0000000000000000
> [    6.845571] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [    6.845574] CR2: 0000000000000038 CR3: 000000017cb56000 CR4: 0000000000350ee0
> [    6.845578] Call Trace:
> [    6.845579]  <TASK>
> [    6.845582]  sysfs_unmerge_group+0x18/0x60
> [    6.845585]  dpm_sysfs_remove+0x20/0x60
> [    6.845590]  device_del+0xa4/0x3f0
> [    6.845594]  platform_device_del.part.0+0x13/0x70
> [    6.845599]  platform_device_unregister+0x1c/0x30
> [    6.845602]  sysfb_disable+0x2d/0x60
> [    6.845605]  remove_conflicting_framebuffers+0x1b/0xc0
> [    6.845610]  remove_conflicting_pci_framebuffers+0xce/0x120
> [    6.845614]  drm_aperture_remove_conflicting_pci_framebuffers+0x57/0x80
> [    6.845620]  amdgpu_pci_probe+0xcb/0x360 [amdgpu]
> [    6.845760]  local_pci_probe+0x41/0x80
> [    6.845764]  pci_device_probe+0xaa/0x210
> [    6.845768]  really_probe+0x1bf/0x390
> [    6.845771]  __driver_probe_device+0xfc/0x170
> [    6.845775]  driver_probe_device+0x1f/0x90
> [    6.845778]  __driver_attach+0xbf/0x1b0
> [    6.845782]  ? __device_attach_driver+0xe0/0xe0
> [    6.845785]  bus_for_each_dev+0x65/0x90
> [    6.845789]  bus_add_driver+0x15c/0x200
> [    6.845792]  driver_register+0x89/0xe0
> [    6.845796]  ? 0xffffffffc0c8d000
> [    6.845801]  do_one_initcall+0x69/0x350
> [    6.845806]  ? rcu_read_lock_sched_held+0x3c/0x70
> [    6.845810]  ? trace_kmalloc+0x3c/0x100
> [    6.845814]  ? kmem_cache_alloc_trace+0x1e8/0x350
> [    6.845818]  do_init_module+0x4a/0x200
> [    6.845822]  __do_sys_init_module+0x13a/0x190
> [    6.845827]  do_syscall_64+0x5b/0x80
> [    6.845832]  ? asm_exc_page_fault+0x27/0x30
> [    6.845835]  ? lockdep_hardirqs_on+0x7d/0x100
> [    6.845839]  entry_SYSCALL_64_after_hwframe+0x46/0xb0
> [    6.845842] RIP: 0033:0x7fababb7463e
> [    6.845845] Code: 48 8b 0d e5 57 0c 00 f7 d8 64 89 01 48 83 c8 ff
> c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 af 00 00
> 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d b2 57 0c 00 f7 d8 64 89
> 01 48
> [    6.845852] RSP: 002b:00007ffc6a6c9658 EFLAGS: 00000246 ORIG_RAX:
> 00000000000000af
> [    6.845857] RAX: ffffffffffffffda RBX: 00005620deef53f0 RCX: 00007fababb7463e
> [    6.845860] RDX: 00005620deeb2df0 RSI: 00000000010bfac6 RDI: 00007faba943e010
> [    6.845864] RBP: 00005620deeb2df0 R08: 00005620deef4880 R09: 0000000000000000
> [    6.845867] R10: 0000000000000005 R11: 0000000000000246 R12: 0000000000020000
> [    6.845870] R13: 00005620deeb5330 R14: 0000000000000000 R15: 00005620deef0410
> [    6.845875]  </TASK>
> [    6.845877] Modules linked in: amdgpu(+) drm_ttm_helper ttm
> iommu_v2 crct10dif_pclmul gpu_sched crc32_pclmul crc32c_intel
> drm_buddy drm_display_helper ucsi_ccg nvme igb typec_ucsi
> ghash_clmulni_intel ccp cec typec sp5100_tco nvme_core dca wmi
> ip6_tables ip_tables ipmi_devintf ipmi_msghandler fuse
> [    6.845898] CR2: 0000000000000038
> [    6.845900] ---[ end trace 0000000000000000 ]---
> 
> 
> $ /usr/src/kernels/5.19.0-0.rc5.20220705gitc1084b6c5620.40.fc37.x86_64/scripts/faddr2line
> /lib/debug/lib/modules/5.19.0-0.rc5.20220705gitc1084b6c5620.40.fc37.x86_64/kernel/drivers/gpu/drm/amd/amdgpu/amdgpu.ko.debug
> amdgpu_pci_probe+0xcb
> amdgpu_pci_probe+0xcb/0x360:
> amdgpu_pci_probe at
> /usr/src/debug/kernel-5.19-rc5-49-gc1084b6c5620/linux-5.19.0-0.rc5.20220705gitc1084b6c5620.40.fc37.x86_64/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c:2061
> 
> 
> $ cat -s -n /usr/src/debug/kernel-5.19-rc5-49-gc1084b6c5620/linux-5.19.0-0.rc5.20220705gitc1084b6c5620.40.fc37.x86_64/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> | head -2071 | tail -20
>    2052 "Use radeon.cik_support=0 amdgpu.cik_support=1 to override.\n"
>    2053 );
>    2054 return -ENODEV;
>    2055 }
>    2056 }
>    2057 #endif
>    2058
>    2059 /* Get rid of things like offb */
>    2060 ret = drm_aperture_remove_conflicting_pci_framebuffers(pdev,
> &amdgpu_kms_driver);
>    2061 if (ret)
>    2062 return ret;
>    2063
>    2064 adev = devm_drm_dev_alloc(&pdev->dev, &amdgpu_kms_driver,
> typeof(*adev), ddev);
>    2065 if (IS_ERR(adev))
>    2066 return PTR_ERR(adev);
>    2067
>    2068 adev->dev  = &pdev->dev;
>    2069 adev->pdev = pdev;
>    2070 ddev = adev_to_drm(adev);
> 
> $ git blame -L 2052,2070 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> Blaming lines: 100% (19/19), done.
> 984d7a929ad68 (Hans de Goede     2019-10-10 18:28:17 +0200 2052)
>                   dev_info(&pdev->dev,
> 984d7a929ad68 (Hans de Goede     2019-10-10 18:28:17 +0200 2053)
>                            "Use radeon.cik_support=0
> amdgpu.cik_support=1 to override.\n"
> 984d7a929ad68 (Hans de Goede     2019-10-10 18:28:17 +0200 2054)
>                           );
> 984d7a929ad68 (Hans de Goede     2019-10-10 18:28:17 +0200 2055)
>                   return -ENODEV;
> 984d7a929ad68 (Hans de Goede     2019-10-10 18:28:17 +0200 2056)
>           }
> 984d7a929ad68 (Hans de Goede     2019-10-10 18:28:17 +0200 2057)        }
> 984d7a929ad68 (Hans de Goede     2019-10-10 18:28:17 +0200 2058) #endif
> 984d7a929ad68 (Hans de Goede     2019-10-10 18:28:17 +0200 2059)
> d38ceaf99ed01 (Alex Deucher      2015-04-20 16:55:21 -0400 2060)
>   /* Get rid of things like offb */
> 97c9bfe3f6605 (Thomas Zimmermann 2021-06-29 15:58:33 +0200 2061)
>   ret = drm_aperture_remove_conflicting_pci_framebuffers(pdev,
> &amdgpu_kms_driver);
> d38ceaf99ed01 (Alex Deucher      2015-04-20 16:55:21 -0400 2062)        if (ret)
> d38ceaf99ed01 (Alex Deucher      2015-04-20 16:55:21 -0400 2063)
>           return ret;
> d38ceaf99ed01 (Alex Deucher      2015-04-20 16:55:21 -0400 2064)
> 5088d6572e8ff (Luben Tuikov      2020-11-04 11:04:25 +0100 2065)
>   adev = devm_drm_dev_alloc(&pdev->dev, &amdgpu_kms_driver,
> typeof(*adev), ddev);
> df2ce4596c044 (Luben Tuikov      2020-09-18 15:25:04 +0200 2066)
>   if (IS_ERR(adev))
> df2ce4596c044 (Luben Tuikov      2020-09-18 15:25:04 +0200 2067)
>           return PTR_ERR(adev);
> 8aba21b75136c (Luben Tuikov      2020-08-14 20:41:55 -0400 2068)
> 8aba21b75136c (Luben Tuikov      2020-08-14 20:41:55 -0400 2069)
>   adev->dev  = &pdev->dev;
> 8aba21b75136c (Luben Tuikov      2020-08-14 20:41:55 -0400 2070)
>   adev->pdev = pdev;
> 
> Thomas, you recently changed this line. Can you tell why we are
> catching kernel Oops here?
> 
> Full kernel log (5.19-rc5): https://pastebin.com/5Ag804bd
> 

-- 
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev

Download attachment "OpenPGP_signature" of type "application/pgp-signature" (841 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ