lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bb671243-7031-31ff-c6c3-dc1e4192ef71@amd.com>
Date:   Thu, 21 Jan 2021 14:27:02 +0100
From:   Christian König <christian.koenig@....com>
To:     Mikhail Gavrilov <mikhail.v.gavrilov@...il.com>
Cc:     "Deucher, Alexander" <alexander.deucher@....com>,
        Harry Wentland <harry.wentland@....com>,
        dri-devel <dri-devel@...ts.freedesktop.org>,
        amd-gfx list <amd-gfx@...ts.freedesktop.org>,
        Linux List Kernel Mailing <linux-kernel@...r.kernel.org>
Subject: Re: [drm:dm_plane_helper_prepare_fb [amdgpu]] *ERROR* Failed to pin
 framebuffer with error -12

I still have no idea what's going on here.

The KASAN messages from the DC code are completely unrelated.

Please add the full dmesg to your bug report.

Christian.

Am 20.01.21 um 01:59 schrieb Mikhail Gavrilov:
> On Fri, 15 Jan 2021 at 03:43, Mikhail Gavrilov
> <mikhail.v.gavrilov@...il.com> wrote:
> In rc4, the number of warnings has dropped dramatically.
> No more errors "kasan slab-out-of-bounds" and no "DMA-API device
> driver failed to check map error".
> But still not fixed "sleeping function called from invalid context at
> include/linux/sched/mm.h:196" and "BUG: key ffff88810b0d9148 has not
> been registered!"
> Second issue Navi specific because it started to happen in 5.10 kernel
> after replacing Radeon VII to 6900XT.
>
> 1.
> BUG: sleeping function called from invalid context at
> include/linux/sched/mm.h:196
> in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 500, name: systemd-udevd
> 1 lock held by systemd-udevd/500:
>   #0: ffff888107690258 (&dev->mutex){....}-{3:3}, at:
> device_driver_attach+0xa3/0x250
> CPU: 9 PID: 500 Comm: systemd-udevd Not tainted
> 5.11.0-0.rc4.129.fc34.x86_64+debug #1
> Hardware name: System manufacturer System Product Name/ROG STRIX
> X570-I GAMING, BIOS 2802 10/21/2020
> Call Trace:
>   dump_stack+0xae/0xe5
>   ___might_sleep.cold+0x150/0x17e
>   ? dcn30_clock_source_create+0x53/0x110 [amdgpu]
>   kmem_cache_alloc_trace+0x23f/0x270
>   dcn30_clock_source_create+0x53/0x110 [amdgpu]
>   dcn30_create_resource_pool+0x998/0x4890 [amdgpu]
>   ? dcn30_calc_max_scaled_time+0x40/0x40 [amdgpu]
>   ? lock_is_held_type+0xb8/0xf0
>   ? unpoison_range+0x3a/0x60
>   ? ____kasan_kmalloc.constprop.0+0x84/0xa0
>   ? dc_create_resource_pool+0x26e/0x5e0 [amdgpu]
>   dc_create_resource_pool+0x26e/0x5e0 [amdgpu]
>   dc_create+0x636/0x1bc0 [amdgpu]
>   ? lock_acquire+0x2dd/0x7a0
>   ? sched_clock+0x5/0x10
>   ? sched_clock_cpu+0x18/0x170
>   ? find_held_lock+0x33/0x110
>   ? dc_create_state+0xa0/0xa0 [amdgpu]
>   ? lock_downgrade+0x6b0/0x6b0
>   ? module_assert_mutex_or_preempt+0x3e/0x70
>   ? lock_is_held_type+0xb8/0xf0
>   ? unpoison_range+0x3a/0x60
>   ? ____kasan_kmalloc.constprop.0+0x84/0xa0
>   amdgpu_dm_init.isra.0+0x479/0x640 [amdgpu]
>   ? vprintk_emit+0x1c0/0x460
>   ? dev_vprintk_emit+0x2d8/0x31a
>   ? sched_clock+0x5/0x10
>   ? dm_resume+0x13b0/0x13b0 [amdgpu]
>   ? dev_attr_show.cold+0x35/0x35
>   ? lock_downgrade+0x6b0/0x6b0
>   ? dev_printk_emit+0x8c/0xa8
>   ? dev_vprintk_emit+0x31a/0x31a
>   ? wait_for_completion_io+0x240/0x240
>   ? __dev_printk+0x71/0xdf
>   ? smu_hw_init.cold+0x16b/0x18a [amdgpu]
>   ? smu_suspend+0x240/0x240 [amdgpu]
>   ? navi10_ih_irq_init+0xea3/0x2420 [amdgpu]
>   dm_hw_init+0xe/0x20 [amdgpu]
>   amdgpu_device_init.cold+0x3031/0x4940 [amdgpu]
>   ? amdgpu_device_cache_pci_state+0xf0/0xf0 [amdgpu]
>   ? pci_bus_read_config_byte+0x140/0x140
>   ? do_pci_enable_device+0x1f8/0x260
>   ? pci_find_saved_ext_cap+0x110/0x110
>   ? pci_enable_bridge+0xf9/0x1e0
>   ? pci_dev_check_d3cold+0x107/0x250
>   ? pci_enable_device_flags+0x201/0x340
>   amdgpu_driver_load_kms+0x167/0x8a0 [amdgpu]
>   amdgpu_pci_probe+0x235/0x360 [amdgpu]
>   ? amdgpu_pci_remove+0xd0/0xd0 [amdgpu]
>   local_pci_probe+0xd8/0x170
>   pci_device_probe+0x318/0x5c0
>   ? kernfs_create_link+0x16c/0x230
>   ? pci_device_remove+0x1d0/0x1d0
>   really_probe+0x224/0xc40
>   driver_probe_device+0x1f2/0x380
>   device_driver_attach+0x1df/0x250
>   __driver_attach+0xf6/0x260
>   ? device_driver_attach+0x250/0x250
>   bus_for_each_dev+0x114/0x180
>   ? subsys_dev_iter_exit+0x10/0x10
>   bus_add_driver+0x352/0x570
>   driver_register+0x20f/0x390
>   ? __pci_register_driver+0x13a/0x210
>   ? 0xffffffffc1d8d000
>   do_one_initcall+0xfb/0x530
>   ? perf_trace_initcall_level+0x3d0/0x3d0
>   ? __memset+0x2b/0x30
>   ? unpoison_range+0x3a/0x60
>   do_init_module+0x1ce/0x7a0
>   load_module+0x9841/0xa380
>   ? module_frob_arch_sections+0x20/0x20
>   ? lockdep_hardirqs_on_prepare+0x3e0/0x3e0
>   ? sched_clock_cpu+0x18/0x170
>   ? sched_clock+0x5/0x10
>   ? lock_acquire+0x2dd/0x7a0
>   ? sched_clock+0x5/0x10
>   ? lock_is_held_type+0xb8/0xf0
>   ? __do_sys_init_module+0x18b/0x220
>   __do_sys_init_module+0x18b/0x220
>   ? load_module+0xa380/0xa380
>   ? ktime_get_coarse_real_ts64+0x12f/0x160
>   do_syscall_64+0x33/0x40
>   entry_SYSCALL_64_after_hwframe+0x44/0xa9
> RIP: 0033:0x7f2c109da07e
> Code: 48 8b 0d f5 1d 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f
> 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 af 00 00 00 0f 05 <48> 3d
> 01 f0 ff ff 73 01 c3 48 8b 0d c2 1d 0c 00 f7 d8 64 89 01 48
> RSP: 002b:00007ffc84d33f88 EFLAGS: 00000246 ORIG_RAX: 00000000000000af
> RAX: ffffffffffffffda RBX: 000055b87f8260a0 RCX: 00007f2c109da07e
> RDX: 000055b87f834060 RSI: 0000000001e2cbf6 RDI: 00007f2c0b7e0010
> RBP: 00007f2c0b7e0010 R08: 000055b87f8281e0 R09: 00007ffc84d30a26
> R10: 000055bd2404cc18 R11: 0000000000000246 R12: 000055b87f834060
> R13: 000055b87f831ca0 R14: 0000000000000000 R15: 000055b87f832640
> [drm] Display Core initialized with v3.2.116!
> [drm] DMUB hardware initialized: version=0x02000001
> usb 1-3.2: Device not responding to setup address.
> usb 1-3.2: device not accepting address 5, error -71
> [drm] REG_WAIT timeout 1us * 100000 tries - mpc2_assert_idle_mpcc line:480
>
>
> 2.
> BUG: key ffff88810b0d9148 has not been registered!
> ------------[ cut here ]------------
> DEBUG_LOCKS_WARN_ON(1)
> WARNING: CPU: 25 PID: 500 at kernel/locking/lockdep.c:4618
> lockdep_init_map_waits+0x592/0x770
> Modules linked in: amdgpu(+) drm_ttm_helper ttm iommu_v2 gpu_sched
> drm_kms_helper cec crct10dif_pclmul crc32_pclmul crc32c_intel drm
> ghash_clmulni_intel ccp igb nvme dca nvme_core i2c_algo_bit xhci_pci
> xhci_pci_renesas wmi pinctrl_amd fuse
> CPU: 25 PID: 500 Comm: systemd-udevd Tainted: G        W
> --------- ---  5.11.0-0.rc4.129.fc34.x86_64+debug #1
> Hardware name: System manufacturer System Product Name/ROG STRIX
> X570-I GAMING, BIOS 2802 10/21/2020
> RIP: 0010:lockdep_init_map_waits+0x592/0x770
> Code: 08 84 d2 0f 85 d8 01 00 00 8b 3d e1 02 38 04 85 ff 0f 85 7e fc
> ff ff 48 c7 c6 e0 04 ca 8e 48 c7 c7 40 fd c9 8e e8 01 8e 23 02 <0f> 0b
> e9 64 fc ff ff 48 89 df 44 89 4c 24 0c 44 89 44 24 08 48 89
> RSP: 0018:ffffc900029bef88 EFLAGS: 00010282
> RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000000000
> RDX: 0000000000000027 RSI: 0000000000000004 RDI: fffff52000537de7
> RBP: 0000000000000000 R08: 0000000000000001 R09: ffff8886f9fe72ab
> R10: ffffed10df3fce55 R11: 0000000000000001 R12: ffff88810b0d9148
> R13: 0000000000000000 R14: ffffffff8edbda60 R15: ffff88810b0db690
> FS:  00007f2c0fdda140(0000) GS:ffff8886f9e00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 000055b8800aec68 CR3: 0000000127fd0000 CR4: 0000000000350ee0
> Call Trace:
>   ? lockdep_hardirqs_on+0x75/0xf0
>   __kernfs_create_file+0x102/0x2f0
>   sysfs_add_file_mode_ns+0x1af/0x500
>   sysfs_create_bin_file+0x100/0x160
>   ? lock_is_held_type+0xb8/0xf0
>   ? sysfs_add_file_to_group+0x150/0x150
>   ? static_obj+0x8a/0xc0
>   ? lockdep_init_map_waits+0x2a2/0x770
>   hdcp_create_workqueue+0x879/0xb50 [amdgpu]
>   amdgpu_dm_init.isra.0.cold+0x7f2/0x374c [amdgpu]
>   ? vprintk_emit+0x140/0x460
>   ? dev_vprintk_emit+0x2d8/0x31a
>   ? sched_clock+0x5/0x10
>   ? dm_resume+0x13b0/0x13b0 [amdgpu]
>   ? dev_attr_show.cold+0x35/0x35
>   ? psp_set_srm+0x250/0x250 [amdgpu]
>   ? hdcp_update_display+0x5b0/0x5b0 [amdgpu]
>   ? lock_downgrade+0x6b0/0x6b0
>   ? dev_printk_emit+0x8c/0xa8
>   ? dev_vprintk_emit+0x31a/0x31a
>   ? wait_for_completion_io+0x240/0x240
>   ? __dev_printk+0x71/0xdf
>   ? smu_hw_init.cold+0x16b/0x18a [amdgpu]
>   ? smu_suspend+0x240/0x240 [amdgpu]
>   ? navi10_ih_irq_init+0xea3/0x2420 [amdgpu]
>   dm_hw_init+0xe/0x20 [amdgpu]
>   amdgpu_device_init.cold+0x3031/0x4940 [amdgpu]
>   ? amdgpu_device_cache_pci_state+0xf0/0xf0 [amdgpu]
>   ? pci_bus_read_config_byte+0x140/0x140
>   ? do_pci_enable_device+0x1f8/0x260
>   ? pci_find_saved_ext_cap+0x110/0x110
>   ? pci_enable_bridge+0xf9/0x1e0
>   ? pci_dev_check_d3cold+0x107/0x250
>   ? pci_enable_device_flags+0x201/0x340
>   amdgpu_driver_load_kms+0x167/0x8a0 [amdgpu]
>   amdgpu_pci_probe+0x235/0x360 [amdgpu]
>   ? amdgpu_pci_remove+0xd0/0xd0 [amdgpu]
>   local_pci_probe+0xd8/0x170
>   pci_device_probe+0x318/0x5c0
>   ? kernfs_create_link+0x16c/0x230
>   ? pci_device_remove+0x1d0/0x1d0
>   really_probe+0x224/0xc40
>   driver_probe_device+0x1f2/0x380
>   device_driver_attach+0x1df/0x250
>   __driver_attach+0xf6/0x260
>   ? device_driver_attach+0x250/0x250
>   bus_for_each_dev+0x114/0x180
>   ? subsys_dev_iter_exit+0x10/0x10
>   bus_add_driver+0x352/0x570
>   driver_register+0x20f/0x390
>   ? __pci_register_driver+0x13a/0x210
>   ? 0xffffffffc1d8d000
>   do_one_initcall+0xfb/0x530
>   ? perf_trace_initcall_level+0x3d0/0x3d0
>   ? __memset+0x2b/0x30
>   ? unpoison_range+0x3a/0x60
>   do_init_module+0x1ce/0x7a0
>   load_module+0x9841/0xa380
>   ? module_frob_arch_sections+0x20/0x20
>   ? lockdep_hardirqs_on_prepare+0x3e0/0x3e0
>   ? sched_clock_cpu+0x18/0x170
>   ? sched_clock+0x5/0x10
>   ? lock_acquire+0x2dd/0x7a0
>   ? sched_clock+0x5/0x10
>   ? lock_is_held_type+0xb8/0xf0
>   ? __do_sys_init_module+0x18b/0x220
>   __do_sys_init_module+0x18b/0x220
>   ? load_module+0xa380/0xa380
>   ? ktime_get_coarse_real_ts64+0x12f/0x160
>   do_syscall_64+0x33/0x40
>   entry_SYSCALL_64_after_hwframe+0x44/0xa9
> RIP: 0033:0x7f2c109da07e
> Code: 48 8b 0d f5 1d 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f
> 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 af 00 00 00 0f 05 <48> 3d
> 01 f0 ff ff 73 01 c3 48 8b 0d c2 1d 0c 00 f7 d8 64 89 01 48
> RSP: 002b:00007ffc84d33f88 EFLAGS: 00000246 ORIG_RAX: 00000000000000af
> RAX: ffffffffffffffda RBX: 000055b87f8260a0 RCX: 00007f2c109da07e
> RDX: 000055b87f834060 RSI: 0000000001e2cbf6 RDI: 00007f2c0b7e0010
> RBP: 00007f2c0b7e0010 R08: 000055b87f8281e0 R09: 00007ffc84d30a26
> R10: 000055bd2404cc18 R11: 0000000000000246 R12: 000055b87f834060
> R13: 000055b87f831ca0 R14: 0000000000000000 R15: 000055b87f832640
> irq event stamp: 593331
> hardirqs last  enabled at (593331): [<ffffffff8c3602f0>]
> console_unlock+0x7c0/0x9a0
> hardirqs last disabled at (593330): [<ffffffff8c3601e8>]
> console_unlock+0x6b8/0x9a0
> softirqs last  enabled at (593162): [<ffffffff8e801112>]
> asm_call_irq_on_stack+0x12/0x20
> softirqs last disabled at (593157): [<ffffffff8e801112>]
> asm_call_irq_on_stack+0x12/0x20
> ---[ end trace 37dc3a4a3aa1704a ]---
>
> Issue with the switching off monitor still happens too, but messages
> in logs become more detailed:
> [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to process the buffer list -4!
> amdgpu 0000:0b:00.0: amdgpu: 0000000087613007 pin failed
> [drm:dm_plane_helper_prepare_fb [amdgpu]] *ERROR* Failed to pin
> framebuffer with error -12
> [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to process the buffer list -4!
> [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to process the buffer list -4!
>
> I hope "[drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to process the
> buffer list -4!" gives an idea of what happened.
>
> Full kernel log is here: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpastebin.com%2FnX69zgvf&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Cdee77ab7d3c04b44adda08d8bcdebcfe%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637467012155850822%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=J6TiqMBHrrZyNolxaUgKo4%2BNa6kBCBytrs1bJhqzGuU%3D&amp;reserved=0
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ