lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CABXGCsPAdv6wCWmzh6OQmDX1LOf_FEu_wH=4K9HDd_rToTdwrQ@mail.gmail.com>
Date:   Wed, 7 Apr 2021 14:30:49 +0500
From:   Mikhail Gavrilov <mikhail.v.gavrilov@...il.com>
To:     amd-gfx list <amd-gfx@...ts.freedesktop.org>,
        Linux List Kernel Mailing <linux-kernel@...r.kernel.org>
Cc:     Christian König <christian.koenig@....com>,
        Dave Airlie <airlied@...il.com>
Subject: Unexpected multihop in swaput - likely driver bug.

Hi!
During the 5.12 testing cycle I observed the repeatable bug when
launching heavy graphic applications.
The kernel log is flooded with the message "Unexpected multihop in
swaput - likely driver bug.".

Trace:
[ 8707.814899] ------------[ cut here ]------------
[ 8707.814920] Unexpected multihop in swaput - likely driver bug.
[ 8707.814998] WARNING: CPU: 19 PID: 28231 at
drivers/gpu/drm/ttm/ttm_bo.c:1484 ttm_bo_swapout+0x40b/0x420 [ttm]
[ 8707.815011] Modules linked in: tun uinput snd_seq_dummy rfcomm
snd_hrtimer netconsole nft_objref nf_conntrack_netbios_ns
nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib
nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct
nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set
nf_tables nfnetlink cmac bnep sunrpc vfat fat hid_logitech_hidpp
hid_logitech_dj intel_rapl_msr snd_hda_codec_realtek intel_rapl_common
mt76x2u snd_hda_codec_generic mt76x2_common mt76x02_usb iwlmvm
ledtrig_audio snd_hda_codec_hdmi mt76_usb mt76x02_lib snd_hda_intel
mt76 snd_intel_dspcfg snd_intel_sdw_acpi mac80211 joydev snd_usb_audio
snd_hda_codec uvcvideo edac_mce_amd videobuf2_vmalloc snd_hda_core
snd_usbmidi_lib videobuf2_memops snd_hwdep iwlwifi snd_rawmidi btusb
videobuf2_v4l2 kvm_amd snd_seq videobuf2_common btrtl btbcm videodev
btintel snd_seq_device kvm mc cfg80211 bluetooth snd_pcm libarc4
eeepc_wmi snd_timer asus_wmi irqbypass xpad sp5100_tco
[ 8707.815065]  sparse_keymap ecdh_generic ff_memless video ecc
wmi_bmof i2c_piix4 snd rapl k10temp soundcore rfkill acpi_cpufreq
ip_tables amdgpu drm_ttm_helper ttm iommu_v2 gpu_sched drm_kms_helper
crct10dif_pclmul crc32_pclmul crc32c_intel cec drm ghash_clmulni_intel
igb ccp nvme dca nvme_core i2c_algo_bit wmi pinctrl_amd fuse
[ 8707.815096] CPU: 19 PID: 28231 Comm: kworker/u64:1 Tainted: G
 W        --------- ---  5.12.0-0.rc6.184.fc35.x86_64+debug #1
[ 8707.815101] Hardware name: System manufacturer System Product
Name/ROG STRIX X570-I GAMING, BIOS 3603 03/20/2021
[ 8707.815106] Workqueue: ttm_swap ttm_shrink_work [ttm]
[ 8707.815114] RIP: 0010:ttm_bo_swapout+0x40b/0x420 [ttm]
[ 8707.815122] Code: 10 00 00 48 c1 e2 0c 48 c1 e6 0c e8 3f 37 fa c8
e9 71 fe ff ff 83 f8 b8 0f 85 a9 fe ff ff 48 c7 c7 28 32 37 c0 e8 02
2b 98 c9 <0f> 0b e9 96 fe ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f
00 0f
[ 8707.815126] RSP: 0018:ffffa306d20e7d58 EFLAGS: 00010292
[ 8707.815130] RAX: 0000000000000032 RBX: ffffffffc0379260 RCX: 0000000000000027
[ 8707.815133] RDX: ffff918c091daae8 RSI: 0000000000000001 RDI: ffff918c091daae0
[ 8707.815136] RBP: ffff918602210058 R08: 0000000000000000 R09: 0000000000000000
[ 8707.815138] R10: ffffa306d20e7b90 R11: ffff918c2e2fffe8 R12: 00000000ffffffb8
[ 8707.815141] R13: ffffffffc03792a0 R14: ffff9186022102c0 R15: 0000000000000001
[ 8707.815145] FS:  0000000000000000(0000) GS:ffff918c09000000(0000)
knlGS:0000000000000000
[ 8707.815148] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 8707.815151] CR2: 0000325c84d12000 CR3: 0000000776c28000 CR4: 0000000000350ee0
[ 8707.815154] Call Trace:
[ 8707.815164]  ttm_shrink+0xa6/0xe0 [ttm]
[ 8707.815171]  ttm_shrink_work+0x36/0x40 [ttm]
[ 8707.815177]  process_one_work+0x2b0/0x5e0
[ 8707.815185]  worker_thread+0x55/0x3c0
[ 8707.815188]  ? process_one_work+0x5e0/0x5e0
[ 8707.815192]  kthread+0x13a/0x150
[ 8707.815196]  ? __kthread_bind_mask+0x60/0x60
[ 8707.815199]  ret_from_fork+0x22/0x30
[ 8707.815207] irq event stamp: 0
[ 8707.815209] hardirqs last  enabled at (0): [<0000000000000000>] 0x0
[ 8707.815213] hardirqs last disabled at (0): [<ffffffff890ddafb>]
copy_process+0x91b/0x1e10
[ 8707.815218] softirqs last  enabled at (0): [<ffffffff890ddafb>]
copy_process+0x91b/0x1e10
[ 8707.815222] softirqs last disabled at (0): [<0000000000000000>] 0x0
[ 8707.815224] ---[ end trace 29252aa87289bbaa ]---

Full kernel log: https://pastebin.com/mmAxwBYc

$ /usr/src/kernels/`uname -r`/scripts/faddr2line
/lib/debug/lib/modules/`uname
-r`/kernel/drivers/gpu/drm/ttm/ttm.ko.debug ttm_bo_swapout+0x40b
ttm_bo_swapout+0x40b/0x420:
ttm_bo_swapout at
/usr/src/debug/kernel-5.12-rc6/linux-5.12.0-0.rc6.184.fc35.x86_64/drivers/gpu/drm/ttm/ttm_bo.c:1484
(discriminator 1)


$ git blame drivers/gpu/drm/ttm/ttm_bo.c -L 1475,1494
Blaming lines:   1% (20/1530), done.
ebdf565169af0 (Dave Airlie      2020-10-29 13:58:52 +1000 1475)
         memset(&hop, 0, sizeof(hop));
ba4e7d973dd09 (Thomas Hellstrom 2009-06-10 15:20:19 +0200 1476)
ba4e7d973dd09 (Thomas Hellstrom 2009-06-10 15:20:19 +0200 1477)
         evict_mem = bo->mem;
ba4e7d973dd09 (Thomas Hellstrom 2009-06-10 15:20:19 +0200 1478)
         evict_mem.mm_node = NULL;
ce65b874001d7 (Christian König  2020-09-30 16:44:16 +0200 1479)
         evict_mem.placement = 0;
ba4e7d973dd09 (Thomas Hellstrom 2009-06-10 15:20:19 +0200 1480)
         evict_mem.mem_type = TTM_PL_SYSTEM;
ba4e7d973dd09 (Thomas Hellstrom 2009-06-10 15:20:19 +0200 1481)
ebdf565169af0 (Dave Airlie      2020-10-29 13:58:52 +1000 1482)
         ret = ttm_bo_handle_move_mem(bo, &evict_mem, true, &ctx,
&hop);
ebdf565169af0 (Dave Airlie      2020-10-29 13:58:52 +1000 1483)
         if (unlikely(ret != 0)) {
ebdf565169af0 (Dave Airlie      2020-10-29 13:58:52 +1000 1484)
                 WARN(ret == -EMULTIHOP, "Unexpected multihop in
swaput - likely driver bug.\n");
ba4e7d973dd09 (Thomas Hellstrom 2009-06-10 15:20:19 +0200 1485)
                 goto out;
ebdf565169af0 (Dave Airlie      2020-10-29 13:58:52 +1000 1486)
         }
ba4e7d973dd09 (Thomas Hellstrom 2009-06-10 15:20:19 +0200 1487)         }
ba4e7d973dd09 (Thomas Hellstrom 2009-06-10 15:20:19 +0200 1488)
61ede07055539 (Christian König  2016-06-06 10:17:57 +0200 1489)         /**
61ede07055539 (Christian König  2016-06-06 10:17:57 +0200 1490)
  * Make sure BO is idle.
61ede07055539 (Christian König  2016-06-06 10:17:57 +0200 1491)          */
61ede07055539 (Christian König  2016-06-06 10:17:57 +0200 1492)
61ede07055539 (Christian König  2016-06-06 10:17:57 +0200 1493)
 ret = ttm_bo_wait(bo, false, false);
61ede07055539 (Christian König  2016-06-06 10:17:57 +0200 1494)
 if (unlikely(ret != 0))




-- 
Best Regards,
Mike Gavrilov.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ