lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Mon, 18 Apr 2022 21:52:01 +0800
From:   kernel test robot <oliver.sang@...el.com>
To:     Robert Beckett <bob.beckett@...labora.com>
Cc:     0day robot <lkp@...el.com>, LKML <linux-kernel@...r.kernel.org>,
        lkp@...ts.01.org, dri-devel@...ts.freedesktop.org,
        intel-gfx@...ts.freedesktop.org,
        Jani Nikula <jani.nikula@...ux.intel.com>,
        Joonas Lahtinen <joonas.lahtinen@...ux.intel.com>,
        Rodrigo Vivi <rodrigo.vivi@...el.com>,
        Tvrtko Ursulin <tvrtko.ursulin@...ux.intel.com>,
        David Airlie <airlied@...ux.ie>,
        Daniel Vetter <daniel@...ll.ch>,
        Robert Beckett <bob.beckett@...labora.com>,
        Thomas Hellström 
        <thomas.hellstrom@...ux.intel.com>,
        Matthew Auld <matthew.auld@...el.com>
Subject: [drm/i915]  9c20c625e8: WARNING:possible_recursive_locking_detected



Greeting,

FYI, we noticed the following commit (built with gcc-11):

commit: 9c20c625e8f84a42c7963db854df4439553e8bc4 ("[PATCH v3 5/5] drm/i915: stolen memory use ttm backend")
url: https://github.com/intel-lab-lkp/linux/commits/Robert-Beckett/drm-i915-ttm-for-stolen-region/20220413-034123
base: git://anongit.freedesktop.org/drm/drm-tip drm-tip
patch link: https://lore.kernel.org/dri-devel/20220412193817.2098308-6-bob.beckett@collabora.com

in testcase: kernel-selftests
version: kernel-selftests-x86_64-a17aac1b-1_20220413
with following parameters:

	group: x86
	ucode: 0xec

test-description: The kernel contains a set of "self tests" under the tools/testing/selftests/ directory. These are intended to be small unit tests to exercise individual code paths in the kernel.
test-url: https://www.kernel.org/doc/Documentation/kselftest.txt


on test machine: 4 threads Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz with 32G memory

caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):



If you fix the issue, kindly add following tag
Reported-by: kernel test robot <oliver.sang@...el.com>


[   47.240273][  T197] WARNING: possible recursive locking detected
[   47.246317][  T197] 5.18.0-rc2-00646-g9c20c625e8f8 #1 Not tainted
[   47.252442][  T197] --------------------------------------------
[   47.258478][  T197] systemd-udevd/197 is trying to acquire lock:
[ 47.264516][ T197] ffffc90000fe72b0 (reservation_ww_class_acquire){+.+.}-{0:0}, at: i915_gem_object_pin_pages_unlocked (drivers/gpu/drm/i915/gem/i915_gem_pages.c:150) i915
[   47.277112][  T197]
[   47.277112][  T197] but task is already holding lock:
[ 47.284369][ T197] ffffc90000fe7578 (reservation_ww_class_acquire){+.+.}-{0:0}, at: __intel_context_do_pin (drivers/gpu/drm/i915/gt/intel_context.c:301) i915
[   47.295899][  T197]
[   47.295899][  T197] other info that might help us debug this:
[   47.303847][  T197]  Possible unsafe locking scenario:
[   47.303847][  T197]
[   47.311190][  T197]        CPU0
[   47.314346][  T197]        ----
[   47.317501][  T197]   lock(reservation_ww_class_acquire);
[   47.322925][  T197]   lock(reservation_ww_class_acquire);
[   47.328349][  T197]
[   47.328349][  T197]  *** DEADLOCK ***
[   47.328349][  T197]
[   47.336386][  T197]  May be due to missing lock nesting notation
[   47.336386][  T197]
[   47.344598][  T197] 3 locks held by systemd-udevd/197:
[ 47.349760][ T197] #0: ffff888103a96248 (&dev->mutex){....}-{3:3}, at: __driver_attach (drivers/base/dd.c:1140) 
[ 47.358945][ T197] #1: ffffc90000fe7578 (reservation_ww_class_acquire){+.+.}-{0:0}, at: __intel_context_do_pin (drivers/gpu/drm/i915/gt/intel_context.c:301) i915
[ 47.370902][ T197] #2: ffff8882069c1068 (&ce->pin_mutex){+.+.}-{3:3}, at: intel_context_alloc_state (drivers/gpu/drm/i915/gt/intel_context.c:55) i915
[   47.381889][  T197]
[   47.381889][  T197] stack backtrace:
[   47.387661][  T197] CPU: 2 PID: 197 Comm: systemd-udevd Not tainted 5.18.0-rc2-00646-g9c20c625e8f8 #1
[   47.396923][  T197] Hardware name: Dell Inc. OptiPlex 7040/0Y7WYT, BIOS 1.8.1 12/05/2017
[   47.405051][  T197] Call Trace:
[   47.408209][  T197]  <TASK>
[ 47.411020][ T197] dump_stack_lvl (lib/dump_stack.c:107 (discriminator 4)) 
[ 47.415405][ T197] validate_chain.cold (kernel/locking/lockdep.c:2958 kernel/locking/lockdep.c:3001 kernel/locking/lockdep.c:3790) 
[ 47.420400][ T197] ? check_prev_add (kernel/locking/lockdep.c:3759) 
[ 47.425307][ T197] __lock_acquire (kernel/locking/lockdep.c:5029) 
[ 47.429952][ T197] ? lock_downgrade (kernel/locking/lockdep.c:5652) 
[ 47.434680][ T197] lock_acquire (kernel/locking/lockdep.c:436 kernel/locking/lockdep.c:5646 kernel/locking/lockdep.c:5609) 
[ 47.439062][ T197] ? i915_gem_object_pin_pages_unlocked (drivers/gpu/drm/i915/gem/i915_gem_pages.c:150) i915
[ 47.446226][ T197] ? rcu_read_unlock (include/linux/rcupdate.h:723 (discriminator 5)) 
[ 47.450865][ T197] ? __mutex_unlock_slowpath (arch/x86/include/asm/atomic64_64.h:190 include/linux/atomic/atomic-long.h:449 include/linux/atomic/atomic-instrumented.h:1790 kernel/locking/mutex.c:910) 
[ 47.456379][ T197] ? wait_for_completion_io_timeout (kernel/locking/mutex.c:888) 
[ 47.462338][ T197] ? _raw_spin_unlock (arch/x86/include/asm/preempt.h:85 include/linux/spinlock_api_smp.h:143 kernel/locking/spinlock.c:186) 
[ 47.467076][ T197] i915_gem_ww_ctx_init (include/linux/ww_mutex.h:152 drivers/gpu/drm/i915/i915_gem_ww.c:11) i915
[ 47.472948][ T197] ? i915_gem_object_pin_pages_unlocked (drivers/gpu/drm/i915/gem/i915_gem_pages.c:150) i915
[ 47.480109][ T197] i915_gem_object_pin_pages_unlocked (drivers/gpu/drm/i915/gem/i915_gem_pages.c:150) i915
[ 47.487094][ T197] ? __i915_gem_ttm_object_init (drivers/gpu/drm/i915/gem/i915_gem_object.h:228 drivers/gpu/drm/i915/gem/i915_gem_ttm.c:1238) i915
[ 47.493632][ T197] ? __i915_gem_object_get_pages (drivers/gpu/drm/i915/gem/i915_gem_pages.c:146) i915
[ 47.500260][ T197] ? rcu_read_lock_sched_held (kernel/rcu/update.c:125) 
[ 47.505685][ T197] ? trace_i915_gem_object_create (drivers/gpu/drm/i915/i915_trace.h:22) i915
[ 47.512317][ T197] ? __i915_gem_object_create_region (drivers/gpu/drm/i915/gem/i915_gem_region.c:102) i915
[ 47.519291][ T197] i915_gem_object_create_stolen (drivers/gpu/drm/i915/gem/i915_gem_stolen.c:547) i915
[ 47.525832][ T197] intel_engine_create_ring (drivers/gpu/drm/i915/gt/intel_ring.c:120 drivers/gpu/drm/i915/gt/intel_ring.c:173) i915
[ 47.532021][ T197] lrc_alloc (drivers/gpu/drm/i915/gt/intel_lrc.c:990) i915
[ 47.536898][ T197] intel_context_alloc_state (drivers/gpu/drm/i915/gt/intel_context.c:64) i915
[ 47.543078][ T197] __intel_context_do_pin_ww (drivers/gpu/drm/i915/gt/intel_context.c:211) i915
[ 47.549516][ T197] ? lock_downgrade (kernel/locking/lockdep.c:5652) 
[ 47.554245][ T197] ? rwlock_bug+0xc0/0xc0 
[ 47.559062][ T197] ? intel_context_alloc_state (drivers/gpu/drm/i915/gt/intel_context.c:205) i915
[ 47.565508][ T197] ? i915_gem_ww_ctx_init (include/linux/ww_mutex.h:152 drivers/gpu/drm/i915/i915_gem_ww.c:11) i915
[ 47.571532][ T197] __intel_context_do_pin (drivers/gpu/drm/i915/gt/intel_context.c:304) i915
[ 47.577452][ T197] ? __module_address (kernel/module.c:4737) 
[ 47.582878][ T197] ? __intel_context_do_pin_ww (drivers/gpu/drm/i915/gt/intel_context.c:297) i915
[ 47.589500][ T197] ? __i915_active_init (include/linux/list.h:38 drivers/gpu/drm/i915/i915_active.c:360) i915
[ 47.595344][ T197] ? intel_engine_print_breadcrumbs (drivers/gpu/drm/i915/gt/intel_context.c:371) i915
[ 47.602220][ T197] intel_engine_create_pinned_context (drivers/gpu/drm/i915/gt/intel_context.h:157 drivers/gpu/drm/i915/gt/intel_engine_cs.c:1060) i915
[ 47.609273][ T197] intel_engines_init (drivers/gpu/drm/i915/gt/intel_engine_cs.c:1100 drivers/gpu/drm/i915/gt/intel_engine_cs.c:1131 drivers/gpu/drm/i915/gt/intel_engine_cs.c:1176) i915
[ 47.614928][ T197] ? execlists_unwind_incomplete_requests (drivers/gpu/drm/i915/gt/intel_execlists_submission.c:3485) i915
[ 47.622148][ T197] intel_gt_init (drivers/gpu/drm/i915/gt/intel_gt.c:716) i915
[ 47.627370][ T197] i915_gem_init (drivers/gpu/drm/i915/i915_gem.c:1121) i915
[ 47.632605][ T197] ? intel_modeset_init_nogem (drivers/gpu/drm/i915/display/intel_display.c:9780) i915
[ 47.638984][ T197] i915_driver_probe (drivers/gpu/drm/i915/i915_driver.c:872) i915
[ 47.644545][ T197] ? i915_print_iommu_status (drivers/gpu/drm/i915/i915_driver.c:822) i915
[ 47.650627][ T197] ? drm_privacy_screen_get (drivers/gpu/drm/drm_privacy_screen.c:167) 
[ 47.656054][ T197] i915_pci_probe (drivers/gpu/drm/i915/i915_pci.c:1202) i915
[ 47.661357][ T197] ? i915_pci_remove (drivers/gpu/drm/i915/i915_pci.c:1202) i915
[ 47.666741][ T197] ? lockdep_hardirqs_on_prepare (kernel/locking/lockdep.c:4501) 
[ 47.673211][ T197] ? _raw_spin_unlock_irqrestore (arch/x86/include/asm/irqflags.h:45 arch/x86/include/asm/irqflags.h:80 arch/x86/include/asm/irqflags.h:138 include/linux/spinlock_api_smp.h:151 kernel/locking/spinlock.c:194) 
[ 47.678899][ T197] ? trace_hardirqs_on (kernel/trace/trace_preemptirq.c:50 (discriminator 22)) 
[ 47.683804][ T197] ? _raw_spin_unlock_irqrestore (arch/x86/include/asm/irqflags.h:45 arch/x86/include/asm/irqflags.h:80 arch/x86/include/asm/irqflags.h:138 include/linux/spinlock_api_smp.h:151 kernel/locking/spinlock.c:194) 
[ 47.689497][ T197] ? i915_pci_remove (drivers/gpu/drm/i915/i915_pci.c:1202) i915
[ 47.694894][ T197] local_pci_probe (drivers/pci/pci-driver.c:323) 
[ 47.699455][ T197] pci_call_probe (drivers/pci/pci-driver.c:391) 
[ 47.704017][ T197] ? rwlock_bug+0xc0/0xc0 
[ 47.708840][ T197] ? pci_pm_suspend_noirq (drivers/pci/pci-driver.c:351) 
[ 47.714106][ T197] ? pci_match_device (drivers/pci/pci-driver.c:107 drivers/pci/pci-driver.c:158) 
[ 47.719016][ T197] ? kernfs_put (arch/x86/include/asm/atomic.h:123 (discriminator 1) include/linux/atomic/atomic-instrumented.h:576 (discriminator 1) fs/kernfs/dir.c:513 (discriminator 1)) 
[ 47.723233][ T197] pci_device_probe (drivers/pci/pci-driver.c:460) 
[ 47.727881][ T197] ? pci_dma_configure (drivers/pci/pci-driver.c:1620) 
[ 47.732789][ T197] really_probe (drivers/base/dd.c:541 drivers/base/dd.c:620) 
[ 47.737174][ T197] __driver_probe_device (drivers/base/dd.c:751) 
[ 47.742339][ T197] ? lockdep_hardirqs_on_prepare (kernel/locking/lockdep.c:4501) 
[ 47.748815][ T197] driver_probe_device (drivers/base/dd.c:781) 
[ 47.753721][ T197] __driver_attach (drivers/base/dd.c:1141) 
[ 47.758366][ T197] ? __device_attach_driver (drivers/base/dd.c:1093) 
[ 47.763792][ T197] bus_for_each_dev (drivers/base/bus.c:301) 
[ 47.768523][ T197] ? lockdep_init_map_type (kernel/locking/lockdep.c:4812) 
[ 47.773864][ T197] ? subsys_dev_iter_exit (drivers/base/bus.c:290) 
[ 47.778948][ T197] bus_add_driver (drivers/base/bus.c:619) 
[ 47.783507][ T197] driver_register (drivers/base/driver.c:171) 
[ 47.788151][ T197] ? __pci_register_driver (include/linux/list.h:37 drivers/pci/pci-driver.c:1413) 
[ 47.793491][ T197] i915_init (drivers/gpu/drm/i915/i915_driver.c:1008) i915
[   47.798293][  T197]  ? 0xffffffffc1201000
[ 47.802324][ T197] do_one_initcall (init/main.c:1298) 
[ 47.806878][ T197] ? trace_event_raw_event_initcall_level (init/main.c:1289) 
[ 47.813525][ T197] ? memset_erms (arch/x86/lib/memset_64.S:65) 
[ 47.817734][ T197] ? kasan_unpoison (mm/kasan/shadow.c:108 mm/kasan/shadow.c:142) 
[ 47.822290][ T197] do_init_module (kernel/module.c:3731) 
[ 47.826848][ T197] __do_sys_finit_module (kernel/module.c:4222) 
[ 47.832015][ T197] ? __ia32_sys_init_module (kernel/module.c:4190) 
[ 47.837269][ T197] ? __seccomp_filter (arch/x86/include/asm/bitops.h:214 include/asm-generic/bitops/instrumented-non-atomic.h:135 kernel/seccomp.c:351 kernel/seccomp.c:378 kernel/seccomp.c:410 kernel/seccomp.c:1183) 
[ 47.842180][ T197] do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80) 
[ 47.846475][ T197] ? lockdep_hardirqs_on_prepare (kernel/locking/lockdep.c:4501) 
[ 47.852948][ T197] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:115) 
[   47.858725][  T197] RIP: 0033:0x7fc6644eb989
[ 47.863018][ T197] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d d7 64 0c 00 f7 d8 64 89 01 48
All code
========
   0:	00 c3                	add    %al,%bl
   2:	66 2e 0f 1f 84 00 00 	nopw   %cs:0x0(%rax,%rax,1)
   9:	00 00 00 
   c:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
  11:	48 89 f8             	mov    %rdi,%rax
  14:	48 89 f7             	mov    %rsi,%rdi
  17:	48 89 d6             	mov    %rdx,%rsi
  1a:	48 89 ca             	mov    %rcx,%rdx
  1d:	4d 89 c2             	mov    %r8,%r10
  20:	4d 89 c8             	mov    %r9,%r8
  23:	4c 8b 4c 24 08       	mov    0x8(%rsp),%r9
  28:	0f 05                	syscall 
  2a:*	48 3d 01 f0 ff ff    	cmp    $0xfffffffffffff001,%rax		<-- trapping instruction
  30:	73 01                	jae    0x33
  32:	c3                   	retq   
  33:	48 8b 0d d7 64 0c 00 	mov    0xc64d7(%rip),%rcx        # 0xc6511
  3a:	f7 d8                	neg    %eax
  3c:	64 89 01             	mov    %eax,%fs:(%rcx)
  3f:	48                   	rex.W

Code starting with the faulting instruction
===========================================
   0:	48 3d 01 f0 ff ff    	cmp    $0xfffffffffffff001,%rax
   6:	73 01                	jae    0x9
   8:	c3                   	retq   
   9:	48 8b 0d d7 64 0c 00 	mov    0xc64d7(%rip),%rcx        # 0xc64e7
  10:	f7 d8                	neg    %eax
  12:	64 89 01             	mov    %eax,%fs:(%rcx)
  15:	48                   	rex.W


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        sudo bin/lkp install job.yaml           # job file is attached in this email
        bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
        sudo bin/lkp run generated-yaml-file

        # if come across any failure that blocks the test,
        # please remove ~/.lkp and /lkp dir to run from a clean state.



-- 
0-DAY CI Kernel Test Service
https://01.org/lkp



View attachment "config-5.18.0-rc2-00646-g9c20c625e8f8" of type "text/plain" (167279 bytes)

View attachment "job-script" of type "text/plain" (5916 bytes)

Download attachment "dmesg.xz" of type "application/x-xz" (42448 bytes)

View attachment "kernel-selftests" of type "text/plain" (141694 bytes)

View attachment "job.yaml" of type "text/plain" (4947 bytes)

View attachment "reproduce" of type "text/plain" (146 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ