lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZG3Ho7eK+KqoAdda@yujie-X299>
Date:   Wed, 24 May 2023 16:15:31 +0800
From:   Yujie Liu <yujie.liu@...el.com>
To:     Luis Chamberlain <mcgrof@...nel.org>
CC:     <oe-lkp@...ts.linux.dev>, <lkp@...el.com>,
        <linux-kernel@...r.kernel.org>, <linux-doc@...r.kernel.org>,
        <linux-modules@...r.kernel.org>
Subject: Re: [linus:master] [module] 8660484ed1:
 WARNING:at_kernel/module/dups.c:#kmod_dup_request_exists_wait

Hi Luis,

On Tue, May 23, 2023 at 10:16:17PM -0700, Luis Chamberlain wrote:
> On Mon, May 22, 2023 at 03:51:59PM +0800, kernel test robot wrote:
> > Hello,
> > 
> > kernel test robot noticed "WARNING:at_kernel/module/dups.c:#kmod_dup_request_exists_wait" on:
> > 
> > commit: 8660484ed1cf3261e89e0bad94c6395597e87599 ("module: add debugging auto-load duplicate module support")
> > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> > 
> > [test failed on linux-next/master 798d276b39e984345d52b933a900a71fa0815928]
> > 
> > in testcase: boot
> > 
> > compiler: gcc-11
> > test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
> > 
> > (please refer to attached dmesg/kmsg for entire log/backtrace)
> > 
> > 
> > We are not sure if this is expected as this patch is for debugging
> > duplicate module requests issues, so we are sending this report for
> > your information. Thanks.
> > 
> > 
> > If you fix the issue, kindly add following tag
> > | Reported-by: kernel test robot <yujie.liu@...el.com>
> > | Closes: https://lore.kernel.org/oe-lkp/202305221508.753c7b78-yujie.liu@intel.com
> > 
> > 
> > [   48.705567][    T1] ------------[ cut here ]------------
> > [   48.706519][    T1] module-autoload: duplicate request for module intel_qat
> > [ 48.707866][ T1] WARNING: CPU: 0 PID: 1 at kernel/module/dups.c:183 kmod_dup_request_exists_wait (kernel/module/dups.c:183 (discriminator 1)) 
> 
> This is a real issue since CONFIG_MODULE_DEBUG_AUTOLOAD_DUPS_TRACE was enabled.
> If you enable CONFIG_MODULE_DEBUG_AUTOLOAD_DUPS_TRACE and these warnings
> come up a bette detailed report would be better. In this case the goal
> was to capture races of request_module() and so the idea for developers
> is to identify the module that caused this, in this case intel_qat, and
> then see who called the request_module() for it. They could try things
> like try_module_get(), but could also ensure that multiple requests
> won't be done in the code by using locking schemes or whatever. Only
> one request should be issued.
> 
> The trace below would show possible users but I don't even see
> drivers/crypto/qat/qat_c3xxx/adf_drv.c in my kernel tree.
> 
> If you don't to deal with this reporting you can just disable this
> config option.

Thanks a lot for the information. Does this indicate that there is
indeed a multiple requests issue in that crypto driver and we could
report to its author/owner? Thanks.

Not sure if I undertand this correctly: enabling
CONFIG_MODULE_DEBUG_AUTOLOAD_DUPS_TRACE can help detect multiple
requests issues in existing code, so we should report to the original
author of the modules that have this issue, but shouldn't report on
this commit that introduces the CONFIG_MODULE_DEBUG_AUTOLOAD_DUPS_TRACE
option?

--
Best Regards,
Yujie

> 
>   Luis
> 
> > [   48.709747][    T1] Modules linked in:
> > [   48.710593][    T1] CPU: 0 PID: 1 Comm: swapper Not tainted 6.3.0-rc1-00121-g8660484ed1cf #1
> > [ 48.712267][ T1] RIP: 0010:kmod_dup_request_exists_wait (kernel/module/dups.c:183 (discriminator 1)) 
> > [ 48.713599][ T1] Code: c7 80 e3 2a 83 e8 ba 9f fe 00 48 89 ef e8 fe 4c 17 00 80 3d 87 15 0e 02 00 4c 89 e6 74 10 48 c7 c7 20 c0 47 82 e8 d8 ba f2 ff <0f> 0b eb 0c 48 c7 c7 80 c0 47 82 e8 f7 73 fc ff 45 84 f6 75 3a 48
> > All code
> > ========
> >    0:	c7 80 e3 2a 83 e8 ba 	movl   $0xfe9fba,-0x177cd51d(%rax)
> >    7:	9f fe 00 
> >    a:	48 89 ef             	mov    %rbp,%rdi
> >    d:	e8 fe 4c 17 00       	call   0x174d10
> >   12:	80 3d 87 15 0e 02 00 	cmpb   $0x0,0x20e1587(%rip)        # 0x20e15a0
> >   19:	4c 89 e6             	mov    %r12,%rsi
> >   1c:	74 10                	je     0x2e
> >   1e:	48 c7 c7 20 c0 47 82 	mov    $0xffffffff8247c020,%rdi
> >   25:	e8 d8 ba f2 ff       	call   0xfffffffffff2bb02
> >   2a:*	0f 0b                	ud2		<-- trapping instruction
> >   2c:	eb 0c                	jmp    0x3a
> >   2e:	48 c7 c7 80 c0 47 82 	mov    $0xffffffff8247c080,%rdi
> >   35:	e8 f7 73 fc ff       	call   0xfffffffffffc7431
> >   3a:	45 84 f6             	test   %r14b,%r14b
> >   3d:	75 3a                	jne    0x79
> >   3f:	48                   	rex.W
> > 
> > Code starting with the faulting instruction
> > ===========================================
> >    0:	0f 0b                	ud2
> >    2:	eb 0c                	jmp    0x10
> >    4:	48 c7 c7 80 c0 47 82 	mov    $0xffffffff8247c080,%rdi
> >    b:	e8 f7 73 fc ff       	call   0xfffffffffffc7407
> >   10:	45 84 f6             	test   %r14b,%r14b
> >   13:	75 3a                	jne    0x4f
> >   15:	48                   	rex.W
> > [   48.717441][    T1] RSP: 0000:ffffc9000001fc48 EFLAGS: 00010286
> > [   48.718679][    T1] RAX: 0000000000000000 RBX: ffffc9000001fcb8 RCX: 0000000000000000
> > [   48.720263][    T1] RDX: ffffc9000001f96f RSI: 0000000000000008 RDI: fffff52000003f7f
> > [   48.721791][    T1] RBP: ffff88812ec34000 R08: 0000000000000000 R09: ffffc9000001fa17
> > [   48.723349][    T1] R10: 0000000000000001 R11: ffffffff81199faf R12: ffffc9000001fd08
> > [   48.724701][    T1] R13: ffff88812a9cd800 R14: 0000000000000001 R15: 0000000000000001
> > [   48.726139][    T1] FS:  0000000000000000(0000) GS:ffffffff82cb6000(0000) knlGS:0000000000000000
> > [   48.727857][    T1] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [   48.729151][    T1] CR2: ffff88843ffff000 CR3: 0000000002c6d000 CR4: 00000000000406b0
> > [   48.730703][    T1] Call Trace:
> > [   48.731336][    T1]  <TASK>
> > [ 48.731859][ T1] ? adfdrv_init (drivers/crypto/qat/qat_c3xxx/adf_drv.c:254) 
> > [ 48.732687][ T1] __request_module (kernel/module/kmod.c:168) 
> > [ 48.733637][ T1] ? free_modprobe_argv (kernel/module/kmod.c:133) 
> > [ 48.734681][ T1] ? w1_ds2760_family_init (drivers/power/supply/ds2760_battery.c:801) 
> > [ 48.735699][ T1] ? rcu_read_lock_sched_held (kernel/rcu/update.c:125) 
> > [ 48.736791][ T1] ? adfdrv_init (drivers/crypto/qat/qat_c62x/adf_drv.c:251) 
> > [ 48.737658][ T1] adfdrv_init (drivers/crypto/qat/qat_c3xxx/adf_drv.c:254) 
> > [ 48.738518][ T1] do_one_initcall (init/main.c:1306) 
> > [ 48.739479][ T1] ? rcu_lock_acquire (kernel/notifier.c:262) 
> > [ 48.740637][ T1] ? rcu_read_lock_sched_held (kernel/rcu/update.c:125) 
> > [ 48.741731][ T1] ? lock_is_held (include/linux/sched.h:1572) 
> > [ 48.742565][ T1] do_initcalls (init/main.c:1378 init/main.c:1395) 
> > [ 48.743358][ T1] kernel_init_freeable (init/main.c:1638) 
> > [ 48.744281][ T1] ? rest_init (init/main.c:1514) 
> > [ 48.745140][ T1] kernel_init (init/main.c:1524) 
> > [ 48.745996][ T1] ret_from_fork (arch/x86/entry/entry_64.S:314) 
> > [   48.746959][    T1]  </TASK>
> > [   48.747563][    T1] irq event stamp: 2359969
> > [ 48.748417][ T1] hardirqs last enabled at (2359977): __up_console_sem (arch/x86/include/asm/irqflags.h:26 (discriminator 3) arch/x86/include/asm/irqflags.h:67 (discriminator 3) arch/x86/include/asm/irqflags.h:127 (discriminator 3) kernel/printk/printk.c:345 (discriminator 3)) 
> > [ 48.750315][ T1] hardirqs last disabled at (2359986): __up_console_sem (kernel/printk/printk.c:343 (discriminator 3)) 
> > [ 48.752231][ T1] softirqs last enabled at (2359662): __do_softirq (arch/x86/include/asm/preempt.h:27 kernel/softirq.c:415 kernel/softirq.c:600) 
> > [ 48.754058][ T1] softirqs last disabled at (2359653): irq_exit (kernel/softirq.c:445 kernel/softirq.c:650 kernel/softirq.c:674) 
> > [   48.755866][    T1] ---[ end trace 0000000000000000 ]---
> > [ 48.757648][ T1] initcall adfdrv_init+0x0/0x45 returned 0 after 52170 usecs 
> > [ 48.759132][ T1] calling adfdrv_init+0x0/0x45 @ 1 
> > 
> > 
> > To reproduce:
> > 
> >         # build kernel
> > 	cd linux
> > 	cp config-6.3.0-rc1-00121-g8660484ed1cf .config
> > 	make HOSTCC=gcc-11 CC=gcc-11 ARCH=x86_64 olddefconfig prepare modules_prepare bzImage modules
> > 	make HOSTCC=gcc-11 CC=gcc-11 ARCH=x86_64 INSTALL_MOD_PATH=<mod-install-dir> modules_install
> > 	cd <mod-install-dir>
> > 	find lib/ | cpio -o -H newc --quiet | gzip > modules.cgz
> > 
> > 
> >         git clone https://github.com/intel/lkp-tests.git
> >         cd lkp-tests
> >         bin/lkp qemu -k <bzImage> -m modules.cgz job-script # job-script is attached in this email
> > 
> >         # if come across any failure that blocks the test,
> >         # please remove ~/.lkp and /lkp dir to run from a clean state.
> > 
> > 
> > -- 
> > 0-DAY CI Kernel Test Service
> > https://github.com/intel/lkp-tests/wiki

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ