[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <202210271500.d4d75b8e-yujie.liu@intel.com>
Date: Thu, 27 Oct 2022 15:38:16 +0800
From: kernel test robot <yujie.liu@...el.com>
To: Anna-Maria Behnsen <anna-maria@...utronix.de>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>,
<linux-kernel@...r.kernel.org>,
Peter Zijlstra <peterz@...radead.org>,
John Stultz <john.stultz@...aro.org>,
Eric Dumazet <edumazet@...gle.com>,
Thomas Gleixner <tglx@...utronix.de>,
"Rafael J. Wysocki" <rafael.j.wysocki@...el.com>,
<linux-pm@...r.kernel.org>, Arjan van de Ven <arjan@...radead.org>,
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
Frederic Weisbecker <fweisbec@...il.com>,
"Rik van Riel" <riel@...hat.com>,
Anna-Maria Behnsen <anna-maria@...utronix.de>
Subject: Re: [PATCH v3 14/17] timer: Implement the hierarchical pull model
Greeting,
FYI, we noticed BUG:KASAN:stack-out-of-bounds_in_tmigr_inactive_up due to commit (built with gcc-11):
commit: b57766ae36b5e3dd11225f0259f9fd7d39a79e94 ("[PATCH v3 14/17] timer: Implement the hierarchical pull model")
url: https://github.com/intel-lab-lkp/linux/commits/Anna-Maria-Behnsen/timer-Move-from-a-push-remote-at-enqueue-to-a-pull-at-expiry-model/20221025-220106
base: https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git 8be3f96ceddb911539a53d87a66da84a04502366
patch link: https://lore.kernel.org/lkml/20221025135850.51044-15-anna-maria@linutronix.de
patch subject: [PATCH v3 14/17] timer: Implement the hierarchical pull model
in testcase: boot
on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
[ 387.110466][ T0] BUG: KASAN: stack-out-of-bounds in tmigr_inactive_up (include/linux/find.h:168 kernel/time/timer_migration.c:518)
[ 387.112931][ T0] Read of size 8 at addr ffffffff84a07b30 by task swapper/0/0
[ 387.115269][ T0]
[ 387.116037][ T0] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.1.0-rc1-00015-gb57766ae36b5 #1
[ 387.118685][ T0] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-debian-1.16.0-4 04/01/2014
[ 387.121748][ T0] Call Trace:
[ 387.122894][ T0] <TASK>
[ 387.123933][ T0] dump_stack_lvl (lib/dump_stack.c:107 (discriminator 1))
[ 387.125423][ T0] print_address_description+0x87/0x2a5
[ 387.127400][ T0] print_report (mm/kasan/report.c:396)
[ 387.128880][ T0] ? kasan_addr_to_slab (mm/kasan/common.c:35)
[ 387.130367][ T0] ? tmigr_inactive_up (include/linux/find.h:168 kernel/time/timer_migration.c:518)
[ 387.131888][ T0] kasan_report (mm/kasan/report.c:497)
[ 387.133264][ T0] ? tmigr_inactive_up (include/linux/find.h:168 kernel/time/timer_migration.c:518)
[ 387.134852][ T0] tmigr_inactive_up (include/linux/find.h:168 kernel/time/timer_migration.c:518)
[ 387.136406][ T0] ? kasan_save_stack (mm/kasan/common.c:47)
[ 387.137843][ T0] ? tmigr_handle_remote_up (kernel/time/timer_migration.c:494)
[ 387.139488][ T0] ? secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:358)
[ 387.141281][ T0] __tmigr_cpu_deactivate (kernel/time/timer_migration.c:143 kernel/time/timer_migration.c:155 kernel/time/timer_migration.c:599)
[ 387.142987][ T0] ? tmigr_inactive_up (kernel/time/timer_migration.c:583)
[ 387.144609][ T0] ? _raw_write_lock_irq (kernel/locking/spinlock.c:153)
[ 387.146255][ T0] ? sched_clock_local (kernel/sched/clock.c:290)
[ 387.147849][ T0] tmigr_cpu_deactivate (kernel/time/timer_migration.c:652)
[ 387.149436][ T0] ? tmigr_cpu_activate (kernel/time/timer_migration.c:613)
[ 387.150980][ T0] ? _raw_spin_lock (arch/x86/include/asm/atomic.h:202 include/linux/atomic/atomic-instrumented.h:543 include/asm-generic/qspinlock.h:111 include/linux/spinlock.h:186 include/linux/spinlock_api_smp.h:134 kernel/locking/spinlock.c:154)
[ 387.152436][ T0] ? _raw_write_lock_irq (kernel/locking/spinlock.c:153)
[ 387.154068][ T0] ? update_load_avg (kernel/sched/fair.c:3851 kernel/sched/fair.c:4186)
[ 387.155707][ T0] forward_and_idle_timer_bases (kernel/time/timer.c:1868)
[ 387.157540][ T0] tick_nohz_next_event (kernel/time/tick-sched.c:839)
[ 387.159056][ T0] ? tick_nohz_full_kick (kernel/time/tick-sched.c:804)
[ 387.160631][ T0] ? __switch_to (arch/x86/include/asm/bitops.h:55 include/asm-generic/bitops/instrumented-atomic.h:29 include/linux/thread_info.h:89 arch/x86/include/asm/fpu/sched.h:65 arch/x86/kernel/process_64.c:623)
[ 387.162049][ T0] tick_nohz_idle_stop_tick (kernel/time/tick-sched.c:1119 kernel/time/tick-sched.c:1149)
[ 387.163559][ T0] cpuidle_idle_call (kernel/sched/idle.c:191)
[ 387.164948][ T0] ? arch_cpu_idle_exit+0xc0/0xc0
[ 387.166569][ T0] do_idle (kernel/sched/idle.c:303)
[ 387.167905][ T0] cpu_startup_entry (kernel/sched/idle.c:399 (discriminator 1))
[ 387.169470][ T0] rest_init (init/main.c:702)
[ 387.170873][ T0] arch_call_rest_init+0xf/0x19
[ 387.172468][ T0] start_kernel (init/main.c:1147)
[ 387.173851][ T0] secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:358)
[ 387.175657][ T0] </TASK>
[ 387.176668][ T0]
[ 387.177433][ T0] The buggy address belongs to stack of task swapper/0/0
[ 387.179259][ T0] and is located at offset 32 in frame:
[ 387.180879][ T0] tmigr_inactive_up (kernel/time/timer_migration.c:494)
[ 387.182424][ T0]
[ 387.183293][ T0] This frame has 1 object:
[ 387.184728][ T0] [32, 36) 'newstate'
[ 387.184736][ T0]
[ 387.186953][ T0] The buggy address belongs to the physical page:
[ 387.188951][ T0] page:(____ptrval____) refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x4a07
[ 387.192040][ T0] flags: 0xfffffc0001000(reserved|node=0|zone=1|lastcpupid=0x1fffff)
[ 387.194464][ T0] raw: 000fffffc0001000 ffffea00001281c8 ffffea00001281c8 0000000000000000
[ 387.196942][ T0] raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000
[ 387.199575][ T0] page dumped because: kasan: bad access detected
[ 387.201611][ T0] page_owner info is not present (never set?)
[ 387.206375][ T0]
[ 387.207215][ T0] Memory state around the buggy address:
[ 387.208801][ T0] ffffffff84a07a00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 387.211366][ T0] ffffffff84a07a80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 387.213864][ T0] >ffffffff84a07b00: 00 00 f1 f1 f1 f1 04 f3 f3 f3 00 00 00 00 00 00
[ 387.216353][ T0] ^
[ 387.218190][ T0] ffffffff84a07b80: 00 00 00 00 00 f1 f1 f1 f1 00 00 00 00 f3 f3 f3
[ 387.220848][ T0] ffffffff84a07c00: f3 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00
[ 387.223223][ T0] ==================================================================
[ 387.225589][ T0] Disabling lock debugging due to kernel taint
If you fix the issue, kindly add following tag
| Reported-by: kernel test robot <yujie.liu@...el.com>
| Link: https://lore.kernel.org/oe-lkp/202210271500.d4d75b8e-yujie.liu@intel.com
To reproduce:
# build kernel
cd linux
cp config-6.1.0-rc1-00015-gb57766ae36b5 .config
make HOSTCC=gcc-11 CC=gcc-11 ARCH=x86_64 olddefconfig prepare modules_prepare bzImage modules
make HOSTCC=gcc-11 CC=gcc-11 ARCH=x86_64 INSTALL_MOD_PATH=<mod-install-dir> modules_install
cd <mod-install-dir>
find lib/ | cpio -o -H newc --quiet | gzip > modules.cgz
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k <bzImage> -m modules.cgz job-script # job-script is attached in this email
# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.
--
0-DAY CI Kernel Test Service
https://01.org/lkp
View attachment "config-6.1.0-rc1-00015-gb57766ae36b5" of type "text/plain" (170371 bytes)
View attachment "job-script" of type "text/plain" (4759 bytes)
Download attachment "dmesg.xz" of type "application/x-xz" (29216 bytes)
Powered by blists - more mailing lists