[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20140730035231.GC16537@localhost>
Date: Wed, 30 Jul 2014 11:52:31 +0800
From: Fengguang Wu <fengguang.wu@...el.com>
To: Christoph Lameter <cl@...ux-foundation.org>
Cc: Jet Chen <jet.chen@...el.com>, Su Tao <tao.su@...el.com>,
Yuanhan Liu <yuanhan.liu@...el.com>, LKP <lkp@...org>,
linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: [vmstat] BUG: using smp_processor_id() in preemptible [00000000]
code: kworker/0:1/36
Hi Christoph,
This is another BUG message for the same commit.
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
commit 6e0a6b18b63e2c0a45ff47ab633dd6f3ad417453
Author: Christoph Lameter <cl@...two.org>
AuthorDate: Wed Jul 23 09:11:43 2014 +1000
Commit: Stephen Rothwell <sfr@...b.auug.org.au>
CommitDate: Wed Jul 23 09:11:43 2014 +1000
vmstat: On demand vmstat workers V8
vmstat workers are used for folding counter differentials into the zone,
per node and global counters at certain time intervals. They currently
run at defined intervals on all processors which will cause some holdoff
for processors that need minimal intrusion by the OS.
The current vmstat_update mechanism depends on a deferrable timer firing
every other second by default which registers a work queue item that runs
on the local CPU, with the result that we have 1 interrupt and one
additional schedulable task on each CPU every 2 seconds If a workload
indeed causes VM activity or multiple tasks are running on a CPU, then
there are probably bigger issues to deal with.
However, some workloads dedicate a CPU for a single CPU bound task. This
is done in high performance computing, in high frequency financial
applications, in networking (Intel DPDK, EZchip NPS) and with the advent
of systems with more and more CPUs over time, this may become more and
more common to do since when one has enough CPUs one cares less about
efficiently sharing a CPU with other tasks and more about efficiently
monopolizing a CPU per task.
The difference of having this timer firing and workqueue kernel thread
scheduled per second can be enormous. An artificial test measuring the
worst case time to do a simple "i++" in an endless loop on a bare metal
system and under Linux on an isolated CPU with dynticks and with and
without this patch, have Linux match the bare metal performance (~700
cycles) with this patch and loose by couple of orders of magnitude (~200k
cycles) without it[*]. The loss occurs for something that just calculates
statistics. For networking applications, for example, this could be the
difference between dropping packets or sustaining line rate.
Statistics are important and useful, but it would be great if there would
be a way to not cause statistics gathering produce a huge performance
difference. This patche does just that.
This patch creates a vmstat shepherd worker that monitors the per cpu
differentials on all processors. If there are differentials on a
processor then a vmstat worker local to the processors with the
differentials is created. That worker will then start folding the diffs
in regular intervals. Should the worker find that there is no work to be
done then it will make the shepherd worker monitor the differentials
again.
With this patch it is possible then to have periods longer than
2 seconds without any OS event on a "cpu" (hardware thread).
The patch shows a very minor increased in system performance.
hackbench -s 512 -l 2000 -g 15 -f 25 -P
Results before the patch:
Running in process mode with 15 groups using 50 file descriptors each (== 750 tasks)
Each sender will pass 2000 messages of 512 bytes
Time: 4.992
Running in process mode with 15 groups using 50 file descriptors each (== 750 tasks)
Each sender will pass 2000 messages of 512 bytes
Time: 4.971
Running in process mode with 15 groups using 50 file descriptors each (== 750 tasks)
Each sender will pass 2000 messages of 512 bytes
Time: 5.063
Hackbench after the patch:
Running in process mode with 15 groups using 50 file descriptors each (== 750 tasks)
Each sender will pass 2000 messages of 512 bytes
Time: 4.973
Running in process mode with 15 groups using 50 file descriptors each (== 750 tasks)
Each sender will pass 2000 messages of 512 bytes
Time: 4.990
Running in process mode with 15 groups using 50 file descriptors each (== 750 tasks)
Each sender will pass 2000 messages of 512 bytes
Time: 4.993
Signed-off-by: Christoph Lameter <cl@...ux.com>
Reviewed-by: Gilad Ben-Yossef <gilad@...yossef.com>
Cc: Frederic Weisbecker <fweisbec@...il.com>
Cc: Thomas Gleixner <tglx@...utronix.de>
Cc: Tejun Heo <tj@...nel.org>
Cc: John Stultz <johnstul@...ibm.com>
Cc: Mike Frysinger <vapier@...too.org>
Cc: Minchan Kim <minchan.kim@...il.com>
Cc: Hakan Akkan <hakanakkan@...il.com>
Cc: Max Krasnyansky <maxk@....qualcomm.com>
Cc: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Cc: Hugh Dickins <hughd@...gle.com>
Cc: Viresh Kumar <viresh.kumar@...aro.org>
Cc: H. Peter Anvin <hpa@...or.com>
Cc: Ingo Molnar <mingo@...nel.org>
Cc: Peter Zijlstra <peterz@...radead.org>
Signed-off-by: Andrew Morton <akpm@...ux-foundation.org>
+---------------------------------------------------------+------------+------------+---------------+
| | 4020841d46 | 6e0a6b18b6 | next-20140725 |
+---------------------------------------------------------+------------+------------+---------------+
| boot_successes | 1012 | 246 | 9 |
| boot_failures | 188 | 54 | 2 |
| BUG:kernel_boot_hang | 188 | 42 | |
| BUG:using_smp_processor_id()in_preemptible_code:kworker | 0 | 12 | 2 |
| backtrace:vmstat_update | 0 | 12 | 2 |
+---------------------------------------------------------+------------+------------+---------------+
[ 16.503488] hwclock (152) used greatest stack depth: 6364 bytes left
[ 16.507349] plymouthd (151) used greatest stack depth: 6292 bytes left
[ 16.697360] chmod (170) used greatest stack depth: 6036 bytes left
[ 17.346850] BUG: using smp_processor_id() in preemptible [00000000] code: kworker/0:1/36
[ 17.347917] caller is debug_smp_processor_id+0x12/0x20
[ 17.348597] CPU: 1 PID: 36 Comm: kworker/0:1 Not tainted 3.16.0-rc6-00251-g6e0a6b1 #7
[ 17.349662] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[ 17.350603] Workqueue: events vmstat_update
[ 17.351211] 00000001 914e9e4c 828277f8 00000000 00000001 8305dad3 914e9e7c 8176b314
[ 17.352422] 8306afe0 83069fba 8305dad3 00000000 914e1924 00000024 83069fba 83a99220
[ 17.353626] 00000129 914c47e0 914e9e84 8176b342 914e9e98 811aa410 00000000 9237d200
[ 17.354767] Call Trace:
[ 17.355104] [<828277f8>] dump_stack+0x7f/0xf3
[ 17.355734] [<8176b314>] check_preemption_disabled+0x164/0x180
[ 17.356555] [<8176b342>] debug_smp_processor_id+0x12/0x20
[ 17.357404] [<811aa410>] vmstat_update+0x40/0x80
[ 17.358065] [<8109bad9>] process_one_work+0x3a9/0xa40
[ 17.358766] [<8109bf8c>] ? process_one_work+0x85c/0xa40
[ 17.359454] [<8109c19d>] ? worker_thread+0x2d/0xb40
[ 17.360141] [<8109c72c>] worker_thread+0x5bc/0xb40
[ 17.360782] [<8109c170>] ? process_one_work+0xa40/0xa40
[ 17.361473] [<810a8932>] kthread+0xe2/0xf0
[ 17.362036] [<8109c170>] ? process_one_work+0xa40/0xa40
[ 17.362784] [<810b0000>] ? SyS_setns+0x90/0x160
[ 17.363492] [<82853c01>] ret_from_kernel_thread+0x21/0x30
[ 17.364254] [<810a8850>] ? insert_kthread_work+0x100/0x100
Elapsed time: 25
qemu-system-x86_64 -enable-kvm -cpu Haswell,+smep,+smap -kernel /kernel/i386-randconfig-ib0-07271715/6e0a6b18b63e2c0a45ff47ab633dd6f3ad417453/vmlinuz-3.16.0-rc6-00251-g6e0a6b1 -append 'hung_task_panic=1 earlyprintk=ttyS0,115200 debug apic=debug sysrq_always_enabled rcupdate.rcu_cpu_stall_timeout=100 panic=10 softlockup_panic=1 nmi_watchdog=panic prompt_ramdisk=0 console=ttyS0,115200 console=tty0 vga=normal root=/dev/ram0 rw link=/kbuild-tests/run-queue/kvm/i386-randconfig-ib0-07271715/next:master:6e0a6b18b63e2c0a45ff47ab633dd6f3ad417453:bisect-linux5/.vmlinuz-6e0a6b18b63e2c0a45ff47ab633dd6f3ad417453-20140728053554-192-kbuild branch=next/master BOOT_IMAGE=/kernel/i386-randconfig-ib0-07271715/6e0a6b18b63e2c0a45ff47ab633dd6f3ad417453/vmlinuz-3.16.0-rc6-00251-g6e0a6b1 drbd.minor_count=8' -initrd /kernel-tests/initrd/quantal-core-i386.cgz -m 320 -smp 2 -net nic,vlan=1,model=e1000 -net user,vlan=1 -boot order=nc -no-reboot -watchdog i6300esb -rtc base=localtime -pidfile /dev/shm/kboot/pid-quantal-kbuild-1 -serial file:/dev/shm/kboot/serial-quantal-kbuild-1 -daemonize -display none -monitor null
git bisect start 5a7439efd1c5c416f768fc550048ca130cf4bf99 9a3c4145af32125c5ee39c0272662b47307a8323 --
git bisect good 38efad9af81d145f07f592f618c76c78cf141e5b # 19:10 300+ 0 Merge remote-tracking branch 'libata/for-next'
git bisect good 7ed8accbe1d061e1dfe4ce7a8681495595ebe1da # 19:19 300+ 6 next-20140724/tip
git bisect good 7d3ce0493347c0176b37d877be1bc2204c2314b4 # 19:26 300+ 20 Merge remote-tracking branch 'staging/staging-next'
git bisect good 550c5daec4f343ffaf1a1e069e1f47275e12b369 # 19:28 300+ 70 Merge remote-tracking branch 'ktest/for-next'
git bisect good dd7314beaded523afff8444fa8d471446fb27172 # 19:32 300+ 96 Merge branch 'rd-docs/master'
git bisect good 4d1954347c000af3ee37661dc3acfe0ae8f59348 # 19:53 300+ 46 PKCS#7: include linux-err.h for PTR_ERR and IS_ERR
git bisect bad 590deb1467ccd5b89a40441542eed94a20fde9cd # 19:54 30- 1 Merge branch 'akpm-current/current'
git bisect bad a85e2d130331aa9885cbba74ae1a604dce709482 # 20:00 8- 2 include/linux/kernel.h:744:28: note: in expansion of macro 'min'
git bisect good 4ac25431a42651458ee8fe31358d714aa18ee9aa # 20:45 300+ 25 mm: memcontrol: rearrange charging fast path
git bisect good 84334f9696fba65dac01b6896e728ed64f25b0bb # 20:50 300+ 49 mm,hugetlb: simplify error handling in hugetlb_cow()
git bisect bad de32ada9f1bb4fd7673ed245ba2b1a9103ec50ae # 20:53 53- 63 slub: remove kmemcg id from create_unique_id
git bisect good e28c951ff01a805eacae2f67a96e0f29e32cebd1 # 21:03 300+ 45 mm: pagemap: avoid unnecessary overhead when tracepoints are deactivated
git bisect good 5860f33b9ac1c224a399736358d83693fe78ce82 # 21:15 300+ 82 mm: describe mmap_sem rules for __lock_page_or_retry() and callers
git bisect bad e7943023cfcac3c9a7fe5a23713aa5723386d83b # 21:34 13- 4 cpu_stat_off can be static
git bisect bad 6e0a6b18b63e2c0a45ff47ab633dd6f3ad417453 # 21:36 5- 6 vmstat: On demand vmstat workers V8
git bisect good 4020841d464d689c045ad77f091f6f7fa211663d # 22:20 300+ 48 mm/shmem.c: remove the unused gfp arg to shmem_add_to_page_cache()
# first bad commit: [6e0a6b18b63e2c0a45ff47ab633dd6f3ad417453] vmstat: On demand vmstat workers V8
git bisect good 4020841d464d689c045ad77f091f6f7fa211663d # 22:34 900+ 188 mm/shmem.c: remove the unused gfp arg to shmem_add_to_page_cache()
git bisect bad 5a7439efd1c5c416f768fc550048ca130cf4bf99 # 22:34 0- 2 Add linux-next specific files for 20140725
git bisect good 2062afb4f804afef61cbe62a30cac9a46e58e067 # 22:47 900+ 167 Fix gcc-4.9.0 miscompilation of load_balance() in scheduler
git bisect bad 5a7439efd1c5c416f768fc550048ca130cf4bf99 # 22:47 0- 2 Add linux-next specific files for 20140725
This script may reproduce the error.
----------------------------------------------------------------------------
#!/bin/bash
kernel=$1
initrd=quantal-core-i386.cgz
wget --no-clobber https://github.com/fengguang/reproduce-kernel-bug/blob/master/initrd/$initrd
kvm=(
qemu-system-x86_64
-enable-kvm
-cpu Haswell,+smep,+smap
-kernel $kernel
-initrd $initrd
-m 320
-smp 2
-net nic,vlan=1,model=e1000
-net user,vlan=1
-boot order=nc
-no-reboot
-watchdog i6300esb
-rtc base=localtime
-serial stdio
-display none
-monitor null
)
append=(
hung_task_panic=1
earlyprintk=ttyS0,115200
debug
apic=debug
sysrq_always_enabled
rcupdate.rcu_cpu_stall_timeout=100
panic=10
softlockup_panic=1
nmi_watchdog=panic
prompt_ramdisk=0
console=ttyS0,115200
console=tty0
vga=normal
root=/dev/ram0
rw
drbd.minor_count=8
)
"${kvm[@]}" --append "${append[*]}"
----------------------------------------------------------------------------
Thanks,
Fengguang
View attachment "dmesg-quantal-kbuild-1:20140727214816:i386-randconfig-ib0-07271715:3.16.0-rc6-00251-g6e0a6b1:7" of type "text/plain" (59227 bytes)
Download attachment "i386-randconfig-ib0-07271715-5a7439efd1c5c416f768fc550048ca130cf4bf99-BUG:-using----in-preemptible----code:-110167.log" of type "application/octet-stream" (53631 bytes)
View attachment "config-3.16.0-rc6-00251-g6e0a6b1" of type "text/plain" (95748 bytes)
_______________________________________________
LKP mailing list
LKP@...ux.intel.com
Powered by blists - more mailing lists