[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <5979B8A5.205@huawei.com>
Date: Thu, 27 Jul 2017 17:55:49 +0800
From: Gu Zheng <guzheng1@...wei.com>
To: <eparis@...hat.com>
CC: Zhaohongjiang <zhaohongjiang@...wei.com>,
"miaoxie@...wei.com" <miaoxie@...wei.com>,
Qiuxishi <qiuxishi@...wei.com>, <linux-kernel@...r.kernel.org>
Subject: trinity test fanotify cause hungtasks on kernel 4.13
hi,Eric Paris:
when we used the trinity test the fanotify interfaces, it cause many hungtasks.
CONFIG_FANOTIFY_ACCESS_PERMISSIONS=y
the shell is simple:
1 #!/bin/bash
2
3 while true
4 do
5 ./trinity -c fanotify_init -l off -C 2 -X > /dev/null 2>&1 &
6 sleep 1
7 ./trinity -c fanotify_mark -l off -C 2 -X > /dev/null 2>&1 &
8 sleep 10
9 done
we found the trinity enter the D state fastly.
we check the pids'stack
[root@...alhost ~]# ps -aux | grep D
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 977 0.0 0.0 207992 7904 ? Ss 15:23 0:00 /usr/bin/abrt-watch-log -F BUG: WARNING: at WARNING: CPU: INFO: possible recursive locking detected ernel BUG at list_del corruption list_add corruption do_IRQ: stack overflow: ear stack overflow (cur: eneral protection fault nable to handle kernel ouble fault: RTNL: assertion failed eek! page_mapcount(page) went negative! adness at NETDEV WATCHDOG ysctl table check failed : nobody cared IRQ handler type mismatch Machine Check Exception: Machine check events logged divide error: bounds: coprocessor segment overrun: invalid TSS: segment not present: invalid opcode: alignment check: stack segment: fpu exception: simd exception: iret exception: /var/log/messages -- /usr/bin/abrt-dump-oops -xtD
root 997 0.0 0.0 203360 3188 ? Ssl 15:23 0:00 /usr/sbin/gssproxy -D
root 1549 0.0 0.0 82552 6012 ? Ss 15:23 0:00 /usr/sbin/sshd -D
root 2807 3.5 0.2 59740 35416 pts/0 DL 15:24 0:00 ./trinity -c fanotify_init -l off -C 2 -X
root 2809 3.1 0.2 53712 35332 pts/0 DL 15:24 0:00 ./trinity -c fanotify_mark -l off -C 2 -X
root 2915 0.0 0.0 136948 1776 pts/0 D 15:24 0:00 ps ax
root 2919 0.0 0.0 112656 2100 pts/1 S+ 15:24 0:00 grep --color=auto D
[root@...alhost ~]# cat /proc/2807/stack
[<ffffffff95287551>] fanotify_handle_event+0x2a1/0x2f0
[<ffffffff95283c13>] fsnotify+0x2d3/0x4f0
[<ffffffff952f3a89>] security_file_open+0x89/0x90
[<ffffffff95239819>] do_dentry_open+0x139/0x330
[<ffffffff9523ad9f>] vfs_open+0x4f/0x70
[<ffffffff9524c428>] path_openat+0x548/0x1350
[<ffffffff9524ea51>] do_filp_open+0x91/0x100
[<ffffffff9523b174>] do_sys_open+0x124/0x210
[<ffffffff9523b27e>] SyS_open+0x1e/0x20
[<ffffffff95003857>] do_syscall_64+0x67/0x150
[<ffffffff95741de7>] entry_SYSCALL64_slow_path+0x25/0x25
[<ffffffffffffffff>] 0xffffffffffffffff
[root@...alhost ~]# cat /proc/2915/stack
[<ffffffff95287551>] fanotify_handle_event+0x2a1/0x2f0
[<ffffffff95283c13>] fsnotify+0x2d3/0x4f0
[<ffffffff952f3a89>] security_file_open+0x89/0x90
[<ffffffff95239819>] do_dentry_open+0x139/0x330
[<ffffffff9523ad9f>] vfs_open+0x4f/0x70
[<ffffffff9524c428>] path_openat+0x548/0x1350
[<ffffffff9524ea51>] do_filp_open+0x91/0x100
[<ffffffff9523b174>] do_sys_open+0x124/0x210
[<ffffffff9523b27e>] SyS_open+0x1e/0x20
[<ffffffff95003857>] do_syscall_64+0x67/0x150
[<ffffffff95741de7>] entry_SYSCALL64_slow_path+0x25/0x25
[<ffffffffffffffff>] 0xffffffffffffffff
[root@...alhost ~]# cat /proc/2809/stack
[<ffffffff95287551>] fanotify_handle_event+0x2a1/0x2f0
[<ffffffff95283c13>] fsnotify+0x2d3/0x4f0
[<ffffffff952f3a89>] security_file_open+0x89/0x90
[<ffffffff95239819>] do_dentry_open+0x139/0x330
[<ffffffff9523ad9f>] vfs_open+0x4f/0x70
[<ffffffff9524c428>] path_openat+0x548/0x1350
[<ffffffff9524ea51>] do_filp_open+0x91/0x100
[<ffffffff9523b174>] do_sys_open+0x124/0x210
[<ffffffff9523b27e>] SyS_open+0x1e/0x20
[<ffffffff95003857>] do_syscall_64+0x67/0x150
[<ffffffff95741de7>] entry_SYSCALL64_slow_path+0x25/0x25
[<ffffffffffffffff>] 0xffffffffffffffff
all pids wait for the response in fanotify_handle_event->fanotify_get_response,
but the monitor can not replay anything ,becauseof the permission or killed monitor
then the others will be stucked who use the fanotify or synchronize_srcu
if we disable the CONFIG_FANOTIFY_ACCESS_PERMISSIONS,
the mem will be consumed quickly, because the fsnotify_mark_srcu read lock always be hold.
if add a timeout , the safety can not be guaranteed.
do you have any ideas?
thanks.
Powered by blists - more mailing lists