lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 1 Sep 2016 20:04:24 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     Dmitry Vyukov <dvyukov@...gle.com>
Cc:     eparis@...hat.com, LKML <linux-kernel@...r.kernel.org>,
        Ingo Molnar <mingo@...hat.com>,
        syzkaller <syzkaller@...glegroups.com>,
        Kostya Serebryany <kcc@...gle.com>,
        Alexander Potapenko <glider@...gle.com>
Subject: Re: fanotify: unkillable hanged processes

On Thu, Sep 01, 2016 at 07:53:44PM +0200, Dmitry Vyukov wrote:
> Hello,
> 
> The following program:
> 
> https://gist.githubusercontent.com/dvyukov/0952eeac71069b46b3fe0e28bd1a02bf/raw/396b9dcce2636cecab1a4161c15d3f066e6ef639/gistfile1.txt
> 
> if run in a parallel loop creates unkillable hanged processes:
> 
> -bash-4.3# ps afxu | grep a.out
> root      4489  0.0  0.0   8868   340 ttyp1    S+   17:19   0:00
>    \_ grep a.out
> root      4423  0.0  0.0      0     0 ttyp0    D    17:16   0:00 [a.out]
> root      4424  0.0  0.0      0     0 ttyp0    D    17:16   0:00 [a.out]
> root      4425  0.0  0.0      0     0 ttyp0    D    17:16   0:00 [a.out]
> root      4470  0.0  0.1   7016  2316 ttyp0    D    17:16   0:00
> ./stress ./a.out
> 
> This looks like a classical deadlock, but LOCKDEP is silent so
> +LOCKDEP maintainers.

fanotify_get_response() uses wait_event(), that's an asymmetric API and
therefore one cannot include it in block-on chains. That is, while it
blocks, it doesn't know who it blocks on etc..

> Most of the processes are hanged at (from /proc/pid/stack):
> 
> [<ffffffff814dd6c8>] __synchronize_srcu+0x248/0x380 kernel/rcu/srcu.c:448
> [<ffffffff814dd81e>] synchronize_srcu+0x1e/0x40 kernel/rcu/srcu.c:492
> [<ffffffff819448d7>] fsnotify_mark_destroy_list+0x107/0x370 fs/notify/mark.c:551
> [<ffffffff8194163e>] fsnotify_destroy_group+0x1e/0xc0 fs/notify/group.c:57
> [<ffffffff8194a84b>] fanotify_release+0x20b/0x2d0
> fs/notify/fanotify/fanotify_user.c:392
> [<ffffffff81868bbc>] __fput+0x28c/0x780 fs/file_table.c:208
> [<ffffffff81869135>] ____fput+0x15/0x20 fs/file_table.c:244
> [<ffffffff813ebf63>] task_work_run+0xf3/0x170 kernel/task_work.c:116
> [<     inline     >] exit_task_work ./include/linux/task_work.h:21
> [<ffffffff81394218>] do_exit+0x868/0x2e70 kernel/exit.c:828
> [<ffffffff81396998>] do_group_exit+0x108/0x330 kernel/exit.c:958
> [<     inline     >] SYSC_exit_group kernel/exit.c:969
> [<ffffffff81396bdd>] SyS_exit_group+0x1d/0x20 kernel/exit.c:967
> [<ffffffff86e10700>] entry_SYSCALL_64_fastpath+0x23/0xc1
> arch/x86/entry/entry_64.S:208
> [<ffffffffffffffff>] 0xffffffffffffffff
> 
> One process holds the srcu lock:
> 
> stress          D ffffffff86dfe09a 28024  4470      1 0x00000004
>  ffff880064d9c440 0000000000000000 ffff8800686ab0c0 ffff88006cecc080
>  ffff88006d322bd8 ffff8800653f7630 ffffffff86dfe09a ffff8800653f76e0
>  0000000000000282 ffff88006d323568 ffff88006d323540 ffff880064d9c448
> Call Trace:
>  [<ffffffff86dff497>] schedule+0x97/0x1c0 kernel/sched/core.c:3414
>  [<     inline     >] fanotify_get_response fs/notify/fanotify/fanotify.c:70
>  [<ffffffff8194a347>] fanotify_handle_event+0x537/0x830
> fs/notify/fanotify/fanotify.c:233
>  [<     inline     >] send_to_group fs/notify/fsnotify.c:179
>  [<ffffffff8193fc8d>] fsnotify+0x73d/0x1020 fs/notify/fsnotify.c:275
>  [<     inline     >] fsnotify_perm ./include/linux/fsnotify.h:55
>  [<ffffffff82acefb1>] security_file_open+0x151/0x190 security/security.c:887
>  [<ffffffff8185a8db>] do_dentry_open+0x2ab/0xd30 fs/open.c:736
>  [<ffffffff8185e445>] vfs_open+0x105/0x220 fs/open.c:860
>  [<     inline     >] do_last fs/namei.c:3374
>  [<ffffffff81895909>] path_openat+0x12f9/0x2ab0 fs/namei.c:3497
>  [<ffffffff8189a7ac>] do_filp_open+0x18c/0x250 fs/namei.c:3532
>  [<ffffffff81873528>] do_open_execat+0xe8/0x4d0 fs/exec.c:818
>  [<ffffffff8187a18f>] do_execveat_common.isra.35+0x71f/0x1d80 fs/exec.c:1679
>  [<     inline     >] do_execve fs/exec.c:1783
>  [<     inline     >] SYSC_execve fs/exec.c:1864
>  [<ffffffff8187c262>] SyS_execve+0x42/0x50 fs/exec.c:1859
>  [<ffffffff810088ff>] do_syscall_64+0x1df/0x640 arch/x86/entry/common.c:288
>  [<ffffffff86e107c3>] entry_SYSCALL64_slow_path+0x25/0x25
> arch/x86/entry/entry_64.S:249
> 
> 
> [  467.548485] Showing all locks held in the system:
> [  467.548981] 2 locks held by bash/4044:
> [  467.549313]  #0:  (&tty->ldisc_sem){.+.+.+}, at:
> [<ffffffff86e0f7f7>] ldsem_down_read+0x37/0x40
> [  467.550076]  #1:  (&ldata->atomic_read_lock){+.+...}, at:
> [<ffffffff8322ec95>] n_tty_read+0x1e5/0x1860
> [  467.550897] 3 locks held by bash/4062:
> [  467.551210]  #0:  (sb_writers#5){.+.+.+}, at: [<ffffffff81863c04>]
> vfs_write+0x3a4/0x4e0
> [  467.551923]  #1:  (rcu_read_lock){......}, at: [<ffffffff83245c30>]
> __handle_sysrq+0x0/0x4d0
> [  467.552655]  #2:  (tasklist_lock){.+.+..}, at: [<ffffffff81496fc4>]
> debug_show_all_locks+0x74/0x290
> [  467.553454] 2 locks held by stress/4470:
> [  467.553754]  #0:  (&sig->cred_guard_mutex){+.+.+.}, at:
> [<ffffffff818799b3>] prepare_bprm_creds+0x53/0x110
> [  467.554590]  #1:  (&fsnotify_mark_srcu){......}, at:
> [<ffffffff8193f71d>] fsnotify+0x1cd/0x1020

So not quite enough information, what is bash/4044's stack trace?
Because that is holding the fanotify srcu reference.

But it looks like a scenario where everyone is waiting for SRCU to
complete, while the task holding up SRCU completion is waiting for
something else.

But I'm not at all familiar with fanotify.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ