lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 7 Sep 2018 08:39:27 +0200
From:   Jiri Slaby <jslaby@...e.cz>
To:     kernel test robot <rong.a.chen@...el.com>
Cc:     lkp@...org, Dmitry Safonov <dima@...sta.com>,
        Daniel Axtens <dja@...ens.net>,
        Dmitry Safonov <0x7f454c46@...il.com>,
        Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>,
        Dmitry Vyukov <dvyukov@...gle.com>,
        Tan Xiaojun <tanxiaojun@...wei.com>,
        Peter Hurley <peter@...leysoftware.com>,
        Pasi Kärkkäinen <pasik@....fi>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Michael Neuling <mikey@...ling.org>,
        Mikulas Patocka <mpatocka@...hat.com>,
        linux-kernel@...r.kernel.org, stable@...r.kernel.org
Subject: Re: [LKP] [tty] 0b4f83d510: INFO:task_blocked_for_more_than#seconds

On 09/07/2018, 06:50 AM, kernel test robot wrote:
> FYI, we noticed the following commit (built with gcc-7):
> 
> commit: 0b4f83d510f6fef6bb9da25f122c8d733d50516f ("[PATCH 2/4] tty: Hold tty_ldisc_lock() during tty_reopen()")
> url: https://github.com/0day-ci/linux/commits/Dmitry-Safonov/tty-Hold-write-ldisc-sem-in-tty_reopen/20180829-165618
> base: https://git.kernel.org/cgit/linux/kernel/git/gregkh/tty.git tty-testing
> 
> in testcase: trinity
> with following parameters:
> 
> 	runtime: 300s
> 
> test-description: Trinity is a linux system call fuzz tester.
> test-url: http://codemonkey.org.uk/projects/trinity/
> 
> 
> on test machine: qemu-system-x86_64 -enable-kvm -m 256M
> 
> caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
> 
> 
> +--------------------------------------------------+------------+------------+
> |                                                  | 58dd163974 | 0b4f83d510 |
> +--------------------------------------------------+------------+------------+
> | boot_successes                                   | 14         | 4          |
> | boot_failures                                    | 0          | 6          |
> | INFO:task_blocked_for_more_than#seconds          | 0          | 6          |
> | Kernel_panic-not_syncing:hung_task:blocked_tasks | 0          | 6          |
> +--------------------------------------------------+------------+------------+
> 
> 
> 
> [  244.816801] INFO: task validate_data:655 blocked for more than 120 seconds.
> [  244.818833]       Not tainted 4.18.0-11684-g0b4f83d #1
> [  244.820028] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [  244.826965] validate_data   D    0   655    623 0x20020000
> [  244.828279] Call Trace:
> [  244.828958]  ? __schedule+0x843/0x950
> [  244.830173]  ? __ldsem_down_read_nested+0x1c4/0x3b0
> [  244.834903]  schedule+0x31/0x70
> [  244.835665]  schedule_timeout+0x34/0x760
> [  244.836613]  ? ftrace_likely_update+0x35/0x60
> [  244.837683]  ? __ldsem_down_read_nested+0x1c4/0x3b0
> [  244.838818]  ? ftrace_likely_update+0x35/0x60
> [  244.840127]  ? ftrace_likely_update+0x35/0x60
> [  244.845947]  ? __ldsem_down_read_nested+0x1c4/0x3b0
> [  244.847882]  __ldsem_down_read_nested+0x23a/0x3b0
> [  244.849886]  ? tty_ldisc_ref_wait+0x25/0x50
> [  244.853807]  tty_ldisc_ref_wait+0x25/0x50
> [  244.854946]  tty_compat_ioctl+0x8a/0x120
> [  244.855928]  ? this_tty+0x80/0x80
> [  244.856742]  __ia32_compat_sys_ioctl+0xc28/0x1ce0
> [  244.857981]  do_int80_syscall_32+0x1d2/0x5f0
> [  244.859003]  entry_INT80_compat+0x88/0xa0
> [  244.859972] INFO: task dnsmasq:668 blocked for more than 120 seconds.
> [  244.868315]       Not tainted 4.18.0-11684-g0b4f83d #1
> [  244.869583] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [  244.871744] dnsmasq         D    0   668      1 0x20020000
> [  244.873063] Call Trace:
> [  244.873697]  ? __schedule+0x843/0x950
> [  244.874572]  ? __ldsem_down_read_nested+0x1c4/0x3b0
> [  244.875725]  schedule+0x31/0x70
> [  244.876576]  schedule_timeout+0x34/0x760
> [  244.877573]  ? ftrace_likely_update+0x35/0x60
> [  244.878660]  ? __ldsem_down_read_nested+0x1c4/0x3b0
> [  244.879872]  ? ftrace_likely_update+0x35/0x60
> [  244.890522]  ? ftrace_likely_update+0x35/0x60
> [  244.891572]  ? __ldsem_down_read_nested+0x1c4/0x3b0
> [  244.892746]  __ldsem_down_read_nested+0x23a/0x3b0
> [  244.893861]  ? tty_ldisc_ref_wait+0x25/0x50
> [  244.894841]  tty_ldisc_ref_wait+0x25/0x50
> [  244.895911]  tty_compat_ioctl+0x8a/0x120
> [  244.896916]  ? this_tty+0x80/0x80
> [  244.897717]  __ia32_compat_sys_ioctl+0xc28/0x1ce0
> [  244.898821]  do_int80_syscall_32+0x1d2/0x5f0
> [  244.899830]  entry_INT80_compat+0x88/0xa0
> [  244.909466] INFO: task dropbear:734 blocked for more than 120 seconds.
> [  244.911173]       Not tainted 4.18.0-11684-g0b4f83d #1
> [  244.912394] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [  244.914176] dropbear        D    0   734      1 0x20020000
> [  244.915446] Call Trace:
> [  244.916068]  ? __schedule+0x843/0x950
> [  244.916945]  ? __ldsem_down_read_nested+0x1c4/0x3b0
> [  244.918076]  schedule+0x31/0x70
> [  244.918832]  schedule_timeout+0x34/0x760
> [  244.919781]  ? ftrace_likely_update+0x35/0x60
> [  244.921104]  ? __ldsem_down_read_nested+0x1c4/0x3b0
> [  244.922304]  ? ftrace_likely_update+0x35/0x60
> [  244.923347]  ? ftrace_likely_update+0x35/0x60
> [  244.924369]  ? __ldsem_down_read_nested+0x1c4/0x3b0
> [  244.925496]  __ldsem_down_read_nested+0x23a/0x3b0
> [  244.926598]  ? tty_ldisc_ref_wait+0x25/0x50
> [  244.927578]  tty_ldisc_ref_wait+0x25/0x50
> [  244.928526]  tty_compat_ioctl+0x8a/0x120
> [  244.929449]  ? this_tty+0x80/0x80
> [  244.930240]  __ia32_compat_sys_ioctl+0xc28/0x1ce0
> [  244.940083]  do_int80_syscall_32+0x1d2/0x5f0
> [  244.941310]  entry_INT80_compat+0x88/0xa0
> [  244.944070] 
> [  244.944070] Showing all locks held in the system:
> [  244.945558] 1 lock held by khungtaskd/18:
> [  244.946495]  #0: (____ptrval____) (rcu_read_lock){....}, at: debug_show_all_locks+0x16/0x190
> [  244.948503] 2 locks held by askfirst/235:
> [  244.949439]  #0: (____ptrval____) (&tty->ldisc_sem){++++}, at: tty_ldisc_ref_wait+0x25/0x50
> [  244.951762]  #1: (____ptrval____) (&ldata->atomic_read_lock){+.+.}, at: n_tty_read+0x13d/0xa00

Here, it just seems to wait for input from the user.

> [  244.953799] 1 lock held by validate_data/655:
> [  244.954814]  #0: (____ptrval____) (&tty->ldisc_sem){++++}, at: tty_ldisc_ref_wait+0x25/0x50
> [  244.956764] 1 lock held by dnsmasq/668:
> [  244.957649]  #0: (____ptrval____) (&tty->ldisc_sem){++++}, at: tty_ldisc_ref_wait+0x25/0x50
> [  244.959598] 1 lock held by dropbear/734:
> [  244.967564]  #0: (____ptrval____) (&tty->ldisc_sem){++++}, at: tty_ldisc_ref_wait+0x25/0x50

Hmm, I assume there is another task waiting for write_ldsem and that one
prevents these readers to proceed. Unfortunately, due to the defunct
__ptrval__ pointer hashing crap, we do not see who is waiting for what.
But I am guessing all are the same locks.

So I think, we are forced to limit the waiting to 5 seconds in reopen in
the end too (the same as we do for new open in tty_init_dev).

Dmitry, could you add the limit and handle the return value of
tty_ldisc_lock now?

thanks,
-- 
js
suse labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ