lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <5156DC21.20901@synopsys.com>
Date:	Sat, 30 Mar 2013 18:05:45 +0530
From:	Vineet Gupta <Vineet.Gupta1@...opsys.com>
To:	lkml <linux-kernel@...r.kernel.org>, <linux-serial@...r.kernel.org>
CC:	Ingo Molnar <mingo@...nel.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Jiri Slaby <jslaby@...e.cz>,
	Peter Hurley <peter@...leysoftware.com>
Subject: n_tty_write() going into schedule but NOT coming out

Hi,

I've been stress testing ARC Linux 3.8 (same happens for 3.9-rc3 as well). The
setup has 3 telnet sessions, each running find . -name "*" in a loop.
The platform is a FPGA @ 80 MHz, running a single core ARC700 so kernel .config
has !SMP and PREEMPT_NONE.

After ~10 mins of run, I see that one of the telnet session gets stuck (and later
the 2nd one as well), while system is still alive, 3rd telnet is running find merrily.

[ARCLinux]$ ps
....
    7 root       0:00 inetd
   62 root       0:00 -/bin/sh
   64 root       1:34 telnetd -i -l /bin/sh
   65 root       0:00 /bin/sh
   75 root       1:47 telnetd -i -l /bin/sh
   76 root       0:00 /bin/sh
   79 root       0:53 telnetd -i -l /bin/sh
   80 root       0:00 /bin/sh
  281 root       0:00 find / -name *	<--- stuck
  358 root       0:03 find / -name *	<--- stuck
  377 root       0:00 find / -name *
  378 root       0:00 ps

Hung find task is sitting in the schedule() call in n_tty_write()

[ARCLinux]$ cat /proc/281/stack
[<8065945e>] n_tty_write+0x23a/0x424
[<80655cd4>] tty_write+0x1ac/0x2d4
[<805976ba>] vfs_write+0x92/0x110
[<80597816>] sys_write+0x4e/0x88
[<8050e780>] ret_from_system_call+0x0/0x4

This task never resumes out of schedule() - verified by putting a hardware
breakpoint on next insn - using a JTAG host debugger.

Attached are .config, /proc/281/sched, /proc/schedstat, /proc/sched_debug

My knowledge of schedular is close to none, hence any tips to debug this further
would be much appreciated.


TIA,
-Vineet

View attachment ".config" of type "text/plain" (21892 bytes)

View attachment "proc_pid_sched.log" of type "text/x-log" (2365 bytes)

View attachment "proc_sched_debug.log" of type "text/x-log" (2577 bytes)

View attachment "proc_schedstat.log" of type "text/x-log" (96 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ