linux-kernel - Re: Scheduling issue with signal handling

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20080710085500.GA19918@elte.hu>
Date:	Thu, 10 Jul 2008 10:55:00 +0200
From:	Ingo Molnar <mingo@...e.hu>
To:	Elias Oltmanns <eo@...ensachen.de>
Cc:	linux-kernel@...r.kernel.org, Roland McGrath <roland@...hat.com>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Dmitry Adamushko <dmitry.adamushko@...il.com>
Subject: Re: Scheduling issue with signal handling

* Elias Oltmanns <eo@...ensachen.de> wrote:

> By sprinkling some printk()s all over the place, I've managed to 
> establish the following sequence of events taking place in the event 
> of delayed signal handling as described above: The first Ctrl+Z event 
> enqueues a SIGTSTP signal which eventually results in a call to 
> kick_process(). For some reason though, the signal isn't handled 
> straight away but remains on the queue for some time. Consequently, 
> subsequent Ctrl+Z events result in echoing another ^Z to the terminal 
> but everything related to sending a signal is skipped (and rightly so) 
> because the kernel detects that a SIGTSTP is still pending. 
> Eventually, get_signal_to_deliver() dequeues the SIGTSTP signal and 
> the shell propt appears.
> 
> My question is this: Even under high disk I/O pressure, the threads 
> dealing with I/O to the terminal evidently still get their turn as 
> indicated by the sequence of ^Z appearing on screen. Why is it then, 
> that the threads which are meant to process the SIGTSTP or SIGINT 
> signals aren't scheduled for some seconds and is there a way to change 
> this?
> 
> Please let me know if there is anything I can try to investigate this 
> any further or if you need further information.
> 
> Thanks in advance,
> 
> Elias
> 
> [1] http://lkml.org/lkml/2008/6/28/50

hm, kick_process() is a no-op on !SMP.

Basically, when a new signal is queued and a task is already running, it 
will run in due course and process the signal the moment it's scheduled 
again. (unless the signal is blocked)

If a task is not already running, then the signal code will wake up the 
task and it will then process the signal the moment it's executed.

The maximum latency of a runnable task hitting the CPU is controlled via 
/proc/sys/kernel/sched_latency [available if CONFIG_SCHED_DEBUG=y in the 
.config] - 20 milliseconds on uniprocessors.

Several seconds of lag is almost out of question and would indicate a 
serious scheduler bug, or - which is far more likely - either an 
application signal processing hickup or a kernel signal processing 
hickup.

If the lag happens with the task you can observe its worst-case 
scheduling delay by looking at /proc/<PID>/sched, if you also have 
CONFIG_SCHEDSTAT=y in your .config.

For example, a random shell's delays on a testbox:

  phoenix:~> grep se.wait_max /proc/$$/sched
  se.wait_max                        :             3.338588

That's 3.3 msecs _worst case_, on a system that has otherwise quite 
insane load:

 10:53:57 up  2:48,  2 users,  load average: 77.56, 94.33, 102.75

So several seconds of delay, if it came from the scheduler, would be 
really anomalous.

As a final measure, instead of printk's, you could try the scheduler 
tracer in linux-next (CONFIG_CONTEXT_SWITCH_TRACER=y), to have an exact 
idea about what is going on and when. (see /debug/tracing/README)

[ You might also want to try CONFIG_FTRACE=y and CONFIG_DYNAMIC_FTRACE=y
  for extremely finegrained kernel tracing - available in linux-next 
  too. ]

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/