lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 31 Mar 2009 22:51:08 +1100
From:	Neil Brown <neilb@...e.de>
To:	Theodore Tso <tytso@....edu>
Cc:	Ingo Molnar <mingo@...e.hu>, Jan Kara <jack@...e.cz>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Alan Cox <alan@...rguk.ukuu.org.uk>,
	Arjan van de Ven <arjan@...radead.org>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Nick Piggin <npiggin@...e.de>,
	Jens Axboe <jens.axboe@...cle.com>,
	David Rees <drees76@...il.com>, Jesper Krogh <jesper@...gh.cc>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Oleg Nesterov <oleg@...hat.com>,
	Roland McGrath <roland@...hat.com>
Subject: Re: ext3 IO latency measurements (was: Linux 2.6.29)

On Thursday March 26, tytso@....edu wrote:
> Ingo,
.....
> 
> > Oh, and while at it - also a job control complaint. I tried to 
> > Ctrl-C the above script:
> > 
> > I had to hit Ctrl-C numerous times before Bash would honor it. This 
> > to is a very common thing on large SMP systems.
> 
> Well, the script you sent runs the compile in the background.  It did:
> 
> >   while :; do
> >     date
> >     make mrproper      2>/dev/null >/dev/null
> >     make defconfig     2>/dev/null >/dev/null
> >     make -j32 bzImage  2>/dev/null >/dev/null
> >   done &
>          ^^
> 
> So there would have been nothing to ^C; I assume you were running this
> with a variant that didn't have the ampersand, which would have run
> the whole shell pipeline in a detached background process?
> 
> In any case, the workaround for this is to ^Z the script, and then
> "kill %" it.
> 
> I'm pretty sure this is actually a bash problem.  When you send a
> Ctrl-C, it sends a SIGINT to all of the members of the tty's
> foreground process group.  Under some circumstances, bash sets the
> signal handler for SIGINT to be SIGIGN.  I haven't looked at this
> super closely (it would require diving into the bash sources), but you
> can see it if you attach an strace to the bash shell driving a script
> such as
> 
> #!/bin/bash
> 
> while /bin/true; do
>       date
>       sleep 60
> done &
> 
> If you do a "ps axo pid,ppid,pgrp,args", you'll see that the bash and
> the sleep 60 have the same process group.  If you emulate hitting ^C
> by sending a SIGINT to pid of the shell, you'll see that it ignores
> it.  Sleep also seems to be ignoring the SIGINT when run in the
> background; but it does honor SIGINT in the foreground --- I didn't
> have time to dig into that.
> 
> In any case, bash appears to SIGIGN the INT signal if there is a child
> process running, and only takes the ^C if bash itself is actually
> "running" the shell script.  For example, if you run the command
> "date;sleep 10;date;sleep 10;date", the ^C only interrupts the sleep
> command.  It doesn't stop the series of commands which bash is
> running.

This is something that is really hard to get right.

If the shell is running a program when SIGINT arrives, it needs to
wait until the program exits, and then try to decide if the program
died because of the signal, or actually caught the signal (from the
user's perspective), did something useful, and then chose to exit.

If the program's exit status shows that it died due to SIGINT, it is
easy to know what to do.  But lots of non-trivial programs, probably
including 'make' catch SIGINT, do some quick cleanup and then exit.
In that case the shell has a hard time deciding what to do.

I wrote a job-controlling shell many years ago and I think the
heuristic I came up with was that if the process exited with the
SIGINT status, or with a non-zero error status in less that 3 seconds
after the signal actually arrived, then react to the signal and abort
any script.  However it the process takes longer to exit or returns a
zero exit status, assume that it was interactive and handled the
interrupt to the user's satisfaction, and continue with any script.


I don't know what bash does, and it is possible that it could do a
better job.  But it is a problem for which there is no straight
forward solution (a bit like filesystem data safety it would seem :-)

NeilBrown
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ