lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LFD.1.10.0807121339430.2875@woody.linux-foundation.org>
Date:	Sat, 12 Jul 2008 13:47:36 -0700 (PDT)
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Török Edwin <edwintorok@...il.com>
cc:	Ingo Molnar <mingo@...e.hu>, Roland McGrath <roland@...hat.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Elias Oltmanns <eo@...ensachen.de>,
	Arjan van de Ven <arjan@...radead.org>,
	Oleg Nesterov <oleg@...sign.ru>
Subject: Re: [PATCH] x86_64: fix delayed signals



On Sat, 12 Jul 2008, Török Edwin wrote:
> 
> A bit off-topic, but something I noticed during the tests:
> In my original test I have rm-ed the files right after launching dd in
> the background, yet it still continued to write to the disk.
> I can understand that if the file is opened O_RDWR, you might seek back
> and read what you wrote, so Linux needs to actually do the write,
> but why does it insist on writing to the disk, on a file opened with
> O_WRONLY, after the file itself got unlinked?

Linux itself doesn't insist on writing to disk. In fact, at least with 
traditional UNIX filesystems (eg minix, ext2) the deleted writes would be 
undone.

But some filesystems can't just invalidate dirty buffers (some won't do it 
for meta-data, others won't do it for _any_ data). So again, this 
behaviour depends on the filesystem. And sadly, the more "advanced" 
filesystem, the worse it usually behaves here.


> I have my filesystems mounted as noatime already.
> But yes, I am using different filesystems, the x86-64 box has reiserfs,
> and the x86-32 box has xfs.
> 
> > You can try to limit the amount of dirty data in flight by tweaking 
> > /proc/sys/vm/dirty*ratio
> 
> I have these in my /etc/rc.local:
> echo 5 > /proc/sys/vm/dirty_background_ratio
> echo 10 >/proc/sys/vm/dirty_ratio

That matches the modern defaults. You can try playing with them if you 
want to. And yes, it's worth testing nr_requests too.

> > Ok, that is definitel not related to signals at all. You're simply stuck 
> > waiting for IO - or perhaps some fundamental filesystem semaphore which is 
> > held while some IO needs to be flushed.
> 
> AFAICT reiserfs still uses the BKL, could that explain why one I/O
> delays another?

The BKL should be ok in this respect - it gets automatically dropped when 
doing synchronous waiting (this is somethign that will possibly go away as 
we try to convince people to get rid of the BKL, but it certainly hasn't 
happened yet).

So it actually gets worse with other locks - semaphores or mutexes - that 
stay held over IO. And reiserfs has a journal lock (and a "commit" lock), 
but I don't know how they are held and whether this could be part of the 
issue.

> > This is also why your trace on just 'kill_pgrp' and 'detach_pid' is not 
> > interesting. It's _normal_ to have a delay between them. It can happen 
> > because the process blocks (or catches) signals, but it will also happen 
> > if some system call waits for disk.
> 
> Is there a way to trace what happens between those 2 functions?

You could try to trace not just those functions, but scheduling events 
too. Or yes, do something special-caed.

Trying to figure out latencies in the block trace is likely also going to 
be interesting (although you won't see any signal issues there - but any 
long read latencies will automatically tend to imply latency issues not 
just for signals, but for pretty much any operations).

		Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ