linux-kernel - Re: 2.6.39-rc4+: oom-killer busy killing tasks

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.DEB.2.01.1105031345090.18728@trent.utfs.org>
Date:	Tue, 3 May 2011 13:53:31 -0700 (PDT)
From:	Christian Kujau <lists@...dbynature.de>
To:	Dave Chinner <david@...morbit.com>
cc:	Markus Trippelsdorf <markus@...ppelsdorf.de>,
	LKML <linux-kernel@...r.kernel.org>, xfs@....sgi.com,
	minchan.kim@...il.com
Subject: Re: 2.6.39-rc4+: oom-killer busy killing tasks

On Tue, 3 May 2011 at 10:51, Dave Chinner wrote:
> Can you run an event trace of all the XFS events during a find for
> me? Don't do it over the entire subset of the filesystem - only
> 100,000 inodes is sufficient (i.e. kill the find once the xfs inode
> cache slab reaches 100k inodes. While still running the event trace,
> can you then drop the caches (echo 3 > /proc/sys/vm/drop_caches) and
> check that the xfs inode cache is emptied? If it isn't emptied, drop
> caches again to see if that empties it. If you coul dthen post the
> event trace, I might be able to see what is going strange with the
> shrinker and/or reclaim.

OK, I've done something. Not sure if I got everything right:

  https://trent.utfs.org/p/bits/2.6.39-rc4/oom/trace/
  (new URL, the other one ran out of webspace. Omit the s in https
   if you don't have the CAcert.org root cert imported)

* I've started 'trace-cmd record -e xfs /usr/bin/find /mnt/backup'
  in one (screen-)window, which produced trace-14.dat.bz2

* I've started my oom-debug.sh script in another, which produced
  slabinfo-14.txt.bz2

* In another window, I was dropping the caches and looked at
  /proc/slabinfo again, see drop_caches-14.txt

Somehow "trace-cmd report" segfaults here, but I hope "trace-14.report" 
contains enough details already. If not, I can do this again.

Thanks,
Christian.
-- 
BOFH excuse #314:

You need to upgrade your VESA local bus to a MasterCard local bus.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/