lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4B72E74C.9040001@nortel.com>
Date:	Wed, 10 Feb 2010 11:05:16 -0600
From:	"Chris Friesen" <cfriesen@...tel.com>
To:	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>
CC:	Rik van Riel <riel@...hat.com>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	linux-mm@...ck.org, Balbir Singh <balbir@...ux.vnet.ibm.com>
Subject: Re: tracking memory usage/leak in "inactive" field in /proc/meminfo?

On 02/09/2010 06:32 PM, KOSAKI Motohiro wrote:

> can you please post your /proc/meminfo?


On 02/09/2010 09:50 PM, Balbir Singh wrote:
> Do you have swap enabled? Can you help with the OOM killed dmesg log?
> Does the situation get better after OOM killing.


On 02/09/2010 10:09 PM, KOSAKI Motohiro wrote:

> Chris, 2.6.27 is a bit old. plese test it on latest kernel. and please
don't use
> any proprietary drivers.


Thanks for the replies.

Swap is enabled in the kernel, but there is no swap configured.  ipcs
shows little consumption there.

The test load relies on a number of kernel modifications, making it
difficult to use newer kernels. (This is an embedded system.)  There are
no closed-source drivers loaded, though there are some that are not in
vanilla kernels.  I haven't yet tried to reproduce the problem with a
minimal load--I've been more focused on trying to understand what's
going on in the code first.  It's on my list to try though.

Here are some /proc/meminfo outputs from a test run where we
artificially chewed most of the free memory to try and force the oom
killer to fire sooner (otherwise it takes days for the problem to trigger).

It's spaced with tabs so I'm not sure if it'll stay aligned.  The first
row is the sample number.  All the HugePages entries were 0.  The
DirectMap entries were constant. SwapTotal/SwapFree/SwapCached were 0,
as were Writeback/NFS_Unstable/Bounce/WritebackTmp.

Samples were taken 10 minutes apart.  Between samples 49 and 50 the
oom-killer fired.

		13		49		50
MemTotal	4042848		4042848		4042848
MemFree		113512		52668		69536
Buffers		20		24		76
Cached		1285588		1287456		1295128
Active		2883224		3369440		2850172
Inactive	913756		487944		990152
Dirty		36		216		252
AnonPages	2274756		2305448		2279216
Mapped		10804		12772		15760
Slab		62324		62568		63608
SReclaimable	24092		23912		24848
SUnreclaim	38232		38656		38760
PageTables	11960		12144		11848
CommitLimit	2021424		2021424		2021424
Committed_AS	12666508	12745200	7700484
VmallocUsed	23256		23256		23256

It's hard to get a good picture from just a few samples, so I've
attached an ooffice spreadsheet showing three separate runs.  The
samples above are from sheet 3 in the document.

In those spreadsheets I notice that
memfree+active+inactive+slab+pagetables is basically a constant.
However, if I don't use active+inactive then I can't make the numbers
add up.  And the difference between active+inactive and
buffers+cached+anonpages+dirty+mapped+pagetables+vmallocused grows
almost monotonically.

Thanks,

Chris

Download attachment "meminfo.ods" of type "application/vnd.oasis.opendocument.spreadsheet" (76528 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ