linux-kernel - Very odd memory behavior. Is this normal?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [day] [month] [year] [list]

Message-ID: <0683F5F5A5C7FE419A752A034B4A0B97973F7C13@sswchi5pmbx2.peak6.net>
Date:	Wed, 9 Oct 2013 19:31:54 +0000
From:	Shaun Thomas <sthomas@...ionshouse.com>
To:	"'linux-kernel@...r.kernel.org'" <linux-kernel@...r.kernel.org>
Subject: Very odd memory behavior. Is this normal?

Devs,

Ever since we moved from 2.6.* to 3.*, I've noticed some very odd MM behavior I'd like to run past you. It's pretty difficult to replicate, but I've figured out a fairly straightforward method. But first, the issue:

Our systems with 72GB of RAM on a 3.2 kernel eventually converge on this, from /proc/meminfo:

Active(file):   29059980 kB
Inactive(file): 29069296 kB

Basically some kind of even split across Inactive and Active, suggesting a hard IO loop that's purging as fast as it can promote. It's pretty easy to cause, too. If you have PostgreSQL handy, any DB that just barely fits in memory should be sufficient.

createdb pgbench
pgbench -i -s 4000 pgbench
pgbench -T 1800 -c 24 -S pgbench

I let that run for a while to cache everything, until read IO slows down to a trickle according to iostat. Memory at that point looks like this:

Active(file):   61056316 kB
Inactive(file):   249968 kB

And free -m looks like this:

             total       used       free     shared    buffers     cached
Mem:         72485      66963       5521          0          2      64183
-/+ buffers/cache:       2776      69708
Swap:         2047          0       2047

So I have 5GB available, 64GB cached... looks pretty normal. I leave the pgbench running, then in another terminal, I waste more memory than I have free:

python - <<EOF
foo = bytearray(8 * 1024 * 1024 * 1024)
from time import sleep
sleep(1000)
EOF

After a minute or two, Active and Inactive are even again, and IO to the underlying device is extremely high. In our case, that's 350MB/s in random reads. Every kernel I've tested does this. 3.2, 3.5, 3.8. I haven't been able to go higher than that, but the behavior seems pretty consistent. This is with zone_reclaim_mode disabled, and numactl --interleave=all doesn't change anything.

I get that it needs to balance memory, but this seems excessive somehow. I'm probably missing something, though. Any insight would be helpful, and I can try to provide anything else you need if this is a legitimate complaint.

Thanks!

--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd | Suite 500 | Chicago IL, 60604
312-676-8870
sthomas@...ionshouse.com

______________________________________________

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/