[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20151017160230.GA19968@thunk.org>
Date: Sat, 17 Oct 2015 12:02:30 -0400
From: Theodore Ts'o <tytso@....edu>
To: linux-ext4@...r.kernel.org
Cc: Namjae Jeon <namjae.jeon@...sung.com>
Subject: memory leak: data=journal and {collapse,insert,zero}_range
On the last conference call I mentioned that I had recently been
running into memory leaks with data=journal. I've traced it down to
{collapse,insert,zero}_range calls.
It bisected down to 1ce01c4a199: "ext4: fix COLLAPSE_RANGE test
failure in data journalling mode", and specifically, this bit of code
the collapse/insert/zero range functions:
/* Call ext4_force_commit to flush all data in case of data=journal. */
if (ext4_should_journal_data(inode)) {
ret = ext4_force_commit(inode->i_sb);
if (ret)
return ret;
}
This is needed to prevent data from being lost with the insert and
collapse range calls. (It is not obvious to me whether this bit of
code, and the filemap_write_and_wait_range() call, is needed for
zero_range, but I haven't had a chance to investigate this more
deeply.)
Further investigation shows that the problem seems to be a buffer_head
leak. Running generic/091 with data=journal in a loop, and using the
following between each test runs:
#!/bin/sh
# /tmp/hooks/logger
echo 3 > /proc/sys/vm/drop_caches
logger $(grep MemFree /proc/meminfo)
logger $(grep buffer_head /proc/slabinfo)
logger $(grep jbd2_journal_head /proc/slabinfo)
what we find is:
logger: MemFree: 7348684 kB
logger: buffer_head 608 1600 128 32 1 : tunables 32 16 8 : slabdata 50 50 0 : globalstat 108003 36592 3294 0 0 1 0 0 0 : cpustat 102424 6803 101863 6760
logger: jbd2_journal_head 120 120 136 30 1 : tunables 32 16 8 : slabdata 4 4 0 : globalstat 203 120 4 0 0 0 0 0 0 : cpustat 179 14 84 0
logger: MemFree: 7155576 kB
logger: buffer_head 45008 65920 128 32 1 : tunables 32 16 8 : slabdata 2060 2060 0 : globalstat 207971 79408 5753 0 0 1 0 0 0 : cpustat 354158 14482 311984 11664
logger: jbd2_journal_head 206 690 136 30 1 : tunables 32 16 8 : slabdata 23 23 128 : globalstat 38817 11086 1142 0 0 0 0 0 0 : cpustat 68275 3482 68402 3326
kernel: logger (5838): drop_caches: 3
logger: MemFree: 6967656 kB
logger: buffer_head 89552 111616 128 32 1 : tunables 32 16 8 : slabdata 3488 3488 0 : globalstat 308435 124352 7589 0 0 1 0 0 0 : cpustat 606287 22193 522363 16591
logger: jbd2_journal_head 210 660 136 30 1 : tunables 32 16 8 : slabdata 22 22 128 : globalstat 77205 11086 2158 0 0 0 0 0 0 : cpustat 136335 6954 136593 6669
logger: MemFree: 6780176 kB
logger: buffer_head 134080 155328 128 32 1 : tunables 32 16 8 : slabdata 4854 4854 0 : globalstat 408163 168144 9369 0 0 1 0 0 0 : cpustat 857710 29874 732032 21489
logger: jbd2_journal_head 210 570 136 30 1 : tunables 32 16 8 : slabdata 19 19 128 : globalstat 115685 11086 3198 0 0 0 0 0 0 : cpustat 204408 10417 204798 9998
The problem then appears to be with the buffer heads associated with
pages that have been recently written in data=journal mode are not
getting freed. I haven't had time to take this any further, but I
wanted to document what I've found to date as a checkpoint. If
someone has time to take this further, or has any additional insight,
I'd certainly appreciate it!
- Ted
P.S. I will shortly be releasing an enhancement to kvm-xfstests and
gce-xfstests which allows collapse/insert/zero range to be easily
disabled, like this:
gce-xfstests --hooks /tmp/hooks -C 20 -c data_journal --no-collapse --no-zero --no-insert generic/091
(The above traces were taken without --no-collapse --no-zero and
--noinsert. With those options added, the memory leak goes away.)
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists