linux-ext4 - Re: memory leak: data=journal and {collapse,insert,zero}

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20151021145214.GC2165@thunk.org>
Date:	Wed, 21 Oct 2015 10:52:14 -0400
From:	Theodore Ts'o <tytso@....edu>
To:	Namjae Jeon <namjae.jeon@...sung.com>
Cc:	linux-ext4@...r.kernel.org
Subject: Re: memory leak: data=journal and {collapse,insert,zero}_range

On Wed, Oct 21, 2015 at 06:44:10PM +0900, Namjae Jeon wrote:
> > Interestingly we're not seeing these memory leaks on the truncate
> > path, so I suspect the issue is in how collapse range is clearing
> > pages from the page cache, especially pages that were freshly written
> > to the journal by the commit but which hadn't yet been writtten to
> > disk and then marked as complete so we can allow the relevant
> > transaction to be checkpointed.  (Although we're not leaking the
> > journal head structures, but only the buffer heads, so the story most
> > be a bit more complicated than that.)
> 
> Okay, Thanks for sharing your view and points !!
> 
> Currently I can reproduce memory leak issue without collase/insert/zero range.
> conditions like the following.(collase/insert/zero range are disable with -I -C -z option and add -y option instead of -W)
>   1. small size parition(1GB)
>   2. run fsx with these options "./fsx -N 30000 -o 128000 -l 500000 -r 4096 -t 512 -w 512 -Z -R -y -I -C -z testfile"
> And same result with generic/091 is showing (buffer_head leak)
> 
> So I am starting to find root-cause base on your points.
> I will share the result or the patch.

Thanks, that's very interesting data point.  So this makes it appear
that the problem *is* probably with how we deal with checkpointing
buffers after the pages get discarded using either a truncate or a
collapse_range, since the 'y' option causes a lot fsync's, and hence
commits, some of which are happening after a truncate command.

Thanks for a taking a look at this.  I really appreciate it.

Cheers,

					- Ted









--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html