linux-kernel - Re: [PATCH 3/7] cachefiles: Fix page leak in cachefiles_read_backing

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87r2f26qng.fsf@linkitivity.dja.id.au>
Date:   Sat, 01 Dec 2018 11:23:47 +1100
From:   Daniel Axtens <dja@...ens.net>
To:     David Howells <dhowells@...hat.com>, torvalds@...ux-foundation.org
Cc:     Shantanu Goel <sgoel01@...oo.com>,
        Kiran Kumar Modukuri <kiran.modukuri@...il.com>,
        dhowells@...hat.com, linux-cachefs@...hat.com,
        linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 3/7] cachefiles: Fix page leak in cachefiles_read_backing_file while vmscan is active

David Howells <dhowells@...hat.com> writes:

> From: Kiran Kumar Modukuri <kiran.modukuri@...il.com>
>
> [Description]
>
> In a heavily loaded system where the system pagecache is nearing memory
> limits and fscache is enabled, pages can be leaked by fscache while trying
> read pages from cachefiles backend.  This can happen because two
> applications can be reading same page from a single mount, two threads can
> be trying to read the backing page at same time.  This results in one of
> the threads finding that a page for the backing file or netfs file is
> already in the radix tree.  During the error handling cachefiles does not
> clean up the reference on backing page, leading to page leak.
>
> [Fix]
> The fix is straightforward, to decrement the reference when error is
> encountered.
>
>   [dhowells: Note that I've removed the clearance and put of newpage as
>    they aren't attested in the commit message and don't appear to actually
>    achieve anything since a new page is only allocated is newpage!=NULL and
>    any residual new page is cleared before returning.]

Sorry I hadn't got back to you on this; I think we also discussed this
with the Ubuntu kernel team and concluded - as you did - that these
didn't fix any bugs but did make things seem more consistent.

Regards,
Daniel
>
> [Testing]
> I have tested the fix using following method for 12+ hrs.
>
> 1) mkdir -p /mnt/nfs ; mount -o vers=3,fsc <server_ip>:/export /mnt/nfs
> 2) create 10000 files of 2.8MB in a NFS mount.
> 3) start a thread to simulate heavy VM presssure
>    (while true ; do echo 3 > /proc/sys/vm/drop_caches ; sleep 1 ; done)&
> 4) start multiple parallel reader for data set at same time
>    find /mnt/nfs -type f | xargs -P 80 cat > /dev/null &
>    find /mnt/nfs -type f | xargs -P 80 cat > /dev/null &
>    find /mnt/nfs -type f | xargs -P 80 cat > /dev/null &
>    ..
>    ..
>    find /mnt/nfs -type f | xargs -P 80 cat > /dev/null &
>    find /mnt/nfs -type f | xargs -P 80 cat > /dev/null &
> 5) finally check using cat /proc/fs/fscache/stats | grep -i pages ;
>    free -h , cat /proc/meminfo and page-types -r -b lru
>    to ensure all pages are freed.
>
> Reviewed-by: Daniel Axtens <dja@...ens.net>
> Signed-off-by: Shantanu Goel <sgoel01@...oo.com>
> Signed-off-by: Kiran Kumar Modukuri <kiran.modukuri@...il.com>
> [dja: forward ported to current upstream]
> Signed-off-by: Daniel Axtens <dja@...ens.net>
> Signed-off-by: David Howells <dhowells@...hat.com>
> ---
>
>  fs/cachefiles/rdwr.c |    6 ++++++
>  1 file changed, 6 insertions(+)
>
> diff --git a/fs/cachefiles/rdwr.c b/fs/cachefiles/rdwr.c
> index 40f7595aad10..db233588a69a 100644
> --- a/fs/cachefiles/rdwr.c
> +++ b/fs/cachefiles/rdwr.c
> @@ -535,7 +535,10 @@ static int cachefiles_read_backing_file(struct cachefiles_object *object,
>  					    netpage->index, cachefiles_gfp);
>  		if (ret < 0) {
>  			if (ret == -EEXIST) {
> +				put_page(backpage);
> +				backpage = NULL;
>  				put_page(netpage);
> +				netpage = NULL;
>  				fscache_retrieval_complete(op, 1);
>  				continue;
>  			}
> @@ -608,7 +611,10 @@ static int cachefiles_read_backing_file(struct cachefiles_object *object,
>  					    netpage->index, cachefiles_gfp);
>  		if (ret < 0) {
>  			if (ret == -EEXIST) {
> +				put_page(backpage);
> +				backpage = NULL;
>  				put_page(netpage);
> +				netpage = NULL;
>  				fscache_retrieval_complete(op, 1);
>  				continue;
>  			}