[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAH2r5mtD7tF7UH0sXbvX2PASV5a63X0bmGhXK6KU3UJf+HB1zg@mail.gmail.com>
Date: Wed, 1 Mar 2023 12:24:50 -0600
From: Steve French <smfrench@...il.com>
To: Paulo Alcantara <pc@...guebit.com>
Cc: David Howells <dhowells@...hat.com>,
Shyam Prasad N <nspmangalore@...il.com>,
Rohith Surabattula <rohiths.msft@...il.com>,
Tom Talpey <tom@...pey.com>,
Stefan Metzmacher <metze@...ba.org>,
Jeff Layton <jlayton@...nel.org>, linux-cifs@...r.kernel.org,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
Murphy Zhou <jencce.kernel@...il.com>,
Steve French <sfrench@...ba.org>
Subject: Re: [PATCH 1/1] cifs: Fix memory leak in direct I/O
I also verified that this fixes the problem that Murphy pointed out - thx
On Wed, Mar 1, 2023 at 9:11 AM Paulo Alcantara <pc@...guebit.com> wrote:
>
> David Howells <dhowells@...hat.com> writes:
>
> > When __cifs_readv() and __cifs_writev() extract pages from a user-backed
> > iterator into a BVEC-type iterator, they set ->bv_need_unpin to note
> > whether they need to unpin the pages later. However, in both cases they
> > examine the BVEC-type iterator and not the source iterator - and so
> > bv_need_unpin doesn't get set and the pages are leaked.
> >
> > I think this may be responsible for the generic/208 xfstest failing
> > occasionally with:
> >
> > WARNING: CPU: 0 PID: 3064 at mm/gup.c:218 try_grab_page+0x65/0x100
> > RIP: 0010:try_grab_page+0x65/0x100
> > follow_page_pte+0x1a7/0x570
> > __get_user_pages+0x1a2/0x650
> > __gup_longterm_locked+0xdc/0xb50
> > internal_get_user_pages_fast+0x17f/0x310
> > pin_user_pages_fast+0x46/0x60
> > iov_iter_extract_pages+0xc9/0x510
> > ? __kmalloc_large_node+0xb1/0x120
> > ? __kmalloc_node+0xbe/0x130
> > netfs_extract_user_iter+0xbf/0x200 [netfs]
> > __cifs_writev+0x150/0x330 [cifs]
> > vfs_write+0x2a8/0x3c0
> > ksys_pwrite64+0x65/0xa0
> >
> > with the page refcount going negative. This is less unlikely than it seems
> > because the page is being pinned, not simply got, and so the refcount
> > increased by 1024 each time, and so only needs to be called around ~2097152
> > for the refcount to go negative.
> >
> > Further, the test program (aio-dio-invalidate-failure) uses a 32MiB static
> > buffer and all the PTEs covering it refer to the same page because it's
> > never written to.
> >
> > The warning in try_grab_page():
> >
> > if (WARN_ON_ONCE(folio_ref_count(folio) <= 0))
> > return -ENOMEM;
> >
> > then trips and prevents us ever using the page again for DIO at least.
> >
> > Fixes: d08089f649a0 ("cifs: Change the I/O paths to use an iterator rather than a page list")
> > Reported-by: Murphy Zhou <jencce.kernel@...il.com>
> > Link: https://lore.kernel.org/r/CAH2r5mvaTsJ---n=265a4zqRA7pP+o4MJ36WCQUS6oPrOij8cw@mail.gmail.com
> > Signed-off-by: David Howells <dhowells@...hat.com>
> > cc: Steve French <sfrench@...ba.org>
> > cc: Shyam Prasad N <nspmangalore@...il.com>
> > cc: Rohith Surabattula <rohiths.msft@...il.com>
> > cc: Paulo Alcantara <pc@....nz>
> > cc: Jeff Layton <jlayton@...nel.org>
> > cc: linux-cifs@...r.kernel.org
> > ---
> > fs/cifs/file.c | 4 ++--
> > 1 file changed, 2 insertions(+), 2 deletions(-)
>
> Reviewed-by: Paulo Alcantara (SUSE) <pc@...guebit.com>
--
Thanks,
Steve
Powered by blists - more mailing lists