lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20101202082913.7cb98444@notabene.brown>
Date:	Thu, 2 Dec 2010 08:29:13 +1100
From:	Neil Brown <neilb@...e.de>
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	Trond Myklebust <Trond.Myklebust@...app.com>,
	Hugh Dickins <hughd@...gle.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Nick Bowler <nbowler@...iptictech.com>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	linux-nfs@...r.kernel.org, Rik van Riel <riel@...hat.com>,
	Christoph Hellwig <hch@....de>,
	Al Viro <viro@...iv.linux.org.uk>,
	Nick Piggin <npiggin@...nel.dk>
Subject: Re: [PATCH v2 3/3] NFS: Fix a memory leak in nfs_readdir

On Wed, 1 Dec 2010 12:39:29 -0800 Andrew Morton <akpm@...ux-foundation.org>
wrote:

> On Wed, 01 Dec 2010 15:05:38 -0500
> Trond Myklebust <Trond.Myklebust@...app.com> wrote:
> 
> > On Wed, 2010-12-01 at 11:23 -0800, Hugh Dickins wrote:
> > > On Wed, 1 Dec 2010, Trond Myklebust wrote:
> > > > On Wed, 2010-12-01 at 08:17 -0800, Linus Torvalds wrote:
> > > > > include/linux/fs.h |    1 +
> > > > >  mm/vmscan.c        |    3 +++
> > > > >  2 files changed, 4 insertions(+), 0 deletions(-)
> > > > > 
> > > > > diff --git a/include/linux/fs.h b/include/linux/fs.h
> > > > > index c9e06cc..090f0ea 100644
> > > > > --- a/include/linux/fs.h
> > > > > +++ b/include/linux/fs.h
> > > > > @@ -602,6 +602,7 @@ struct address_space_operations {
> > > > >         sector_t (*bmap)(struct address_space *, sector_t);
> > > > >         void (*invalidatepage) (struct page *, unsigned long);
> > > > >         int (*releasepage) (struct page *, gfp_t);
> > > > > +       void (*freepage)(struct page *);
> > > > >         ssize_t (*direct_IO)(int, struct kiocb *, const struct iovec
> > > > > *iov,
> > > > >                         loff_t offset, unsigned long nr_segs);
> > > > >         int (*get_xip_mem)(struct address_space *, pgoff_t, int,
> > > > > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > > > > index d31d7ce..1accb01 100644
> > > > > --- a/mm/vmscan.c
> > > > > +++ b/mm/vmscan.c
> > > > > @@ -499,6 +499,9 @@ static int __remove_mapping(struct address_space
> > > > > *mapping, struct page *page)
> > > > >                 mem_cgroup_uncharge_cache_page(page);
> > > > >         }
> > > > >  
> > > > > +       if (mapping->a_ops->freepage)
> > > > > +               mapping->a_ops->freepage(page);
> > > > 
> > > > Hmm... Looking again at the problem, it appears that the same callback
> > > > needs to be added to truncate_complete_page() and
> > > > invalidate_complete_page2(). Otherwise we end up in a situation where
> > > > the page can sometimes be removed from the page cache without calling
> > > > freepage().
> > > > 
> > > > > +
> > > > >         return 1;
> > > > >  
> > > > >  cannot_free:
> > > 
> > > Yes, I was wondering quite how we would define this ->freepage thing,
> > > if it gets called from one place that removes from page cache and not
> > > from others.
> > > 
> > > Another minor problem with it: it would probably need to take the
> > > struct address_space *mapping as arg as well as struct page *page:
> > > because by this time page->mapping has been reset to NULL.
> > > 
> > > But I'm not at all keen on adding a calllback in this very special
> > > frozen-to-0-references place: please let's not do it without an okay
> > > from Nick Piggin (now Cc'ed).
> > > 
> > > I agree completely with what Linus said originally about how the
> > > page cannot be freed while there's a reference to it, and it should
> > > be possible to work this without your additional page locks.
> > > 
> > > Your ->releasepage should be able to judge whether the page is likely
> > > (not certain) to be freed - page_count 3? 1 for the page cache, 1 for
> > > the page_private reference, 1 for vmscan's reference, I think.  Then
> > > it can mark !PageUptodate and proceed with freeing the stuff you had
> > > allocated, undo page_has_private and its reference, and return 1 (or
> > > return 0 if it decides to hold on to the page).
> > 
> > That is very brittle. I'd prefer not to have to scan linux-mm every week
> > in order to find out if someone changed the page_count.
> > 
> > However, while reading Documentation/filesystems/vfs.txt (in order to
> > add documentation for freepage) I was surprised to read that the
> > ->releasepage() is itself supposed to be allowed to actually remove the
> > page from the address space if it so desires.
> 
> That doesn't sound right.  It came from Neil in 2006.
> 
> Neil, what were you thinking there?   Did you find such a ->releasepage()?

Nope, no idea, sorry.

No releasepage functions do anything like that, and no call sites suggest it
could be a possibility.  Quite the reverse - they are likely to remove the
page from the mapping without checking that it is still in the mapping.

So that sentence should be deleted.

(Getting good review for doco updates is sooo hard :-)

NeilBrown

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ