linux-kernel - Re: [RFC PATCH v3 14/15] dcache: Implement partial shrink via Slab Movable Objects

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20190411050538.GA22216@eros.localdomain>
Date:   Thu, 11 Apr 2019 15:05:38 +1000
From:   "Tobin C. Harding" <me@...in.cc>
To:     Al Viro <viro@...iv.linux.org.uk>
Cc:     "Tobin C. Harding" <tobin@...nel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Roman Gushchin <guro@...com>,
        Alexander Viro <viro@....linux.org.uk>,
        Christoph Hellwig <hch@...radead.org>,
        Pekka Enberg <penberg@...helsinki.fi>,
        David Rientjes <rientjes@...gle.com>,
        Joonsoo Kim <iamjoonsoo.kim@....com>,
        Christopher Lameter <cl@...ux.com>,
        Matthew Wilcox <willy@...radead.org>,
        Miklos Szeredi <mszeredi@...hat.com>,
        Andreas Dilger <adilger@...ger.ca>,
        Waiman Long <longman@...hat.com>,
        Tycho Andersen <tycho@...ho.ws>, Theodore Ts'o <tytso@....edu>,
        Andi Kleen <ak@...ux.intel.com>,
        David Chinner <david@...morbit.com>,
        Nick Piggin <npiggin@...il.com>,
        Rik van Riel <riel@...hat.com>,
        Hugh Dickins <hughd@...gle.com>,
        Jonathan Corbet <corbet@....net>, linux-mm@...ck.org,
        linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH v3 14/15] dcache: Implement partial shrink via Slab
 Movable Objects

On Thu, Apr 11, 2019 at 05:47:46AM +0100, Al Viro wrote:
> On Thu, Apr 11, 2019 at 12:48:21PM +1000, Tobin C. Harding wrote:
> 
> > Oh, so putting entries on a shrink list is enough to pin them?
> 
> Not exactly pin, but __dentry_kill() has this:
>         if (dentry->d_flags & DCACHE_SHRINK_LIST) {
>                 dentry->d_flags |= DCACHE_MAY_FREE;
>                 can_free = false;
>         }
>         spin_unlock(&dentry->d_lock);
>         if (likely(can_free))
>                 dentry_free(dentry);
> and shrink_dentry_list() - this:
>                         if (dentry->d_lockref.count < 0)
>                                 can_free = dentry->d_flags & DCACHE_MAY_FREE;
>                         spin_unlock(&dentry->d_lock);
>                         if (can_free)
>                                 dentry_free(dentry);
> 			continue;
> so if dentry destruction comes before we get around to
> shrink_dentry_list(), it'll stop short of dentry_free() and mark it for
> shrink_dentry_list() to do just dentry_free(); if it overlaps with
> shrink_dentry_list(), but doesn't progress all the way to freeing,
> we will
> 	* have dentry removed from shrink list
> 	* notice the negative ->d_count (i.e. that it has already reached
> __dentry_kill())
> 	* see that __dentry_kill() is not through with tearing the sucker
> apart (no DCACHE_MAY_FREE set)
> ... and just leave it alone, letting __dentry_kill() do the rest of its
> thing - it's already off the shrink list, so __dentry_kill() will do
> everything, including dentry_free().
> 
> The reason for that dance is the locking - shrink list belongs to whoever
> has set it up and nobody else is modifying it.  So __dentry_kill() doesn't
> even try to remove the victim from there; it does all the teardown
> (detaches from inode, unhashes, etc.) and leaves removal from the shrink
> list and actual freeing to the owner of shrink list.  That way we don't
> have to protect all shrink lists a single lock (contention on it would
> be painful) and we don't have to play with per-shrink-list locks and
> all the attendant headaches (those lists usually live on stack frame
> of some function, so just having the lock next to the list_head would
> do us no good, etc.).  Much easier to have the shrink_dentry_list()
> do all the manipulations...
> 
> The bottom line is, once it's on a shrink list, it'll stay there
> until shrink_dentry_list().  It may get extra references after
> being inserted there (e.g. be found by hash lookup), it may drop
> those, whatever - it won't get freed until we run shrink_dentry_list().
> If it ends up with extra references, no problem - shrink_dentry_list()
> will just kick it off the shrink list and leave it alone.
> 
> Note, BTW, that umount coming between isolate and drop is not a problem;
> it call shrink_dcache_parent() on the root.  And if shrink_dcache_parent()
> finds something on (another) shrink list, it won't put it to the shrink
> list of its own, but it will make note of that and repeat the scan in
> such case.  So if we find something with zero refcount and not on
> shrink list, we can move it to our shrink list and be sure that its
> superblock won't go away under us...

Man, that was good to read.  Thanks for taking the time to write this.


	Tobin