linux-kernel - Re: [PATCH 13/17] fs: Implement lazy LRU updates for inodes.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20101016075457.GF19147@amd>
Date:	Sat, 16 Oct 2010 18:54:57 +1100
From:	Nick Piggin <npiggin@...nel.dk>
To:	Christoph Hellwig <hch@...radead.org>
Cc:	Dave Chinner <david@...morbit.com>, linux-fsdevel@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 13/17] fs: Implement lazy LRU updates for inodes.

On Wed, Sep 29, 2010 at 10:05:17PM -0400, Christoph Hellwig wrote:
> > @@ -1058,8 +1051,6 @@ static void wait_sb_inodes(struct super_block *sb)
> >  	 */
> >  	WARN_ON(!rwsem_is_locked(&sb->s_umount));
> >  
> > -	spin_lock(&sb_inode_list_lock);
> > -
> >  	/*
> >  	 * Data integrity sync. Must wait for all pages under writeback,
> >  	 * because there may have been pages dirtied before our sync
> > @@ -1067,6 +1058,7 @@ static void wait_sb_inodes(struct super_block *sb)
> >  	 * In which case, the inode may not be on the dirty list, but
> >  	 * we still have to wait for that writeout.
> >  	 */
> > +	spin_lock(&sb_inode_list_lock);
> 
> I think this should be folded back into the patch introducing
> sb_inode_list_lock.
> 
> > @@ -1083,10 +1075,10 @@ static void wait_sb_inodes(struct super_block *sb)
> >  		spin_unlock(&sb_inode_list_lock);
> >  		/*
> >  		 * We hold a reference to 'inode' so it couldn't have been
> > -		 * removed from s_inodes list while we dropped the
> > -		 * sb_inode_list_lock.  We cannot iput the inode now as we can
> > -		 * be holding the last reference and we cannot iput it under
> > -		 * spinlock. So we keep the reference and iput it later.
> > +		 * removed from s_inodes list while we dropped the i_lock.  We
> > +		 * cannot iput the inode now as we can be holding the last
> > +		 * reference and we cannot iput it under spinlock. So we keep
> > +		 * the reference and iput it later.
> 
> This also looks like a hunk that got in by accident and should be merged
> into an earlier patch.

These two actually came from a patch to do rcu locking (which Dave has
changed a bit, but originally due to my fault), so I'll fix those, thanks.

 
> > @@ -431,11 +412,12 @@ static int invalidate_list(struct list_head *head, struct list_head *dispose)
> >  		invalidate_inode_buffers(inode);
> >  		if (!inode->i_count) {
> >  			spin_lock(&wb_inode_list_lock);
> > -			list_move(&inode->i_list, dispose);
> > +			list_del(&inode->i_list);
> >  			spin_unlock(&wb_inode_list_lock);
> >  			WARN_ON(inode->i_state & I_NEW);
> >  			inode->i_state |= I_FREEING;
> >  			spin_unlock(&inode->i_lock);
> > +			list_add(&inode->i_list, dispose);
> 
> Moving the list_add out of the lock looks fine, but I can't really
> see how it's related to the rest of the patch.

Just helps shows that dispose isn't being protected by
wb_inode_list_lock, I guess.

> 
> > +		if (inode->i_count || (inode->i_state & ~I_REFERENCED)) {
> > +			list_del_init(&inode->i_list);
> > +			spin_unlock(&inode->i_lock);
> > +			atomic_dec(&inodes_stat.nr_unused);
> > +			continue;
> > +		}
> > +		if (inode->i_state) {
> 
> Slightly confusing but okay given the only i_state that will get us here
> is I_REFERENCED.  Do we really care about the additional cycle or two a
> dumb compiler might generate when writing
> 
> 	if (inode->i_state & I_REFERENCED)

Sure, why not.

> 
> ?
> 
> >  		if (inode_has_buffers(inode) || inode->i_data.nrpages) {
> > +			list_move(&inode->i_list, &inode_unused);
> 
> Why are we now moving the inode to the front of the list? 

It was always being moved to the front of the list, but with lazy LRU,
iput_final doesn't move it for us, hence the list_move here.

Without this, it busy-spins and locks badly under heavy reclaim load
when buffers or pagecache can't be invalidated.

Seeing as it wasn't obvious to you, I'll add a comment here.

I was thinking we should probably have a shortcut to go back to the
tail of the LRU in case of invalidation success, but that's out of the
scope of this patch and I never got around to testing such a change
yet.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/