lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sat, 1 Aug 2015 07:58:45 +0200
From:	Dominique Martinet <asmadeus@...ewreck.org>
To:	Hugh Dickins <hughd@...gle.com>
Cc:	Linus Torvalds <torvalds@...ux-foundation.org>,
	"J. Bruce Fields" <bfields@...ldses.org>,
	Dominique Martinet <dominique.martinet@....fr>,
	Al Viro <viro@...iv.linux.org.uk>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	linux-fsdevel <linux-fsdevel@...r.kernel.org>,
	David Howells <dhowells@...hat.com>
Subject: Re: v4.2-rc dcache regression, probably 75a6f82a0d10

Hugh Dickins wrote on Fri, Jul 31, 2015:
> On Fri, 31 Jul 2015, Linus Torvalds wrote:
>> I'd be more suspicious about other effects. For example, iot's not at
>> all obvious that the commit in question just changes the order of the
>> flags/inode field accesses, there are potentialy bigger changes there.
>> For example, this part (in __d_obtain_alias):
>> 
>> -       tmp->d_inode = inode;
>> -       tmp->d_flags |= add_flags;
>> +       __d_set_inode_and_type(tmp, inode, add_flags);
>> 
>> looks a bit off, because it *used* to just add those flags, but now,
>> through __d_set_inode_and_type, it does
>> 
>> +       dentry->d_inode = inode;
>> +       smp_wmb();
>> +       flags = READ_ONCE(dentry->d_flags);
>> +       flags &= ~(DCACHE_ENTRY_TYPE | DCACHE_FALLTHRU);
>> +       flags |= type_flags;
>> +       WRITE_ONCE(dentry->d_flags, flags);
>> 
>> so it clears DCACHE_ENTRY_TYPE | DCACHE_FALLTHRU.
>> 
>> Is that correct? Maybe, I haven't checked. And maybe it's a big bad
>> bug. Regardless, it sure as hell isn't just changing the order of the
>> access to those fields. That "DCACHE_ENTRY_TYPE | DCACHE_FALLTHRU"
>> clearing came from __d_instantiate(), but now it hits __d_obtain_alias
>> too.

Yeah, had spotted this and tried to fix just this bit, but it didn't
seem to change much for my problem.
Not saying it isn't a bug, but I have no idea what __d_obtain_alias
does and nobody seemed to care about this bit in my previous thread.

> Yes, the one which grabbed my attention is:
> 
> @@ -311,7 +346,7 @@ static void dentry_iput(struct dentry * dentry)
>  {
>  	struct inode *inode = dentry->d_inode;
>  	if (inode) {
> -		dentry->d_inode = NULL;
> +		__d_clear_type_and_inode(dentry);
>  		hlist_del_init(&dentry->d_u.d_alias);
>  		spin_unlock(&dentry->d_lock);
>  		spin_unlock(&inode->i_lock);

Oh, missed that one... Running tests with just that over the weekend,
it's a good candidate and the first 10 minutes of tests sound quite
positive!

I think it's wrong to call it a "fix" even if it stops the bug from
reproducing though, the way I understand the intent of the patch, they
want that checking d_flags be enough to take decisions without having to
check d_inode as well - so now things rely on that, it's still going to
be wrong on HEAD... I think?
(I'm running tests at this commit, so I don't have the patches that rely
on that yet)

As I understand things, the fact that lookup managed to get a
being-removed entry from rcu/wherever isn't changed, it's just that it
won't fail as fast and maybe something later will notice the lack of
inode and fallback graciously instead of that ENOTDIR I've been
tracking -- so that commit makes it possible to hit the bug, but there's
another issue about taking the dentry in the first place?


I really need to spend some time to understand vfs/dcache better as a
whole at some point, I've been looking at a small part of it without
context for too long...


Cheers,
-- 
Dominique
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ