lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4DA8B3FB.5020401@linux.intel.com>
Date:	Fri, 15 Apr 2011 14:09:15 -0700
From:	Andi Kleen <ak@...ux.intel.com>
To:	Tim Chen <tim.c.chen@...ux.intel.com>
CC:	Alexander Viro <viro@...iv.linux.org.uk>,
	Nick Piggin <npiggin@...nel.dk>, linux-fsdevel@...r.kernel.org,
	linux-kernel@...r.kernel.org, shaohua.li@...el.com,
	alex.shi@...el.com, torvalds@...ux-foundation.org,
	akpm@...ux-foundation.org
Subject: Re: [PATCH] vfs: Fix RCU path walk failiures due to uninitialized
 nameidata seq number for root directory

On 4/15/2011 11:39 AM, Tim Chen wrote:
> During RCU walk in path_lookupat and path_openat, the rcu lookup
> frequently failed because when root directory was looked up, seq number
> was not properly set in nameidata.  We dropped out of RCU walk in
> nameidata_drop_rcu due to mismatch in directory entry's seq number.  We
> reverted to slow path walk that need to take references.

Thanks Tim. Adding Andrew, Linus too. IMHO this fix is quite important to
actually make the fabled RCU dcache work -- without it it's just slower 
because
it will fallback nearly allways.

And it's a correctness fix because with the bogus sequence number you 
could fail
to detect a race on root's dentry, leading to very subtle malfunction.

Could it be merged ASAP please?
Also should be a stable candidate for .38 (whoever merges it please
add a Cc: stable@...nel.org # .38)

Reviewed-by: Andi Kleen <ak@...ux.intel.com>

-Andi

> With the following patch, I saw a 50% increase in an exim mail server
> benchmark throughput on a 4-socket Nehalem-EX system.
>
> Thanks.
>
> Tim
>
> Signed-off-by: Tim Chen<tim.c.chen@...ux.intel.com>
> diff --git a/fs/namei.c b/fs/namei.c
> index 3cb616d..e4b27a6 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -697,6 +697,7 @@ static __always_inline void set_root_rcu(struct nameidata *nd)
>   		do {
>   			seq = read_seqcount_begin(&fs->seq);
>   			nd->root = fs->root;
> +			nd->seq = __read_seqcount_begin(&nd->root.dentry->d_seq);
>   		} while (read_seqcount_retry(&fs->seq, seq));
>   	}
>   }
>
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ