lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Thu, 9 Aug 2012 08:18:32 -0400
From:	Jeff Layton <jlayton@...hat.com>
To:	Namjae Jeon <linkinjeon@...il.com>
Cc:	viro@...iv.linux.org.uk, linux-fsdevel@...r.kernel.org,
	linux-nfs@...r.kernel.org, linux-kernel@...r.kernel.org,
	michael.brantley@...haw.com, hch@...radead.org, miklos@...redi.hu,
	pstaubach@...grid.com
Subject: Re: [PATCH v5 00/19] vfs: add the ability to retry on ESTALE to
 several syscalls

On Thu, 9 Aug 2012 20:57:14 +0900
Namjae Jeon <linkinjeon@...il.com> wrote:

> Hi Jeff.
> 
> I still found ESTALE error although patching these patch-set.
> Is test method correct that I try to run estale_test on each nfs
> server and client at the same time ?
> 
> ./estale_test
> chmod: Stale NFS[  281.720000] ##### send signal from USER, SIG : 2,
> estale_test(107)->estale_test(102) sys_kill
> [  281.728000] ##### send signal from USER, SIG : 15,
> estale_test(102)->estale_test(103) sys_kill
> [  281.736000] ##### send signal from USER, SIG : 15,
> estale_test(102)->estale_test(104) sys_kill
> [  281.744000] ##### send signal from USER, SIG : 15,
> estale_test(102)->estale_test(105) sys_kill
> [  281.752000] ##### send signal from USER, SIG : 15,
> estale_test(102)->estale_test(106) sys_kill
> [  281.760000] ##### send signal from USER, SIG : 15,
> estale_test(102)->estale_test(107) sys_kill
> [  281.768000] ##### send signal from USER, SIG : 15,
> estale_test(102)->estale_test(108) sys_kill
> [  281.780000] ##### send signal from USER, SIG : 15,
> estale_test(102)->estale_test(109) sys_kill
> [  281.788000] ##### send signal from USER, SIG : 15,
> estale_test(102)->estale_test(110) sys_kill
> [  281.796000] ##### send signal from USER, SIG : 15,
> estale_test(102)->estale_test(111) sys_kill
> [  281.804000] ##### send signal from USER, SIG : 15,
> estale_test(102)->estale_test(112) sys_kill
> [  281.812000] ##### send signal from USER, SIG : 15,
> estale_test(102)->estale_test(113) sys_kill
> [  281.820000] ##### send signal from USER, SIG : 15,
> estale_test(102)->estale_test(114) sys_kill
> [  281.828000] ##### send signal from USER, SIG : 15,
> estale_test(102)->estale_test(115) sys_kill
> [  281.840000] ##### send signal from USER, SIG : 15,
> estale_test(102)->estale_test(116) sys_kill
> [  281.848000] ##### send signal from USER, SIG : 15,
> estale_test(102)->estale_test(117) sys_kill
> [  281.856000] ##### send signal from USER, SIG : 15,
> estale_test(102)->estale_test(118) sys_kill
> [  281.864000] ##### send signal from USER, SIG : 15,
> estale_test(102)->estale_test(119) sys_kill
>  file handle
> VDLinux#> chdir: Stale NFS[  282.664000] ##### send signal from USER,
> SIG : 2, estale_test(120)->???(102) sys_kill
>  file handle
> 
> Thanks.
> 

I guess you didn't read my response earlier? I'll re-post it here...

> It's a bit labor intensive, I'm afraid...
>
> Attached is a cleaned-up copy of the test program that Peter wrote to
> test his original patchset. The basic idea is to run this on both the
> client and server at the same time so they race against each other. He
> was able to run it overnight when testing with his patchset.
>
> With this patchset, that doesn't work since we're only retrying the
> lookup and call once. So, what I've been doing is modifying the program
> so that it just runs one test at a time, and sniffing traffic to see
> whether the lookups and calls are retried after an ESTALE return from
> the server. 


So, ESTALE errors are still expected when running that test. This
patchset only fixes a very specific set of circumstances where an entry
goes stale once between the lookup and the actual operation(s).
Anything outside of that, and it won't help.

That test is very aggressive, and can cause it to race multiple times.
You actually have to sniff traffic and look to see if the lookup and
call were reattempted after the ESTALE error.

-- 
Jeff Layton <jlayton@...hat.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ