lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100525125833.GB9731@ics.muni.cz>
Date:	Tue, 25 May 2010 14:58:33 +0200
From:	Lukas Hejtmanek <xhejtman@....muni.cz>
To:	Trond Myklebust <trond.myklebust@....uio.no>
Cc:	Pavel Machek <pavel@....cz>, linux-nfs@...r.kernel.org,
	linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
	salvet@....muni.cz
Subject: Re: Deadlock in NFSv4 in all kernels

Hi,

On Tue, May 25, 2010 at 08:28:40AM -0400, Trond Myklebust wrote:
> > Seems like pretty fundamental problem in nfs :-(. Limiting writeback
> > caches for nfs, so that system has enough memory to perform rpc calls
> > with the rest might do the trick, but...
> > 
> 
> It's the same problem that you have for any file or storage system that
> has initiators in userland. On the storage side, iSCSI in particular has
> the same problem. On the filesystem side, CIFS, AFS, coda, .... do too.
> The clustered filesystems can deadlock if the node that is running the
> DLM runs out of memory...
> 
> A few years ago there were several people proposing various solutions
> for allowing these daemons to run in a protected memory environment to
> avoid deadlocks, but those efforts have since petered out. Perhaps it is
> time to review the problem?

I saw some patches targeting 2.6.35 that should prevent some deadlocks. They
seem to be not enough in some cases. rpc.* daemons should be mlocked for sure
but there is a problem with libkrb that reads files using fread(). fread() uses
anonymous mmap, under mlockall(MCL_FUTURE) this causes the anonymous map to be
mapped instantly and it deadlocks. 

IBM GPFS also uses userspace daemon, but it seems that the deamon is mlocked
and it does not open any files and does not create new connections. 

My problem was quite easily reproducible.

I started an application that eats 80% of free memory. Then I started:
for i in `seq 1 10`; do dd if=/dev/zero of=/mnt/nfs4/file$i bs=1M count=2048
& done

it deadlock within 2 minutes until this patch is applied:
commit 3d7b08945e54a3a5358d5890240619a013cb7388
Author: Trond Myklebust <Trond.Myklebust@...app.com>
Date:   Thu Apr 22 15:35:55 2010 -0400

    SUNRPC: Fix a bug in rpcauth_prune_expired
    
    Don't want to evict a credential if cred->cr_expire == jiffies, since that
    means that it was just placed on the cred_unused list. We therefore need
to
    use time_in_range() rather than time_in_range_open().
    
    Signed-off-by: Trond Myklebust <Trond.Myklebust@...app.com>

diff --git a/net/sunrpc/auth.c b/net/sunrpc/auth.c
index f394fc1..95afe79 100644
--- a/net/sunrpc/auth.c
+++ b/net/sunrpc/auth.c
@@ -237,7 +237,7 @@ rpcauth_prune_expired(struct list_head *free, int
nr_to_scan)
        list_for_each_entry_safe(cred, next, &cred_unused, cr_lru) {
 
                /* Enforce a 60 second garbage collection moratorium */
-               if (time_in_range_open(cred->cr_expire, expired, jiffies) &&
+               if (time_in_range(cred->cr_expire, expired, jiffies) &&
                    test_bit(RPCAUTH_CRED_HASHED, &cred->cr_flags) != 0)
                        continue;


but I believe this only hides the real problem.

-- 
Lukáš Hejtmánek
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ