[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <1335200249.5607.41.camel@rocinante.mythic-beasts.com>
Date: Mon, 23 Apr 2012 17:57:28 +0100
From: Toby Goodwin <toby@...hic-beasts.com>
To: LKML <linux-kernel@...r.kernel.org>
Subject: Fixing NFS over OpenVPN
NFS over OpenVPN is unreliable. On systems with modest RAM, it quite
quickly locks up in a nasty way, with unkillable processes, and the
system unable even to shutdown or reboot. Ouch.
I've been looking at this for some time, and have finally made some
progress. The problem is that if the openvpn process needs any more
kernel memory, the kernel will sometimes ask nfs to release some pages,
nfs calls commit_inode(), which wants to write to the openvpn tunnel,
and we have deadlock.
I believe I know how to fix this, but would appreciate some guidance.
Part of the solution is to specify the "--mlock" flag to openvpn -- this
exists so that secrets are never written to swap, but as a side effect
it prevents openvpn from ever page faulting.
The other kernel allocation culprits are the sockets that openvpn holds.
To prevent them triggering the deadlock, we need to set sk_allocation =
GFP_NOIO. For the socket created by the tun driver it's a simple change
to drivers/net/tun.c. (This is the socket that applications use when
they want to communicate with the tunnel.)
The other socket created by openvpn is trickier, as it's just a normal
userland AF_INET socket. (This is the socket through which OpenVPN sends
its encrypted packets to the remote end of the tunnel.) I haven't
discovered any way for userland to request a particular allocation
policy for a socket, please let me know if I've missed something.
So I've added a setsockopt() call that can do it:
--- net/core/sock.c.orig 2012-03-18 23:15:34.000000000 +0000
+++ net/core/sock.c 2012-04-22 22:55:34.320023377 +0100
@@ -793,6 +793,10 @@
sock_valbool_flag(sk, SOCK_WIFI_STATUS, valbool);
break;
+ case SO_RECURSIVE:
+ sk->sk_allocation = GFP_NOIO;
+ break;
+
default:
ret = -ENOPROTOOPT;
break;
(I picked the SO_RECURSIVE name late at night -- suggestions for a
better name are very welcome!)
I've dealt with these three troublesome memory allocations as I've
discovered them, and initial testing shows that I can now keep nfs over
openvpn going, even on a system with limited memory that is also
thrashing with a large compile.
But this seems like a fragile process: I don't know if there are other
troublespots lurking. Really I'd like openvpn to say to the kernel
"whenever you allocate memory for me, use GFP_NOIO", but I haven't
discovered a way to do this: none of the process flags quite seem to fit
the bill. (I did consider PF_KSWAPD, but thought better of it. :-)
Is there a better way to solve this?
Thanks,
Toby.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists