lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 15 Dec 2008 15:44:58 -0800
From:	Harry Edmon <harry@...os.washington.edu>
To:	Trond Myklebust <trond.myklebust@....uio.no>
CC:	Max Kellermann <max@...mpel.org>, linux-kernel@...r.kernel.org,
	linux-nfs@...r.kernel.org
Subject: Re: High load in 2.6.27, NFS / rpcauth_lookup_credcache()?

Trond Myklebust wrote:
> On Thu, 2008-10-23 at 14:36 +0200, Max Kellermann wrote:
>   
>> On 2008/10/22 11:12, Max Kellermann <max@...mpel.org> wrote:
>>     
>>> after I was able to fix http://lkml.org/lkml/2008/10/17/147, the
>>> server which was already upgraded to 2.6.27.2 still gets very high
>>> load.  It is a web server with NFS file storage (NetApp), and while
>>> the others in the cluster (kernel 2.6.25) have a load of 1-3, 2.6.27.2
>>> gets 30-50.
>>>
>>> I did an oprofile, with the following results (server just started,
>>> load "only" 5-10):
>>>
>>> 87593    56.1116  (no location information)   vmlinux
>>> vmlinux                  rpcauth_lookup_credcache
>>> 16037    10.2732  auth_generic.c:0            vmlinux
>>> vmlinux                  generic_match
>>> 6460      4.1382  (no location information)   php4
>>> php4                     (no symbols)
>>> 2478      1.5874  (no location information)   libc-2.7.so
>>> libc-2.7.so              (no symbols)
>>> [...]
>>>
>>> We havn't configured any special authentication method.  It is a NFSv3
>>> over UDP mount, but the kernel has NFSv4 and therefore KRB5 enabled.
>>>
>>> Any ideas why rpcauth_lookup_credcache() goes overboard with CPU
>>> usage?
>>>       
>> I have bisected the problem: 98a8e323 is the result ("SUNRPC: Add a
>> helper rpcauth_lookup_generic_cred()").  5c691044 is ok.
>>
>> See the attached oprofile annotation data for both commits.  I guess
>> that the function rpcauth_lookup_credcache() is waiting for a spinlock
>> too often and too long.  Trond, any idea?
>>     
>
> Can you add a '-v' to the rpc.gssd daemon startup line? I'd like to see
> how often you are creating new gss contexts.
>
>   
>> Harry: added you to Cc because your problem sounds similar.
>>     
>
> Harry's problem is should be unrelated. afaik, he is seeing a problem
> with userland RPC code, not kernel rpc code.
>
> Trond
>
>   
I am finally getting some time to look at my problem that I originally 
reported in October (SUNRPC problem with 2.6.26 and beyond), and I am 
seeing the same behavior as Max Kellermann when my machine slows as I 
described earlier.  The system in question is currently running 
2.6.27.7.  Here is what I see when it is misbehaving:

samples  %        image name               app name                 
symbol name
11380517 57.4191  sunrpc.ko                sunrpc                   
rpcauth_lookup_credcache
3263657  16.4664  sunrpc.ko                sunrpc                   
generic_match
1081287   5.4555  vmlinux                  vmlinux                  
copy_user_generic_string
499407    2.5197  vmlinux                  vmlinux                  
__posix_lock_file
[...]

And here is what I see when I stop the programs that are chewing up all 
the system time, and then starting them up again:

samples  %        image name               app name                 
symbol name
6372650  21.7978  vmlinux                  vmlinux                  
copy_user_generic_string
5401386  18.4755  sunrpc.ko                sunrpc                   
rpcauth_lookup_credcache
3018753  10.3257  vmlinux                  vmlinux                  
__posix_lock_file
1050095   3.5919  sunrpc.ko                sunrpc                   
generic_match


and I am not using Kerberos with NFSv4 (i.e. no rpc.gssd).  Did you ever 
find a solution for this problem with rpcauth_lookup_credcache?

-- 

 Dr. Harry Edmon			E-MAIL: harry@...os.washington.edu
 206-543-0547				harry@...hington.edu
 Dept of Atmospheric Sciences		FAX:	206-543-0308
 University of Washington, Box 351640, Seattle, WA 98195-1640

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ