lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Date:	Tue, 28 Aug 2012 10:54:58 +0200
From:	Heiko Nardmann <heiko.nardmann@...chnical.de>
To:	linux-kernel@...r.kernel.org
Subject: Q: dlm_recoverd takes 100%

Hi together,

maybe someone can give me a hint which ML to contact (if I am wrong here)?

In a two-node cluster system I see 'dlm_recoverd' taking 100% time of 
one cpu for around 6 minutes. Here is small excerpt from a 'top' output 
during that period:

top - 10:51:01 up 3 days, 17:21,  5 users,  load average: 10.19, 5.39, 2.76
Tasks: 536 total,   3 running, 533 sleeping,   0 stopped,   0 zombie
Cpu(s):  1.2%us,  6.6%sy,  0.0%ni, 92.1%id,  0.1%wa,  0.0%hi, 0.0%si,  
0.0%st
Mem:  12183344k total, 11827540k used,   355804k free,   160332k buffers
Swap: 14417912k total,        0k used, 14417912k free,  8299364k cached

   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+ COMMAND
  3121 root      20   0     0    0    0 R 100.0  0.0   3:36.15 dlm_recoverd

The cluster nodes use a shared SAN (GFS2). The second node has been 
rebooted while I experience this behaviour. The real problem is that my 
application is unable to open a file on the SAN for these 6 minutes. 
After the reboot of the second node all is fine again and the 
application succeeds in opening the file. So I am not sure what can 
cause those two symptoms.

Thanks in advance for any hint!


Kind regards,

     Heiko
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ