linux-kernel - Q: dlm_recoverd takes 100%

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [day] [month] [year] [list]

Message-ID: <503C8762.5070607@itechnical.de>
Date:	Tue, 28 Aug 2012 10:54:58 +0200
From:	Heiko Nardmann <heiko.nardmann@...chnical.de>
To:	linux-kernel@...r.kernel.org
Subject: Q: dlm_recoverd takes 100%

Hi together,

maybe someone can give me a hint which ML to contact (if I am wrong here)?

In a two-node cluster system I see 'dlm_recoverd' taking 100% time of 
one cpu for around 6 minutes. Here is small excerpt from a 'top' output 
during that period:

top - 10:51:01 up 3 days, 17:21,  5 users,  load average: 10.19, 5.39, 2.76
Tasks: 536 total,   3 running, 533 sleeping,   0 stopped,   0 zombie
Cpu(s):  1.2%us,  6.6%sy,  0.0%ni, 92.1%id,  0.1%wa,  0.0%hi, 0.0%si,  
0.0%st
Mem:  12183344k total, 11827540k used,   355804k free,   160332k buffers
Swap: 14417912k total,        0k used, 14417912k free,  8299364k cached

   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+ COMMAND
  3121 root      20   0     0    0    0 R 100.0  0.0   3:36.15 dlm_recoverd

The cluster nodes use a shared SAN (GFS2). The second node has been 
rebooted while I experience this behaviour. The real problem is that my 
application is unable to open a file on the SAN for these 6 minutes. 
After the reboot of the second node all is fine again and the 
application succeeds in opening the file. So I am not sure what can 
cause those two symptoms.

Thanks in advance for any hint!

Kind regards,

     Heiko
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/