[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20131118091428.6360d82a@samsung-9>
Date: Mon, 18 Nov 2013 09:14:28 -0800
From: Stephen Hemminger <stephen@...workplumber.org>
To: netdev@...r.kernel.org
Subject: Fw: [Bug 65131] New: kernel panic (BUG_ON raised) in SCTP function
sctp_cmd_interpreter
Begin forwarded message:
Date: Sun, 17 Nov 2013 19:38:56 -0800
From: "bugzilla-daemon@...zilla.kernel.org" <bugzilla-daemon@...zilla.kernel.org>
To: "stephen@...workplumber.org" <stephen@...workplumber.org>
Subject: [Bug 65131] New: kernel panic (BUG_ON raised) in SCTP function sctp_cmd_interpreter
https://bugzilla.kernel.org/show_bug.cgi?id=65131
Bug ID: 65131
Summary: kernel panic (BUG_ON raised) in SCTP function
sctp_cmd_interpreter
Product: Networking
Version: 2.5
Kernel Version: 3.11.8 custom build, repeated on 3.11.2
Hardware: All
OS: Linux
Tree: Mainline
Status: NEW
Severity: blocking
Priority: P1
Component: IPV4
Assignee: shemminger@...ux-foundation.org
Reporter: yuras@....net
Regression: No
Created attachment 114991
--> https://bugzilla.kernel.org/attachment.cgi?id=114991&action=edit
Screenshot of panic
Two-node cluster configured using latest corosync (also DRBD 8.4.4, LVM2, and
GFS2 but this is unessential).
Steps to reproduce:
1. Start corosync on both nodes.
2. Start dlm_controld (version 4.0.2) on both nodes (used SCTP protocol as TCP
cannot be used on multi-homed hosts). Adds such lines to kern.log:
kernel: [ 580.428664] sctp: Hash tables configured (established 65536 bind
65536)
kernel: [ 580.441779] DLM installed
3. Start clvmd on either node. Adds such lines to kern.log:
kernel: [ 1345.259502] dlm: Using SCTP for communications
kernel: [ 1345.260699] dlm: clvmd: joining the lockspace group...
kernel: [ 1345.262962] dlm: clvmd: dlm_recover 1
kernel: [ 1345.262968] dlm: clvmd: group event done 0 0
kernel: [ 1345.262992] dlm: clvmd: add member 1024
kernel: [ 1345.262995] dlm: clvmd: dlm_recover_members 1 nodes
kernel: [ 1345.262996] dlm: clvmd: join complete
kernel: [ 1345.262998] dlm: clvmd: generation 1 slots 1 1:1024
kernel: [ 1345.262999] dlm: clvmd: dlm_recover_directory
kernel: [ 1345.263000] dlm: clvmd: dlm_recover_directory 0 in 0 new
kernel: [ 1345.263002] dlm: clvmd: dlm_recover_directory 0 out 0 messages
kernel: [ 1345.263019] dlm: clvmd: dlm_recover 1 generation 1 done: 0 ms
4. Start clvmd on second node. With high probability one node or both nodes
panic in the similar way. Screenshot in attachment.
Stack trace can differ slightly above EOI line, but RIP was always the same. I
suppose provided CPU codes correspond to one of BUG_ON macro inside
sctp_cmd_interpreter. So, this is a bug.
Now this bug totally prevents me from using my cluster as DLM rejects to use
TCP for multi-homed hosts.
--
You are receiving this mail because:
You are the assignee for the bug.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists