lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <249b36ec1afa43a699b281a4192c6afd@EX132MBOX1A.de2.local>
Date:	Mon, 12 Jan 2015 11:48:28 +0000
From:	"Stoidner, Christoph" <c.stoidner@...ero.de>
To:	"paulmck@...ux.vnet.ibm.com" <paulmck@...ux.vnet.ibm.com>
CC:	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: AW: Question concerning RCU

Hi Paul,

> You got stack traces with the stall warnings, correct?  If so, please look
> at them and at Documentation/RCU/stallwarn.txt and see if the kernel is
> looping somewhere inappropriate.

Yes and no. I have a stack trace, but it is not generated by a stall warning. More
precise: I can never see any stall warning. The reason is that the system freezes 
when it is about to output such a warning. Instead the stack trace is generated 
by gdb and JTAG hardware debugging, when freezing has occurred.

So I am not sure if there is really a CPU-stall condition or it is just a misrepresented
stall detection. However, outputting a stall warning leads to system freeze. The 
warning is never seen.

> I am not familiar with the low-level ARM kernel code, but the stack below
> leads me to suspect that your kernel is interrupting itself to death or
> is improperly handling interrupts.

The stack trace must be read from bottom to top. The repetitive occurrence of
"__irq_svc () at arch/arm/kernel/entry-armv.S:202" on bottom of stack trace is 
caused by the stack frame of the interrupt context. This is completely legal and 
also the case in normal situations. Instead the problem is on the top of the stack 
trace, in function rcu_print_task_stall(). The loop rcutree_plugin.h in line 528 
never ends:

static int rcu_print_task_stall(struct rcu_node *rnp)
{
	...
	...

	list_for_each_entry_continue(t, &rnp->blkd_tasks, rcu_node_entry) {
		printk(KERN_CONT " P%d", t->pid);
		ndetected++;
	}

	...    
	...
}        

That means list_for_each_entry_continue () never ends since rcu_node_entry.next 
seems to point to it-self but not to rnp->blkd_tasks. I have no idea how this can
happen.

One more thing: Just for testing I have now enabled CONFIG_TINY_PREEMPT_RCU. 
Until now the problem has not occurred anymore. Do you have any idea what makes
the differences here?

Thanks and regards,
Christoph
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ