lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <1351003871.8467.18.camel@gandalf.local.home>
Date:	Tue, 23 Oct 2012 10:51:11 -0400
From:	Steven Rostedt <rostedt@...dmis.org>
To:	Fengguang Wu <fengguang.wu@...el.com>
Cc:	Steven Rostedt <srostedt@...hat.com>,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: [tracing] WARNING: at kernel/lockdep.c:3545 check_flags()

On Thu, 2012-10-18 at 19:41 +0800, Fengguang Wu wrote:
> Hi Steven,

Hi Wu,

I was out of town and I don't read my redhat email address while
traveling.

> 
> This warning occur from time to time in my tests:
> 
> [  593.080133] Testing tracer function: 
> [  593.149895] ------------[ cut here ]------------
> [  593.149895] WARNING: at /c/kernel-tests/src/stable/kernel/lockdep.c:3545 check_flags+0x1c8/0x1d0()

This is:

        if (!hardirq_count()) {
                if (softirq_count()) {
                        /* like the above, but with softirqs */
                        DEBUG_LOCKS_WARN_ON(current->softirqs_enabled);
                } else {

Which means that some how softirqs are disabled but lockdep thinks they
are enabled.

> [  593.149895] Modules linked in:
> [  593.149895] Pid: 1, comm: swapper/0 Not tainted 3.7.0-rc1-12129-gee2dd3c #3
> [  593.149895] Call Trace:
> [  593.149895]  <IRQ>  [<ffffffff810734cf>] warn_slowpath_common+0x8f/0xc0
> [  593.149895]  [<ffffffff8107351a>] warn_slowpath_null+0x1a/0x20
> [  593.149895]  [<ffffffff810fac08>] check_flags+0x1c8/0x1d0
> [  593.149895]  [<ffffffff810fbfed>] lock_is_held+0x4d/0xa0
> [  593.149895]  [<ffffffff8156d570>] ? notify_die+0x30/0x30
> [  593.149895]  [<ffffffff81081b13>] ? __local_bh_enable+0x1d3/0x230
> [  593.149895]  [<ffffffff81133cdf>] rcu_read_lock_held+0x7f/0x90
> [  593.149895]  [<ffffffff81134208>] ftrace_ops_no_ops+0x188/0x190
> [  593.149895]  [<ffffffff81573872>] trace+0x55/0x93
> [  593.149895]  [<ffffffff81573872>] ? trace+0x55/0x93
> [  593.149895]  [<ffffffff81566fc9>] ? _raw_spin_unlock_irq+0x99/0xc0
> [  593.149895]  [<ffffffff8156d575>] ? sub_preempt_count+0x5/0x210
> [  593.149895]  [<ffffffff8156d575>] ? sub_preempt_count+0x5/0x210
> [  593.149895]  [<ffffffff81081b13>] ? __local_bh_enable+0x1d3/0x230

This makes sense, as function tracing of softirqs sometimes causes this
bug. It's a race where a function is traced inside enabling softirqs and
the tracer's lock causes a check of softirqs after lockdep disabled it
but before it was actually disabled. I'll try to figure out how this
happens again. It has been fixed in the past before.


> [  593.149895]  [<ffffffff810827ea>] __do_softirq+0x42a/0x710
> [  593.149895]  [<ffffffff81574c7c>] call_softirq+0x1c/0x30
> [  593.149895]  [<ffffffff81028a05>] do_softirq+0x1d5/0x2a0
> [  593.149895]  [<ffffffff81082d3d>] irq_exit+0xad/0xc0
> [  593.149895]  [<ffffffff81028035>] do_IRQ+0x75/0x100
> [  593.149895]  [<ffffffff81567c72>] common_interrupt+0x72/0x72
> [  593.149895]  <EOI>  [<ffffffff810fafbb>] ? mark_held_locks+0x9b/0x100
> [  593.149895]  [<ffffffff815675f3>] ? _raw_spin_unlock_irqrestore+0xd3/0x130
> [  593.149895]  [<ffffffff81307487>] __debug_object_init+0x1b7/0x580
> [  593.149895]  [<ffffffff813078c6>] debug_object_init_on_stack+0x36/0x40
> [  593.149895]  [<ffffffff8108a565>] init_timer_on_stack_key+0x35/0x80
> [  593.149895]  [<ffffffff815616fb>] schedule_timeout+0x8b/0x470
> [  593.149895]  [<ffffffff81573872>] ? trace+0x55/0x93
> [  593.149895]  [<ffffffff81562fb7>] ? __mutex_unlock_slowpath+0x247/0x380
> [  593.149895]  [<ffffffff81133487>] ? update_ftrace_function+0x17/0x150
> [  593.149895]  [<ffffffff81561afe>] schedule_timeout_uninterruptible+0x1e/0x20
> [  593.149895]  [<ffffffff8108ea60>] msleep+0x20/0x30
> [  593.149895]  [<ffffffff81145a6a>] trace_selftest_startup_function+0xda/0x2b0
> [  593.149895]  [<ffffffff81146c4c>] register_tracer+0x32c/0x430
> [  593.149895]  [<ffffffff81c3aee4>] ? init_trace_printk_function_export+0x48/0x48
> [  593.149895]  [<ffffffff81c3aef4>] init_function_trace+0x10/0x12
> [  593.149895]  [<ffffffff8100223a>] do_one_initcall+0x1aa/0x270
> [  593.149895]  [<ffffffff815313bc>] kernel_init+0x11c/0x370
> [  593.149895]  [<ffffffff81c17878>] ? do_early_param+0xc6/0xc6
> [  593.149895]  [<ffffffff810ca474>] ? schedule_tail+0x34/0x110
> [  593.149895]  [<ffffffff815312a0>] ? rest_init+0x170/0x170
> [  593.149895]  [<ffffffff815739ac>] ret_from_fork+0x7c/0xb0
> [  593.149895]  [<ffffffff815312a0>] ? rest_init+0x170/0x170
> 
> And it's bisected down to an unbelievable simple commit..
> 
> commit 17bb615ad4f8d2d2c0f02794d27d7f83e0009ef4
> Author: Steven Rostedt <srostedt@...hat.com>
> Date:   Mon May 23 15:27:46 2011 -0400
> 
>     tracing: Have event with function tracer check error return

This change is simply:

--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -1657,7 +1657,12 @@ static struct ftrace_ops trace_ops __initdata  =
 
 static __init void event_trace_self_test_with_function(void)
 {
-       register_ftrace_function(&trace_ops);
+       int ret;
+       ret = register_ftrace_function(&trace_ops);
+       if (WARN_ON(ret < 0)) {
+               pr_info("Failed to enable function tracer for event tests\n")
+               return;
+       }


I highly doubt that this has anything to do with the bug. The bug is a
race, and that means you can't bisect it with single runs to determine
good and bad. It can only detect bad, but just because it passes,
doesn't actually mean its good.

My search is to see what is being traced again when it shouldn't be.

-- Steve


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ