linux-kernel - lockdep and preemptoff tracer are fighting again.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.1.10.0901221523400.5838@gandalf.stny.rr.com>
Date:	Thu, 22 Jan 2009 15:40:23 -0500 (EST)
From:	Steven Rostedt <rostedt@...dmis.org>
To:	Ingo Molnar <mingo@...e.hu>, Peter Zijlstra <peterz@...radead.org>
cc:	LKML <linux-kernel@...r.kernel.org>
Subject: lockdep and preemptoff tracer are fighting again.



Hey guys, I can consistently hit this bug when running the preempt tracer:

------------[ cut here ]------------
WARNING: at kernel/lockdep.c:2899 check_flags+0x154/0x18b()
Hardware name: Precision WorkStation 470    
Modules linked in: radeon drm autofs4 hidp rfcomm l2cap bluetooth sunrpc nf_conn
track_netbios_ns ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 xt_state iptable_fi
lter ip_tables ip6t_REJECT xt_tcpudp ip6table_filter ip6_tables x_tables ipv6 sb
s sbshc battery ac snd_intel8x0 snd_ac97_codec sg ac97_bus snd_seq_dummy snd_seq
_oss snd_seq_midi_event floppy snd_seq snd_seq_device snd_pcm_oss ide_cd_mod snd
_mixer_oss cdrom snd_pcm e1000 serio_raw snd_timer snd i2c_i801 button soundcore
 ata_generic i2c_core iTCO_wdt snd_page_alloc e752x_edac iTCO_vendor_support shp
chp edac_core pcspkr dm_snapshot dm_zero dm_mirror dm_region_hash dm_log dm_mod 
ata_piix libata sd_mod scsi_mod ext3 jbd ehci_hcd ohci_hcd uhci_hcd
Pid: 3855, comm: sshd Not tainted 2.6.29-rc2-tip #366
Call Trace:
 [<ffffffff80245e9f>] warn_slowpath+0xd8/0xf7
 [<ffffffff80297154>] ? ring_buffer_unlock_commit+0x24/0xa3
 [<ffffffff80299501>] ? trace_function+0xad/0xbc
 [<ffffffff8025c1ff>] ? remove_wait_queue+0x4d/0x52
 [<ffffffff8029e5dc>] ? trace_preempt_on+0x113/0x130
 [<ffffffff8029e4ba>] ? check_critical_timing+0x12e/0x13d
 [<ffffffff8025c1ff>] ? remove_wait_queue+0x4d/0x52
 [<ffffffff8029f75b>] ? stack_trace_call+0x249/0x25d
 [<ffffffff802da06e>] ? fput+0x4/0x1c
 [<ffffffff802e7edc>] ? free_poll_entry+0x26/0x2a
 [<ffffffff802da06e>] ? fput+0x4/0x1c
 [<ffffffff8020c2d6>] ? ftrace_call+0x5/0x2b
 [<ffffffff8029f75b>] ? stack_trace_call+0x249/0x25d
 [<ffffffff80543dec>] ? _spin_lock_irqsave+0xb/0x59
 [<ffffffff802699bf>] check_flags+0x154/0x18b
 [<ffffffff8026de66>] lock_acquire+0x41/0xa9
 [<ffffffff80543dfd>] ? _spin_lock_irqsave+0x1c/0x59
 [<ffffffff80543e27>] _spin_lock_irqsave+0x46/0x59
 [<ffffffff8029519c>] ? ring_buffer_reset_cpu+0x31/0x6b
 [<ffffffff8029519c>] ring_buffer_reset_cpu+0x31/0x6b
 [<ffffffff80299ec6>] tracing_reset+0x46/0x9b
 [<ffffffff8029e33f>] trace_preempt_off+0x100/0x14d
 [<ffffffff8024b491>] ? local_bh_disable+0x12/0x14
 [<ffffffff8024b44f>] ? __local_bh_disable+0xc0/0xf0
 [<ffffffff8024b491>] ? local_bh_disable+0x12/0x14
 [<ffffffff80543b95>] ? _spin_lock_bh+0x16/0x4c
 [<ffffffff80546df1>] add_preempt_count+0x12d/0x132
 [<ffffffff8024b44f>] __local_bh_disable+0xc0/0xf0
 [<ffffffff8024b491>] local_bh_disable+0x12/0x14
 [<ffffffff80543b95>] _spin_lock_bh+0x16/0x4c
 [<ffffffff804ab49a>] lock_sock_nested+0x28/0xe5
 [<ffffffff80292c90>] ? ftrace_list_func+0x24/0x39
 [<ffffffff8020c2d6>] ? ftrace_call+0x5/0x2b
 [<ffffffff804eff87>] tcp_sendmsg+0x27/0xac2
 [<ffffffff803556c7>] ? cap_socket_sendmsg+0x4/0xd
 [<ffffffff80292c90>] ? ftrace_list_func+0x24/0x39
 [<ffffffff8020c2d6>] ? ftrace_call+0x5/0x2b
 [<ffffffff804a82b0>] sock_aio_write+0x109/0x11d
 [<ffffffff8029f75b>] ? stack_trace_call+0x249/0x25d
 [<ffffffff8020c2d6>] ? ftrace_call+0x5/0x2b
 [<ffffffff802d8881>] do_sync_write+0xf0/0x137
 [<ffffffff8025c002>] ? autoremove_wake_function+0x0/0x3d
 [<ffffffff8020c2d6>] ? ftrace_call+0x5/0x2b
 [<ffffffff803553ca>] ? cap_file_permission+0x9/0xd
 [<ffffffff80353c88>] ? security_file_permission+0x16/0x18
 [<ffffffff802d921c>] vfs_write+0x103/0x17d
 [<ffffffff802d978f>] sys_write+0x4e/0x8c
 [<ffffffff8020c64b>] system_call_fastpath+0x16/0x1b
---[ end trace 713cc9df66b54d6e ]---


The cause is simple. The following happens:

local_bh_disable is called, which calls __local_bh_disable which does a 
add_preempt_count(SOFTIRQ_OFFSET).

Thus, add_preempt_count adds the SOFTIRQ_OFFSET to the preempt_count of 
current, and then calls trace_preempt_off.

This goes into the preempt tracer which calls start_critical_timing, and 
this will reset the ring buffer for the CPU, because this is the start of 
the trace.

ring_buffer_reset_cpu() calls spin_lock_irqsave() which eventually calls 
spin_acquire which is lock_acquire in lockdep.

lock_acquire calls check_flags which performs this check:

	if (!hardirq_count()) {
		if (softirq_count())
			DEBUG_LOCKS_WARN_ON(current->softirqs_enabled);
		else
			DEBUG_LOCKS_WARN_ON(!current->softirqs_enabled);
	}

With this:

#define hardirq_count()	(preempt_count() & HARDIRQ_MASK)
#define softirq_count()	(preempt_count() & SOFTIRQ_MASK)


The hardirq_count returns false, but the softirq_count returns true and 
softirqs_enalbed is also true. The problem lies in local_bh_disable:

static void __local_bh_disable(unsigned long ip)
{
	unsigned long flags;

	WARN_ON_ONCE(in_irq());

	raw_local_irq_save(flags);
	add_preempt_count(SOFTIRQ_OFFSET); <-- here softirq_count is true
	/*
	 * Were softirqs turned off above:
	 */
	if (softirq_count() == SOFTIRQ_OFFSET)
		trace_softirqs_off(ip); <-- here softirqs_enabled is false
	raw_local_irq_restore(flags);
}

If we call into lockdep between softirq_count == true and 
softirqs_enabled == false, we hit the WARN_ON.

The trace_softirqs_off() sets softirs_enabled to false. But because the 
tracer calls into lockdep between the two, we hit this warning.

If we try to swap the trace_softirqs_off with the add_preempt_count we hit 
another warning thatch checks to make sure softirq_count is true in the 
trace_softirqs_off code.

We need a way to have lockdep and the preempt tracer to be able to talk to 
each other and let it know that it should not fail here.

Any ideas?

-- Steve


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/