lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 29 Dec 2008 09:47:11 -0500
From:	Steven Rostedt <srostedt@...hat.com>
To:	Indan Zupancic <indan@....nu>
Cc:	Ingo Molnar <mingo@...hat.com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] Fix race in ring_buffer_consume()


On Mon, 2008-12-29 at 04:59 +0100, Indan Zupancic wrote:
> Hello,
> 
> While doing a grep -r as root somewhere in /proc, grep founds its way
> into /mnt/debug/tracing and caused the below snippet:
> 
> ------------[ cut here ]------------
> kernel BUG at /home/indan/src/linux-2.6/kernel/trace/ring_buffer.c:1676!
> invalid opcode: 0000 [#1] PREEMPT
> last sysfs file: /sys/devices/virtual/backlight/thinkpad_screen/brightness
> Dumping ftrace buffer:
>    (ftrace buffer empty)
> Modules linked in: i915 drm pl2303 usbserial usb_storage uhci_hcd ehci_hcd
> usbcore
> 
> Pid: 10660, comm: grep Not tainted (2.6.28phc #12) 2371GHG
> EIP: 0060:[<c0241464>] EFLAGS: 00010246 CPU: 0
> EIP is at rb_advance_reader+0xe/0xd7
> EAX: 00000000 EBX: f703c000 ECX: f717ea80 EDX: 00000003
> ESI: f70081c0 EDI: f7008240 EBP: 00000fff ESP: f6ac2f3c
>  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
> Process grep (pid: 10660, ti=f6ac2000 task=f689b900 task.ti=f6ac2000)
> Stack:
>  f703c000 f70081c0 00000000 c024171e f2de2000 f2de2000 f2de3038 c0242fc8
>  00000a6a c0245105 089adff4 f2dca400 f2de2010 00005000 f2dca400 c0244f8f
>  089adff4 c0263633 f6ac2fa0 f2dca400 fffffff7 00001000 f6ac2000 c02636f9
> Call Trace:
>  [<c024171e>] ring_buffer_consume+0x2b/0x31
>  [<c0242fc8>] trace_consume+0x1d/0x23
>  [<c0245105>] tracing_read_pipe+0x176/0x1dc
>  [<c0244f8f>] tracing_read_pipe+0x0/0x1dc
>  [<c0263633>] vfs_read+0x73/0xa1
>  [<c02636f9>] sys_read+0x3c/0x63
>  [<c0202ef5>] sysenter_do_call+0x12/0x2a
> Code: 00 25 00 f0 ff ff e8 f9 fb ff ff 85 c0 74 05 e8 8c 61 1c 00
> 89 f0 5e 5f 5b 5e 5f 5d c3 57 89 c7 56 53 e8 c2 fe ff ff 85 c0 75
> 04 <0f> 0b eb fe 8b 47 1c 8b 70 1c 03 70 10 8a 06 24 03 3c 03 75 03
> EIP: [<c0241464>] rb_advance_reader+0xe/0xd7 SS:ESP 0068:f6ac2f3c
> ---[ end trace 77bf081d52bacbc7 ]---
> note: grep[10660] exited with preempt_count 1
> 
> I think preemptirqsoff tracing is enabled, can't check because doing
> cat on most files in debug/tracing hangs. Kernel was booted with
> initcall_debug. Kernel is 2.6.28.
> 
> $ zcat /proc/config.gz |grep TRACER=
> CONFIG_NOP_TRACER=y
> CONFIG_HAVE_FUNCTION_TRACER=y
> CONFIG_IRQSOFF_TRACER=y
> CONFIG_PREEMPT_TRACER=y
> CONFIG_CONTEXT_SWITCH_TRACER=y
> CONFIG_BOOT_TRACER=y
> 
> ring_buffer.c:1676:
> 
> static void rb_advance_reader(struct ring_buffer_per_cpu *cpu_buffer)
> {
> 	struct ring_buffer_event *event;
> 	struct buffer_page *reader;
> 	unsigned length;
> 
> 	reader = rb_get_reader_page(cpu_buffer);
> 
> 	/* This function should not be called when buffer is empty */
> 	BUG_ON(!reader);
> 
> Called by ring_buffer_consume():
> 
> ring_buffer_consume(struct ring_buffer *buffer, int cpu, u64 *ts)
> {
> 	struct ring_buffer_per_cpu *cpu_buffer;
> 	struct ring_buffer_event *event;
> 
> 	if (!cpu_isset(cpu, buffer->cpumask))
> 		return NULL;
> 
> 	event = ring_buffer_peek(buffer, cpu, ts);
> 	if (!event)
> 		return NULL;
> 
> 	cpu_buffer = buffer->buffers[cpu];
> 	rb_advance_reader(cpu_buffer);
> 
> ring_buffer_peek() is complicated, but boils down to:
> 
> ring_buffer_peek(struct ring_buffer *buffer, int cpu, u64 *ts)
> {
> [...]
> 	cpu_buffer = buffer->buffers[cpu];
> [...]
> 	reader = rb_get_reader_page(cpu_buffer);
> 	if (!reader)
> 		return NULL;
> 
> 	event = rb_reader_event(cpu_buffer);
> [...]
> 	return event or return NULL;
> }
> 
> So between the rb_get_reader_page() call in ring_buffer_peek() and
> the one in rb_advance_reader() something happened causing the event
> to disappear. Also note: grep[10660] exited with preempt_count 1.
> There's no locking between the ring_buffer_peek and rb_advance_reader
> calls in ring_buffer_consume, so a tentative guess is that there should
> be some.
> 
> Then again, the double tracing_read_pipe() in the backtrace seems weird,
> so maybe something else is going on.
> 
> Easiest fix seems to merge ring_buffer_peek and ring_buffer_consume into
> ring_buffer_get, which is the same as the current peek, but with an extra
> parameter telling it to consume the buffer or not.
> 
> The same might be needed for ring_buffer_iter_peek() and ring_buffer_read().
> 
> Patch doing this:
> 
> commit 648cf957210619595856b78f12bd11752ae22aa5
> Author: Indan Zupancic <indan@....nu>
> Date:   Mon Dec 29 14:27:18 2008 +1100
> 
>     Fix race in ring_buffer_consume(): Replace ring_buffer_consume and
> ring_buffer_peek with ring_buffer_get_event
> 
>     Signed-off-by: Indan Zupancic <indan@....nu>
> 
>  include/linux/ring_buffer.h   |    4 +---
>  kernel/trace/ring_buffer.c    |   39 ++++++++-------------------------------
>  kernel/trace/trace.c          |   15 ++++++++-------
>  kernel/trace/trace_selftest.c |    2 +-
>  4 files changed, 18 insertions(+), 42 deletions(-)

Hi Indan,

I really would like to analyze this more, but I'm on leave till
January 5th. I'll definitely take a good look at what is happening then.

Thanks,

-- Steve


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ