[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240207072812.4a29235f@rorschach.local.home>
Date: Wed, 7 Feb 2024 07:28:12 -0500
From: Steven Rostedt <rostedt@...dmis.org>
To: Mete Durlu <meted@...ux.ibm.com>
Cc: Sven Schnelle <svens@...ux.ibm.com>, Masami Hiramatsu
<mhiramat@...nel.org>, Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
linux-kernel@...r.kernel.org, linux-trace-kernel@...r.kernel.org
Subject: Re: [PATCH] tracing: use ring_buffer_record_is_set_on() in
tracer_tracing_is_on()
On Wed, 7 Feb 2024 13:07:36 +0100
Mete Durlu <meted@...ux.ibm.com> wrote:
> wouldn't the following scenario explain the behavior we are seeing.
> When using event triggers, trace uses lockless ringbuffer control paths.
> If cmdline update and trace output reading is happening on different
> cpus, the ordering can get messed up.
>
> 1. event happens and trace trigger tells ring buffer to stop writes
> 2. (on cpu#1)test calculates checksum on current state of trace
> output.
> 3. (on cpu#2)not knowing about the trace buffers status yet, writer adds
> a one last entry which would collide with a pid in cmdline map before
> actually stopping. While (on cpu#1) checksum is being calculated, new
> saved cmdlines entry is waiting for spinlocks to be unlocked and then
> gets added.
> 4. test calculates checksum again and finds that the trace output has
> changed. <...> is put on collided pid.
But the failure is here:
on=`cat tracing_on`
if [ $on != "0" ]; then
fail "Tracing is not off"
fi
csum1=`md5sum trace`
sleep $SLEEP_TIME
csum2=`md5sum trace`
if [ "$csum1" != "$csum2" ]; then
fail "Tracing file is still changing"
fi
1. tracing is off
2. do checksum of trace
3. sleep
4. do another checksum of trace
5. compare the two checksums
Now how did they come up differently in that amount of time? The
saved_cmdlines really should not have been updated.
(note, I'm not against the patch, I just want to understand why this
test really failed).
-- Steve
Powered by blists - more mailing lists