linux-kernel - Re: [PATCH] tracing: use ring_buffer_record_is_set_on() in tracer_tracing_is

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <8c986cb3-61b3-4f65-81c9-ffcfa994390f@linux.ibm.com>
Date: Thu, 8 Feb 2024 11:25:50 +0100
From: Mete Durlu <meted@...ux.ibm.com>
To: Steven Rostedt <rostedt@...dmis.org>, Sven Schnelle <svens@...ux.ibm.com>
Cc: Masami Hiramatsu <mhiramat@...nel.org>,
        Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
        linux-kernel@...r.kernel.org, linux-trace-kernel@...r.kernel.org
Subject: Re: [PATCH] tracing: use ring_buffer_record_is_set_on() in
 tracer_tracing_is_on()

On 2/7/24 16:47, Steven Rostedt wrote:
> On Wed, 07 Feb 2024 14:33:21 +0100
> Sven Schnelle <svens@...ux.ibm.com> wrote:
> 
>> My assumption without reading the code is that something like this
>> happens:
>>
>> CPU0                             CPU1
>> [ringbuffer enabled]
>>                                   ring_buffer_write()
>>                                       if (atomic_read(&buffer->record_disabled))
>>                                              goto out;
>> echo 0 > tracing_on
>> record_disabled |= RB_BUFFER_OFF
>> csum1=`md5sum trace`
> 
> Note, the CPU1 is performing with preemption disabled, so for this to
> happen, something really bad happened on CPU0 to delay preempt disabled
> section so long to allow the trace to be read. Perhaps we should have
> the return of the echo 0 > tracing_on require a synchronize_rcu() to
> make sure all ring buffers see it disabled before it returns.
> 
> But unless your system is doing something really stressed to cause the
> preempt disabled section to take so long, I highly doubt this was the
> race.
> 

I have been only able to reliably reproduce this issue when the system
is under load from stressors. But I am not sure if it can be considered
as *really stressed*.

system : 8 cpus (4 physical cores)
load   : stress-ng --fanotify 1 (or --context 2)
result : ~5/10 test fails

of course as load increases test starts to fail more often, but a
single stressor doesn't seem like much to me for a 4 core machine.

after adding synchronize_rcu() + patch from Sven, I am no longer seeing
failures with the setup above. So it seems like synchronize_rcu() did
the trick(or at least it helps a lot) for the case described on the
previous mail. I couldn't trigger the failure yet, not even with
increased load(but now the test case takes > 5mins to finish :) ).

Here is the diff:

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
@@ -9328,10 +9328,12 @@ rb_simple_write(struct file *filp, const char 
__user *ubuf,
                         val = 0; /* do nothing */
                 } else if (val) {
                         tracer_tracing_on(tr);
+                       synchronize_rcu();
                         if (tr->current_trace->start)
                                 tr->current_trace->start(tr);
                 } else {
                         tracer_tracing_off(tr);
+                       synchronize_rcu();
                         if (tr->current_trace->stop)
                                 tr->current_trace->stop(tr);

Not 100% sure if these were the correct places to add them.