[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230816152308.5f887721@gandalf.local.home>
Date: Wed, 16 Aug 2023 15:23:08 -0400
From: Steven Rostedt <rostedt@...dmis.org>
To: Zheng Yejian <zhengyejian1@...wei.com>
Cc: "Masami Hiramatsu (Google)" <mhiramat@...nel.org>,
<laijs@...fujitsu.com>, <linux-kernel@...r.kernel.org>,
<linux-trace-kernel@...r.kernel.org>
Subject: Re: [PATCH] tracing: Fix race when concurrently splice_read
trace_pipe
On Sat, 12 Aug 2023 10:22:43 +0800
Zheng Yejian <zhengyejian1@...wei.com> wrote:
>
> >>>> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
> >>>> index b8870078ef58..f169d33b948f 100644
> >>>> --- a/kernel/trace/trace.c
> >>>> +++ b/kernel/trace/trace.c
> >>>> @@ -7054,14 +7054,16 @@ static ssize_t tracing_splice_read_pipe(struct file *filp,
> >>>> if (ret <= 0)
> >>>> goto out_err;
> >>>>
> >>>> - if (!iter->ent && !trace_find_next_entry_inc(iter)) {
> >>>> + trace_event_read_lock();
> >>>> + trace_access_lock(iter->cpu_file);
> >>>> +
> >>>> + if (!trace_find_next_entry_inc(iter)) {
> >>>
> >>> It seems you skips '!iter->ent' check. Is there any reason for this change?
> >>
> >> IIUC, 'iter->ent' may be the entry that was found but not consumed
> >> in last call tracing_splice_read_pipe(), and in this call, 'iter->ent'
> >> may have being consumed, so we may should find a new 'iter->ent' before
> >> printing it in tracing_fill_pipe_page(), see following reduced codes:
> >
> > And if it wasn't consumed? We just lost it?
>
> If 'iter->ent' was not consumed, trace_find_next_entry_inc() will find
> it again, will it?
>
> -- Zheng Yejian
>
> >
> >>
> >> tracing_splice_read_pipe() {
> >> if (!iter->ent && !trace_find_next_entry_inc(iter)) { // 1. find
> >> entry here
> >> ... ...
> >> }
> >> tracing_fill_pipe_page() {
> >> for (;;) {
> >> ... ...
> >> ret = print_trace_line(iter); // 2. print entry
> >> ... ...
> >
> > You missed:
> >
> > count = trace_seq_used(&iter->seq) - save_len;
> > if (rem < count) {
> > rem = 0;
> > iter->seq.seq.len = save_len;
> >
> > Where the above just threw away what was printed in the above
> > "print_trace_line()", and it never went to console.
> >
> > break;
> > }
> >
>
> Thanks for pointing this out!
Just to get this moving again, I believe we should add a ref count to
trace_pipe and the per_cpu trace_pipes, where if they are opened, nothing else can read it.
Opening trace_pipe locks all per_cpu ref counts, if any of them are open,
then the trace_pipe open will fail (and releases any ref counts it had
taken).
Opening a per_cpu trace_pipe will up the ref count for just that CPU
buffer. This will allow multiple tasks to read different per_cpu trace_pipe
files, but will prevent the main trace_pipe file from being opened.
Does that work for this?
-- Steve
Powered by blists - more mailing lists