linux-kernel - Re: [PATCH] ring-buffer: Do not read at &event->array[0] if it across the page

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3a89d49ae0da6378de83bc68690cf6664d97cb24.camel@mediatek.com>
Date:   Thu, 7 Sep 2023 09:42:41 +0000
From:   Tze-nan Wu (吳澤南) 
        <Tze-nan.Wu@...iatek.com>
To:     "rostedt@...dmis.org" <rostedt@...dmis.org>
CC:     "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-trace-kernel@...r.kernel.org" 
        <linux-trace-kernel@...r.kernel.org>,
        "linux-mediatek@...ts.infradead.org" 
        <linux-mediatek@...ts.infradead.org>,
        Cheng-Jui Wang (王正睿) 
        <Cheng-Jui.Wang@...iatek.com>,
        wsd_upstream <wsd_upstream@...iatek.com>,
        Bobule Chang (張弘義) 
        <bobule.chang@...iatek.com>,
        "linux-arm-kernel@...ts.infradead.org" 
        <linux-arm-kernel@...ts.infradead.org>,
        "mhiramat@...nel.org" <mhiramat@...nel.org>,
        "angelogioacchino.delregno@...labora.com" 
        <angelogioacchino.delregno@...labora.com>
Subject: Re: [PATCH] ring-buffer: Do not read at &event->array[0] if it across
 the page

On Tue, 2023-09-05 at 12:39 -0400, Steven Rostedt wrote:
>  	 
> External email : Please do not click links or open attachments until
> you have verified the sender or the content.
>  On Tue, 5 Sep 2023 22:12:45 +0800
> Tze-nan Wu <Tze-nan.Wu@...iatek.com> wrote:
> 
> > While reading from the tracing/trace, the ftrace reader rarely
> encounters
> > a KASAN invalid access issue.
> > It is likely that the writer has disrupted the ring_buffer that the
> reader
> > is currently parsing. the kasan report is as below:
> 
> What kernel are you using? There's been recent fixes that handle the
> race
> between reading the last entry and the writer adding to it. This
> looks like
> you may have hit that bug.
> 

I am currently using kernel version 6.1.24, but I have applied the
patch for the ringbuffer.c file up to kernel version 6.1.52. However, I
am still able to reprduce the issue.

BTW, my decice have ARCH=ARM64 and PAGESIZE=4K

> > 
> > [name:report&]BUG: KASAN: invalid-access in
> rb_iter_head_event+0x27c/0x3d0
> > [name:report&]Read of size 4 at addr 71ffff8111a18000 by task xxxx
> > [name:report_sw_tags&]Pointer tag: [71], memory tag: [0f]
> > [name:report&]
> > CPU: 2 PID: 380 Comm: xxxx
> > Call trace:
> > dump_backtrace+0x168/0x1b0
> > show_stack+0x2c/0x3c
> > dump_stack_lvl+0xa4/0xd4
> > print_report+0x268/0x9b0
> > kasan_report+0xdc/0x148
> > kasan_tag_mismatch+0x28/0x3c
> > __hwasan_tag_mismatch+0x2c/0x58
> > rb_event_length() [inline]
> > rb_iter_head_event+0x27c/0x3d0
> > ring_buffer_iter_peek+0x23c/0x6e0
> > __find_next_entry+0x1ac/0x3d8
> > s_next+0x1f0/0x310
> > seq_read_iter+0x4e8/0x77c
> > seq_read+0xf8/0x150
> > vfs_read+0x1a8/0x4cc
> > 
> > In some edge cases, ftrace reader could access to an invalid
> address,
> > specifically when reading 4 bytes beyond the end of the currently
> page.
> > While issue happened, the dump of rb_iter_head_event is shown as
> below:
> > 
> >     in function rb_iter_head_event:
> >           - iter->head = 0xFEC
> >           - iter->next_event = 0xFEC
> >           - commit = 0xFF0
> >           - read_stamp = 0x2955AC46DB0
> >           - page_stamp = 0x2955AC2439A
> >           - iter->head_page->page = 0x71FFFF8111A17000
> >           - iter->head_page->time_stamp = 0x2956A142267
> >           - iter->head_page->page->commit = 0xFF0
> >           - the content in iter->head_page->page
> >                 0x71FFFF8111A17FF0: 01010075 00002421 0A123B7C
> FFFFFFC0
> > 
> > In rb_iter_head_event, reader will call rb_event_length with
> argument
> > (struct ring_buffer_event *event = 0x71FFFF8111A17FFC).
> 
> As the commit is considered the end of buffer, it should never read
> past
> that, which the above appears to be doing. That's why I think it hit
> that
> race.
> 

Same as my thought,
since every time the reader try to access the address in next page,
the below condition hold in rb_iter_head_event function:

        if (iter->page_stamp != iter_head_page->page->time_stamp ||
                commit > rb_page_commit(iter_head_page))
                        goto reset;

I observe the result by the debug patch provided below:

        @@ -2378,6 +2378,19 @@ rb_iter_head_event(struct
ring_buffer_iter *iter)
                commit = rb_page_commit(iter_head_page);
                smp_rmb();
        +       if ((iter->head >= 0xFECUL) && commit == 0xFF0UL) {
        +               pr_info("rbdbg: cpu = %d, event = 0x%lx, iter-
>head = 0x%lx,\
        +                       commit = 0xFF0, type = 0x%x, ts = 0x%x,
array addr = 0x%lx\n",\
        +                       iter->cpu_buffer->cpu, (unsigned
long)event, iter->head,\
        +                       event->type_len, event->time_delta,
(unsigned long)(&(event->array[0])));
        +               mdelay(500);
        +               pr_info("rbdbg2: cpu = %d, event = 0x%lx, iter-
>head = 0x%lx,\
        +                       commit = 0xFF0, type = 0x%x, ts = 0x%x,
array addr = 0x%lx\n",\
        +                       iter->cpu_buffer->cpu, (unsigned
long)event, iter->head,\
        +                       event->type_len, event->time_delta,
(unsigned long)(&(event->array[0])));
        +               if (iter->page_stamp != iter_head_page->page-
>time_stamp || commit > rb_page_commit(iter_head_page))
        +                       pr_info("rbdbg: writer corrupt
reader\n");
        +       }
                event = __rb_page_index(iter_head_page, iter->head);
                length = rb_event_length(event);

Note that the mdelay(500) in the debug patch can reproduce the issue
easier with same test in my environmnet,
I am now able to reproduce the issue within 15 minutes if the debug
patch on.

> > Since the content data start at address 0x71FFFF8111A17FFC are
> 0xFFFFFFC0.
> > event->type will be interpret as 0x0, than the reader will try to
> get the
> > length by accessing event->array[0], which is an invalid address:
> >     &event->array[0] = 0x71FFFF8111A18000
> > 
> > Signed-off-by: Tze-nan Wu <Tze-nan.Wu@...iatek.com>
> > ---
> > Following patch may not become a solution, it merely checks if the
> address
> > to be accessed is valid or not within the rb_event_length before
> access.
> > And not sure if there is any side-effect it can lead to.
> > 
> > I am curious about what a better solution for this issue would look
> like.
> > Should we address the problem from the writer or the reader?
> > 
> > Also I wonder if the cause of the issue is exactly as I suspected.
> > Any Suggestion will be appreciated.
> 
> I guess this depends on if you have the fixes or not?
> 

yes, I could try to pick the patches that only included in mainline but
not in kernel-6.1.52 for ring_buffer.c file,
and do the same test to see if the issue is still reproducible.

> > 
> > Test below can reproduce the issue in 2 hours on kernel-6.1.24:
> >     $cd /sys/kernel/tracing/
> >     # make the reader and writer race more through resize the
> buffer to 8kb
> >     $echo 8 > buffer_size_kn
> >     # enable all events
> >     $echo 1 > event/enable
> >     # enable trace
> >     $echo 1 > tracing_on
> >  
> >     # write and run a script that keep reading trace
> >     $./read_trace.sh
> > 
> >     ``` read_trace.sh
> >        while :
> >        do
> >            cat /sys/kernel/tracing/trace > /dev/null
> >        done
> > 
> >     ```
> 
> Thanks, I'll look at that when I finish debugging the eventfs bug.
> 
> -- Steve

Also thank for your reply,

-- Tze-nan