lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20220222204600.mgtt3ihtvlptjh2l@riteshh-domain>
Date:   Wed, 23 Feb 2022 02:16:00 +0530
From:   Ritesh Harjani <riteshh@...ux.ibm.com>
To:     Steven Rostedt <rostedt@...dmis.org>
Cc:     LKML <linux-kernel@...r.kernel.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        "Theodore Ts'o" <tytso@....edu>,
        Andreas Dilger <adilger.kernel@...ger.ca>,
        linux-ext4@...r.kernel.org,
        Harshad Shirwadkar <harshadshirwadkar@...il.com>
Subject: Re: [BUG] NEVER dereference pointers from the tracing ring buffer!

On 22/02/21 04:09PM, Steven Rostedt wrote:
> While working on libtraceevent, it stumbled over a "->" in the print
> formats that wasn't parsing properly. Well, the reason it does not
> handle this particular case is because it is A MAJOR BUG IN THE KERNEL!
>
> DO NOT DEREFERENCE ANYTHING FROM THE RING BUFFER.
>
> Sorry, for yelling, but I really want to stress this is not something
> to do. The ring buffer is referenced seconds, minutes, hours, days,
> months after the data has been loaded. This is the same as using
> something after it is freed. Just don't do it.
>
> The culprit is:
>
>   ext4_fc_stats
>
> Introduced by: aa75f4d3daaeb ("ext4: main fast-commit commit path")
>
> That has in its print fmt in include/trace/events/ext4.h:
>
>
>             TP_printk("dev %d:%d fc ineligible reasons:\n"
>                       "%s:%d, %s:%d, %s:%d, %s:%d, %s:%d, %s:%d, %s:%d, %s:%d, %s:%d; "
>                       "num_commits:%ld, ineligible: %ld, numblks: %ld",
>                       MAJOR(__entry->dev), MINOR(__entry->dev),
>                       FC_REASON_NAME_STAT(EXT4_FC_REASON_XATTR),
>                       FC_REASON_NAME_STAT(EXT4_FC_REASON_CROSS_RENAME),
>                       FC_REASON_NAME_STAT(EXT4_FC_REASON_JOURNAL_FLAG_CHANGE),
>                       FC_REASON_NAME_STAT(EXT4_FC_REASON_NOMEM),
>                       FC_REASON_NAME_STAT(EXT4_FC_REASON_SWAP_BOOT),
>                       FC_REASON_NAME_STAT(EXT4_FC_REASON_RESIZE),
>                       FC_REASON_NAME_STAT(EXT4_FC_REASON_RENAME_DIR),
>                       FC_REASON_NAME_STAT(EXT4_FC_REASON_FALLOC_RANGE),
>                       FC_REASON_NAME_STAT(EXT4_FC_REASON_INODE_JOURNAL_DATA),
>                       __entry->sbi->s_fc_stats.fc_num_commits,
>                       __entry->sbi->s_fc_stats.fc_ineligible_commits,
>                       __entry->sbi->s_fc_stats.fc_numblks)
>
> That __entry->sbi can be LONG GONE by the time it is read. You have no
> idea what it is pointing to. And even if it is still around, there's no
> way that any of those numbers are accurate from the time the event
> triggered. Might as well just expose them by some other interface.

Thanks Steven for spotting this and providing details of the problem.
I have submitted a patch to fix this problem at [1].
But I am facing a small challenge w.r.t __array field type w.r.t perf record
which I have explained in the cover letter.
Would it be possible for you to take a look at it [2] and also at the patch [1].

Thanks a lot for your help!!

[1]: https://lore.kernel.org/all/9a8c359270a6330ed384ea0a75441e367ecde924.1645558375.git.riteshh@linux.ibm.com/
[2]: https://lore.kernel.org/all/cover.1645558375.git.riteshh@linux.ibm.com/


-ritesh

>
> This event must be modified or removed from the current release and all
> stable kernels that have it.
>
> In the mean time, I need to update my event verifier to detect this and
> fail to register the event if it has such cases.
>
> -- Steve

Powered by blists - more mailing lists