[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220221160916.333e6491@rorschach.local.home>
Date: Mon, 21 Feb 2022 16:09:16 -0500
From: Steven Rostedt <rostedt@...dmis.org>
To: LKML <linux-kernel@...r.kernel.org>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
"Theodore Ts'o" <tytso@....edu>,
Andreas Dilger <adilger.kernel@...ger.ca>,
linux-ext4@...r.kernel.org,
Harshad Shirwadkar <harshadshirwadkar@...il.com>
Subject: [BUG] NEVER dereference pointers from the tracing ring buffer!
While working on libtraceevent, it stumbled over a "->" in the print
formats that wasn't parsing properly. Well, the reason it does not
handle this particular case is because it is A MAJOR BUG IN THE KERNEL!
DO NOT DEREFERENCE ANYTHING FROM THE RING BUFFER.
Sorry, for yelling, but I really want to stress this is not something
to do. The ring buffer is referenced seconds, minutes, hours, days,
months after the data has been loaded. This is the same as using
something after it is freed. Just don't do it.
The culprit is:
ext4_fc_stats
Introduced by: aa75f4d3daaeb ("ext4: main fast-commit commit path")
That has in its print fmt in include/trace/events/ext4.h:
TP_printk("dev %d:%d fc ineligible reasons:\n"
"%s:%d, %s:%d, %s:%d, %s:%d, %s:%d, %s:%d, %s:%d, %s:%d, %s:%d; "
"num_commits:%ld, ineligible: %ld, numblks: %ld",
MAJOR(__entry->dev), MINOR(__entry->dev),
FC_REASON_NAME_STAT(EXT4_FC_REASON_XATTR),
FC_REASON_NAME_STAT(EXT4_FC_REASON_CROSS_RENAME),
FC_REASON_NAME_STAT(EXT4_FC_REASON_JOURNAL_FLAG_CHANGE),
FC_REASON_NAME_STAT(EXT4_FC_REASON_NOMEM),
FC_REASON_NAME_STAT(EXT4_FC_REASON_SWAP_BOOT),
FC_REASON_NAME_STAT(EXT4_FC_REASON_RESIZE),
FC_REASON_NAME_STAT(EXT4_FC_REASON_RENAME_DIR),
FC_REASON_NAME_STAT(EXT4_FC_REASON_FALLOC_RANGE),
FC_REASON_NAME_STAT(EXT4_FC_REASON_INODE_JOURNAL_DATA),
__entry->sbi->s_fc_stats.fc_num_commits,
__entry->sbi->s_fc_stats.fc_ineligible_commits,
__entry->sbi->s_fc_stats.fc_numblks)
That __entry->sbi can be LONG GONE by the time it is read. You have no
idea what it is pointing to. And even if it is still around, there's no
way that any of those numbers are accurate from the time the event
triggered. Might as well just expose them by some other interface.
This event must be modified or removed from the current release and all
stable kernels that have it.
In the mean time, I need to update my event verifier to detect this and
fail to register the event if it has such cases.
-- Steve
Powered by blists - more mailing lists