[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ccc4234d-8a47-4c0f-808a-95e61c9c9171@efficios.com>
Date: Mon, 22 Jan 2024 12:14:36 -0500
From: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: Steven Rostedt <rostedt@...dmis.org>,
Geert Uytterhoeven <geert@...ux-m68k.org>, Kees Cook
<keescook@...omium.org>, Linus Torvalds <torvalds@...ux-foundation.org>
Cc: linux-kernel@...r.kernel.org, Masami Hiramatsu <mhiramat@...nel.org>,
Mark Rutland <mark.rutland@....com>,
Andrew Morton <akpm@...ux-foundation.org>,
Christian Brauner <brauner@...nel.org>, Al Viro <viro@...iv.linux.org.uk>,
Ajay Kaher <ajay.kaher@...adcom.com>
Subject: Re: [for-linus][PATCH 1/3] eventfs: Have the inodes all for files and
directories all be the same
On 2024-01-22 10:06, Steven Rostedt wrote:
> On Mon, 22 Jan 2024 11:38:52 +0100
> Geert Uytterhoeven <geert@...ux-m68k.org> wrote:
>
[...]
>>
>> On Wed, Jan 17, 2024 at 3:37 PM Steven Rostedt <rostedt@...dmis.org> wrote:
>>> From: "Steven Rostedt (Google)" <rostedt@...dmis.org>
>>>
>>> The dentries and inodes are created in the readdir for the sole purpose of
>>> getting a consistent inode number. Linus stated that is unnecessary, and
>>> that all inodes can have the same inode number. For a virtual file system
>>> they are pretty meaningless.
>>>
>>> Instead use a single unique inode number for all files and one for all
>>> directories.
>>>
>>> Link: https://lore.kernel.org/all/20240116133753.2808d45e@gandalf.local.home/
>
> Yeah, Linus wanted me to try this first and see if there's any regressions.
> Well, I guess you just answered that.
>
>> This confuses "find".
>> Running "find /sys/" now prints lots of error messages to stderr:
>>
>> find: File system loop detected;
>> ‘/sys/kernel/debug/tracing/events/initcall/initcall_finish’ is part of
>> the same file system loop as
>> ‘/sys/kernel/debug/tracing/events/initcall’.
>
> So at a minimum, the directories need to have unique inode numbers.
[...]
> I'm using the eventfs_inode pointer to create a unique value for the inode.
> But it's being salted, hashed and then truncated. As it is very easy to
> read inodes (although by default, only root has access to read these
> inodes), the inode numbers themselves shouldn't be able to leak kernel
> addresses via the results of these inode numbers, would it?
Why use an improvised hashing function (re-purposed from
scripts/kconfig/symbol.c to a use-case which is exposed through a
userspace ABI prone to kernel address leaks) rather than simply
reserving values by setting bits in a bitmap ?
How many inodes do we realistically expect to have there ?
On my 6.1.0 kernel:
find /sys/kernel/tracing | wc -l
15598
(mainly due to TRACE_EVENT ABI files)
Hashing risks:
- Exposing kernel addresses if the hashing algorithm is broken,
- Collisions if users are unlucky (which could trigger those
'find' errors).
Those 15598 inode values fit within a single page (bitmap of
1922 bytes).
So I would recommend simply adding a bitmap per tracefs filesystem
instance to keep track of inode number allocation.
Creation/removal of files/directories in tracefs should not be
a fast-path anyway, so who cares about the speed of a find first
bit within a single page ?
Thanks,
Mathieu
--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com
Powered by blists - more mailing lists