linux-kernel - Re: [PATCH] eventfs: Have inodes have unique inode numbers

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAHk-=wg0bfqL9Yn-KnamLTvTpw+zbAa=og_JRPjZHgJ5m9iCZA@mail.gmail.com>
Date: Sun, 28 Jan 2024 14:17:43 -0800
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Steven Rostedt <rostedt@...dmis.org>
Cc: Masami Hiramatsu <mhiramat@...nel.org>, Mathieu Desnoyers <mathieu.desnoyers@...icios.com>, 
	LKML <linux-kernel@...r.kernel.org>, 
	Linux Trace Devel <linux-trace-devel@...r.kernel.org>, Christian Brauner <brauner@...nel.org>, 
	Ajay Kaher <ajay.kaher@...adcom.com>, Geert Uytterhoeven <geert@...ux-m68k.org>, 
	linux-fsdevel <linux-fsdevel@...r.kernel.org>
Subject: Re: [PATCH] eventfs: Have inodes have unique inode numbers

On Sun, 28 Jan 2024 at 14:01, Steven Rostedt <rostedt@...dmis.org> wrote:
>
> Basically you are saying that when the ei is created, it should have a
> ref count of 1. If the lookup happens and does the
> atomic_inc_not_zero() it will only increment if the ref count is not 0
> (which is basically equivalent to is_freed).

Exactly.

That's what we do with almost all our data structures.

You can use the kref() infrastructure to give this a bit more
structure, and avoid doing the atomics by hand. So then "get a ref" is
literally

    kref_get(&ei->kref);

and releasing it is

    kref_put(&ei->kref, ei_release);

so you don't have the write out that "if (atomic_dec_and_test(..) kfree();".

And "ei_release()" looks just something like this:

    static void ei_release(struct kref *ref)
    {
        kfree(container_of(ref, struct eventfs_inode, kref);
    }

and that's literally all you need (ok, you need to add the 'kref' to
the eventfs_inode, and initialize it at allocation time, of course).

> And then we can have deletion of the object happen in both where the
> caller (kprobes) deletes the directory and in the final iput()
> reference (can I use iput and not the d_release()?), that it does the
> same as well.
>
> Where whatever sees the refcount of zero calls rcu_free?

Having looked more at your code, I actually don't see you even needing
rcu_free().

It's not that you don't need SRCU (which is *much* heavier than RCU) -
you don't even need the regular quick RCU at all.

As far as I can see, you do all the lookups under eventfs_mutex, so
there are actually no RCU users.

And the VFS code (that obviously uses RCU internally) has a ref to it
and just a direct pointer, so again, there's no real RCU involved.

You seem to have used SRCU as a "I don't want to do refcounts" thing.
I bet you'll notice that it clarifies things *enormously* to just use
refcounts.

                Linus