lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 29 Jan 2024 15:26:00 -0500
From: Steven Rostedt <rostedt@...dmis.org>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: kernel test robot <oliver.sang@...el.com>, oe-lkp@...ts.linux.dev,
 lkp@...el.com, linux-kernel@...r.kernel.org, Masami Hiramatsu
 <mhiramat@...nel.org>, Mark Rutland <mark.rutland@....com>, Mathieu
 Desnoyers <mathieu.desnoyers@...icios.com>, Christian Brauner
 <brauner@...nel.org>, Al Viro <viro@...iv.linux.org.uk>, Ajay Kaher
 <ajay.kaher@...adcom.com>, linux-trace-kernel@...r.kernel.org
Subject: Re: [linus:master] [eventfs] 852e46e239:
 BUG:unable_to_handle_page_fault_for_address

On Mon, 29 Jan 2024 11:51:52 -0800
Linus Torvalds <torvalds@...ux-foundation.org> wrote:

> On Mon, 29 Jan 2024 at 11:24, Linus Torvalds
> <torvalds@...ux-foundation.org> wrote:
> >
> > So the patch was completely broken. Here's the one that should
> > actually compile (although still not actually *tested*).  
> 
> Note that this fixes the d_instantiate() ordering wrt initializing the inode.
> 
> But as I look up the call chain, I see many more fundamental mistakes.
> 
> Steven - the reason you think that the VFS doesn't have documentation
> is that we *do* have tons of documentation, but it's of the kind "Here
> is what you should do".
> 
> It is *not* of the kind that says "You messed up and did something
> else, and how do you recover from it?".

When I first worked on this, I did read all the VFS documentation, and it
was difficult to understand what I needed and what I didn't. It is focused
on a real file system and not a pseudo ones. Another mistake I made was
thinking that debugfs was doing things the "right" way as well. And having
a good idea of how debugfs worked, and thinking it was correct, just made
reading VFS documentation even more difficult as I couldn't relate what I
knew (about debugfs) with what was being explained. I thought it was
perfectly fine to use dentry as a handle for the file system. I did until
you told me it wasn't. That made a profound change in my understanding of
how things are supposed to work.

> 
> So the fundamental bug I now find is that eventfs_root_lookup() gets a
> target dentry, and for some unfathomable reason it then does
> 
>         ret = simple_lookup(dir, dentry, flags);
> 
> on it. Which is *completely* broken, because what "simple_lookup()"
> does is just say "oh, you didn't have a dentry of this kind before, so
> clearly a lookup must be a non-existent file". Remember: this is for
> 'tmpfs' kinds of filesystems where the dentry cache cotnains *ALL*
> files.

Sorry, I don't really understand what you mean by "ALL files"? You mean
that all files in the pseudo file system has a dentry to it (like debugfs,
and the rest of tracefs)?

> 
> For the tracefs kind of filesystem, it's TOTALLY BOGUS. What the
> "simple_lookup()" will do is just a plain
> 
>         d_add(dentry, NULL);
> 
> and nothing else. And guess what *that* does? It basically
> instantiates a negative dentry, telling all other lookups that the
> path does not exist.
> 
> So if you have two concurrent lookups, one will do that
> simple_lookup(), and the other will then - depending on timing -
> either see the negative dentry and return -ENOENT, or - if it comes in
> a bit later - see the new inode that then later gets added by the
> first lookup with d_instantiate().
> 
> See? That simple_lookup() is not just unnecessary, but it's also
> actively completely WRONG. Because it instantiates a NULL pointer,
> other processes that race with the lookup may now end up saying "that
> file doesn't exist", even though it should.
> 
> Basically, you can't use *any* of the "simple" filesystem helpers.
> Because they are all designed for that "the dentry tree is all there
> is" case.

Yeah, the above code did come about with me not fully understanding the
above. It's not that it wasn't documented, but I admit, when I read the VFS
documentation, I had a lot of trouble trying to make sense of things like
negative dentries and how they relate.

I now have a much better understanding of most of this, thanks to our
discussion here, and also knowing that using dentry as the main handle to a
file is *not* how to do it. When thinking it was, it made things much more
difficult to comprehend.

-- Steve

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ