[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wjex4GE-HXFNPzi+xE+w2hkZTQrACgAaScNdf-8hnMHKA@mail.gmail.com>
Date: Mon, 15 May 2023 12:13:18 -0700
From: Linus Torvalds <torvalds@...uxfoundation.org>
To: Christian Brauner <brauner@...nel.org>
Cc: Vladimir Sementsov-Ogievskiy <vsementsov@...dex-team.ru>,
Alexander Viro <viro@...iv.linux.org.uk>,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
ptikhomirov@...tuozzo.com, Andrey Ryabinin <arbn@...dex-team.com>
Subject: Re: [PATCH] fs/coredump: open coredump file in O_WRONLY instead of O_RDWR
On Mon, May 15, 2023 at 11:50 AM Linus Torvalds
<torvalds@...uxfoundation.org> wrote:
>
> It's strange, because the "O_WRONLY" -> "2" change that changes to a
> magic raw number is right next to changing "(unsigned short) 0x10" to
> "KERNEL_DS", so we're getting *rid* of a magic raw number there.
Oh, no, never mind. I see what is going on.
Back then, "open_namei()" didn't actually take O_RDWR style flags AT ALL.
The O_RDONLY flags are broken, because you cannot say "open with no
permissions", which we used internally. You have
0 - read-only
1 - write-only
2 - read-write
but the internal code actually wants to match that up with the
read-write permission bits (FMODE_READ etc).
And then we've long had a special value for "open for special
accesses" (format etc), which (naturally) was 3.
So then the open code would do
f->f_flags = flag = flags;
f->f_mode = (flag+1) & O_ACCMODE;
if (f->f_mode)
flag++;
which means that "f_mode" now becomes that FMODE_READ | FMODE_WRITE
mask, and "flag" ends up being a translation from that O_RDWR space
(0/1/2/3) into the FMODE_READ/WRITE space (1/2/3/3, where "special"
required read-write permissions, and 0 was only used for symlinks).
We still have that, although the code looks different.
So back then, "open_namei()" took that FMODE_READ/WRITE flag as an
argument, and the "O_WRONLY" -> "2" change is actually a bugfix and
makes sense. The O_WRONLY thing was wrong, because it was 1, which
actuall ymeant FMODE_READ.
And back then, we didn't *have* FMODE_READ and FMODE_WRITE.
So just writing it as "2" made sense, even if it was horrible. We
added FMODE_WRITE later, but never fixed up those core file writers.
So that 0.99pl10 commit from 1993 is actually correct, and the bug
happened *later*.
I think the real bug may have been in 2.2.4pre4 (February 16, 1999),
when this happened:
- dentry = open_namei(corefile,O_CREAT | 2 | O_TRUNC | O_NOFOLLOW, 0600);
...
+ file = filp_open(corefile,O_CREAT | 2 | O_TRUNC | O_NOFOLLOW, 0600);
without realizing that the "2" in open_namei() should have become a
O_WRONLY for filp_open().
So I think this explains it all.
Very understandable mistake after all.
Linus
Powered by blists - more mailing lists