[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+55aFxt1FFJ0DPpHcANvHiOO1JLcupDok+mHKG7D+zzC-TeYw@mail.gmail.com>
Date: Mon, 16 Apr 2018 12:19:51 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Steven Rostedt <rostedt@...dmis.org>
Cc: Sasha Levin <Alexander.Levin@...rosoft.com>,
Pavel Machek <pavel@....cz>, Petr Mladek <pmladek@...e.com>,
"stable@...r.kernel.org" <stable@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
Cong Wang <xiyou.wangcong@...il.com>,
Dave Hansen <dave.hansen@...el.com>,
Johannes Weiner <hannes@...xchg.org>,
Mel Gorman <mgorman@...e.de>, Michal Hocko <mhocko@...nel.org>,
Vlastimil Babka <vbabka@...e.cz>,
Peter Zijlstra <peterz@...radead.org>, Jan Kara <jack@...e.cz>,
Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>,
Byungchul Park <byungchul.park@....com>,
Tejun Heo <tj@...nel.org>, Greg KH <gregkh@...uxfoundation.org>
Subject: Re: [PATCH AUTOSEL for 4.14 015/161] printk: Add console owner and
waiter logic to load balance console writes
On Mon, Apr 16, 2018 at 11:52 AM, Linus Torvalds
<torvalds@...ux-foundation.org> wrote:
>
> And yes, sometimes that means jumping through hoops. But that's what
> it takes to keep users happy.
The example of "jumping through hoops" I tend to give is the pipe "packet mode".
The kernel actually has a magic pipe mode for "packet buffers", so
that if you do two small writes, the other side of the pipe can
actually say "I want to read not the pipe buffers, but the individual
packets".
Why would we do that? That's not how pipes work! If you want to send
and receive messages, use a socket, for chrissake! A pipe is just a
stream of bytes - as the elder Gods of Unix decreed!
But it turns out that we added the notion of a packetized pipe writer,
and you can actually access it in user space by setting the O_DIRECT
flag (so after you do the "pipe()" system call, do a fcntl(SETFL,
O_DIRECT) on it).
Absolutely nobody uses it, afaik, because you'd be crazy to do it.
What would be the point? sockets work, and are portable.
So why do we have it?
We have it for one single user: autofs. The way you talk to the kernel
side of things is with a magic pipe that you get at mount time, and
the user-space autofs daemon might be 32-bit even if the kernel is
64-bit, and we had a horrible ABI mistake which meant that sending the
binary data over that pipe had different format for a 32-bit autofsd.
And systemd and automount had different workarounds (or lack
there-of), for the ABI issue.
So the whole "ok, allow people to send packets, and read them as
packets" came about purely because the ABI was broken, and there was
no other way to fix things.
See 64f371bc3107 ("autofs: make the autofsv5 packet file descriptor
use a packetized pipe") for (some) of the sad details.
That commit kind of makes it sound like it's a nice solution that just
takes advantage of the packetized pipes. Nice and clean fix, right?
No. The packetized pipes exist in the first place _purely_ to make
that nice solution possible. It's literally just a "this allows us to
be ABI compatible with two different users that were confused about
the compatibility issue we had due to a broken binary structure format
acrss x86-32 and x86-64".
See commit 9883035ae7ed ("pipes: add a "packetized pipe" mode for
writing") for the other side of that.
All this just because _we_ made a mistake in our ABI, and then real
life users started using that mistake, including one user that
literally *knew* about the mistake and worked around it and started
depending on the fact t hat our compatibility mode was buggy because
of it.
So it was a bug in our ABI. But since people depended on the bug, the
bug was a feature, and needed to be kept around. In this case by
adding a totally new and unrelated feature, and using that new feature
to make those old users happy. The whole "set packetized mode on the
autofs pipe" is all done transparently inside the kernel, and
automount never knew or needed to care that we started to use a
packetized pipe to get the data it sent us in the chunks it intended.
Linus
Powered by blists - more mailing lists