[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8e60555e-9247-e03f-e8b4-1d31f70f1221@redhat.com>
Date: Fri, 6 Sep 2019 17:12:13 +0100
From: Steven Whitehouse <swhiteho@...hat.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>,
David Howells <dhowells@...hat.com>
Cc: Ray Strode <rstrode@...hat.com>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Nicolas Dichtel <nicolas.dichtel@...nd.com>, raven@...maw.net,
keyrings@...r.kernel.org, linux-usb@...r.kernel.org,
linux-block <linux-block@...r.kernel.org>,
Christian Brauner <christian@...uner.io>,
LSM List <linux-security-module@...r.kernel.org>,
linux-fsdevel <linux-fsdevel@...r.kernel.org>,
Linux API <linux-api@...r.kernel.org>,
Linux List Kernel Mailing <linux-kernel@...r.kernel.org>,
Al Viro <viro@...iv.linux.org.uk>,
"Ray, Debarshi" <debarshi.ray@...il.com>,
Robbie Harwood <rharwood@...hat.com>
Subject: Re: Why add the general notification queue and its sources
Hi,
On 06/09/2019 16:53, Linus Torvalds wrote:
> On Fri, Sep 6, 2019 at 8:35 AM Linus Torvalds
> <torvalds@...ux-foundation.org> wrote:
>> This is why I like pipes. You can use them today. They are simple, and
>> extensible, and you don't need to come up with a new subsystem and
>> some untested ad-hoc thing that nobody has actually used.
> The only _real_ complexity is to make sure that events are reliably parseable.
>
> That's where you really want to use the Linux-only "packet pipe"
> thing, becasue otherwise you have to have size markers or other things
> to delineate events. But if you do that, then it really becomes
> trivial.
>
> And I checked, we made it available to user space, even if the
> original reason for that code was kernel-only autofs use: you just
> need to make the pipe be O_DIRECT.
>
> This overly stupid program shows off the feature:
>
> #define _GNU_SOURCE
> #include <fcntl.h>
> #include <unistd.h>
>
> int main(int argc, char **argv)
> {
> int fd[2];
> char buf[10];
>
> pipe2(fd, O_DIRECT | O_NONBLOCK);
> write(fd[1], "hello", 5);
> write(fd[1], "hi", 2);
> read(fd[0], buf, sizeof(buf));
> read(fd[0], buf, sizeof(buf));
> return 0;
> }
>
> and it you strace it (because I was too lazy to add error handling or
> printing of results), you'll see
>
> write(4, "hello", 5) = 5
> write(4, "hi", 2) = 2
> read(3, "hello", 10) = 5
> read(3, "hi", 10) = 2
>
> note how you got packets of data on the reader side, instead of
> getting the traditional "just buffer it as a stream".
>
> So now you can even have multiple readers of the same event pipe, and
> packetization is obvious and trivial. Of course, I'm not sure why
> you'd want to have multiple readers, and you'd lose _ordering_, but if
> all events are independent, this _might_ be a useful thing in a
> threaded environment. Maybe.
>
> (Side note: a zero-sized write will not cause a zero-sized packet. It
> will just be dropped).
>
> Linus
The events are generally not independent - we would need ordering either
implicit in the protocol or explicit in the messages. We also need to
know in case messages are dropped too - doesn't need to be anything
fancy, just some idea that since we last did a read, there are messages
that got lost, most likely due to buffer overrun.
That is why the initial idea was to use netlink, since it solves a lot
of those issues. The downside was that the indirect nature of the
netlink sockets resulted in making it tricky to know the namespace of
the process to which the message was to be delivered (and hence whether
it should be delivered at all),
Steve.
Powered by blists - more mailing lists