[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <930B6F39-4174-46C2-B556-E98ED72E27F8@amacapital.net>
Date: Fri, 6 Sep 2019 10:14:17 -0700
From: Andy Lutomirski <luto@...capital.net>
To: Steven Whitehouse <swhiteho@...hat.com>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
David Howells <dhowells@...hat.com>,
Ray Strode <rstrode@...hat.com>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Nicolas Dichtel <nicolas.dichtel@...nd.com>, raven@...maw.net,
keyrings@...r.kernel.org, linux-usb@...r.kernel.org,
linux-block <linux-block@...r.kernel.org>,
Christian Brauner <christian@...uner.io>,
LSM List <linux-security-module@...r.kernel.org>,
linux-fsdevel <linux-fsdevel@...r.kernel.org>,
Linux API <linux-api@...r.kernel.org>,
Linux List Kernel Mailing <linux-kernel@...r.kernel.org>,
Al Viro <viro@...iv.linux.org.uk>,
"Ray, Debarshi" <debarshi.ray@...il.com>,
Robbie Harwood <rharwood@...hat.com>
Subject: Re: Why add the general notification queue and its sources
> On Sep 6, 2019, at 9:12 AM, Steven Whitehouse <swhiteho@...hat.com> wrote:
>
> Hi,
>
>> On 06/09/2019 16:53, Linus Torvalds wrote:
>> On Fri, Sep 6, 2019 at 8:35 AM Linus Torvalds
>> <torvalds@...ux-foundation.org> wrote:
>>> This is why I like pipes. You can use them today. They are simple, and
>>> extensible, and you don't need to come up with a new subsystem and
>>> some untested ad-hoc thing that nobody has actually used.
>> The only _real_ complexity is to make sure that events are reliably parseable.
>>
>> That's where you really want to use the Linux-only "packet pipe"
>> thing, becasue otherwise you have to have size markers or other things
>> to delineate events. But if you do that, then it really becomes
>> trivial.
>>
>> And I checked, we made it available to user space, even if the
>> original reason for that code was kernel-only autofs use: you just
>> need to make the pipe be O_DIRECT.
>>
>> This overly stupid program shows off the feature:
>>
>> #define _GNU_SOURCE
>> #include <fcntl.h>
>> #include <unistd.h>
>>
>> int main(int argc, char **argv)
>> {
>> int fd[2];
>> char buf[10];
>>
>> pipe2(fd, O_DIRECT | O_NONBLOCK);
>> write(fd[1], "hello", 5);
>> write(fd[1], "hi", 2);
>> read(fd[0], buf, sizeof(buf));
>> read(fd[0], buf, sizeof(buf));
>> return 0;
>> }
>>
>> and it you strace it (because I was too lazy to add error handling or
>> printing of results), you'll see
>>
>> write(4, "hello", 5) = 5
>> write(4, "hi", 2) = 2
>> read(3, "hello", 10) = 5
>> read(3, "hi", 10) = 2
>>
>> note how you got packets of data on the reader side, instead of
>> getting the traditional "just buffer it as a stream".
>>
>> So now you can even have multiple readers of the same event pipe, and
>> packetization is obvious and trivial. Of course, I'm not sure why
>> you'd want to have multiple readers, and you'd lose _ordering_, but if
>> all events are independent, this _might_ be a useful thing in a
>> threaded environment. Maybe.
>>
>> (Side note: a zero-sized write will not cause a zero-sized packet. It
>> will just be dropped).
>>
>> Linus
>
> The events are generally not independent - we would need ordering either implicit in the protocol or explicit in the messages. We also need to know in case messages are dropped too - doesn't need to be anything fancy, just some idea that since we last did a read, there are messages that got lost, most likely due to buffer overrun.
This could be a bit fancier: if the pipe recorded the bitwise or of the first few bytes of dropped message, then the messages could set a bit in the header indicating the type, and readers could then learn which *types* of messages were dropped.
Or they could just use multiple pipes.
If this whole mechanism catches on, I wonder if implementing recvmmsg() on pipes would be worthwhile.
Powered by blists - more mailing lists