[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGXxSxU+wbAfXtoAWHKD=UZ3u7pt5UNW-=OR6qCUuCJ8XZ48Eg@mail.gmail.com>
Date: Fri, 31 Jul 2015 18:09:43 +0800
From: cee1 <fykcee1@...il.com>
To: LKML <linux-kernel@...r.kernel.org>
Cc: Greg KH <gregkh@...uxfoundation.org>,
Lennart Poettering <lennart@...ttering.net>,
David Herrmann <dh.herrmann@...il.com>,
gnomes@...rguk.ukuu.org.uk, luto@...capital.net
Subject: Re: Revisit AF_BUS: is it a better way to implement KDBUS?
In a nutshell, this AF_BUS:
1. For privilege operations, bus endpoints send requests to bus
master, and bus master replies with cmsg(control message, e.g. tells
the kernel to assign specified sockaddr_bus)
2. Bus master allocates sockaddr_bus
3. Three kinds of sockaddr_bus:
* The normal ones
* Multicast addresses (last char of sbus_path is '*')
* Kernel notification addr (sbus_addr == NULL)
4. Bloom filters friendly. (i.e. the multicast logic)
2015-07-30 21:09 GMT+08:00 cee1 <fykcee1@...il.com>:
> Hi all,
>
> I'm interested in the idea of AF_BUS.
>
> There have already been varies discussions about it:
> * Missing the AF_BUS - https://lwn.net/Articles/504970/
> * Kroah-Hartman: AF_BUS, D-Bus, and the Linux kernel -
> http://lwn.net/Articles/537021/
> * presentation-kdbus -
> https://github.com/gregkh/presentation-kdbus/blob/master/kdbus.txt
> * Re: [GIT PULL] kdbus for 4.1-rc1 - https://lwn.net/Articles/641278/
> * The kdbuswreck - https://lwn.net/Articles/641275/
>
> I'm wondering whether it is a better way, that is, a general mechanism
> to implement varies __Bus__ orientated IPCs, such as Binder[1],
> DBus[2], etc.
>
> The original design of AF_BUS is at
> https://github.com/Airtau/genivi/blob/master/af_bus-linux/0002-net-bus-Add-AF_BUS-documentation.patch.
> And following is my version of AF_BUS.
>
> Some characteristics of a Bus orientated IPC:
> 1. A process creates a Bus, the process is then called 'bus master'.
> 2. Connects to a Bus, be assigned Bus address(es).
> 3. Sending/Receiving multicast message, in additional to P2P communication.
> 4. The implementation may base on shared memory model to avoid unnecessary copy.
>
> ## How to map point 1: """A process creates a Bus, the process is then
> called 'bus master'"""
> The [bus master] acts:
>
> struct sockaddr_bus {
> sa_family_t sbus_family; /* AF_BUS */
> unsigned short sbus_addr_ncomp; /* number of
> components of sbus_addr */
> char sbus_path[BUS_PATH_MAX]; /* pathname of
> this bus */
> uint64_t sbus_addr[BUS_ADDR_COMP_MAX]; /* address
> within the bus */
> };
> #define BUS_ADDR_MAX (BUS_ADDR_COMP_MAX * sizeof(uint64_t))
>
> char bus_path[] = "/tmp/test"; /* non-abstract path */
> char bus_addr[] = "org.example.bus";
> struct sockaddr_bus addr = { .sbus_family = AF_BUS };
>
> strncpy(addr.sbus_path, bus_path, BUS_PATH_MAX - 2);
> memcpy(addr.sbus_addr, bus_addr, MIN(sizeof(bus_addr), BUS_ADDR_MAX));
> addr.sbus_addr_ncomp = MIN(ALIGN(sizeof(bus_addr), 8) / 8, BUS_ADDR_COMP_MAX);
>
> bus_fd = socket(AF_BUS, SOCK_DGRAM, 0);
> /* creates a Bus, becomes the master of the bus */
> bind(bus_fd, &addr, sizeof(struct sockaddr_bus));
>
>
> ## How to map point 2: """Connects to a Bus, be assigned Bus address(es)"""
> ### The [bus endpoint] acts:
> fd = socket(AF_BUS, SOCK_DGRAM, 0);
>
> /* AUTH message setup */
> struct msghdr msghdr = {
> .msg_name = &addr, /* bus master's addr */
> .msg_namelen = sizeof(struct sockaddr_bus),
> .msg_iov = &auth_iovec,
> .msg_iovlen = 1,
> };
>
> msghdr.msg_controllen = CMSG_SPACE(sizeof(struct ucred));
> msghdr.msg_control = alloca(msghdr.msg_controllen);
> cmsg = CMSG_FIRSTHDR(&msghdr);
> cmsg->cmsg_level = SOL_SOCKET;
> cmsg->cmsg_type = SCM_CREDENTIALS;
> cmsg->cmsg_len = CMSG_LEN(sizeof(struct ucred));
> ucred = (struct ucred *) CMSG_DATA(cmsg);
> ucred->pid = getpid();
> ucred->uid = getuid();
> ucred->gid = getgid();
>
> sendmsg(fd, &msghdr, MSG_NOSIGNAL);
>
> ### The [bus master] acts:
> int optval = 1;
> setsockopt(bus_fd, SOL_SOCKET, SO_PASSCRED, &optval, sizeof(optval));
> recvmsg(bus_fd, &msghdr, MSG_NOSIGNAL);
>
> /* do AUTH ... */
>
> msghdr.msg_iov = &reply_iovec;
> msghdr.msg_iovlen = 1;
> msghdr.msg_controllen = 0;
> msghdr.msg_control = NULL;
>
> if (auth_ok) {
> /* bus master allocates a bus addr */
> char bus_path[] = "/tmp/test";
> char ret_bus_addr[] = "1.1";
> struct sockaddr_bus ret_addr = { .sbus_family = AF_BUS };
>
> strncpy(ret_addr.sbus_path, bus_path, BUS_PATH_MAX - 2);
> memcpy(ret_addr.sbus_addr, ret_bus_addr,
> MIN(sizeof(ret_bus_addr), BUS_ADDR_MAX));
> ret_addr.sbus_addr_ncomp = MIN(ALIGN(sizeof(ret_bus_addr), 8)
> / 8, BUS_ADDR_COMP_MAX);
>
> /*
> * 1. bus master returns the bus addr
> * 2. kernel will apply it against the bus endpoint
> * 3. the bus endpoint is then able to talk with endpoints on the bus.
> */
> msghdr.msg_controllen = CMSG_SPACE(sizeof(struct sockaddr_bus));
> msghdr.msg_control = alloca(msghdr.msg_controllen);
> cmsg = CMSG_FIRSTHDR(&msghdr);
> cmsg->cmsg_level = BUS_SOCKET;
> cmsg->cmsg_type = SCM_OWNED_ADDR;
> cmsg->cmsg_len = CMSG_LEN(sizeof(struct sockaddr_bus));
> memcpy(CMSG_DATA(cmsg), &ret_addr, sizeof(struct sockaddr_bus));
> }
> sendmsg(bus_fd, &msghdr, MSG_NOSIGNAL);
>
>
> ## How to map point 3: """Sending/Receiving multicast message, in
> additional to P2P communication""".
> ### P2P communication
> Sometimes, a bus endpoint maybe assigned to multi-addresses. It may
> want to send message through a specific address.
>
> struct msghdr msghdr = {
> .msg_name = &dst_addr,
> .msg_namelen = sizeof(struct sockaddr_bus),
> .msg_iov = &msg_iovec,
> .msg_iovlen = 1,
> };
>
> char bus_path[] = "/tmp/test";
> char bus_addr[] = "com.example.service1";
> struct sockaddr_bus src_addr = { .sbus_family = AF_BUS };
>
> strncpy(src_addr.sbus_path, bus_path, BUS_PATH_MAX - 2);
> memcpy(src_addr.sbus_addr, bus_addr, MIN(sizeof(bus_addr), BUS_ADDR_MAX));
> src_addr.sbus_addr_ncomp = MIN(ALIGN(sizeof(bus_addr), 8) / 8,
> BUS_ADDR_COMP_MAX),
>
> msghdr.msg_controllen = CMSG_SPACE(sizeof(struct sockaddr_bus));
> msghdr.msg_control = alloca(msghdr.msg_controllen);
> cmsg = CMSG_FIRSTHDR(&msghdr);
> cmsg->cmsg_level = BUS_SOCKET;
> cmsg->cmsg_type = SCM_SRC_ADDR;
> cmsg->cmsg_len = CMSG_LEN(sizeof(struct sockaddr_bus));
> memcpy(CMSG_DATA(cmsg), &src_addr, sizeof(struct sockaddr_bus));
>
> sendmsg(my_sock_fd, &msghdr, MSG_NOSIGNAL);
>
> ### Multicast
> The multicast address may look like:
> {
> .sbus_family = AF_BUS,
>
> /* In a multicast addr, its bus_path is '*'-terminated */
> .sbus_path = "/tmp/test\0\0\0\0\0...*",
>
> .sbus_addr_ncomp = 8;
> .sbus_addr = /* 8 * 64bits bitarray for example */
> }
>
> The receiver will request [bus master] for permitting to receive
> messages from a set of multicast addresses, and the bus master grants
> it with replying a control message:
> {
> .cmsg_level = BUS_SOCKET,
> .cmsg_type = SCM_MULTICAST_MATCH,
> .cmsg_data = /* the requested struct sockaddr_bus */
> }
>
> How does matching happen?
> Let's assume someone sends message to multicast address maddr1, and
> the receiver granted a match of maddr2:
>
> The [kernel]:
> is_matched = maddr1 & maddr2 == maddr2.
>
> In this way, usespace can deploy bloom filters, and then it may
> further apply eBPF to filter out "false positive" case.
>
> ## How to avoid unnecessary copy?
> A sockopt similar to PACKET_RX_RING[3] may be introduced, which brings
> a mmap/shared memory style API.
>
>
> ## Other thoughts
> 1. The bus master may want to receive notifications from the kernel,
> such as "a bus endpoint died". A special sockaddr_bus "{
> .sbus_addr_ncomp = 0, .sbus_addr = NULL }" indicates a message from
> kernel.
> 2. A bus endpoint may pass a memfd to another bus endpoint, and then
> they communicates under mmap/shared memory model, if it needs ultimate
> performance.
>
>
>
> ---
> 1. http://www.freedesktop.org/wiki/Software/dbus/
> 2. http://elinux.org/Android_Binder
> 3. http://man7.org/linux/man-pages/man7/packet.7.html
>
>
>
> Regards,
>
> - cee1
--
Regards,
- cee1
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists