lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250622041200.GA3703@wendao-VirtualBox>
Date: Sun, 22 Jun 2025 12:12:00 +0800
From: Dawei Li <dawei.li@...ux.dev>
To: Arnaud POULIQUEN <arnaud.pouliquen@...s.st.com>
Cc: andersson@...nel.org, mathieu.poirier@...aro.org,
	linux-remoteproc@...r.kernel.org, linux-kernel@...r.kernel.org,
	set_pte_at@...look.com, dawei.li@...ux.dev
Subject: Re: [PATCH v4 0/3] rpmsg: Introduce RPMSG_CREATE_EPT_FD_IOCTL uAPI

Hi Arnaud,

Thanks for the reply.

On Fri, Jun 20, 2025 at 09:52:03AM +0200, Arnaud POULIQUEN wrote:
> 
> 
> On 6/19/25 16:43, Dawei Li wrote:
> > Hi Arnaud, 
> > Thanks for review.
> > 
> > On Wed, Jun 18, 2025 at 03:07:36PM +0200, Arnaud POULIQUEN wrote:
> >> Hello Dawei,
> >>
> >>
> >> Please find a few comments below. It is not clear to me which parts of your
> >> implementation are mandatory and which are optional "nice-to-have" optimizations.
> > 
> > It's more like an improvement.
> > 
> >>
> >> Based on (potentially erroneous) hypothesis, you will find a suggestion for an
> >> alternative to the anonymous inode approach, which does not seem to be a common
> >> interface.
> > 
> > AFAIC, annoymous inode is a common interface and used extensivly in kernel development.
> > Some examples below.
> > 
> >>
> >>
> >> On 6/9/25 17:15, Dawei Li wrote:
> >>> Hi,
> >>>
> >>> This is V4 of series which introduce new uAPI(RPMSG_CREATE_EPT_FD_IOCTL)
> >>> for rpmsg subsystem.
> >>>
> >>> Current uAPI implementation for rpmsg ctrl & char device manipulation is
> >>> abstracted in procedures below:
> >>> - fd = open("/dev/rpmsg_ctrlX")
> >>> - ioctl(fd, RPMSG_CREATE_EPT_IOCTL, &info); /dev/rpmsgY devnode is
> >>>   generated.
> >>> - fd_ep = open("/dev/rpmsgY", O_RDWR) 
> >>> - operations on fd_ep(write, read, poll ioctl)
> >>> - ioctl(fd_ep, RPMSG_DESTROY_EPT_IOCTL)
> >>> - close(fd_ep)
> >>> - close(fd)
> >>>
> >>> This /dev/rpmsgY abstraction is less favorable for:
> >>> - Performance issue: It's time consuming for some operations are
> >>> invovled:
> >>>   - Device node creation.
> >>>     Depends on specific config, especially CONFIG_DEVTMPFS, the overall
> >>>     overhead is based on coordination between DEVTMPFS and userspace
> >>>     tools such as udev and mdev.
> >>>
> >>>   - Extra kernel-space switch cost.
> >>>
> >>>   - Other major costs brought by heavy-weight logic like device_add().
> >>
> >> Is this a blocker of just optimization?
> > 
> > Yep, performance is one of motivations of this change.
> > 
> >>
> >>>
> >>> - /dev/rpmsgY node can be opened only once. It doesn't make much sense
> >>>     that a dynamically created device node can be opened only once.
> >>
> >>
> >> I assume this is blocker with the fact that you need to open the /dev/rpmsg<x>
> >> to create the endpoint.
> > 
> > Yes. You have to open /dev/rpmsgX which is generated by legacy ioctl to
> > instantiate a new endpoint.
> > 
> >>
> >>
> >>>
> >>> - For some container application such as docker, a client can't access
> >>>   host's dev unless specified explicitly. But in case of /dev/rpmsgY, which
> >>>   is generated dynamically and whose existence is unknown for clients in
> >>>   advance, this uAPI based on device node doesn't fit well.
> >>
> >> does this could be solve in userspace parsing /sys/class/rpmsg/ directory to
> >> retreive the device?
> > 
> > Hardly, because client still can't access /dev/rpmsgX which is generated
> > by host _after_ client is launched.
> 
> 
> This part is not clear to me; could you provide more details?
> I cannot figure out why a client can access /dev/rpmsg_ctrlX but not /dev/rpmsgX.

Well, let's take docker as example:

For docker, when a client is launched and it wants to access host's
device, it must make explicit request when it's launched:

docker run --device=/dev/xxx

Let's presume that xxx is /dev/rpmsgX generated dynamically by _host_.
Docker command above knows nothing about these rpmsg nodes which are
generated by host _after_ client is launched. And yes, parsing
/sys/class/rpmsg may acquire info about rpmsg devices, but client still
can't access /dev/rpmsgX.

> 
> 
> > 
> >>
> >> You could face same kind of random instantiation for serial peripherals ( UART;
> >> USb, I2C,...) based on a device tree enumeration. I suppose that user space
> >> use to solve this.
> >>
> >>>
> >>> An anonymous inode based approach is introduced to address the issues above.
> >>> Rather than generating device node and opening it, rpmsg code just creates
> >>> an anonymous inode representing eptdev and return the fd to userspace.
> >>
> >> A drawback is that you need to share fb passed between processes.
> > 
> > Fd is the abstraction of an unique endpoint device, it holds true for
> > both legacy and new approach.
> > 
> > So I guess what you mean is that /dev/rpmsgX is global to all so other process
> > can access it?
> > 
> > But /dev/rpmsgX is designed to be opened only once, it's implemented as
> > singleton pattern.
> > 
> > static int rpmsg_eptdev_open(struct inode *inode, struct file *filp)
> > {
> > ...
> >         if (eptdev->ept) {
> >                 mutex_unlock(&eptdev->ept_lock);
> >                 return -EBUSY;
> >         }
> > ...
> >         eptdev->ept = ept;
> > ...
> > }
> > 
> > [...]
> >  
> >>> 	printf("loop[%d]\n", loop);
> >>>
> >>> 	gettimeofday(&start, NULL);
> >>>
> >>> 	while (loop--) {
> >>
> >> Do you need to create /close Endpoint sevral times in your real use case with
> >> high timing
> >> constraint?
> > 
> > No, it's just a silly benchmark demo, large sample reduces noise statistically.
> > 
> >>
> >>> 		fd_info.fd = -1;
> >>> 		fd_info.flags = O_RDWR | O_CLOEXEC | O_NONBLOCK;
> >>> 		ret = ioctl(fd, RPMSG_CREATE_EPT_FD_IOCTL, &fd_info);
> >>> 		if (ret < 0 || fd_info.fd < 0) {
> >>> 			printf("ioctl[RPMSG_CREATE_EPT_FD_IOCTL] failed, ret[%d]\n", ret);
> >>> 		}
> >>>
> >>
> >>
> >>> 		ret = ioctl(fd_info.fd, RPMSG_DESTROY_EPT_IOCTL, &info);
> >>> 		if (ret < 0) {
> >>> 			printf("new ioctl[RPMSG_DESTROY_EPT_IOCTL] failed, ret[%d]\n", ret);
> >>> 		}
> >>>
> >>> 		close(fd_info.fd);
> >>
> >> It seems strange to me to use ioctl() for opening and close() for closing, from
> >> a symmetry point of view.
> > 
> > Sorry to hear that. But no, it's a pretty normal skill in kernel codebase
> > , I had to copy some examples from reply to other reviewer[1].
> 
> I missed this one, apologize for the duplication.
> 
> > 
> > anon_inode_get_{fd,file} are used extensively in kernel for returning a new
> > fd to userspace which is associated with an unique data structure in kernel
> > space, in different ways:
> > 
> > - via ioctl(), some examples are:
> > 
> >  - KVM ioctl(s)
> >    - KVM_CREATE_VCPU -> kvm_vm_ioctl_create_vcpu
> >    - KVM_GET_STATS_FD -> kvm_vcpu_ioctl_get_stats_fd
> >    - KVM_CREATE_DEVICE -> kvm_ioctl_create_device
> >    - KVM_CREATE_VM -> kvm_dev_ioctl_create_vm
> > 
> >  - DMA buf/fence/sync ioctls
> >    - DMA_BUF_IOCTL_EXPORT_SYNC_FILE -> dma_buf_export_sync_file
> >    - SW_SYNC_IOC_CREATE_FENCE -> sw_sync_ioctl_create_fence
> >    - Couples of driver implement DMA buf by using anon file _implicitly_:
> >      - UDMABUF_CREATE -> udmabuf_ioctl_create
> >      - DMA_HEAP_IOCTL_ALLOC -> dma_heap_ioctl_allocate
> > 
> >  - gpiolib ioctls:
> >    - GPIO_GET_LINEHANDLE_IOCTL -> linehandle_create
> >    - GPIO_V2_GET_LINE_IOCTL
> > 
> >  -  IOMMUFD ioctls:
> > 
> >  -  VFIO Ioctls:
> > 
> >  - ....
> > 
> > 
> > - via other specific syscalls:
> >  - epoll_create1
> >  - bpf
> >  - perf_event_open
> >  - inotify_init
> >  - ...
> 
> If we put the optimization aspect aside, what seems strange to me is that the
> purpose of rpmsg_char was to expose a FS character device to user space. If we
> need tobypass the use of /dev/rpmsgX, does it make sense to support an anonymous
> inode in this driver?  I am clearly not legitimate to answer this question...

You have every right to do so, after all, it's purely a technical
discussion :).

I admit it's bit confusing to add annoymous inode logic to a file named
rpmsg_char.c which implies 'character' device. That's why I rename API
following Mathieu's comment:
  - __rpmsg_chrdev_eptdev_alloc ->  rpmsg_eptdev_alloc
  - __rpmsg_chrdev_eptdev_add ->  rpmsg_eptdev_add

As to topic how these two uAPI(s) co-exist and affect each other. This
change is based on rules:

1. Never break existing uAPI.
2. Try best to reuse existing codebase.
3. Userspace can choose whatever approach they want to.

Thanks,

	Dawei
> 
> 
> Thanks,
> Arnaud
> 
> > 
> > [1] https://lore.kernel.org/all/20250530125008.GA5355@wendao-VirtualBox/
> > 
> >>
> >> Regarding your implementation, I wonder if we could keep the /dev/rpmsg<x>
> >> device with specific open() and close() file operations associated with your new
> >> ioctl.
> >>
> >> - The ioctl would create the endpoint.
> >> - The open() and close() operations would simply manage the file descriptor and
> >> increment/decrement a counter to prevent premature endpoint destruction.
> >>
> >>
> >> Regards,
> >> Arnaud
> >>
> > 
> > [...]
> > 
> > Thanks,
> > 
> > 	Dawei

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ