linux-kernel - Re: [PATCH v4 0/3] rpmsg: Introduce RPMSG_CREATE_EPT_FD

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ce95f80b-4a36-457e-aedd-d59a325a711c@foss.st.com>
Date: Mon, 30 Jun 2025 09:54:40 +0200
From: Arnaud POULIQUEN <arnaud.pouliquen@...s.st.com>
To: Dawei Li <dawei.li@...ux.dev>
CC: <andersson@...nel.org>, <mathieu.poirier@...aro.org>,
        <linux-remoteproc@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
        <set_pte_at@...look.com>
Subject: Re: [PATCH v4 0/3] rpmsg: Introduce RPMSG_CREATE_EPT_FD_IOCTL uAPI

Hello Dawei,

Sorry for the late answer.

On 6/22/25 06:12, Dawei Li wrote:
> Hi Arnaud,
> 
> Thanks for the reply.
> 
> On Fri, Jun 20, 2025 at 09:52:03AM +0200, Arnaud POULIQUEN wrote:
>>
>>
>> On 6/19/25 16:43, Dawei Li wrote:
>>> Hi Arnaud, 
>>> Thanks for review.
>>>
>>> On Wed, Jun 18, 2025 at 03:07:36PM +0200, Arnaud POULIQUEN wrote:
>>>> Hello Dawei,
>>>>
>>>>
>>>> Please find a few comments below. It is not clear to me which parts of your
>>>> implementation are mandatory and which are optional "nice-to-have" optimizations.
>>>
>>> It's more like an improvement.
>>>
>>>>
>>>> Based on (potentially erroneous) hypothesis, you will find a suggestion for an
>>>> alternative to the anonymous inode approach, which does not seem to be a common
>>>> interface.
>>>
>>> AFAIC, annoymous inode is a common interface and used extensivly in kernel development.
>>> Some examples below.
>>>
>>>>
>>>>
>>>> On 6/9/25 17:15, Dawei Li wrote:
>>>>> Hi,
>>>>>
>>>>> This is V4 of series which introduce new uAPI(RPMSG_CREATE_EPT_FD_IOCTL)
>>>>> for rpmsg subsystem.
>>>>>
>>>>> Current uAPI implementation for rpmsg ctrl & char device manipulation is
>>>>> abstracted in procedures below:
>>>>> - fd = open("/dev/rpmsg_ctrlX")
>>>>> - ioctl(fd, RPMSG_CREATE_EPT_IOCTL, &info); /dev/rpmsgY devnode is
>>>>>   generated.
>>>>> - fd_ep = open("/dev/rpmsgY", O_RDWR) 
>>>>> - operations on fd_ep(write, read, poll ioctl)
>>>>> - ioctl(fd_ep, RPMSG_DESTROY_EPT_IOCTL)
>>>>> - close(fd_ep)
>>>>> - close(fd)
>>>>>
>>>>> This /dev/rpmsgY abstraction is less favorable for:
>>>>> - Performance issue: It's time consuming for some operations are
>>>>> invovled:
>>>>>   - Device node creation.
>>>>>     Depends on specific config, especially CONFIG_DEVTMPFS, the overall
>>>>>     overhead is based on coordination between DEVTMPFS and userspace
>>>>>     tools such as udev and mdev.
>>>>>
>>>>>   - Extra kernel-space switch cost.
>>>>>
>>>>>   - Other major costs brought by heavy-weight logic like device_add().
>>>>
>>>> Is this a blocker of just optimization?
>>>
>>> Yep, performance is one of motivations of this change.
>>>
>>>>
>>>>>
>>>>> - /dev/rpmsgY node can be opened only once. It doesn't make much sense
>>>>>     that a dynamically created device node can be opened only once.
>>>>
>>>>
>>>> I assume this is blocker with the fact that you need to open the /dev/rpmsg<x>
>>>> to create the endpoint.
>>>
>>> Yes. You have to open /dev/rpmsgX which is generated by legacy ioctl to
>>> instantiate a new endpoint.
>>>
>>>>
>>>>
>>>>>
>>>>> - For some container application such as docker, a client can't access
>>>>>   host's dev unless specified explicitly. But in case of /dev/rpmsgY, which
>>>>>   is generated dynamically and whose existence is unknown for clients in
>>>>>   advance, this uAPI based on device node doesn't fit well.
>>>>
>>>> does this could be solve in userspace parsing /sys/class/rpmsg/ directory to
>>>> retreive the device?
>>>
>>> Hardly, because client still can't access /dev/rpmsgX which is generated
>>> by host _after_ client is launched.
>>
>>
>> This part is not clear to me; could you provide more details?
>> I cannot figure out why a client can access /dev/rpmsg_ctrlX but not /dev/rpmsgX.
> 
> Well, let's take docker as example:

> 
> For docker, when a client is launched and it wants to access host's
> device, it must make explicit request when it's launched:
> 
> docker run --device=/dev/xxx
> 
> Let's presume that xxx is /dev/rpmsgX generated dynamically by _host_.
> Docker command above knows nothing about these rpmsg nodes which are
> generated by host _after_ client is launched. And yes, parsing> /sys/class/rpmsg may acquire info about rpmsg devices, but client still
> can't access /dev/rpmsgX.

One extra question:Are you using RPMsg over virtio?

If yes, do you test the RPMsg name service (NS) announcement, that might also
address your needs.

The principle is that the remote processor sends a name service announcement to
Linux, which probes the rpmsg character device and creates the /dev/rpmsgX
device in a predefined order known by the remote processor.
In such a case, the /dev/rpmsgX usage would be determined by the remote
processor itself.

Another advantage is that the RPMsg channel creation is not driven by either the
host or the client. In such case host does no need to define/kwnow RPMSg
endpoint addresses.

You still need to call the open() file system operation, but this should be done
one time during Docker client initialization.

Regards
Arnaud


> 
>>
>>
>>>
>>>>
>>>> You could face same kind of random instantiation for serial peripherals ( UART;
>>>> USb, I2C,...) based on a device tree enumeration. I suppose that user space
>>>> use to solve this.
>>>>
>>>>>
>>>>> An anonymous inode based approach is introduced to address the issues above.
>>>>> Rather than generating device node and opening it, rpmsg code just creates
>>>>> an anonymous inode representing eptdev and return the fd to userspace.
>>>>
>>>> A drawback is that you need to share fb passed between processes.
>>>
>>> Fd is the abstraction of an unique endpoint device, it holds true for
>>> both legacy and new approach.
>>>
>>> So I guess what you mean is that /dev/rpmsgX is global to all so other process
>>> can access it?
>>>
>>> But /dev/rpmsgX is designed to be opened only once, it's implemented as
>>> singleton pattern.
>>>
>>> static int rpmsg_eptdev_open(struct inode *inode, struct file *filp)
>>> {
>>> ...
>>>         if (eptdev->ept) {
>>>                 mutex_unlock(&eptdev->ept_lock);
>>>                 return -EBUSY;
>>>         }
>>> ...
>>>         eptdev->ept = ept;
>>> ...
>>> }
>>>
>>> [...]
>>>  
>>>>> 	printf("loop[%d]\n", loop);
>>>>>
>>>>> 	gettimeofday(&start, NULL);
>>>>>
>>>>> 	while (loop--) {
>>>>
>>>> Do you need to create /close Endpoint sevral times in your real use case with
>>>> high timing
>>>> constraint?
>>>
>>> No, it's just a silly benchmark demo, large sample reduces noise statistically.
>>>
>>>>
>>>>> 		fd_info.fd = -1;
>>>>> 		fd_info.flags = O_RDWR | O_CLOEXEC | O_NONBLOCK;
>>>>> 		ret = ioctl(fd, RPMSG_CREATE_EPT_FD_IOCTL, &fd_info);
>>>>> 		if (ret < 0 || fd_info.fd < 0) {
>>>>> 			printf("ioctl[RPMSG_CREATE_EPT_FD_IOCTL] failed, ret[%d]\n", ret);
>>>>> 		}
>>>>>
>>>>
>>>>
>>>>> 		ret = ioctl(fd_info.fd, RPMSG_DESTROY_EPT_IOCTL, &info);
>>>>> 		if (ret < 0) {
>>>>> 			printf("new ioctl[RPMSG_DESTROY_EPT_IOCTL] failed, ret[%d]\n", ret);
>>>>> 		}
>>>>>
>>>>> 		close(fd_info.fd);
>>>>
>>>> It seems strange to me to use ioctl() for opening and close() for closing, from
>>>> a symmetry point of view.
>>>
>>> Sorry to hear that. But no, it's a pretty normal skill in kernel codebase
>>> , I had to copy some examples from reply to other reviewer[1].
>>
>> I missed this one, apologize for the duplication.
>>
>>>
>>> anon_inode_get_{fd,file} are used extensively in kernel for returning a new
>>> fd to userspace which is associated with an unique data structure in kernel
>>> space, in different ways:
>>>
>>> - via ioctl(), some examples are:
>>>
>>>  - KVM ioctl(s)
>>>    - KVM_CREATE_VCPU -> kvm_vm_ioctl_create_vcpu
>>>    - KVM_GET_STATS_FD -> kvm_vcpu_ioctl_get_stats_fd
>>>    - KVM_CREATE_DEVICE -> kvm_ioctl_create_device
>>>    - KVM_CREATE_VM -> kvm_dev_ioctl_create_vm
>>>
>>>  - DMA buf/fence/sync ioctls
>>>    - DMA_BUF_IOCTL_EXPORT_SYNC_FILE -> dma_buf_export_sync_file
>>>    - SW_SYNC_IOC_CREATE_FENCE -> sw_sync_ioctl_create_fence
>>>    - Couples of driver implement DMA buf by using anon file _implicitly_:
>>>      - UDMABUF_CREATE -> udmabuf_ioctl_create
>>>      - DMA_HEAP_IOCTL_ALLOC -> dma_heap_ioctl_allocate
>>>
>>>  - gpiolib ioctls:
>>>    - GPIO_GET_LINEHANDLE_IOCTL -> linehandle_create
>>>    - GPIO_V2_GET_LINE_IOCTL
>>>
>>>  -  IOMMUFD ioctls:
>>>
>>>  -  VFIO Ioctls:
>>>
>>>  - ....
>>>
>>>
>>> - via other specific syscalls:
>>>  - epoll_create1
>>>  - bpf
>>>  - perf_event_open
>>>  - inotify_init
>>>  - ...
>>
>> If we put the optimization aspect aside, what seems strange to me is that the
>> purpose of rpmsg_char was to expose a FS character device to user space. If we
>> need tobypass the use of /dev/rpmsgX, does it make sense to support an anonymous
>> inode in this driver?  I am clearly not legitimate to answer this question...
> 
> You have every right to do so, after all, it's purely a technical
> discussion :).
> 
> I admit it's bit confusing to add annoymous inode logic to a file named
> rpmsg_char.c which implies 'character' device. That's why I rename API
> following Mathieu's comment:
>   - __rpmsg_chrdev_eptdev_alloc ->  rpmsg_eptdev_alloc
>   - __rpmsg_chrdev_eptdev_add ->  rpmsg_eptdev_add
> 
> As to topic how these two uAPI(s) co-exist and affect each other. This
> change is based on rules:
> 
> 1. Never break existing uAPI.
> 2. Try best to reuse existing codebase.
> 3. Userspace can choose whatever approach they want to.
> 
> Thanks,
> 
> 	Dawei
>>
>>
>> Thanks,
>> Arnaud
>>
>>>
>>> [1] https://lore.kernel.org/all/20250530125008.GA5355@wendao-VirtualBox/
>>>
>>>>
>>>> Regarding your implementation, I wonder if we could keep the /dev/rpmsg<x>
>>>> device with specific open() and close() file operations associated with your new
>>>> ioctl.
>>>>
>>>> - The ioctl would create the endpoint.
>>>> - The open() and close() operations would simply manage the file descriptor and
>>>> increment/decrement a counter to prevent premature endpoint destruction.
>>>>
>>>>
>>>> Regards,
>>>> Arnaud
>>>>
>>>
>>> [...]
>>>
>>> Thanks,
>>>
>>> 	Dawei