[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1596dbc6-65cb-4d3f-8e56-33842e3dcd2b@bytedance.com>
Date: Mon, 1 Jul 2024 12:46:05 -0700
From: Zijian Zhang <zijianzhang@...edance.com>
To: Willem de Bruijn <willemdebruijn.kernel@...il.com>, netdev@...r.kernel.org
Cc: edumazet@...gle.com, cong.wang@...edance.com, xiaochun.lu@...edance.com
Subject: Re: [External] Re: [PATCH net-next v6 2/4] sock: support copy cmsg to
userspace in TX path
On 6/30/24 7:43 AM, Willem de Bruijn wrote:
> zijianzhang@ wrote:
>> From: Zijian Zhang <zijianzhang@...edance.com>
>>
>> Since ____sys_sendmsg creates a kernel copy of msg_control and passes
>> that to the callees, put_cmsg will write into this kernel buffer. If
>> people want to piggyback some information like timestamps upon returning
>> of sendmsg. ____sys_sendmsg will have to copy_to_user to the original buf,
>> which is not supported. As a result, users typically have to call recvmsg
>> on the ERRMSG_QUEUE of the socket, incurring extra system call overhead.
>>
>> This commit supports copying cmsg to userspace in TX path by introducing
>> a flag MSG_CMSG_COPY_TO_USER in struct msghdr to guide the copy logic
>> upon returning of ___sys_sendmsg.
>>
>> Signed-off-by: Zijian Zhang <zijianzhang@...edance.com>
>> Signed-off-by: Xiaochun Lu <xiaochun.lu@...edance.com>
>> ---
>> include/linux/socket.h | 6 ++++++
>> net/core/sock.c | 2 ++
>> net/ipv4/ip_sockglue.c | 2 ++
>> net/ipv6/datagram.c | 3 +++
>> net/socket.c | 45 ++++++++++++++++++++++++++++++++++++++++++
>> 5 files changed, 58 insertions(+)
>>
>> diff --git a/include/linux/socket.h b/include/linux/socket.h
>> index 89d16b90370b..35adc30c9db6 100644
>> --- a/include/linux/socket.h
>> +++ b/include/linux/socket.h
>> @@ -168,6 +168,11 @@ static inline struct cmsghdr * cmsg_nxthdr (struct msghdr *__msg, struct cmsghdr
>> return __cmsg_nxthdr(__msg->msg_control, __msg->msg_controllen, __cmsg);
>> }
>>
>> +static inline bool cmsg_copy_to_user(struct cmsghdr *__cmsg)
>> +{
>> + return 0;
>> +}
>> +
>> static inline size_t msg_data_left(struct msghdr *msg)
>> {
>> return iov_iter_count(&msg->msg_iter);
>> @@ -329,6 +334,7 @@ struct ucred {
>>
>> #define MSG_ZEROCOPY 0x4000000 /* Use user data in kernel path */
>> #define MSG_SPLICE_PAGES 0x8000000 /* Splice the pages from the iterator in sendmsg() */
>> +#define MSG_CMSG_COPY_TO_USER 0x10000000 /* Copy cmsg to user space */
>
> Careful that userspace must not be able to set this bit. See also
> MSG_INTERNAL_SENDMSG_FLAGS.
>
> Perhaps better to define a bit like msg_control_is_user.
>
>> #define MSG_FASTOPEN 0x20000000 /* Send data in TCP SYN */
>> #define MSG_CMSG_CLOEXEC 0x40000000 /* Set close_on_exec for file
>> descriptor received through
>> diff --git a/net/core/sock.c b/net/core/sock.c
>> index 9abc4fe25953..4a766a91ff5c 100644
>> --- a/net/core/sock.c
>> +++ b/net/core/sock.c
>> @@ -2879,6 +2879,8 @@ int sock_cmsg_send(struct sock *sk, struct msghdr *msg,
>> for_each_cmsghdr(cmsg, msg) {
>> if (!CMSG_OK(msg, cmsg))
>> return -EINVAL;
>> + if (cmsg_copy_to_user(cmsg))
>> + msg->msg_flags |= MSG_CMSG_COPY_TO_USER;
>
> Probably better to pass msg to __sock_cmsg_send and only set this
> field in the specific cmsg handler that uses it.
>
Thanks for the above suggestions!
>> if (cmsg->cmsg_level != SOL_SOCKET)
>> continue;
>> ret = __sock_cmsg_send(sk, cmsg, sockc);
...
>> +static int sendmsg_copy_cmsg_to_user(struct msghdr *msg_sys,
>> + struct user_msghdr __user *umsg)
>> +{
>> + struct compat_msghdr __user *umsg_compat =
>> + (struct compat_msghdr __user *)umsg;
>> + unsigned long cmsg_ptr = (unsigned long)umsg->msg_control;
>> + unsigned int flags = msg_sys->msg_flags;
>> + struct msghdr msg_user = *msg_sys;
>> + struct cmsghdr *cmsg;
>> + int err;
>> +
>> + msg_user.msg_control = umsg->msg_control;
>> + msg_user.msg_control_is_user = true;
>> + for_each_cmsghdr(cmsg, msg_sys) {
>> + if (!CMSG_OK(msg_sys, cmsg))
>> + break;
>> + if (cmsg_copy_to_user(cmsg))
>> + put_cmsg(&msg_user, cmsg->cmsg_level, cmsg->cmsg_type,
>> + cmsg->cmsg_len - sizeof(*cmsg), CMSG_DATA(cmsg));
>> + }
>
> Alternatively just copy the entire msg_control if any cmsg wants to
> be copied back. The others will be unmodified. No need to iterate
> then.
>
Copy the entire msg_control via copy_to_user does not take
MSG_CMSG_COMPAT into account. I may have to use put_cmsg to deal
with the compat version, and thus have to keep the for loop?
If so, I may keep the function cmsg_copy_to_user to avoid extra copy?
>> +
>> + err = __put_user((msg_sys->msg_flags & ~MSG_CMSG_COMPAT), COMPAT_FLAGS(umsg));
>> + if (err)
>> + return err;
>
> Does this value need to be written?
>
I did this according to ____sys_recvmsg, maybe it's useful to export
flag like MSG_CTRUNC to users?
>> + if (MSG_CMSG_COMPAT & flags)
>> + err = __put_user((unsigned long)msg_user.msg_control - cmsg_ptr,
>> + &umsg_compat->msg_controllen);
>> + else
>> + err = __put_user((unsigned long)msg_user.msg_control - cmsg_ptr,
>> + &umsg->msg_controllen);
>> + return err;
>> +}
>> +
>> static int ___sys_sendmsg(struct socket *sock, struct user_msghdr __user *msg,
>> struct msghdr *msg_sys, unsigned int flags,
>> struct used_address *used_address,
>> @@ -2638,6 +2671,18 @@ static int ___sys_sendmsg(struct socket *sock, struct user_msghdr __user *msg,
>>
>> err = ____sys_sendmsg(sock, msg_sys, flags, used_address,
>> allowed_msghdr_flags);
>> + if (err < 0)
>> + goto out;
>> +
>> + if (msg_sys->msg_flags & MSG_CMSG_COPY_TO_USER) {
>> + ssize_t len = err;
>> +
>> + err = sendmsg_copy_cmsg_to_user(msg_sys, msg);
>> + if (err)
>> + goto out;
>> + err = len;
>> + }
>> +out:
>> kfree(iov);
>> return err;
>> }
>> --
>> 2.20.1
>>
>
>
Powered by blists - more mailing lists