[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKm6_Rv2-0BgpzAKCBPsi9TJbTPq5q0maC1odUAdNEQV7GegiQ@mail.gmail.com>
Date: Thu, 31 Oct 2024 13:39:03 +0100
From: Anna Nyiri <annaemesenyiri@...il.com>
To: Willem de Bruijn <willemdebruijn.kernel@...il.com>
Cc: netdev@...r.kernel.org, fejes@....elte.hu
Subject: Re: [PATCH net-next] support SO_PRIORITY cmsg
Willem de Bruijn <willemdebruijn.kernel@...il.com> ezt írta (időpont:
2024. okt. 29., K, 16:15):
>
> Anna Emese Nyiri wrote:
> > The Linux socket API currently supports setting SO_PRIORITY at the socket
> > level, which applies a uniform priority to all packets sent through that
> > socket. The only exception is IP_TOS, if that is specified as ancillary
> > data, the packet does not inherit the socket's priority. Instead, the
> > priority value is computed when handling the ancillary data (as implemented
> > in commit <f02db315b8d888570cb0d4496cfbb7e4acb047cb>: "ipv4: IP_TOS
> > and IP_TTL can be specified as ancillary data").
>
> Please use commit format <$SHA1:12> ("subject"). Checkpatch might also
> flag this.
>
> > Currently, there is no option to set the priority directly from userspace
> > on a per-packet basis. The following changes allow SO_PRIORITY to be set
> > through control messages (CMSG), giving userspace applications more
> > granular control over packet priorities.
> >
> > This patch enables setting skb->priority using CMSG. If SO_PRIORITY is
> > specified as ancillary data, the packet is sent with the priority value
> > set through sockc->priority_cmsg_value, overriding the socket-level
> > values set via the traditional setsockopt() method.
>
> Please also describe how this interacts with priority set from IP_TOS or
> IPV6_TCLASS.
>
> > This is analogous to
> > existing support for SO_MARK (as implemented in commit
> > <c6af0c227a22bb6bb8ff72f043e0fb6d99fd6515>, “ip: support SO_MARK
> > cmsg”).
> >
> > Suggested-by: Ferenc Fejes <fejes@....elte.hu>
> > Signed-off-by: Anna Emese Nyiri <annaemesenyiri@...il.com>
> > ---
> > include/net/inet_sock.h | 2 ++
> > include/net/sock.h | 5 ++++-
> > net/can/raw.c | 6 +++++-
> > net/core/sock.c | 12 ++++++++++++
> > net/ipv4/ip_output.c | 11 ++++++++++-
> > net/ipv4/raw.c | 5 ++++-
> > net/ipv6/ip6_output.c | 8 +++++++-
> > net/ipv6/raw.c | 6 +++++-
> > net/packet/af_packet.c | 6 +++++-
> > 9 files changed, 54 insertions(+), 7 deletions(-)
> >
> > diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h
> > index f9ddd47dc4f8..9d4e4e2a8232 100644
> > --- a/include/net/inet_sock.h
> > +++ b/include/net/inet_sock.h
> > @@ -175,6 +175,8 @@ struct inet_cork {
> > __u16 gso_size;
> > u64 transmit_time;
> > u32 mark;
> > + __u8 priority_cmsg_set;
> > + u32 priority_cmsg_value;
>
> Just priority, drop the cmsg value.
>
> Instead of an explicit "is set" bit, preferred is to initialize the
> cookie field from the sock. See sockcm_init(), below, and also
> ipcm_init_sk(). That also avoids the branches later in the datapath.
>
> > };
> >
> > struct inet_cork_full {
> > diff --git a/include/net/sock.h b/include/net/sock.h
> > index cce23ac4d514..e02170977165 100644
> > --- a/include/net/sock.h
> > +++ b/include/net/sock.h
> > @@ -1794,13 +1794,16 @@ struct sockcm_cookie {
> > u64 transmit_time;
> > u32 mark;
> > u32 tsflags;
> > + u32 priority_cmsg_value;
> > + u8 priority_cmsg_set;
> > };
> >
> > static inline void sockcm_init(struct sockcm_cookie *sockc,
> > const struct sock *sk)
> > {
> > *sockc = (struct sockcm_cookie) {
> > - .tsflags = READ_ONCE(sk->sk_tsflags)
> > + .tsflags = READ_ONCE(sk->sk_tsflags),
> > + .priority_cmsg_set = 0
> > };
> > }
> >
> > diff --git a/net/can/raw.c b/net/can/raw.c
> > index 00533f64d69d..cf7e7ae64cde 100644
> > --- a/net/can/raw.c
> > +++ b/net/can/raw.c
> > @@ -962,7 +962,11 @@ static int raw_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
> > }
> >
> > skb->dev = dev;
> > - skb->priority = READ_ONCE(sk->sk_priority);
> > + if (sockc.priority_cmsg_set)
> > + skb->priority = sockc.priority_cmsg_value;
> > + else
> > + skb->priority = READ_ONCE(sk->sk_priority);
> > +
> > skb->mark = READ_ONCE(sk->sk_mark);
> > skb->tstamp = sockc.transmit_time;
> >
> > diff --git a/net/core/sock.c b/net/core/sock.c
> > index 9abc4fe25953..899bf850b52a 100644
> > --- a/net/core/sock.c
> > +++ b/net/core/sock.c
> > @@ -2863,6 +2863,18 @@ int __sock_cmsg_send(struct sock *sk, struct cmsghdr *cmsg,
> > case SCM_RIGHTS:
> > case SCM_CREDENTIALS:
> > break;
> > + case SO_PRIORITY:
> > + if (cmsg->cmsg_len != CMSG_LEN(sizeof(u32)))
> > + return -EINVAL;
> > +
> > + if ((*(u32 *)CMSG_DATA(cmsg) >= 0 && *(u32 *)CMSG_DATA(cmsg) <= 6) ||
> > + sockopt_ns_capable(sock_net(sk)->user_ns, CAP_NET_RAW) ||
> > + sockopt_ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN)) {
> > + sockc->priority_cmsg_value = *(u32 *)CMSG_DATA(cmsg);
> > + sockc->priority_cmsg_set = 1;
> > + break;
> > + }
>
> What is the magic constant 6 here?
The mechanism for setting the priority value via cmsg mirrors that of
setting the priority value through setsockopt. The control of the
priority value is managed by the sk_setsockopt function, which allows
setting the priority within the range of 0 to 6. However, if the user
has CAP_NET_ADMIN or CAP_NET_RAW capability, they are permitted to set
any priority value without restriction. The specified range of 0 to 6
was selected to align with existing priority value check.
Powered by blists - more mailing lists