[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKm6_RtYXpa5HnTNe+b1xy9p4BsdD8JnG30F+_ktBYcd2QSyfQ@mail.gmail.com>
Date: Sun, 3 Nov 2024 14:39:17 +0100
From: Anna Nyiri <annaemesenyiri@...il.com>
To: Willem de Bruijn <willemdebruijn.kernel@...il.com>
Cc: netdev@...r.kernel.org, fejes@....elte.hu, edumazet@...gle.com,
kuba@...nel.org, pabeni@...hat.com
Subject: Re: [PATCH net-next v2 2/2] support SO_PRIORITY cmsg
Willem de Bruijn <willemdebruijn.kernel@...il.com> ezt írta (időpont:
2024. nov. 3., V, 2:27):
>
> Anna Emese Nyiri wrote:
> > The Linux socket API currently supports setting SO_PRIORITY at the
> > socket level, which applies a uniform priority to all packets sent
> > through that socket. The only exception is IP_TOS; if specified as
> > ancillary data, the packet does not inherit the socket's priority.
> > Instead, the priority value is computed when handling the ancillary
> > data (as implemented in commit <f02db315b8d88>
>
> nit: drop the brackets
>
> > ("ipv4: IP_TOS and IP_TTL can be specified as ancillary data")). If
> > the priority is set via IP_TOS, then skb->priority derives its value
> > from the rt_tos2priority function, which calculates the priority
> > based on the value of ipc->tos obtained from IP_TOS. However, if
> > IP_TOS is not used and the priority has been set through a control
> > message, skb->priority will take the value provided by that control
> > message.
>
> The above describes the new situation? There is no way to set
> priority to a control message prior to this patch.
>
> > Therefore, when both options are available, the primary
> > source for skb->priority is the value set via IP_TOS.
> >
> > Currently, there is no option to set the priority directly from
> > userspace on a per-packet basis. The following changes allow
> > SO_PRIORITY to be set through control messages (CMSG), giving
> > userspace applications more granular control over packet priorities.
> >
> > This patch enables setting skb->priority using CMSG. If SO_PRIORITY
>
> Duplicate statement. Overall, the explanation can perhaps be
> condensed and made more clear.
>
> > is specified as ancillary data, the packet is sent with the priority
> > value set through sockc->priority_cmsg_value, overriding the
>
> No longer matches the code.
>
> > socket-level values set via the traditional setsockopt() method. This
> > is analogous to existing support for SO_MARK (as implemented in commit
> > <c6af0c227a22> ("ip: support SO_MARK cmsg")).
> >
> > Suggested-by: Ferenc Fejes <fejes@....elte.hu>
> > Signed-off-by: Anna Emese Nyiri <annaemesenyiri@...il.com>
> > ---
> > include/net/inet_sock.h | 2 +-
> > include/net/ip.h | 3 ++-
> > include/net/sock.h | 4 +++-
> > net/can/raw.c | 2 +-
> > net/core/sock.c | 8 ++++++++
> > net/ipv4/ip_output.c | 7 +++++--
> > net/ipv4/raw.c | 2 +-
> > net/ipv6/ip6_output.c | 3 ++-
> > net/ipv6/raw.c | 2 +-
> > net/packet/af_packet.c | 2 +-
> > 10 files changed, 25 insertions(+), 10 deletions(-)
> >
> > diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h
> > index 56d8bc5593d3..3ccbad881d74 100644
> > --- a/include/net/inet_sock.h
> > +++ b/include/net/inet_sock.h
> > @@ -172,7 +172,7 @@ struct inet_cork {
> > u8 tx_flags;
> > __u8 ttl;
> > __s16 tos;
> > - char priority;
> > + u32 priority;
>
> Let's check with pahole how this affects struct size and holes.
> It likely adds a hole, but unavoidably so.
>
> > __u16 gso_size;
> > u32 ts_opt_id;
> > u64 transmit_time;
> > diff --git a/include/net/ip.h b/include/net/ip.h
> > index 0e548c1f2a0e..e8f71a191277 100644
> > --- a/include/net/ip.h
> > +++ b/include/net/ip.h
> > @@ -81,7 +81,7 @@ struct ipcm_cookie {
> > __u8 protocol;
> > __u8 ttl;
> > __s16 tos;
> > - char priority;
> > + u32 priority;
>
> No need for a field in ipcm_cookie, when also present in
> sockcm_cookie. As SO_PRIORITY is not limited to IP, sockcm_cookie is
> the right location.
I think there could be a problem if the priority is set by IP_TOS for
some reason, and then also via cmsg. The latter value may overwrite
it. In the ip_setup_cork() function, there is therefore a check for
the value cork->tos != -1 to give priority to the value set by IP_TOS.
And that's why I thought that there should be a priority field in both
ipcm_cookie and sockcm_cookie. The priority field already existed in
ipcm_cookie, I didn't add it. I just changed the type.
>
> If cmsg IP_TOS is present, that can overridde ipc->sockc.priority with
> rt_tos2priority.
>
> Interesting that this override by IP_TOS seems to be IPV4 only. There
> is no equivalent call to rt_tos2priority when setting IPV6_TCLASS.
>
> > __u16 gso_size;
> > };
> >
> > @@ -96,6 +96,7 @@ static inline void ipcm_init_sk(struct ipcm_cookie *ipcm,
> > ipcm_init(ipcm);
> >
> > ipcm->sockc.mark = READ_ONCE(inet->sk.sk_mark);
> > + ipcm->sockc.priority = READ_ONCE(inet->sk.sk_priority);
> > ipcm->sockc.tsflags = READ_ONCE(inet->sk.sk_tsflags);
> > ipcm->oif = READ_ONCE(inet->sk.sk_bound_dev_if);
> > ipcm->addr = inet->inet_saddr;
> > diff --git a/include/net/sock.h b/include/net/sock.h
> > index 7464e9f9f47c..316a34d6c48b 100644
> > --- a/include/net/sock.h
> > +++ b/include/net/sock.h
> > @@ -1814,13 +1814,15 @@ struct sockcm_cookie {
> > u32 mark;
> > u32 tsflags;
> > u32 ts_opt_id;
> > + u32 priority;
> > };
> >
> > static inline void sockcm_init(struct sockcm_cookie *sockc,
> > const struct sock *sk)
> > {
> > *sockc = (struct sockcm_cookie) {
> > - .tsflags = READ_ONCE(sk->sk_tsflags)
> > + .tsflags = READ_ONCE(sk->sk_tsflags),
> > + .priority = READ_ONCE(sk->sk_priority),
> > };
> > }
> >
> > diff --git a/net/can/raw.c b/net/can/raw.c
> > index 255c0a8f39d6..46e8ed9d64da 100644
> > --- a/net/can/raw.c
> > +++ b/net/can/raw.c
> > @@ -962,7 +962,7 @@ static int raw_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
> > }
> >
> > skb->dev = dev;
> > - skb->priority = READ_ONCE(sk->sk_priority);
> > + skb->priority = sockc.priority;
> > skb->mark = READ_ONCE(sk->sk_mark);
> > skb->tstamp = sockc.transmit_time;
> >
> > diff --git a/net/core/sock.c b/net/core/sock.c
> > index 5ecf6f1a470c..d5586b9212dd 100644
> > --- a/net/core/sock.c
> > +++ b/net/core/sock.c
> > @@ -2941,6 +2941,14 @@ int __sock_cmsg_send(struct sock *sk, struct cmsghdr *cmsg,
> > case SCM_RIGHTS:
> > case SCM_CREDENTIALS:
> > break;
> > + case SO_PRIORITY:
> > + if (cmsg->cmsg_len != CMSG_LEN(sizeof(u32)))
> > + return -EINVAL;
> > + if (sk_set_prio_allowed(sk, *(u32 *)CMSG_DATA(cmsg))) {
> > + sockc->priority = *(u32 *)CMSG_DATA(cmsg);
> > + break;
> > + }
> > + return -EPERM;
>
> nit: invert to make the error case the (speculated as unlikely) branch
> and have the common path unindented.
>
> > default:
> > return -EINVAL;
> > }
Powered by blists - more mailing lists