[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <0f956d40-44da-b048-2534-96fa8afb8d1c@iogearbox.net>
Date: Wed, 28 Mar 2018 21:09:47 +0200
From: Daniel Borkmann <daniel@...earbox.net>
To: Nikita Shirokov <tehnerd@...com>, Lawrence Brakmo <brakmo@...com>,
"ast@...nel.org" <ast@...nel.org>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>
Cc: Kernel Team <Kernel-team@...com>
Subject: Re: [PATCH bpf-next]: add sock_ops R/W access to ipv4 tos
On 03/28/2018 07:41 PM, Nikita Shirokov wrote:
>>> On 03/26/2018 05:36 PM, Nikita V. Shirokov wrote:
>>> bpf: Add sock_ops R/W access to ipv4 tos
>>>
>>> Sample usage for tos:
>>>
>>> bpf_getsockopt(skops, SOL_IP, IP_TOS, &v, sizeof(v))
>>>
>>> where skops is a pointer to the ctx (struct bpf_sock_ops).
>>>
>>> Signed-off-by: Nikita V. Shirokov <tehnerd@...com>
>>> ---
>>> net/core/filter.c | 35 +++++++++++++++++++++++++++++++++++
>>> 1 file changed, 35 insertions(+)
>>>
>>> diff --git a/net/core/filter.c b/net/core/filter.c
>>> index 00c711c..afd8255 100644
>>> --- a/net/core/filter.c
>>> +++ b/net/core/filter.c
>>> @@ -3462,6 +3462,27 @@ BPF_CALL_5(bpf_setsockopt, struct bpf_sock_ops_kern *, bpf_sock,
>>> ret = -EINVAL;
>>> }
>>> #ifdef CONFIG_INET
>>> + } else if (level == SOL_IP) {
>>> + if (optlen != sizeof(int) || sk->sk_family != AF_INET)
>>> + return -EINVAL;
>>> +
>>> + val = *((int *)optval);
>>> + /* Only some options are supported */
>>> + switch (optname) {
>>> + case IP_TOS:
>>> + if (val < -1 || val > 0xff) {
>>> + ret = -EINVAL;
>>> + } else {
>>> + struct inet_sock *inet = inet_sk(sk);
>>> +
>>> + if (val == -1)
>>> + val = 0;
>>> + inet->tos = val;
>>
>> Should this not have the exact same semantics given the helper resembles
>> the normal setsockopt? do_ip_setsockopt() does the following when setting
>> IP_TOS:
>>
>> case IP_TOS: /* This sets both TOS and Precedence */
>> if (sk->sk_type == SOCK_STREAM) {
>> val &= ~INET_ECN_MASK;
>> val |= inet->tos & INET_ECN_MASK;
>> }
>> if (inet->tos != val) {
>> inet->tos = val;
>> sk->sk_priority = rt_tos2priority(val);
>> sk_dst_reset(sk);
>> }
>> break;
>>
>> E.g. why we don't need to set sk->sk_priority as well or reset the dst
>> entry here?
>
> it feels like initially (w/ commit for IP_TOS in ip_sockglue.c) there were some usecase in mind
> where reflection of tos to prio was needed + some policy based routing (thats why dst_reset).
> but e.g. for ipv6 (IPV6_TCLASS, same as TOS but in ipv6 world) we do just this - set new tclass value
> and call it the day. in my opinion this aproach is more flexible (e.g. we have separate
> bpf_setsockopt for SOL_PRIORITY) as it did only what we want (i can imagine few usecases
> where we want just to change TOS w/o changing priority)
Ok, fair point, that way the behavior is exactly the same as in v6 case.
Applied to bpf-next, thanks Nikita!
Powered by blists - more mailing lists