[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <DM6PR15MB23807AF16244A1173A717D85D5A30@DM6PR15MB2380.namprd15.prod.outlook.com>
Date: Wed, 28 Mar 2018 17:41:46 +0000
From: Nikita Shirokov <tehnerd@...com>
To: Daniel Borkmann <daniel@...earbox.net>,
Lawrence Brakmo <brakmo@...com>,
"ast@...nel.org" <ast@...nel.org>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>
CC: Kernel Team <Kernel-team@...com>
Subject: Re: [PATCH bpf-next]: add sock_ops R/W access to ipv4 tos
>>On 03/26/2018 05:36 PM, Nikita V. Shirokov wrote:
>> bpf: Add sock_ops R/W access to ipv4 tos
>>
>> Sample usage for tos:
>>
>> bpf_getsockopt(skops, SOL_IP, IP_TOS, &v, sizeof(v))
>>
>> where skops is a pointer to the ctx (struct bpf_sock_ops).
>>
>> Signed-off-by: Nikita V. Shirokov <tehnerd@...com>
>> ---
>> net/core/filter.c | 35 +++++++++++++++++++++++++++++++++++
>> 1 file changed, 35 insertions(+)
>>
>> diff --git a/net/core/filter.c b/net/core/filter.c
>> index 00c711c..afd8255 100644
>> --- a/net/core/filter.c
>> +++ b/net/core/filter.c
>> @@ -3462,6 +3462,27 @@ BPF_CALL_5(bpf_setsockopt, struct bpf_sock_ops_kern *, bpf_sock,
>> ret = -EINVAL;
>> }
>> #ifdef CONFIG_INET
>> + } else if (level == SOL_IP) {
>> + if (optlen != sizeof(int) || sk->sk_family != AF_INET)
>> + return -EINVAL;
>> +
>> + val = *((int *)optval);
>> + /* Only some options are supported */
>> + switch (optname) {
>> + case IP_TOS:
>> + if (val < -1 || val > 0xff) {
>> + ret = -EINVAL;
>> + } else {
>> + struct inet_sock *inet = inet_sk(sk);
>> +
>> + if (val == -1)
>> + val = 0;
>> + inet->tos = val;
>
>Should this not have the exact same semantics given the helper resembles
>the normal setsockopt? do_ip_setsockopt() does the following when setting
>IP_TOS:
>
> case IP_TOS: /* This sets both TOS and Precedence */
> if (sk->sk_type == SOCK_STREAM) {
> val &= ~INET_ECN_MASK;
> val |= inet->tos & INET_ECN_MASK;
> }
> if (inet->tos != val) {
> inet->tos = val;
> sk->sk_priority = rt_tos2priority(val);
> sk_dst_reset(sk);
> }
> break;
>
>E.g. why we don't need to set sk->sk_priority as well or reset the dst
>entry here?
>
it feels like initially (w/ commit for IP_TOS in ip_sockglue.c) there were some usecase in mind
where reflection of tos to prio was needed + some policy based routing (thats why dst_reset).
but e.g. for ipv6 (IPV6_TCLASS, same as TOS but in ipv6 world) we do just this - set new tclass value
and call it the day. in my opinion this aproach is more flexible (e.g. we have separate
bpf_setsockopt for SOL_PRIORITY) as it did only what we want (i can imagine few usecases
where we want just to change TOS w/o changing priority)
however if you feel strong and want to reproduce the same behavior as in regular setsockopt, i can
totally make v2. please let me know
>> + }
>> + break;
>> + default:
>> + ret = -EINVAL;
>> + }
>> #if IS_ENABLED(CONFIG_IPV6)
>> } else if (level == SOL_IPV6) {
>> if (optlen != sizeof(int) || sk->sk_family != AF_INET6)
>> @@ -3561,6 +3582,20 @@ BPF_CALL_5(bpf_getsockopt, struct bpf_sock_ops_kern *, bpf_sock,
>> } else {
>> goto err_clear;
>> }
>> + } else if (level == SOL_IP) {
>> + struct inet_sock *inet = inet_sk(sk);
>> +
>> + if (optlen != sizeof(int) || sk->sk_family != AF_INET)
>> + goto err_clear;
>> +
>> + /* Only some options are supported */
>> + switch (optname) {
>> + case IP_TOS:
>> + *((int *)optval) = (int)inet->tos;
>
>This part is fine though, same as in do_ip_getsockopt().
>
>> + break;
>> + default:
>> + goto err_clear;
>> + }
>> #if IS_ENABLED(CONFIG_IPV6)
>> } else if (level == SOL_IPV6) {
>> struct ipv6_pinfo *np = inet6_sk(sk);
>>
Powered by blists - more mailing lists