lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <EEB42709-8A7F-4833-BA66-F1841DA12E0D@fb.com>
Date:   Tue, 5 Dec 2017 00:21:01 +0000
From:   Lawrence Brakmo <brakmo@...com>
To:     Daniel Borkmann <daniel@...earbox.net>,
        netdev <netdev@...r.kernel.org>
CC:     Alexei Starovoitov <ast@...com>, Kernel Team <Kernel-team@...com>,
        "Blake Matheny" <bmatheny@...com>,
        Vlad Dumitrescu <vladum@...gle.com>
Subject: Re: [PATCH net-next] bpf: Add access to snd_cwnd and others in
 sock_ops

On 12/4/17, 2:32 PM, "netdev-owner@...r.kernel.org on behalf of Daniel Borkmann" <netdev-owner@...r.kernel.org on behalf of daniel@...earbox.net> wrote:

    Hi Lawrence,
    
    On 12/01/2017 07:15 PM, Lawrence Brakmo wrote:
    > Adds read access to snd_cwnd and srtt_us fields of tcp_sock. Since these
    > fields are only valid if the socket associated with the sock_ops program
    > call is a full socket, the field is_fullsock is also added to the
    > bpf_sock_ops struct. If the socket is not a full socket, reading these
    > fields returns 0.
    > 
    > Note that in most cases it will not be necessary to check is_fullsock to
    > know if there is a full socket. The context of the call, as specified by
    > the 'op' field, can sometimes determine whether there is a full socket.
    > 
    > The struct bpf_sock_ops has the following fields added:
    > 
    >   __u32 is_fullsock;      /* Some TCP fields are only valid if
    >                            * there is a full socket. If not, the
    >                            * fields read as zero.
    > 			   */
    >   __u32 snd_cwnd;
    >   __u32 srtt_us;          /* Averaged RTT << 3 in usecs */
    > 
    > There is a new macro, SOCK_OPS_GET_TCP32(NAME), to make it easier to add
    > read access to more 32 bit tcp_sock fields.
    > 
    > Signed-off-by: Lawrence Brakmo <brakmo@...com>
    > ---
    >  include/linux/filter.h   |  1 +
    >  include/net/tcp.h        |  6 ++++--
    >  include/uapi/linux/bpf.h |  6 ++++++
    >  net/core/filter.c        | 36 ++++++++++++++++++++++++++++++++++++
    >  4 files changed, 47 insertions(+), 2 deletions(-)
    > 
    > diff --git a/include/linux/filter.h b/include/linux/filter.h
    > index 80b5b48..0062302 100644
    > --- a/include/linux/filter.h
    > +++ b/include/linux/filter.h
    > @@ -985,6 +985,7 @@ struct bpf_sock_ops_kern {
    >  		u32 reply;
    >  		u32 replylong[4];
    >  	};
    > +	u32	is_fullsock;
    >  };
    >  
    >  #endif /* __LINUX_FILTER_H__ */
    > diff --git a/include/net/tcp.h b/include/net/tcp.h
    > index 4e09398..89a6560 100644
    > --- a/include/net/tcp.h
    > +++ b/include/net/tcp.h
    > @@ -2012,10 +2012,12 @@ static inline int tcp_call_bpf(struct sock *sk, int op)
    >  	struct bpf_sock_ops_kern sock_ops;
    >  	int ret;
    >  
    > -	if (sk_fullsock(sk))
    > +	memset(&sock_ops, 0, sizeof(sock_ops));
    > +	if (sk_fullsock(sk)) {
    > +		sock_ops.is_fullsock = 1;
    >  		sock_owned_by_me(sk);
    > +	}
    >  
    > -	memset(&sock_ops, 0, sizeof(sock_ops));
    >  	sock_ops.sk = sk;
    >  	sock_ops.op = op;
    >  
    > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
    > index 4c223ab..80d62e8 100644
    > --- a/include/uapi/linux/bpf.h
    > +++ b/include/uapi/linux/bpf.h
    > @@ -941,6 +941,12 @@ struct bpf_sock_ops {
    >  	__u32 local_ip6[4];	/* Stored in network byte order */
    >  	__u32 remote_port;	/* Stored in network byte order */
    >  	__u32 local_port;	/* stored in host byte order */
    > +	__u32 is_fullsock;	/* Some TCP fields are only valid if
    > +				 * there is a full socket. If not, the
    > +				 * fields read as zero.
    > +				 */
    
    Just a question for clarification: do you need to have this in
    the uapi struct as well? Meaning, do you have a specific use case
    where you do this check out of the BPF program given the below
    two snd_cwnd/srtt_us check this internally in the ctx rewrite?
    
Although is_fullsock is checked internally, the bpf program may want to
verify if a return value of zero is due to a non fullsock state. The issue is
that there are some ops that are called from both active and passive paths
so it is not possible to know just by the op type.

    Do you plan to reuse the ctx assignment also for bpf_setsockopt()
    and bpf_getsockopt() internally?

If you mean, check is_fullsock instead of the call to sk_fullsock, then yes, it is
a great idea. However, I would rather do it in another patch so we don’t delay
this one (I have other things almost ready that depend on this patch).

    > +	__u32 snd_cwnd;
    > +	__u32 srtt_us;		/* Averaged RTT << 3 in usecs */
    >  };
    >  
    >  /* List of known BPF sock_ops operators.
    > diff --git a/net/core/filter.c b/net/core/filter.c
    > index 6a85e67..bf31236 100644
    > --- a/net/core/filter.c
    > +++ b/net/core/filter.c
    > @@ -4437,6 +4437,42 @@ static u32 sock_ops_convert_ctx_access(enum bpf_access_type type,
    >  		*insn++ = BPF_LDX_MEM(BPF_H, si->dst_reg, si->dst_reg,
    >  				      offsetof(struct sock_common, skc_num));
    >  		break;
    > +
    > +	case offsetof(struct bpf_sock_ops, is_fullsock):
    > +		*insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(
    > +						struct bpf_sock_ops_kern,
    > +						is_fullsock),
    > +				      si->dst_reg, si->src_reg,
    > +				      offsetof(struct bpf_sock_ops_kern,
    > +					       is_fullsock));
    > +		break;
    > +
    > +/* Helper macro for adding read access to tcp_sock fields. */
    > +#define SOCK_OPS_GET_TCP32(FIELD_NAME)					      \
    > +	do {								      \
    > +		BUILD_BUG_ON(FIELD_SIZEOF(struct tcp_sock, FIELD_NAME) != 4); \
    > +		*insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(			      \
    > +						struct bpf_sock_ops_kern,     \
    > +						is_fullsock),		      \
    > +				      si->dst_reg, si->src_reg,		      \
    > +				      offsetof(struct bpf_sock_ops_kern,      \
    > +					       is_fullsock));		      \
    > +		*insn++ = BPF_JMP_IMM(BPF_JEQ, si->dst_reg, 0, 2);	      \
    > +		*insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(			      \
    > +						struct bpf_sock_ops_kern, sk),\
    > +				      si->dst_reg, si->src_reg,		      \
    > +				      offsetof(struct bpf_sock_ops_kern, sk));\
    > +		*insn++ = BPF_LDX_MEM(BPF_W, si->dst_reg, si->dst_reg,        \
    > +				      offsetof(struct tcp_sock, FIELD_NAME)); \
    > +	} while (0)
    > +
    > +	case offsetof(struct bpf_sock_ops, snd_cwnd):
    > +		SOCK_OPS_GET_TCP32(snd_cwnd);
    > +		break;
    > +
    > +	case offsetof(struct bpf_sock_ops, srtt_us):
    > +		SOCK_OPS_GET_TCP32(srtt_us);
    > +		break;
    >  	}
    >  	return insn - insn_buf;
    >  }
    > 
    
    

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ