netdev - Re: [PATCH net-next 3/4] soreuseport: setsockopt SO_ATTACH_REUSEPORT

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20151224163639.GA90724@ast-mbp.thefacebook.com>
Date:	Thu, 24 Dec 2015 09:36:40 -0700
From:	Alexei Starovoitov <alexei.starovoitov@...il.com>
To:	Craig Gallek <kraigatgoog@...il.com>
Cc:	netdev@...r.kernel.org, David Miller <davem@...emloft.net>
Subject: Re: [PATCH net-next 3/4] soreuseport: setsockopt
 SO_ATTACH_REUSEPORT_[CE]BPF

On Tue, Dec 22, 2015 at 03:05:09PM -0500, Craig Gallek wrote:
> From: Craig Gallek <kraig@...gle.com>
> 
> Expose socket options for setting a classic or extended BPF program
> for use when selecting sockets in an SO_REUSEPORT group.  These options
> can be used on the first socket to belong to a group before bind or
> on any socket in the group after bind.
> 
> This change includes refactoring of the existing sk_filter code to
> allow reuse of the existing BPF filter validation checks.
> 
> Signed-off-by: Craig Gallek <kraig@...gle.com>

interesting stuff.

> +static struct sock *run_bpf(struct sock_reuseport *reuse, u16 socks,
> +			    struct bpf_prog *prog, struct sk_buff *skb,
> +			    int hdr_len)
> +{
> +	struct sk_buff *nskb = NULL;
> +	u32 index;
> +
> +	if (skb_shared(skb)) {
> +		nskb = skb_clone(skb, GFP_ATOMIC);
> +		if (!nskb)
> +			return NULL;
> +		skb = nskb;
> +	}

what is the typical case here skb_shared or not?

> +	/* temporarily advance data past protocol header */
> +	if (skb_headlen(skb) < hdr_len || !skb_pull_inline(skb, hdr_len)) {

though bpf core will read just fine past linear part of the packet,
here we're limiting this feature only to packets where udp header is
part of headlen. Will it make it somewhat unreliable?
May be we can avoid doing this pull/push and use negative load
instructions with SKF_NET_OFF ? Something like:
load_word(skb, SKF_NET_OFF + sizeof(struct udphdr)));

>  /**
>   *  reuseport_select_sock - Select a socket from an SO_REUSEPORT group.
>   *  @sk: First socket in the group.
> - *  @hash: Use this hash to select.
> + *  @hash: When no BPF filter is available, use this hash to select.
> + *  @skb: skb to run through BPF filter.
> + *  @hdr_len: BPF filter expects skb data pointer at payload data.  If
> + *    the skb does not yet point at the payload, this parameter represents
> + *    how far the pointer needs to advance to reach the payload.

what is the use case of this?
Do you expect programs to be stateful?

> +				sk2 = reuseport_select_sock(sk, hash, NULL, 0);
...
> +				sk2 = reuseport_select_sock(sk, hash, skb,
> +							sizeof(struct udphdr));

these are the cases that comment is trying to explain?
Meaning the bpf program needs to understand well enough when udp stack
is calling it ?

Will do more careful review of bpf bits once I'm back from PTO.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html