netdev - Re: [bpf-next PATCH 05/16] bpf: create tcp_bpf_ulp allowing BPF to monitor socket TX/RX data

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <54a57db7-904e-39c4-b434-0dfff8b2accc@iogearbox.net>
Date:   Wed, 7 Mar 2018 14:03:35 +0100
From:   Daniel Borkmann <daniel@...earbox.net>
To:     John Fastabend <john.fastabend@...il.com>,
        David Miller <davem@...emloft.net>
Cc:     ast@...nel.org, netdev@...r.kernel.org, davejwatson@...com
Subject: Re: [bpf-next PATCH 05/16] bpf: create tcp_bpf_ulp allowing BPF to
 monitor socket TX/RX data

On 03/07/2018 04:25 AM, John Fastabend wrote:
[...]
> Thought about this a bit more and chatted with Daniel a bit. I think
> a better solution is to set data_start = data_end = 0 by default in the
> sendpage case. This will disallow any read/writes into the sendpage
> data. Then if the user needs to read/write data we can use a helper
> bpf_sk_msg_pull_data(start_byte, end_byte) which can pull the data into a
> linear buffer as needed. This will ensure any user writes will not
> change data after the BPF verdict (your concern). Also it will minimize
> the amount of data that needs to be copied (my concern). In some of my
> use cases where no data is needed we can simple not use the helper. Then
> on the sendmsg side we can continue to set the (data_start, data_end)
> pointers to the first scatterlist element. But, also use this helper to
> set the data pointers past the first scatterlist element if needed. So
> if someone wants to read past the first 4k bytes on a large send for
> example this can be done with the above helper. BPF programs just
> need to check (start,end) data pointers and can be oblivious to
> if the program is being invoked by a call from sendpage or sendmsg.
> 
> I think this is a fairly elegant solution. Finally we can further
> optimize later with a flag if needed to cover the case where we
> want to read lots of bytes but _not_ do the copy. We can debate
> the usefulness of this later with actual perf data.
> 
> All this has the added bonus that all I need is another patch on
> top to add the helper. Pseudo code might look like this,
> 
> my_bpf_prog(struct sk_msg_md *msg) {
> 	void *data_end = msg->data_end;
> 	void *data_start = msg->data_start;
> 
> 	need = PARSE_BYTES;
> 
> 	// ensure user actually sent full header
> 	if (msg->size < PARSE_BYTES) {
> 		bpf_msg_cork(PARSE_BYTES);
> 		return SK_DROP;
> 	}
> 
> 	/* ensure we can read full header, if this is a
> 	 * sendmsg system call AND PARSE_BYTES are all in
> 	 * the first scatterlist elem this is a no-op.
> 	 * If this is a sendpage call will put PARSE_BYTES
> 	 * in a psock buffer to avoid user modifications.
> 	 */
> 	if (data_end - data_start < PARSE_BYTES) {

I think it might need to look like 'data_start + PARSE_BYTES > data_end'
for verifier to recognize (unless LLVM generates code that way).

> 		err = bpf_sk_msg_pull_data(0, PARSE_BYTES, flags);
> 		if (err)
> 			return SK_DROP;

Above should be:

		if (unlikely(err || data_start + PARSE_BYTES > data_end))
			return SK_DROP;

Here for the successful case, you need to recheck since data pointers
were invalidated due to the helper call. bpf_sk_msg_pull_data() would
for the very first case potentially be called unconditionally at prog
start though when you start out with 0 len anyway, basically right after
msg->size test.

> 	}
> 
> 	// we have the full header parse it now
> 	verdict = my_bpf_header_parser(msg);
> 	return verdict;
> }
> 
> Future optimization can work with prologue to pull in bytes
> more efficiently. And for what its worth I found a couple bugs
> in the error path of the sendpage hook I can fix in the v2 as well.
> 
> What do you think? 
> 
> @Daniel, sound more or less like what you were thinking?

Yes, absolutely what I was thinking.

We have exactly the same logic in tc/BPF today for the case when the direct
packet access test fails and we want to pull in skb data from non-linear
area, so we can in such case just call bpf_skb_pull_data(skb, len) and redo
the test to access it privately after that.

Thanks,
Daniel