[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <54a57db7-904e-39c4-b434-0dfff8b2accc@iogearbox.net>
Date: Wed, 7 Mar 2018 14:03:35 +0100
From: Daniel Borkmann <daniel@...earbox.net>
To: John Fastabend <john.fastabend@...il.com>,
David Miller <davem@...emloft.net>
Cc: ast@...nel.org, netdev@...r.kernel.org, davejwatson@...com
Subject: Re: [bpf-next PATCH 05/16] bpf: create tcp_bpf_ulp allowing BPF to
monitor socket TX/RX data
On 03/07/2018 04:25 AM, John Fastabend wrote:
[...]
> Thought about this a bit more and chatted with Daniel a bit. I think
> a better solution is to set data_start = data_end = 0 by default in the
> sendpage case. This will disallow any read/writes into the sendpage
> data. Then if the user needs to read/write data we can use a helper
> bpf_sk_msg_pull_data(start_byte, end_byte) which can pull the data into a
> linear buffer as needed. This will ensure any user writes will not
> change data after the BPF verdict (your concern). Also it will minimize
> the amount of data that needs to be copied (my concern). In some of my
> use cases where no data is needed we can simple not use the helper. Then
> on the sendmsg side we can continue to set the (data_start, data_end)
> pointers to the first scatterlist element. But, also use this helper to
> set the data pointers past the first scatterlist element if needed. So
> if someone wants to read past the first 4k bytes on a large send for
> example this can be done with the above helper. BPF programs just
> need to check (start,end) data pointers and can be oblivious to
> if the program is being invoked by a call from sendpage or sendmsg.
>
> I think this is a fairly elegant solution. Finally we can further
> optimize later with a flag if needed to cover the case where we
> want to read lots of bytes but _not_ do the copy. We can debate
> the usefulness of this later with actual perf data.
>
> All this has the added bonus that all I need is another patch on
> top to add the helper. Pseudo code might look like this,
>
> my_bpf_prog(struct sk_msg_md *msg) {
> void *data_end = msg->data_end;
> void *data_start = msg->data_start;
>
> need = PARSE_BYTES;
>
> // ensure user actually sent full header
> if (msg->size < PARSE_BYTES) {
> bpf_msg_cork(PARSE_BYTES);
> return SK_DROP;
> }
>
> /* ensure we can read full header, if this is a
> * sendmsg system call AND PARSE_BYTES are all in
> * the first scatterlist elem this is a no-op.
> * If this is a sendpage call will put PARSE_BYTES
> * in a psock buffer to avoid user modifications.
> */
> if (data_end - data_start < PARSE_BYTES) {
I think it might need to look like 'data_start + PARSE_BYTES > data_end'
for verifier to recognize (unless LLVM generates code that way).
> err = bpf_sk_msg_pull_data(0, PARSE_BYTES, flags);
> if (err)
> return SK_DROP;
Above should be:
if (unlikely(err || data_start + PARSE_BYTES > data_end))
return SK_DROP;
Here for the successful case, you need to recheck since data pointers
were invalidated due to the helper call. bpf_sk_msg_pull_data() would
for the very first case potentially be called unconditionally at prog
start though when you start out with 0 len anyway, basically right after
msg->size test.
> }
>
> // we have the full header parse it now
> verdict = my_bpf_header_parser(msg);
> return verdict;
> }
>
> Future optimization can work with prologue to pull in bytes
> more efficiently. And for what its worth I found a couple bugs
> in the error path of the sendpage hook I can fix in the v2 as well.
>
> What do you think?
>
> @Daniel, sound more or less like what you were thinking?
Yes, absolutely what I was thinking.
We have exactly the same logic in tc/BPF today for the case when the direct
packet access test fails and we want to pull in skb data from non-linear
area, so we can in such case just call bpf_skb_pull_data(skb, len) and redo
the test to access it privately after that.
Thanks,
Daniel
Powered by blists - more mailing lists