[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <36057f08-dc87-c0f5-591f-859eaa508f2d@gmail.com>
Date: Mon, 5 Mar 2018 23:06:01 -0800
From: John Fastabend <john.fastabend@...il.com>
To: David Miller <davem@...emloft.net>
Cc: ast@...nel.org, daniel@...earbox.net, netdev@...r.kernel.org,
davejwatson@...com
Subject: Re: [bpf-next PATCH 05/16] bpf: create tcp_bpf_ulp allowing BPF to
monitor socket TX/RX data
On 03/05/2018 10:42 PM, David Miller wrote:
> From: John Fastabend <john.fastabend@...il.com>
> Date: Mon, 5 Mar 2018 22:22:21 -0800
>
>> All I meant by this is if an application uses sendfile() call
>> there is no good way to know when/if the kernel side will copy or
>> xmit the data. So a reliable user space application will need to
>> only modify the data if it "knows" there are no outstanding sends
>> in-flight. So if we assume applications follow this then it
>> is OK to avoid the copy. Of course this is not good enough for
>> security, but for monitoring/statistics (my use case 1 it works).
>
> For an application implementing a networking file system, it's pretty
> legitimate for file contents to change before the page gets DMA's to
> the networking card.
>
Still there are useful BPF programs that can tolerate this. So I
would prefer to allow BPF programs to operate in the no-copy mode
if wanted. It doesn't have to be the default though as it currently
is. A l7 load balancer is a good example of this.
> And that's perfectly fine, and we everything such that this will work
> properly.
>
> The card checksums what ends up being DMA'd so nothing from the
> networking side is broken.
Assuming the card has checksum support correct? Which is why we have
the SKBTX_SHARED_FRAG checked in skb_has_shared_frag() and the checksum
helpers called by the drivers when they do not support the protocol
being used. So probably OK assumption if using supported protocols and
hardware? Perhaps in general folks just use normal protocols and
hardware so it works.
>
> So this assumption you mention really does not hold.
>
OK.
> There needs to be some feedback from the BPF program that parses the
> packet. This way it can say, "I need at least X more bytes before I
> can generate a verdict". And you keep copying more and more bytes
> into a linear buffer and calling the parser over and over until it can
> generate a full verdict or you run out of networking data.
>
So the "I need at least X more bytes" is the msg_cork_bytes() in patch
7. I could handle the sendpage case the same as I handle the sendmsg
case and copy the data into the buffer until N bytes are received. I
had planned to add this mode in a follow up series but could add it in
this series so we have all the pieces in one submission.
Although I used a scatterlist instead of a linear buffer. I was
planning to add a helper to pull in next sg list item if needed
rather than try to allocate a large linear block up front.
Powered by blists - more mailing lists