lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <58EBE21B.5000602@iogearbox.net>
Date:   Mon, 10 Apr 2017 21:50:51 +0200
From:   Daniel Borkmann <daniel@...earbox.net>
To:     Alexei Starovoitov <alexei.starovoitov@...il.com>,
        David Miller <davem@...emloft.net>
CC:     netdev@...r.kernel.org, xdp-newbies@...r.kernel.org
Subject: Re: [PATCH v2 net-next RFC] Generic XDP

On 04/10/2017 04:18 AM, Alexei Starovoitov wrote:
[...]
>> +	xdp.data_end = xdp.data + hlen;
>> +	xdp.data_hard_start = xdp.data - skb_headroom(skb);
>> +	orig_data = xdp.data;
>> +	act = bpf_prog_run_xdp(xdp_prog, &xdp);
>> +
>> +	off = xdp.data - orig_data;
>> +	if (off)
>> +		__skb_push(skb, off);
>
> and restore l2 back somehow and get new skb->protocol ?
> if we simply do __skb_pull(skb, skb->mac_len); like
> we do with cls_bpf, it will not work correctly,
> since if the program did ip->ipip encap (like our balancer
> does and the test tools/testing/selftests/bpf/test_xdp.c)
> the skb metadata fields will be wrong.
> So we need to repeat eth_type_trans() here if (xdp.data != orig_data)

Yeah, agree. Also, when we have gso skb and rewrite/resize parts
of the packet, we would need to update gso related shinfo meta
data accordingly (f.e. a rewrite from v4/v6, rewrite of whole pkt
as icmp reply, etc)?

Also, what about encap/decap, should inner skb headers get
updated as well along with skb->encapsulation, etc? How do we
handle checksumming on this layer?

> In case of cls_bpf when we mess with skb sizes we always
> adjust skb metafields in helpers, so there it's fine
> and __skb_pull(skb, skb->mac_len); is enough.
> Here we need to be a bit more careful.

In cls_bpf I was looking into something generic and fast for
encap/decap like bpf_xdp_adjust_head() but for skbs. Problem is
that they can be received from ingress/egress and transmitted
further from cls_bpf to ingress/egress, so keeping skb meta data
correct and up to date without exposing skb (implementation)
details like header pointers to users is crucial, as otherwise
these can get messed up potentially affecting the rest of the
system. We restricted helpers in cls_bpf to avoid that. Perhaps
we could make easier assumptions when this generic callback is
known to be called out of a physical driver's rx path, but when
being skb already (as mentioned below by Alexei's thoughts) ...

>>   static int netif_receive_skb_internal(struct sk_buff *skb)
>>   {
>>   	int ret;
>> @@ -4258,6 +4336,21 @@ static int netif_receive_skb_internal(struct sk_buff *skb)
>>
>>   	rcu_read_lock();
>>
>> +	if (static_key_false(&generic_xdp_needed)) {
>> +		struct bpf_prog *xdp_prog = rcu_dereference(skb->dev->xdp_prog);
>> +
>> +		if (xdp_prog) {
>> +			u32 act = netif_receive_generic_xdp(skb, xdp_prog);
>
> That's indeed the best attachment point in the stack.
> I was trying to see whether it can be lowered into something like
> dev_gro_receive(), but not everyone calls it.
> Another option to put it into eth_type_trans() itself, then
> there are no problems with gro, l2 headers, and adjust_head,
> but changing all drivers is too much.
>
>> +
>> +			if (act != XDP_PASS) {
>> +				rcu_read_unlock();
>> +				if (act == XDP_TX)
>> +					dev_queue_xmit(skb);
>
> It should be fine. For cls_bpf we do recursion check __bpf_tx_skb()
> but I forgot specific details. May be here it's fine as-is.
> Daniel, do we need recursion check here?

Yeah, Willem is correct. That was for sch_handle_egress() to
sch_handle_egress() as that is otherwise not accounted by the
main xmit_recursion check we have in __dev_queue_xmit().

Thanks,
Daniel

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ