lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 7 Oct 2020 23:37:00 +0200
From:   Daniel Borkmann <daniel@...earbox.net>
To:     Jesper Dangaard Brouer <brouer@...hat.com>, bpf@...r.kernel.org
Cc:     netdev@...r.kernel.org, Daniel Borkmann <borkmann@...earbox.net>,
        Alexei Starovoitov <alexei.starovoitov@...il.com>,
        maze@...gle.com, lmb@...udflare.com, shaun@...era.io,
        Lorenzo Bianconi <lorenzo@...nel.org>, marek@...udflare.com,
        John Fastabend <john.fastabend@...il.com>,
        Jakub Kicinski <kuba@...nel.org>, eyal.birger@...il.com
Subject: Re: [PATCH bpf-next V2 5/6] bpf: Add MTU check for TC-BPF packets
 after egress hook

On 10/7/20 6:23 PM, Jesper Dangaard Brouer wrote:
[...]
>   net/core/dev.c |   24 ++++++++++++++++++++++--
>   1 file changed, 22 insertions(+), 2 deletions(-)
> 
> diff --git a/net/core/dev.c b/net/core/dev.c
> index b433098896b2..19406013f93e 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -3870,6 +3870,7 @@ sch_handle_egress(struct sk_buff *skb, int *ret, struct net_device *dev)
>   	switch (tcf_classify(skb, miniq->filter_list, &cl_res, false)) {
>   	case TC_ACT_OK:
>   	case TC_ACT_RECLASSIFY:
> +		*ret = NET_XMIT_SUCCESS;
>   		skb->tc_index = TC_H_MIN(cl_res.classid);
>   		break;
>   	case TC_ACT_SHOT:
> @@ -4064,9 +4065,12 @@ static int __dev_queue_xmit(struct sk_buff *skb, struct net_device *sb_dev)
>   {
>   	struct net_device *dev = skb->dev;
>   	struct netdev_queue *txq;
> +#ifdef CONFIG_NET_CLS_ACT
> +	bool mtu_check = false;
> +#endif
> +	bool again = false;
>   	struct Qdisc *q;
>   	int rc = -ENOMEM;
> -	bool again = false;
>   
>   	skb_reset_mac_header(skb);
>   
> @@ -4082,14 +4086,28 @@ static int __dev_queue_xmit(struct sk_buff *skb, struct net_device *sb_dev)
>   
>   	qdisc_pkt_len_init(skb);
>   #ifdef CONFIG_NET_CLS_ACT
> +	mtu_check = skb_is_redirected(skb);
>   	skb->tc_at_ingress = 0;
>   # ifdef CONFIG_NET_EGRESS
>   	if (static_branch_unlikely(&egress_needed_key)) {
> +		unsigned int len_orig = skb->len;
> +
>   		skb = sch_handle_egress(skb, &rc, dev);
>   		if (!skb)
>   			goto out;
> +		/* BPF-prog ran and could have changed packet size beyond MTU */
> +		if (rc == NET_XMIT_SUCCESS && skb->len > len_orig)
> +			mtu_check = true;
>   	}
>   # endif
> +	/* MTU-check only happens on "last" net_device in a redirect sequence
> +	 * (e.g. above sch_handle_egress can steal SKB and skb_do_redirect it
> +	 * either ingress or egress to another device).
> +	 */

Hmm, quite some overhead in fast path. Also, won't this be checked multiple times
on stacked devices? :( Moreover, this missed the fact that 'real' qdiscs can have
filters attached too, this would come after this check. Can't this instead be in
driver layer for those that really need it? I would probably only drop the check
as done in 1/6 and allow the BPF prog to do the validation if needed.

> +	if (mtu_check && !is_skb_forwardable(dev, skb)) {
> +		rc = -EMSGSIZE;
> +		goto drop;
> +	}
>   #endif
>   	/* If device/qdisc don't need skb->dst, release it right now while
>   	 * its hot in this cpu cache.
> @@ -4157,7 +4175,9 @@ static int __dev_queue_xmit(struct sk_buff *skb, struct net_device *sb_dev)
>   
>   	rc = -ENETDOWN;
>   	rcu_read_unlock_bh();
> -
> +#ifdef CONFIG_NET_CLS_ACT
> +drop:
> +#endif
>   	atomic_long_inc(&dev->tx_dropped);
>   	kfree_skb_list(skb);
>   	return rc;
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ