[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHA+R7PqC=cfb+WTrBrQjNkio9-bWXX9mX43xcRism4qTNiGog@mail.gmail.com>
Date: Fri, 10 Apr 2015 17:45:19 -0700
From: Cong Wang <cwang@...pensource.com>
To: Alexei Starovoitov <ast@...mgrid.com>
Cc: "David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Daniel Borkmann <daniel@...earbox.net>,
Thomas Graf <tgraf@...g.ch>, Jiri Pirko <jiri@...nulli.us>,
Jamal Hadi Salim <jhs@...atatu.com>,
netdev <netdev@...r.kernel.org>
Subject: Re: [PATCH v4 net-next 2/2] tc: add 'needs_l2' flag to ingress qdisc
On Fri, Apr 10, 2015 at 4:33 PM, Alexei Starovoitov <ast@...mgrid.com> wrote:
> TC classifers and actions attached to ingress and egress qdiscs see
> inconsistent skb->data. For ingress L2 header is already pulled, whereas
> for egress it's present. Introduce an optional flag for ingress qdisc
> which if set will cause ingress to push L2 header before calling
> into classifiers/actions and pull L2 back afterwards.
>
> The cls_bpf/act_bpf are now marked as 'needs_l2'. The users can use them
> on ingress qdisc created with 'needs_l2' flag and on any egress qdisc.
> The use of them with vanilla ingress is disallowed.
>
> The ingress_l2 qdisc can only be attached to devices that provide headers_ops.
>
> When ingress is not enabled static_key avoids *(skb->dev->ingress_queue)
>
> When ingress is enabled the difference old vs new to reach qdisc spinlock:
> old:
> *(skb->dev->ingress_queue), if, *(rxq->qdisc), if, *(rxq->qdisc), if
> new:
> *(skb->dev->ingress_queue), if, *(rxq->qdisc), if, if
>
> This patch provides a foundation to use ingress_l2+cls_bpf to filter
> interesting traffic and mirror small part of it to a different netdev for
> capturing. This approach is significantly faster than traditional af_packet,
> since skb_clone is called after filtering. dhclient and other tap-based tools
> may consider switching to this style.
>
Why the flag needs to be visible to user-space? IOW, why users
should learn the knowledge of the need of this flag?
The problem you try to solve is kernel's implementation problem,
users should not care, so we should make some effort to make ingress
align with egress, rather than making more differences.
The more I look at you patch, no matter where you put that flag into,
the more I think it is a workaround. _Maybe_ it fits for short-term,
but you introduce a new API which can't be removed once merged.
This issue should have been there for a rather long time, so it is not
urgent, take some effect to make ingress align with egress, we would
get more benefits.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists