[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <vbfimq88bx5.fsf@mellanox.com>
Date: Wed, 4 Sep 2019 12:48:49 +0000
From: Vlad Buslov <vladbu@...lanox.com>
To: Pablo Neira Ayuso <pablo@...filter.org>
CC: "netfilter-devel@...r.kernel.org" <netfilter-devel@...r.kernel.org>,
"davem@...emloft.net" <davem@...emloft.net>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"jakub.kicinski@...ronome.com" <jakub.kicinski@...ronome.com>,
"jiri@...nulli.us" <jiri@...nulli.us>,
Saeed Mahameed <saeedm@...lanox.com>,
"vishal@...lsio.com" <vishal@...lsio.com>,
Vlad Buslov <vladbu@...lanox.com>
Subject: Re: [PATCH net-next,v2 3/4] net: flow_offload: mangle action at byte
level
On Tue 03 Sep 2019 at 19:45, Pablo Neira Ayuso <pablo@...filter.org> wrote:
> The flow mangle action is originally modeled after the tc pedit action,
> this has a number of shortcomings:
>
> 1) The tc pedit offset must be set on the 32-bits boundaries. Many
> protocol header field offsets are not aligned to 32-bits, eg. port
> destination, port source and ethernet destination. This patch adjusts
> the offset accordingly and trim off length in these case, so drivers get
> an exact offset and length to the header fields.
>
> 2) The maximum mangle length is one word of 32-bits, hence you need to
> up to four actions to mangle an IPv6 address. This patch coalesces
> consecutive tc pedit actions into one single action so drivers can
> configure the IPv6 mangling in one go. Ethernet address fields now
> require one single action instead of two too.
>
> The following drivers have been updated accordingly to use this new
> mangle action layout:
>
> 1) The cxgb4 driver does not need to split protocol field matching
> larger than one 32-bit words into multiple definitions. Instead one
> single definition per protocol field is enough. Checking for
> transport protocol ports is also simplified.
>
> 2) The mlx5 driver logic to disallow IPv4 ttl and IPv6 hoplimit fields
> becomes more simple too.
>
> 3) The nfp driver uses the nfp_fl_set_helper() function to configure the
> payload mangling. The memchr_inv() function is used to check for
> proper initialization of the value and mask. The driver has been
> updated to refer to the exact protocol header offsets too.
>
> As a result, this patch reduces code complexity on the driver side at
> the cost of adding ~100 LOC at the core to perform offset and length
> adjustment; and to coalesce consecutive actions.
>
> Signed-off-by: Pablo Neira Ayuso <pablo@...filter.org>
> ---
> .../net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c | 162 +++++-----------
> .../net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.h | 40 ++--
> drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 90 +++------
> drivers/net/ethernet/netronome/nfp/flower/action.c | 203 ++++++++++-----------
> include/net/flow_offload.h | 7 +-
> net/sched/cls_api.c | 145 ++++++++++++---
> 6 files changed, 309 insertions(+), 338 deletions(-)
[...]
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
> index f29895b3a947..b7b88bc22cf7 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
> @@ -2201,19 +2201,24 @@ static int pedit_header_offsets[] = {
>
> #define pedit_header(_ph, _htype) ((void *)(_ph) + pedit_header_offsets[_htype])
>
> -static int set_pedit_val(u8 hdr_type, u32 mask, u32 val, u32 offset,
> +static int set_pedit_val(u8 hdr_type, const struct flow_action_entry *act,
> struct pedit_headers_action *hdrs)
> {
> - u32 *curr_pmask, *curr_pval;
> + u32 offset = act->mangle.offset;
> + u8 *curr_pmask, *curr_pval;
> + int i;
>
> - curr_pmask = (u32 *)(pedit_header(&hdrs->masks, hdr_type) + offset);
> - curr_pval = (u32 *)(pedit_header(&hdrs->vals, hdr_type) + offset);
> + curr_pmask = (u8 *)(pedit_header(&hdrs->masks, hdr_type) + offset);
> + curr_pval = (u8 *)(pedit_header(&hdrs->vals, hdr_type) + offset);
>
> - if (*curr_pmask & mask) /* disallow acting twice on the same location */
> - goto out_err;
> + for (i = 0; i < act->mangle.len; i++) {
> + /* disallow acting twice on the same location */
> + if (curr_pmask[i] & act->mangle.mask[i])
> + goto out_err;
>
> - *curr_pmask |= mask;
> - *curr_pval |= val;
> + curr_pmask[i] |= act->mangle.mask[i];
> + curr_pval[i] |= act->mangle.val[i];
> + }
>
> return 0;
>
> @@ -2487,7 +2492,6 @@ static int parse_tc_pedit_action(struct mlx5e_priv *priv,
> {
> u8 cmd = (act->id == FLOW_ACTION_MANGLE) ? 0 : 1;
> int err = -EOPNOTSUPP;
> - u32 mask, val, offset;
> u8 htype;
>
> htype = act->mangle.htype;
> @@ -2504,11 +2508,7 @@ static int parse_tc_pedit_action(struct mlx5e_priv *priv,
> goto out_err;
> }
>
> - mask = act->mangle.mask;
> - val = act->mangle.val;
> - offset = act->mangle.offset;
> -
> - err = set_pedit_val(htype, mask, val, offset, &hdrs[cmd]);
> + err = set_pedit_val(htype, act, &hdrs[cmd]);
> if (err)
> goto out_err;
>
> @@ -2589,50 +2589,18 @@ static bool csum_offload_supported(struct mlx5e_priv *priv,
> return true;
> }
>
> -struct ip_ttl_word {
> - __u8 ttl;
> - __u8 protocol;
> - __sum16 check;
> -};
> -
> -struct ipv6_hoplimit_word {
> - __be16 payload_len;
> - __u8 nexthdr;
> - __u8 hop_limit;
> -};
> -
> static bool is_action_keys_supported(const struct flow_action_entry *act)
> {
> - u32 mask, offset;
> - u8 htype;
> + u32 offset = act->mangle.offset;
> + u8 htype = act->mangle.htype;
>
> - htype = act->mangle.htype;
> - offset = act->mangle.offset;
> - mask = act->mangle.mask;
> - /* For IPv4 & IPv6 header check 4 byte word,
> - * to determine that modified fields
> - * are NOT ttl & hop_limit only.
> - */
> - if (htype == FLOW_ACT_MANGLE_HDR_TYPE_IP4) {
> - struct ip_ttl_word *ttl_word =
> - (struct ip_ttl_word *)&mask;
> -
> - if (offset != offsetof(struct iphdr, ttl) ||
> - ttl_word->protocol ||
> - ttl_word->check) {
> - return true;
> - }
> - } else if (htype == FLOW_ACT_MANGLE_HDR_TYPE_IP6) {
> - struct ipv6_hoplimit_word *hoplimit_word =
> - (struct ipv6_hoplimit_word *)&mask;
> -
> - if (offset != offsetof(struct ipv6hdr, payload_len) ||
> - hoplimit_word->payload_len ||
> - hoplimit_word->nexthdr) {
> - return true;
> - }
> - }
> - return false;
> + if ((htype == FLOW_ACT_MANGLE_HDR_TYPE_IP4 &&
> + offset == offsetof(struct iphdr, ttl)) ||
> + (htype == FLOW_ACT_MANGLE_HDR_TYPE_IP6 &&
> + offset == offsetof(struct ipv6hdr, hop_limit)))
> + return false;
> +
> + return true;
> }
With this change is_action_keys_supported() incorrectly returns true for
non-IP{4|6} mangles. I guess naming of the functions doesn't help
because it should be something like is_action_iphdr_keys_supported()...
Anyway, this results following rule to be incorrectly rejected by
driver:
tc filter add dev ens1f0_0 protocol ip parent ffff: prio 3
flower dst_mac e4:1d:2d:fd:8b:02 skip_sw
action pedit ex munge eth src set 11:22:33:44:55:66 munge eth dst set
aa:bb:cc:dd:ee:ff pipe
action csum ip pipe
action tunnel_key set id 98 src_ip 2.2.2.2 dst_ip 2.2.2.3 dst_port 1234
action mirred egress redirect dev vxlan1
The pedit action is rejected by conditional that follows the loop in
modify_header_match_supported() which calls is_action_keys_supported().
With this change modify_ip_header==true (even though the pedit only
modifies eth header), which causes failure because ip proto is not
supported:
Error: mlx5_core: can't offload re-write of non TCP/UDP.
ERROR: [ 3345.830338] can't offload re-write of ip proto 0
Powered by blists - more mailing lists