[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAOrHB_DHAjUp8Mf1_d-uKiiJPqUmwOvQDCTveFuvEhj9oJ-DhQ@mail.gmail.com>
Date: Mon, 23 Apr 2018 23:30:59 -0700
From: Pravin Shelar <pshelar@....org>
To: Yi-Hung Wei <yihung.wei@...il.com>
Cc: Linux Kernel Network Developers <netdev@...r.kernel.org>
Subject: Re: [PATCH net-next v2 2/2] openvswitch: Support conntrack zone limit
On Tue, Apr 17, 2018 at 5:30 PM, Yi-Hung Wei <yihung.wei@...il.com> wrote:
> Currently, nf_conntrack_max is used to limit the maximum number of
> conntrack entries in the conntrack table for every network namespace.
> For the VMs and containers that reside in the same namespace,
> they share the same conntrack table, and the total # of conntrack entries
> for all the VMs and containers are limited by nf_conntrack_max. In this
> case, if one of the VM/container abuses the usage the conntrack entries,
> it blocks the others from committing valid conntrack entries into the
> conntrack table. Even if we can possibly put the VM in different network
> namespace, the current nf_conntrack_max configuration is kind of rigid
> that we cannot limit different VM/container to have different # conntrack
> entries.
>
> To address the aforementioned issue, this patch proposes to have a
> fine-grained mechanism that could further limit the # of conntrack entries
> per-zone. For example, we can designate different zone to different VM,
> and set conntrack limit to each zone. By providing this isolation, a
> mis-behaved VM only consumes the conntrack entries in its own zone, and
> it will not influence other well-behaved VMs. Moreover, the users can
> set various conntrack limit to different zone based on their preference.
>
> The proposed implementation utilizes Netfilter's nf_conncount backend
> to count the number of connections in a particular zone. If the number of
> connection is above a configured limitation, ovs will return ENOMEM to the
> userspace. If userspace does not configure the zone limit, the limit
> defaults to zero that is no limitation, which is backward compatible to
> the behavior without this patch.
>
> The following high leve APIs are provided to the userspace:
> - OVS_CT_LIMIT_CMD_SET:
> * set default connection limit for all zones
> * set the connection limit for a particular zone
> - OVS_CT_LIMIT_CMD_DEL:
> * remove the connection limit for a particular zone
> - OVS_CT_LIMIT_CMD_GET:
> * get the default connection limit for all zones
> * get the connection limit for a particular zone
>
> Signed-off-by: Yi-Hung Wei <yihung.wei@...il.com>
> ---
> net/openvswitch/Kconfig | 3 +-
> net/openvswitch/conntrack.c | 498 +++++++++++++++++++++++++++++++++++++++++++-
> net/openvswitch/conntrack.h | 9 +-
> net/openvswitch/datapath.c | 7 +-
> net/openvswitch/datapath.h | 1 +
> 5 files changed, 512 insertions(+), 6 deletions(-)
>
> diff --git a/net/openvswitch/Kconfig b/net/openvswitch/Kconfig
> index 2650205cdaf9..89da9512ec1e 100644
> --- a/net/openvswitch/Kconfig
> +++ b/net/openvswitch/Kconfig
> @@ -9,7 +9,8 @@ config OPENVSWITCH
> (NF_CONNTRACK && ((!NF_DEFRAG_IPV6 || NF_DEFRAG_IPV6) && \
> (!NF_NAT || NF_NAT) && \
> (!NF_NAT_IPV4 || NF_NAT_IPV4) && \
> - (!NF_NAT_IPV6 || NF_NAT_IPV6)))
> + (!NF_NAT_IPV6 || NF_NAT_IPV6) && \
> + (!NETFILTER_CONNCOUNT || NETFILTER_CONNCOUNT)))
> select LIBCRC32C
> select MPLS
> select NET_MPLS_GSO
> diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
> index c5904f629091..d09b572f72b4 100644
> --- a/net/openvswitch/conntrack.c
> +++ b/net/openvswitch/conntrack.c
> @@ -17,7 +17,9 @@
> #include <linux/udp.h>
> #include <linux/sctp.h>
> #include <net/ip.h>
> +#include <net/genetlink.h>
> #include <net/netfilter/nf_conntrack_core.h>
> +#include <net/netfilter/nf_conntrack_count.h>
> #include <net/netfilter/nf_conntrack_helper.h>
> #include <net/netfilter/nf_conntrack_labels.h>
> #include <net/netfilter/nf_conntrack_seqadj.h>
> @@ -76,6 +78,38 @@ struct ovs_conntrack_info {
> #endif
> };
>
> +#if IS_ENABLED(CONFIG_NETFILTER_CONNCOUNT)
> +#define OVS_CT_LIMIT_UNLIMITED 0
> +#define OVS_CT_LIMIT_DEFAULT OVS_CT_LIMIT_UNLIMITED
> +#define CT_LIMIT_HASH_BUCKETS 512
> +
Can you use static key when the limit is not set.
This would avoid overhead in datapath when these limits are not used.
> +struct ovs_ct_limit {
> + /* Elements in ovs_ct_limit_info->limits hash table */
> + struct hlist_node hlist_node;
> + struct rcu_head rcu;
> + u16 zone;
> + u32 limit;
> +};
> +
...
> +#endif
> +
> /* Lookup connection and confirm if unconfirmed. */
> static int ovs_ct_commit(struct net *net, struct sw_flow_key *key,
> const struct ovs_conntrack_info *info,
> @@ -1054,6 +1176,13 @@ static int ovs_ct_commit(struct net *net, struct sw_flow_key *key,
> if (!ct)
> return 0;
>
> +#if IS_ENABLED(CONFIG_NETFILTER_CONNCOUNT)
> + err = ovs_ct_check_limit(net, info,
> + &ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple);
> + if (err)
> + return err;
> +#endif
> +
This could be checked during flow install time, so that only permitted
flows would have 'ct commit' action, we can avoid per packet cost
checking the limit.
returning error code form ovs_ct_commit() is lost in datapath and it
would be hard to debug packet lost in case of the limit is reached. So
another advantage of checking the limit in flow install be better
traceability. datapath would return error to usespace and it can log
the error code.
Powered by blists - more mailing lists