[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170417081936.GA1892@nanopsycho.orion>
Date: Mon, 17 Apr 2017 10:19:36 +0200
From: Jiri Pirko <jiri@...nulli.us>
To: Jamal Hadi Salim <jhs@...atatu.com>
Cc: davem@...emloft.net, netdev@...r.kernel.org,
eric.dumazet@...il.com, xiyou.wangcong@...il.com
Subject: Re: [PATCH net-next 1/1] net sched actions: dump more than
TCA_ACT_MAX_PRIO actions per batch
Sun, Apr 16, 2017 at 02:34:30PM CEST, jhs@...atatu.com wrote:
>From: Jamal Hadi Salim <jhs@...atatu.com>
>
>When you dump hundreds of thousands of actions, getting only 32 per
>dump batch even when the socket buffer and memory allocations allow
>is inefficient.
>
>With this change, the user will get as many as possibly fitting
>within the given constraints.
>
>We reuse the pad fields in tcamsg. pad1 is used as a flag space.
>User explicitly requests for the large dump in order to maintain
>backwards compatibility with user space by setting bit
>ACT_LARGE_DUMP_ON; older user space which doesnt set this flag
>doesnt get the large (than 32) batches - so continues to work
>as before. Older kernels ignore this flag, so legacy behavior
>is maintained.
>The kernel uses pad2 to tell the user how many actions are put in
>a single batch. As such user space app(like tc) knows how long
>to iterate instead of hardcoded maximum of 32.
>
>Some results dumping 1.5M actions, first unpatched tc which the
>kernel doesnt help:
>
>prompt$ time -p tc actions ls action gact | grep index | wc -l
>1500000
>real 1388.43
>user 2.07
>sys 1386.79
>
>Now patched tc which sets the correct flags when requesting a dump:
>
>prompt$ time -p updatedtc actions ls action gact | grep index | wc -l
>1500000
>real 178.13
>user 2.02
>sys 176.96
>
>Improvement: Dump time from about 20 minutes to about 2 minutes.
>
>Signed-off-by: Jamal Hadi Salim <jhs@...atatu.com>
>---
> include/uapi/linux/rtnetlink.h | 7 +++++++
> net/sched/act_api.c | 17 ++++++++++++++---
> 2 files changed, 21 insertions(+), 3 deletions(-)
>
>diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h
>index cce0613..3434d98 100644
>--- a/include/uapi/linux/rtnetlink.h
>+++ b/include/uapi/linux/rtnetlink.h
>@@ -678,6 +678,13 @@ struct tcamsg {
> #define TA_PAYLOAD(n) NLMSG_PAYLOAD(n,sizeof(struct tcamsg))
> #define TCA_ACT_TAB 1 /* attr type must be >=1 */
> #define TCAA_MAX 1
>+/* tcamsg flags stored in tca__pad
>+ *
>+ * ACT_LARGE_DUMP_ON user->kernel to request for larger than TCA_ACT_MAX_PRIO
>+ * actions in a dump. All dump responses will contain the number of actions
>+ * being dumped stored in tca__pad2 for user app's info
>+ */
>+#define ACT_LARGE_DUMP_ON (1 << 0)
>
> /* New extended info filters for IFLA_EXT_MASK */
> #define RTEXT_FILTER_VF (1 << 0)
>diff --git a/net/sched/act_api.c b/net/sched/act_api.c
>index 79d875c..90cc774 100644
>--- a/net/sched/act_api.c
>+++ b/net/sched/act_api.c
>@@ -83,6 +83,7 @@ static int tcf_dump_walker(struct tcf_hashinfo *hinfo, struct sk_buff *skb,
> struct netlink_callback *cb)
> {
> int err = 0, index = -1, i = 0, s_i = 0, n_i = 0;
>+ unsigned short act_flags = cb->args[2];
> struct nlattr *nest;
>
> spin_lock_bh(&hinfo->lock);
>@@ -111,14 +112,18 @@ static int tcf_dump_walker(struct tcf_hashinfo *hinfo, struct sk_buff *skb,
> }
> nla_nest_end(skb, nest);
> n_i++;
>- if (n_i >= TCA_ACT_MAX_PRIO)
>+ if (!(act_flags & ACT_LARGE_DUMP_ON) &&
>+ n_i >= TCA_ACT_MAX_PRIO)
> goto done;
> }
> }
> done:
> spin_unlock_bh(&hinfo->lock);
>- if (n_i)
>+ if (n_i) {
> cb->args[0] += n_i;
>+ if (act_flags & ACT_LARGE_DUMP_ON)
>+ cb->args[1] = n_i;
>+ }
> return n_i;
>
> nla_put_failure:
>@@ -1081,6 +1086,7 @@ static int tc_dump_action(struct sk_buff *skb, struct netlink_callback *cb)
> struct tc_action_ops *a_o;
> int ret = 0;
> struct tcamsg *t = (struct tcamsg *) nlmsg_data(cb->nlh);
>+ unsigned char act_flags = t->tca__pad1;
> struct nlattr *kind = find_dump_kind(cb->nlh);
>
> if (kind == NULL) {
>@@ -1096,9 +1102,12 @@ static int tc_dump_action(struct sk_buff *skb, struct netlink_callback *cb)
> cb->nlh->nlmsg_type, sizeof(*t), 0);
> if (!nlh)
> goto out_module_put;
>+
>+ cb->args[2] = act_flags;
>+
> t = nlmsg_data(nlh);
> t->tca_family = AF_UNSPEC;
>- t->tca__pad1 = 0;
>+ t->tca__pad1 = act_flags;
I don't like this one bit. Pad is no longer a pad but by name it still
is. Why don't you introduce new attributes for this?
> t->tca__pad2 = 0;
>
> nest = nla_nest_start(skb, TCA_ACT_TAB);
>@@ -1112,6 +1121,8 @@ static int tc_dump_action(struct sk_buff *skb, struct netlink_callback *cb)
> if (ret > 0) {
> nla_nest_end(skb, nest);
> ret = skb->len;
>+ t->tca__pad2 = cb->args[1];
>+ cb->args[1] = 0;
> } else
> nlmsg_trim(skb, b);
>
>--
>1.9.1
>
Powered by blists - more mailing lists