netdev - Re: [PATCH RFC net-next] net/sched: act_police: add support for packet-per-second policing

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20210127110222.GA29081@netronome.com>
Date:   Wed, 27 Jan 2021 12:02:23 +0100
From:   Simon Horman <simon.horman@...ronome.com>
To:     Jakub Kicinski <kuba@...nel.org>
Cc:     Jamal Hadi Salim <jhs@...atatu.com>,
        Cong Wang <xiyou.wangcong@...il.com>,
        Jiri Pirko <jiri@...lanox.com>, netdev@...r.kernel.org,
        oss-drivers@...ronome.com,
        Baowen Zheng <baowen.zheng@...igine.com>,
        Louis Peens <louis.peens@...ronome.com>
Subject: Re: [PATCH RFC net-next] net/sched: act_police: add support for
 packet-per-second policing

Hi Jakub,

On Tue, Jan 26, 2021 at 06:38:12PM -0800, Jakub Kicinski wrote:
> On Mon, 25 Jan 2021 16:18:19 +0100 Simon Horman wrote:
> > From: Baowen Zheng <baowen.zheng@...igine.com>
> > 
> > Allow a policer action to enforce a rate-limit based on packets-per-second,
> > configurable using a packet-per-second rate and burst parameters. This may
> > be used in conjunction with existing byte-per-second rate limiting in the
> > same policer action.
> > 
> > e.g.
> > tc filter add dev tap1 parent ffff: u32 match \
> >               u32 0 0 police pkts_rate 3000 pkts_burst 1000
> > 
> > Testing was unable to uncover a performance impact of this change on
> > existing features.
> > 
> > Signed-off-by: Baowen Zheng <baowen.zheng@...igine.com>
> > Signed-off-by: Simon Horman <simon.horman@...ronome.com>
> > Signed-off-by: Louis Peens <louis.peens@...ronome.com>
> 
> > diff --git a/net/sched/act_police.c b/net/sched/act_police.c
> > index 8d8452b1cdd4..d700b2105535 100644
> > --- a/net/sched/act_police.c
> > +++ b/net/sched/act_police.c
> > @@ -42,6 +42,8 @@ static const struct nla_policy police_policy[TCA_POLICE_MAX + 1] = {
> >  	[TCA_POLICE_RESULT]	= { .type = NLA_U32 },
> >  	[TCA_POLICE_RATE64]     = { .type = NLA_U64 },
> >  	[TCA_POLICE_PEAKRATE64] = { .type = NLA_U64 },
> > +	[TCA_POLICE_PKTRATE64]  = { .type = NLA_U64 },
> > +	[TCA_POLICE_PKTBURST64] = { .type = NLA_U64 },
> 
> Should we set the policy so that .min = 1?

Yes, I think so.
Thanks for spotting that.

> >  };
> >  
> >  static int tcf_police_init(struct net *net, struct nlattr *nla,
> > @@ -61,6 +63,7 @@ static int tcf_police_init(struct net *net, struct nlattr *nla,
> >  	bool exists = false;
> >  	u32 index;
> >  	u64 rate64, prate64;
> > +	u64 pps, ppsburst;
> >  
> >  	if (nla == NULL)
> >  		return -EINVAL;
> > @@ -183,6 +186,16 @@ static int tcf_police_init(struct net *net, struct nlattr *nla,
> >  	if (tb[TCA_POLICE_AVRATE])
> >  		new->tcfp_ewma_rate = nla_get_u32(tb[TCA_POLICE_AVRATE]);
> >  
> > +	if (tb[TCA_POLICE_PKTRATE64] && tb[TCA_POLICE_PKTBURST64]) {
> 
> Should we reject if only one is present?

Again, yes I think so.
I'll confirm this with the author too.

> > +		pps = nla_get_u64(tb[TCA_POLICE_PKTRATE64]);
> > +		ppsburst = nla_get_u64(tb[TCA_POLICE_PKTBURST64]);
> > +		if (pps) {
> > +			new->pps_present = true;
> > +			new->tcfp_pkt_burst = PSCHED_TICKS2NS(ppsburst);
> > +			psched_ppscfg_precompute(&new->ppsrate, pps);
> > +		}
> > +	}
> > +
> >  	spin_lock_bh(&police->tcf_lock);
> >  	spin_lock_bh(&police->tcfp_lock);
> >  	police->tcfp_t_c = ktime_get_ns();
> 
> > +void psched_ppscfg_precompute(struct psched_pktrate *r,
> > +			      u64 pktrate64)
> > +{
> > +	memset(r, 0, sizeof(*r));
> > +	r->rate_pkts_ps = pktrate64;
> > +	r->mult = 1;
> > +	/* The deal here is to replace a divide by a reciprocal one
> > +	 * in fast path (a reciprocal divide is a multiply and a shift)
> > +	 *
> > +	 * Normal formula would be :
> > +	 *  time_in_ns = (NSEC_PER_SEC * pkt_num) / pktrate64
> > +	 *
> > +	 * We compute mult/shift to use instead :
> > +	 *  time_in_ns = (len * mult) >> shift;
> > +	 *
> > +	 * We try to get the highest possible mult value for accuracy,
> > +	 * but have to make sure no overflows will ever happen.
> > +	 */
> > +	if (r->rate_pkts_ps > 0) {
> > +		u64 factor = NSEC_PER_SEC;
> > +
> > +		for (;;) {
> > +			r->mult = div64_u64(factor, r->rate_pkts_ps);
> > +			if (r->mult & (1U << 31) || factor & (1ULL << 63))
> > +				break;
> > +			factor <<= 1;
> > +			r->shift++;
> 
> Aren't there helpers somewhere for the reciprocal divide
> pre-calculation?

Now that you mention it, yes.

Looking over reciprocal_divide() I don't think it a good fit here as it
operates on 32bit values, whereas the packet rate is 64 bit.

Packet rate could be changed to a 32 bit entity if we convince ourselves we
don't want more than 2^32 - 1 packets per second (a plausible position
IMHO) - but that leads us to a secondary issue.

The code above is very similar to an existing (long existing)
byte rate variant of this helper - psched_ratecfg_precompute().
And I do think we want to:
a) Support 64-bit byte rates. Indeed such support seems to have
   been added to support 25G use-cases
b) Calculate byte and packet rates the same way

So I feel less and less that reciprocal_divide() is a good fit.
But perhaps I am mistaken.

In the meantime I will take a look to see if a helper common function can
be made to do (64 bit) reciprocal divides for the packet and byte rate
use-cases.  I.e. the common code in psched_ppscfg_precompute() and
psched_ratecfg_precompute().

> > +		}
> > +	}
> > +}
> > +EXPORT_SYMBOL(psched_ppscfg_precompute);
> 
>