netdev - Re: [PATCHv3 net-next] sched: add dualpi2 scheduler module

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c4901510-385f-7b3f-1334-30f83f114572@gmail.com>
Date:   Thu, 28 Mar 2019 02:35:38 -0700
From:   Eric Dumazet <eric.dumazet@...il.com>
To:     Olga Albisser <olgabnd@...il.com>, netdev@...r.kernel.org
Cc:     Olga Albisser <olga@...isser.org>,
        Koen De Schepper <koen.de_schepper@...ia-bell-labs.com>,
        Oliver Tilmans <olivier.tilmans@...ia-bell-labs.com>,
        Bob Briscoe <research@...briscoe.net>,
        Henrik Steen <henrist@...rist.net>
Subject: Re: [PATCHv3 net-next] sched: add dualpi2 scheduler module



On 03/28/2019 01:12 AM, Olga Albisser wrote:
> DUALPI2 provides extremely low latency & loss to traffic that uses a
> scalable congestion controller (e.g. L4S, DCTCP) without degrading the
> performance of 'classic' traffic (e.g. Reno, Cubic etc.). It is intended
> to be the reference implementation of the IETF's DualQ Coupled AQM.
> 
> The qdisc provides two queues called low latency and classic. It
> classifies packets based on the ECN field in their IP headers. By default
> it directs non-ECN and ECT(0) into the Classic queue and ECT(1) and CE
> into the low latency queue, as per the IETF spec.
> 
> There is an AQM in each queue.
> * The Classic AQM is called PI2, which is similar to the PIE AQM but more
> responsive and simpler. Classic traffic requires a decent target queue
> (default 15ms for Internet deployment) to fully utilize the link.
> * The low latency AQM is, by default, a simple very shallow ECN marking
> threshold similar to that used for DCTCP.
> 
> The DualQ isolates the extremely low queuing delay of the Low Latency
> queue from the larger delay of the 'Classic' queue. However, from a
> bandwidth perspective, flows in either queue will share out the link
> capacity as if there was just a single queue. This bandwidth pooling
> effect is achieved by coupling together the drop and ECN-marking
> probabilities of the two AQMs.
> 
> The PI2 AQM has two main parameters in addition to its target delay. All
> the defaults are suitable for any Internet setting, but it can be
> reconfigured for a Data Centre setting. The integral gain factor alpha is
> used to slowly correct any persistent standing queue error from the
> target delay, while the proportional gain factor beta is used to quickly
> compensate for queue changes (growth or shrinkage).
> 
> Internally, the output of a simple linear Proportional Integral (PI)
> controller is used for both queues. This output is squared to calculate
> the drop or ECN-marking probability of the classic queue. This
> counterbalances the square-root rate equation of Reno/Cubic, which is the
> trick that balances flow rates across the queues. For the ECN-marking
> probability of the low latency queue, the output of the base AQM is
> multiplied by a coupling parameter k . This determines the balance
> between the flow rates in each queue. The default setting makes the flow
> rates roughly equal, which should be generally applicable.
> 
> If DUALPI2 AQM has detected overload (when excessive non-responsive
> traffic is sent), it will switch to signalling congestion solely using
> drop, irrespective of the ECN field, or alternatively it can be
> configured to limit the drop probability and let the queue grow and
> eventually overflow (like tail-drop).
> 

What changes are in v3 exactly ?

Please include the changes in changelog to ease code review.

Otherwise we have to look again at the whole thing.

More comments inline

> Additional details can be found in the draft:
> https://www.ietf.org/id/draft-ietf-tsvwg-aqm-dualq-coupled
> 
> Signed-off-by: Olga Albisser <olga@...isser.org>
> Signed-off-by: Koen De Schepper <koen.de_schepper@...ia-bell-labs.com>
> Signed-off-by: Oliver Tilmans <olivier.tilmans@...ia-bell-labs.com>
> Signed-off-by: Bob Briscoe <research@...briscoe.net>
> Signed-off-by: Henrik Steen <henrist@...rist.net>
> ---
>  include/uapi/linux/pkt_sched.h |  33 ++
>  net/sched/Kconfig              |  18 +
>  net/sched/Makefile             |   1 +
>  net/sched/sch_dualpi2.c        | 688 +++++++++++++++++++++++++++++++++
>  4 files changed, 740 insertions(+)
>  create mode 100644 net/sched/sch_dualpi2.c
> 
> diff --git a/include/uapi/linux/pkt_sched.h b/include/uapi/linux/pkt_sched.h
> index 7ee74c3474bf..b845274937f2 100644
> --- a/include/uapi/linux/pkt_sched.h
> +++ b/include/uapi/linux/pkt_sched.h
> @@ -1161,4 +1161,37 @@ enum {
>  
>  #define TCA_TAPRIO_ATTR_MAX (__TCA_TAPRIO_ATTR_MAX - 1)
>  
> +/* DUALPI2 */
> +enum {
> +	TCA_DUALPI2_UNSPEC,
> +	TCA_DUALPI2_ALPHA,
> +	TCA_DUALPI2_BETA,
> +	TCA_DUALPI2_DUALQ,
> +	TCA_DUALPI2_ECN,
> +	TCA_DUALPI2_K,
> +	TCA_DUALPI2_L_DROP,
> +	TCA_DUALPI2_ET_PACKETS,
> +	TCA_DUALPI2_L_THRESH,
> +	TCA_DUALPI2_LIMIT,
> +	TCA_DUALPI2_T_SHIFT,
> +	TCA_DUALPI2_T_SPEED,
> +	TCA_DUALPI2_TARGET,
> +	TCA_DUALPI2_TUPDATE,
> +	TCA_DUALPI2_DROP_EARLY,
> +	TCA_DUALPI2_WRR_RATIO,
> +	__TCA_DUALPI2_MAX
> +};
> +
> +#define TCA_DUALPI2_MAX   (__TCA_DUALPI2_MAX - 1)
> +struct tc_dualpi2_xstats {
> +	__u32 prob;             /* current probability */
> +	__u32 delay_c;          /* current delay in C queue */
> +	__u32 delay_l;          /* current delay in L queue */
> +	__u32 packets_in;       /* total number of packets enqueued */
> +	__u32 dropped;          /* packets dropped due to pie_action */
> +	__u32 overlimit;        /* dropped due to lack of space in queue */
> +	__u32 maxq;             /* maximum queue size */
> +	__u32 ecn_mark;         /* packets marked with ecn*/
> +};
> +
>  #endif
> diff --git a/net/sched/Kconfig b/net/sched/Kconfig
> index 1b9afdee5ba9..0b0fb11b8c72 100644
> --- a/net/sched/Kconfig
> +++ b/net/sched/Kconfig
> @@ -409,6 +409,24 @@ config NET_SCH_PLUG
>  	  To compile this code as a module, choose M here: the
>  	  module will be called sch_plug.
>  
> +config NET_SCH_DUALPI2
> +        tristate "Dual Queue Proportional Integral Controller Improved with a Square (DUALPI2)"
> +        help
> +
> +	  Say Y here if you want to use the Dual Queue Proportional Integral Controller AQM -
> +          Improved with a square (DUALPI2).
> +          DUALPI2 AQM is a combination of the DUALQ Coupled-AQM with a PI2 base-AQM. The PI2 AQM
> +          is in turn both an extension and a simplification of the PIE AQM. PI2 makes quite some
> +          PIE heuristics unnecessary, while being able to control scalable congestion controls
> +          like DCTCP and TCP-Prague. With PI2, both Reno/Cubic can be used in parallel with DCTCP,
> +          maintaining window fairness. DUALQ provides latency separation between low latency
> +          DCTCP flows and Reno/Cubic flows that need a bigger queue.
> +          For more information, please see
> +          https://www.ietf.org/id/draft-ietf-tsvwg-aqm-dualq-coupled
> +
> +          To compile this driver as a module, choose M here: the module
> +          will be called sch_dualpi2.
> +
>  menuconfig NET_SCH_DEFAULT
>  	bool "Allow override default queue discipline"
>  	---help---
> diff --git a/net/sched/Makefile b/net/sched/Makefile
> index 8a40431d7b5c..383eb9dd1fcc 100644
> --- a/net/sched/Makefile
> +++ b/net/sched/Makefile
> @@ -58,6 +58,7 @@ obj-$(CONFIG_NET_SCH_PIE)	+= sch_pie.o
>  obj-$(CONFIG_NET_SCH_CBS)	+= sch_cbs.o
>  obj-$(CONFIG_NET_SCH_ETF)	+= sch_etf.o
>  obj-$(CONFIG_NET_SCH_TAPRIO)	+= sch_taprio.o
> +obj-$(CONFIG_NET_SCH_DUALPI2)   += sch_dualpi2.o
>  
>  obj-$(CONFIG_NET_CLS_U32)	+= cls_u32.o
>  obj-$(CONFIG_NET_CLS_ROUTE4)	+= cls_route.o
> diff --git a/net/sched/sch_dualpi2.c b/net/sched/sch_dualpi2.c
> new file mode 100644
> index 000000000000..6b696a546492
> --- /dev/null
> +++ b/net/sched/sch_dualpi2.c
> @@ -0,0 +4,691 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/* Copyright (C) 2018 Nokia.
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License
> + * as published by the Free Software Foundation; either version 2
> + * of the License.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * Author: Koen De Schepper <koen.de_schepper@...ia-bell-labs.com>
> + * Author: Olga Albisser <olga@...isser.org>
> + * Author: Henrik Steen <henrist@...rist.net>
> + * Author: Olivier Tilmans <olivier.tilmans@...ia-bell-labs.com>
> + *
> + * DualPI Improved with a Square (dualpi2)
> + * Supports controlling scalable congestion controls (DCTCP, etc...)
> + * Supports DualQ with PI2
> + * Supports L4S ECN identifier
> + *
> + * References:
> + * IETF draft submission:
> + *   http://tools.ietf.org/html/draft-ietf-tsvwg-aqm-dualq-coupled-08
> + * ACM CoNEXT’16, Conference on emerging Networking EXperiments
> + * and Technologies :
> + * "PI2: PI Improved with a Square to support Scalable Congestion Controllers"
> + * IETF draft submission:
> + *   http://tools.ietf.org/html/draft-pan-aqm-pie-00
> + * IEEE  Conference on High Performance Switching and Routing 2013 :
> + * "PIE: A * Lightweight Control Scheme to Address the Bufferbloat Problem"
> + * Partially based on the PIE implementation:
> + * Copyright (C) 2013 Cisco Systems, Inc, 2013.
> + * Author: Vijay Subramanian <vijaynsu@...co.com>
> + * Author: Mythili Prabhu <mysuryan@...co.com>
> + * ECN support is added by Naeem Khademi <naeemk@....uio.no>
> + * University of Oslo, Norway.
> + */
> +
> +#include <linux/module.h>
> +#include <linux/slab.h>
> +#include <linux/types.h>
> +#include <linux/kernel.h>
> +#include <linux/errno.h>
> +#include <linux/skbuff.h>
> +#include <linux/version.h>
> +#include <net/pkt_sched.h>
> +#include <net/inet_ecn.h>
> +#include <net/dsfield.h>
> +
> +#define QUEUE_THRESHOLD 10000
> +#define MAX_PROB  0xffffffff
> +
> +/* parameters used */
> +struct dualpi2_params {
> +	psched_time_t	target;	/* user specified target delay in pschedtime */
> +	u64     tshift;         /* L4S FIFO time shift (in ns) */
> +	u32	tupdate;	/* timer frequency (in jiffies) */
> +	u32	limit;		/* number of packets that can be enqueued */
> +	u32	alpha;		/* alpha and beta are user specified values
> +				 * scaled by factor of 256
> +				 */
> +	u32	beta;		/* and are used for shift relative to 1 */
> +	u32	k;		/* coupling rate between Classic and L4S */
> +	u32	queue_mask:2,	/* Mask on ecn bits to determine if packet
> +				 * goes in l-queue
> +				 * 0 (00): single queue
> +				 * 1 (01): dual queue for ECT(1) and CE
> +				 * 3 (11): dual queue for ECT(0), ECT(1) and CE
> +				 *	  (DCTCP compatibility)
> +				 */
> +		mark_mask:2,	/* Mask on ecn bits to determine marking
> +				 * (instead of dropping)
> +				 * 0 (00): no ecn
> +				 * 3 (11): ecn (marking) support
> +				 */
> +		scal_mask:2,	/* Mask on ecn bits to mark p (instead of p^2)
> +				 * 0 (00): no scalable marking
> +				 * 1 (01): scalable marking for ECT(1)
> +				 * 3 (11): scalable marking for ECT(0) and
> +				 *	  ECT(1) (DCTCP compatibility)
> +				 */
> +		et_packets:1,   /* ecn threshold in packets (1) or us (0) */
> +		drop_early:1,	/* Drop at enqueue */
> +		tspeed:16;      /* L4S FIFO time speed (in bit shifts) */
> +	u32	ecn_thresh;	/* sojourn queue size to mark LL packets */
> +	u32	l_drop;		/* L4S max probability where classic drop is
> +				 * applied to all traffic, if 0 then no drop
> +				 * applied at all (but taildrop) to ECT
> +				 * packets
> +				 */
> +	u32     dequeue_ratio;  /* WRR denominator for dequeues */
> +};
> +
> +/* variables used */
> +struct dualpi2_vars {
> +	psched_time_t   qdelay_c;               /* Classic Q delay */
> +	psched_time_t   qdelay_l;               /* L4S Q delay */
> +	u32		prob;			/* probability scaled as u32 */
> +	u32		alpha;			/* calculated alpha value */
> +	u32		beta;			/* calculated beta value */
> +	u32		deferred_drop_count;
> +	u32		deferred_drop_len;
> +	u32             dequeued_l;             /* Successive L4S dequeues */
> +};
> +
> +/* statistics gathering */
> +struct dualpi2_stats {
> +	u32	packets_in;	/* total number of packets enqueued */
> +	u32	dropped;	/* packets dropped due to dualpi2_action */
> +	u32	overlimit;	/* dropped due to lack of space in queue */
> +	u32	maxq;		/* maximum queue size */
> +	u32	ecn_mark;	/* packets marked with ECN */
> +};
> +
> +/* private data for the Qdisc */
> +struct dualpi2_sched_data {
> +	struct Qdisc *l_queue;
> +	struct Qdisc *sch;
> +	struct dualpi2_vars vars;
> +	struct dualpi2_stats stats;
> +	struct dualpi2_params params;
> +	struct timer_list adapt_timer;
> +};
> +
> +static inline void set_ts_cb(struct sk_buff *skb)
> +{
> +	u64 now = ktime_get_mono_fast_ns();
> 

Do not use this variant in a qdisc.

networking code never has to use NMI safe variant.

-> ktime_get_ns() is the preferred version.

More details in Documentation/core-api/timekeeping.rst

+
> +	memcpy(qdisc_skb_cb(skb)->data, &now, sizeof(u64));
> +}
> +
> +static inline u64 get_ts_cb(struct sk_buff *skb)
> +{
> +	u64 ts = 0;
> +
> +	memcpy(&ts, qdisc_skb_cb(skb)->data, sizeof(u64));



> +	return ts;
> +}
> +
> +static inline u64 skb_sojourn_time(struct sk_buff *skb, u64 reference)
> +{
> +	return skb ? reference - get_ts_cb(skb) : 0;
> +}
> +
> +static inline u32 __dualpi2_vars_from_params(u32 param)
> +{
> +	return (param * (MAX_PROB / PSCHED_TICKS_PER_SEC)) >> 8;

Can this overflow ?

> +}
> +
> +static void dualpi2_calculate_alpha_beta(struct dualpi2_sched_data *q)
> +{
> +	/* input alpha and beta should be in multiples of 1/256 */
> +	q->vars.alpha = __dualpi2_vars_from_params(q->params.alpha);
> +	q->vars.beta = __dualpi2_vars_from_params(q->params.beta);
> +}
> +
> +static void dualpi2_params_init(struct dualpi2_params *params)
> +{
> +	params->alpha = 80;
> +	params->beta = 800;
> +	params->tupdate = usecs_to_jiffies(32 * USEC_PER_MSEC);	/* 32 ms */
> +	params->limit = 10000;
> +	params->target = PSCHED_NS2TICKS(20 * NSEC_PER_MSEC);	/* 20 ms */
> +	params->k = 2;
> +	params->queue_mask = INET_ECN_ECT_1;
> +	params->mark_mask = INET_ECN_MASK;
> +	params->scal_mask = INET_ECN_ECT_1;
> +	params->et_packets = 0;
> +	params->ecn_thresh = 1000;
> +	params->tshift = 40 * NSEC_PER_MSEC;
> +	params->tspeed = 0;
> +	params->l_drop = 0;
> +	params->drop_early = false;
> +	params->dequeue_ratio = 16;
> +}
> +
> +static u32 get_ecn_field(struct sk_buff *skb)
> +{
> +	switch (tc_skb_protocol(skb)) {
> +	case htons(ETH_P_IP):
> +		return ipv4_get_dsfield(ip_hdr(skb)) & INET_ECN_MASK;
> +	case htons(ETH_P_IPV6):
> +		return ipv6_get_dsfield(ipv6_hdr(skb)) & INET_ECN_MASK;
> +	default:
> +		return 0;
> +	}
> +}
> +
> +static bool should_drop(struct Qdisc *sch, struct dualpi2_sched_data *q,
> +			u32 ecn, struct sk_buff *skb)
> +{
> +	u32 mtu = psched_mtu(qdisc_dev(sch));
> +	u64 local_l_prob;
> +	bool overload;
> +	u32 rnd;
> +
> +	/* If we have fewer than 2 mtu-sized packets, disable drop,
> +	 * similar to min_th in RED
> +	 */
> +	if (sch->qstats.backlog < 2 * mtu)
> +		return false;
> +
> +	local_l_prob = (u64)q->vars.prob * q->params.k;
> +	overload = q->params.l_drop && local_l_prob > (u64)q->params.l_drop;
> +
> +	rnd = prandom_u32();
> +	if (!overload && (ecn & q->params.scal_mask)) {
> +		/* do scalable marking */
> +		if (rnd < local_l_prob && INET_ECN_set_ce(skb))
> +			/* mark ecn without a square */
> +			q->stats.ecn_mark++;
> +	} else if (rnd < q->vars.prob) {
> +		/* think twice to drop, so roll again */
> +		rnd = prandom_u32();
> +		if (rnd < q->vars.prob) {
> +			if (!overload &&
> +			    (ecn & q->params.mark_mask) &&
> +			    INET_ECN_set_ce(skb))
> +				/* mark ecn with a square */
> +				q->stats.ecn_mark++;
> +			else
> +				return true;
> +		}
> +	}
> +
> +	return false;
> +}
> +
> +static int dualpi2_qdisc_enqueue(struct sk_buff *skb, struct Qdisc *sch,
> +				 struct sk_buff **to_free)
> +{
> +	struct dualpi2_sched_data *q = qdisc_priv(sch);
> +	u32 ecn = get_ecn_field(skb);
> +	int err;
> +
> +	/* set to the time the HTQ packet is in the Q */
> +	set_ts_cb(skb);

Why setting the time if the packet is dropped in the following checks ?

Maybe call set_ts_cb() a bit later.

> +
> +	if (unlikely(qdisc_qlen(sch) >= sch->limit)) {
> +		qdisc_qstats_overlimit(sch);
> +		err = NET_XMIT_DROP;
> +		goto drop;
> +	}
> +
> +	/* drop early if configured */
> +	if (q->params.drop_early && should_drop(sch, q, ecn, skb)) {
> +		err = NET_XMIT_SUCCESS | __NET_XMIT_BYPASS;
> +		goto drop;
> +	}
> +
> +	q->stats.packets_in++;
> +	if (qdisc_qlen(sch) > q->stats.maxq)
> +		q->stats.maxq = qdisc_qlen(sch);
> +
> +	/* decide L4S queue or classic */
> +	if (ecn & q->params.queue_mask) {
> +		sch->q.qlen++; /* otherwise packets are not seen by parent Q */
> +		qdisc_qstats_backlog_inc(sch, skb);
> +		return qdisc_enqueue_tail(skb, q->l_queue);
> +	} else {
> +		return qdisc_enqueue_tail(skb, sch);
> +	}
> +
> +drop:
> +	q->stats.dropped++;
> +	q->vars.deferred_drop_count += 1;
> +	q->vars.deferred_drop_len += qdisc_pkt_len(skb);
> +
> +	qdisc_drop(skb, sch, to_free);
> +	return err;
> +}
> +
> +static const struct nla_policy dualpi2_policy[TCA_DUALPI2_MAX + 1] = {
> +	[TCA_DUALPI2_ALPHA] = {.type = NLA_U32},
> +	[TCA_DUALPI2_BETA] = {.type = NLA_U32},
> +	[TCA_DUALPI2_DUALQ] = {.type = NLA_U32},
> +	[TCA_DUALPI2_ECN] = {.type = NLA_U32},
> +	[TCA_DUALPI2_K] = {.type = NLA_U32},
> +	[TCA_DUALPI2_L_DROP] = {.type = NLA_U32},
> +	[TCA_DUALPI2_ET_PACKETS] = {.type = NLA_U32},
> +	[TCA_DUALPI2_L_THRESH] = {.type = NLA_U32},
> +	[TCA_DUALPI2_LIMIT] = {.type = NLA_U32},
> +	[TCA_DUALPI2_T_SHIFT] = {.type = NLA_U32},
> +	[TCA_DUALPI2_T_SPEED] = {.type = NLA_U16},
> +	[TCA_DUALPI2_TARGET] = {.type = NLA_U32},
> +	[TCA_DUALPI2_TUPDATE] = {.type = NLA_U32},
> +	[TCA_DUALPI2_DROP_EARLY] = {.type = NLA_U32},
> +	[TCA_DUALPI2_WRR_RATIO] = {.type = NLA_U32},
> +};
> +
> +static int dualpi2_change(struct Qdisc *sch, struct nlattr *opt,
> +			  struct netlink_ext_ack *extack)
> +{
> +	struct dualpi2_sched_data *q = qdisc_priv(sch);
> +	struct nlattr *tb[TCA_DUALPI2_MAX + 1];
> +	unsigned int qlen, dropped = 0;
> +	int err;
> +
> +	if (!opt)
> +		return -EINVAL;
> +	err = nla_parse_nested(tb, TCA_DUALPI2_MAX, opt, dualpi2_policy, NULL);
> +	if (err < 0)
> +		return err;
> +
> +	sch_tree_lock(sch);
> +	if (q->l_queue == &noop_qdisc) {
> +		struct Qdisc *child;
> +
> +		child = qdisc_create_dflt(sch->dev_queue, &pfifo_qdisc_ops,
> +					  TC_H_MAKE(sch->handle, 1), extack);
> +		if (child)
> +			q->l_queue = child;
> +	}
> +
> +	if (tb[TCA_DUALPI2_TARGET]) {
> +		u32 target = nla_get_u32(tb[TCA_DUALPI2_TARGET]);
> +
> +		q->params.target = PSCHED_NS2TICKS((u64)target * NSEC_PER_USEC);



For a new qdisc, I would not bother using the legacy PSCHED_NS2TICKS() stuff.

PSCHED_NS2TICKS is deprecated now we have nanosecond resolution almost for free
and more hosts are 64bits.

> +	}
> +
> +	if (tb[TCA_DUALPI2_TUPDATE]) {
> +		u32 tupdate_usecs = nla_get_u32(tb[TCA_DUALPI2_TUPDATE]);
> +
> +		q->params.tupdate = usecs_to_jiffies(tupdate_usecs);
> +	}
> +
> +	if (tb[TCA_DUALPI2_LIMIT]) {
> +		u32 limit = nla_get_u32(tb[TCA_DUALPI2_LIMIT]);
> +
> +		q->params.limit = limit;
> +		sch->limit = limit;
> +	}
> +
> +	if (tb[TCA_DUALPI2_ALPHA])
> +		q->params.alpha = nla_get_u32(tb[TCA_DUALPI2_ALPHA]);

No bound checking on this parameter ?

> +
> +	if (tb[TCA_DUALPI2_BETA])
> +		q->params.beta = nla_get_u32(tb[TCA_DUALPI2_BETA]);

No bound checking on this parameter ?


> +
> +	if (tb[TCA_DUALPI2_DUALQ])
> +		q->params.queue_mask = nla_get_u32(tb[TCA_DUALPI2_DUALQ]);
> +
> +	if (tb[TCA_DUALPI2_ECN]) {
> +		u32 masks = nla_get_u32(tb[TCA_DUALPI2_ECN]);
> +
> +		q->params.mark_mask = (masks >> 2) & INET_ECN_MASK;
> +		q->params.scal_mask = masks & INET_ECN_MASK;
> +	}
> +
> +	if (tb[TCA_DUALPI2_K])
> +		q->params.k = nla_get_u32(tb[TCA_DUALPI2_K]);
> +
> +	if (tb[TCA_DUALPI2_K])
> +		q->params.k = nla_get_u32(tb[TCA_DUALPI2_K]);
> +
> +	if (tb[TCA_DUALPI2_L_THRESH])
> +		/* l_thresh is in us */
> +		q->params.ecn_thresh = nla_get_u32(tb[TCA_DUALPI2_L_THRESH]);
> +
> +	if (tb[TCA_DUALPI2_T_SHIFT]) {
> +		u32 t_shift = nla_get_u32(tb[TCA_DUALPI2_T_SHIFT]);
> +
> +		q->params.tshift = (u64)t_shift * NSEC_PER_USEC;
> +	}
> +
> +	if (tb[TCA_DUALPI2_T_SPEED])
> +		q->params.tspeed = nla_get_u16(tb[TCA_DUALPI2_T_SPEED]);
> +
> +	if (tb[TCA_DUALPI2_L_DROP]) {
> +		u32 l_drop_percent = nla_get_u32(tb[TCA_DUALPI2_L_DROP]);
> +
> +		q->params.l_drop = l_drop_percent * (MAX_PROB / 100);
> +	}
> +
> +	if (tb[TCA_DUALPI2_DROP_EARLY])
> +		q->params.drop_early = nla_get_u32(tb[TCA_DUALPI2_DROP_EARLY]);
> +
> +	if (tb[TCA_DUALPI2_WRR_RATIO])
> +		q->params.dequeue_ratio =
> +			nla_get_u32(tb[TCA_DUALPI2_WRR_RATIO]);
> +
> +	/* Calculate new internal alpha and beta values in case their
> +	 * dependencies are changed
> +	 */
> +	dualpi2_calculate_alpha_beta(q);
> +
> +	/* Drop excess packets if new limit is lower */
> +	qlen = sch->q.qlen;
> +	while (sch->q.qlen > sch->limit) {
> +		struct sk_buff *skb = __qdisc_dequeue_head(&sch->q);
> +
> +		dropped += qdisc_pkt_len(skb);
> +		qdisc_qstats_backlog_dec(sch, skb);
> +		rtnl_qdisc_drop(skb, sch);
> +	}
> +	qdisc_tree_reduce_backlog(sch, qlen - sch->q.qlen, dropped);
> +
> +	sch_tree_unlock(sch);
> +	return 0;
> +}
> +
> +static inline psched_time_t qdelay_in_psched(struct Qdisc *q, u64 now)
> +{
> +	struct sk_buff *skb = qdisc_peek_head(q);
> +
> +	return PSCHED_NS2TICKS(skb_sojourn_time(skb, now));

Same as above : psched_time_t is deprecated.

I believe only CBQ and PIE are still using it.

(psched_get_time() is also deprecated)

> +}
> +
> +static void calculate_probability(struct Qdisc *sch)
> +{
> +	struct dualpi2_sched_data *q = qdisc_priv(sch);
> +	u64 now = ktime_get_mono_fast_ns();

ktime_get_ns()

> +	psched_time_t qdelay_old;
> +	psched_time_t qdelay;
> +	u32 oldprob;
> +	s64 delta;	/* determines the change in probability */
> +