lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 13 Jul 2018 17:05:35 -0700
From:   Vinicius Costa Gomes <vinicius.gomes@...el.com>
To:     netdev@...r.kernel.org
Cc:     Vinicius Costa Gomes <vinicius.gomes@...el.com>,
        jesus.sanchez-palencia@...el.com, tglx@...utronix.de,
        jan.altenberg@...utronix.de, henrik@...tad.us,
        richardcochran@...il.com, levi.pearson@...man.com,
        jhs@...atatu.com, xiyou.wangcong@...il.com, jiri@...nulli.us
Subject: [RFC net-next v1 0/1] net/sched: Introduce the taprio scheduler


Hi,

This series provides a set of interfaces that can be used by
applications that require (time-based) Scheduled Transmission of
packets. It is comprised by 3 new components to the kernel:

  - etf: the per-queue TxTime-Based scheduling qdisc;
  - taprio: the per-port Time-Aware scheduler qdisc;
  - SO_TXTIME: a socket option + cmsg APIs.

ETF and SO_TXTIME are already applied[1] into the net-next tree. This
is the remaining piece.

Overview
========

The CBS qdisc proposal RFC [2] included some rough ideas about the
design and API of a "taprio" (Time Aware Priority) qdisc. The idea of
presenting the taprio ideas at that point (almost 10 months ago!) was
to show our vision of how things would fit together going forward.
>From that concept stage to this (almost) realised stage the main
differences are:

  - As of now, taprio is a software only implementation of a schedule
    executor;
  - Instead of taprio centralising all the time based decisions, taprio
    can work together with ETF (the Earliest TxTime First), a qdisc
    meant to use the LaunchTime (or similar) feature of various network
    controllers;

In a nutshell, taprio is a root qdisc that can execute a predefined
schedule, etf is a qdisc that provides time based admission control
and "earliest deadline first" dequeue mode, and SO_TXTIME is a socket
option that is used for enabling a socket to be used for time-based
transmission and configuring its parameters.

taprio
======

This scheduler allows the network administrator to configure schedules
for classes of traffic, the configuration interface is similar to what
IEEE 802.1Qbv-2015 defines.

Example configuration:

$ tc qdisc add dev enp2s0 parent root handle 100 taprio \
	    num_tc 3 \
	    map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 \
	    queues 1@0 1@1 2@2 \
	    sched-file ~/gates.sched \
	    base-time 1528743495910289987 \
	    clockid CLOCK_TAI

This qdisc borrows a few concepts from mqprio and so, most the
parameters are similar to mqprio. The main difference is the
'sched-file' parameter, one example on a schedule file would be:

gates.sched
-----------
S 01 300000
S 02 300000
S 04 300000

The format of each line is:
<CMD> <GATE MASK> <INTERVAL>

The only supported <CMD> is "S", which means "SetGateStates",
following the IEEE 802.1Qbv-2015 definition (Table 8-6). <GATE MASK>
is a bitmask where each bit is a associated with a traffic class, so
bit 0 (the least significant bit) being "on" means that traffic class
0 is "active" for that schedule entry. <INTERVAL> is a time duration
in nanoseconds that specifies for how long that state defined by <CMD>
and <GATE MASK> should be held before moving to the next entry.

This schedule is circular, that is, after the last entry is executed
it starts from the first one, indefinitely.

The other parameters can be defined as follows:
  - base-time: allows that multiple systems can have synchronised
    schedules, it specifies the instant when the schedule starts;
  - clockid: specifies the reference clock to be used;

A more complete example can be found here, with instructions of how to
test it:

https://gist.github.com/jeez/bd3afeff081ba64a695008dd8215866f [3]

The basic design of the scheduler is simple, after we calculate the
first expiration of the hrtimer, we set the next expiration to be the
previous plus the current entry's interval. At each time the function
runs, we set the current_entry, which has a gate_mask (that controls
which traffic classes are allowed to "go out" during each interval),
and we reuse this callback to "kick" the qdisc (this is the reason
that the usual qdisc watchdog isn't used).


Known Issues
============

 - As taprio is a software only implementation, and there's another
   layer of queuing in the network controller, packets can still
   leave the controller outside their "correct" windows. This happens
   mostly for low-priority classes, and is more evident if they are
   'starved' by the higher priority ones;
 - There's no support for changing the schedule during runtime;

This series is also hosted on github and can be found at [4].
The companion iproute2 patches can be found at [5].


Cheers,
--
Vinicius


[1] https://patchwork.ozlabs.org/cover/938991/

[2] https://patchwork.ozlabs.org/cover/808504/

[3] github doesn't make it clear, but the gist can be cloned like this:
$ git clone https://gist.github.com/jeez/bd3afeff081ba64a695008dd8215866f taprio-test

[4] https://github.com/vcgomes/linux/tree/taprio-RFC-v1

[5] https://github.com/vcgomes/iproute2/tree/taprio-RFC-v1


Vinicius Costa Gomes (1):
  net/sched: Introduce the taprio scheduler

 include/uapi/linux/pkt_sched.h |  49 ++
 net/sched/Kconfig              |  11 +
 net/sched/Makefile             |   1 +
 net/sched/sch_taprio.c         | 952 +++++++++++++++++++++++++++++++++
 4 files changed, 1013 insertions(+)
 create mode 100644 net/sched/sch_taprio.c

-- 
2.18.0

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ