[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20221026075054.119069-1-pratyush@sipanda.io>
Date: Wed, 26 Oct 2022 13:20:50 +0530
From: Pratyush Kumar Khan <pratyush@...anda.io>
To: netdev@...r.kernel.org, bpf@...r.kernel.org
Cc: tom@...anda.io, abuduri@...tanamicro.com, chethan@...anda.io,
Pratyush Kumar Khan <pratyush@...anda.io>
Subject: [RFC PATCH net-next]net: Add new kParser KMOD in net, integrate kParser with XDP
kParser, or the "Kernel Parser" is a highly programmable, high
performance protocol parser in the Linux network stack. An instance
of kParser is programmed by "ip parser ..." commands in iproute2.
The kParser is dynamically programmable and script-able. For instance,
to add new protocol to parse in an existing instance of kparser, CLI
commands are executed -- there is no need to write user code or perform
a recompile of the parser.
A parser is programmed through the "ip parser" CLI, and common netlink
interfaces are used to instantiate a parser in the kernel. A parser is
defined by a parse graph which is implemented by kparser as a set of
parse node definitions and protocol tables that describe the linkages
between the nodes. Per its program, a kparser instance will report
metadata about the packet and various protocols layers of the packet.
Metadata is any information about the packet including header offsets
and value of fields of interest to the programmer; the later is
analogous to the flow information extracted by flow dissector (the
primary difference being that kParser can extract arbitrary protocol
fields and report on fields in multiple layers of encapsulation).
kParser is called in the kernel by the kparser_parse and
__kparser_parse functions. The output returned is a set metadata about
the parsed packet. Metadata is any information that the parser is
programmed to report about a packet (e.g. offsets of various headers,
extracted IP addresses and ports, etc.). A pointer to a metadata buffer
is input to kparser_parse, kparser will fill in the metadata as per
the programming of the parser. Note that the structure of the metadata
buffer is determined by the programmer, when the buffer is returned
the caller can cast the buffer to the data structure for that parser.
The iproute2 CLI for the kParser works in tandem with the KMOD kParser
to configure any number of parser instances. If kParser KMOD is not
statically compiled with Linux kernel, it needs to be
additionally enabled, compiled and loaded to use the iproute2 CLI for
kParser. Please note that the kParser CLI is scriptable meaning the
parser configuration in the kernel can by dynamically updated without
need to recompile any code or reload kernel module.
Building blocks of kParser are various objects from different
namespaces/object types. For example, parser, node, table etc. are all
different types of objects, also known as namespaces. All the namespaces
are described in the next section.
Each object is identified by a maximum 128 bytes long '\0' terminated
(128 bytes including the '\0' character) human readable ASCII name (only
character '/' is not allowed in the name, and names can not start with
'-'). Alternatively an unsigned 16 bit ID or both ID and name can be
used to identify objects.
NOTE: During CLI create operations of these objects, it is must to
specify either the name or ID. Both can also be specified.
Whichever is not specified during create will be auto generated by the
KMOD kParser and CLI will convey the identifiers to user for later use.
User should save these identifiers.
NOTE: During CLI create operations, unique name or ID must always be
specified. Those name/ID can later be used to identify the associated
object in further CLI operations.
Various objects are:
* parser:
A parser represents a parse tree. It defines the user
metadata and metametadata structure size, number of parsing node
and encapsulation limits, root node for the parse tree, success
and failure case exit nodes.
* node:
A node (a.k.a parse node) represents a specific protocol header.
Defining protocol handler involves multiple work, i.e.configure
the parser about the associated protocol's packet header, e.g.
minimum header length, where to look for the next protocol field
in the packet, etc.
Along with that, it also defines the rules/handlers to parse and
store the required metadata by associating a metalist.
The table to find the next protocol node is attached to node.
node can be 3 types:
* PLAIN:
PLAIN nodes are the basic protocol headers.
* TLVS:
TLVS nodes are the Type-Length-Value protocol
headers, such as TCP.
They also binds a tlvtable to a node.
* FLAGFIELDS:
FLAGFIELDS are indexed flag and associated flag
fields protocol headers, such as GRE headers.
It also binds a flagstable with a node.
* table:
A table is a protocol table, which associated a protocol
number with a node. e.g. Ethernet protocol type 0x8000 in
network order means the next node after Ethernet header is IPv4.
NOTE: table has key, key must be unique. Usually this key is
protocol number, such as Ethernet type, or IPv4 protocol number
etc.
* metadata-rule:
Defines the metadata structures that will be passed to
the kParser datapath parser API by the user. This basically
defines a specific metadata extraction rule. This must match
with the user passed metadata structure in the datapath API.
* metadata-ruleset:
A list of metadata(s) to associate it with packet
parsing action handlers, i.e. parse node.
* tlvnode:
A tlvnode defines a specific TLV parsing rule, e.g. to
parse TCP option MSS, a new tlvnode needs to be defined.
Each tlvnode can also associate a metalist with the TLV parsing
rule, i.e. tlvnode
* tlvtable:
This is a table of multiple tlvnode(s) where the key are
types of TLVs (e.g. tlvnode defined for TCP MSS should have the
type/kind value set to 2.
* flags:
It describes parsing rules to extract certain protocol's flags
in bitfields, such as flag value, mask and size.
* flagfields:
It defines flagfields in packet associated with flags in
bitfields of the same packet.
e.g. GRE flagfields such as checksum, key, sequence number etc.
* flagstable:
This defines a table of flagfields and associate them
with their respective flag values via their indexes. Here the
keys are usually indexes, because in typical flag based protocol
header, such as GRE, the flagfields appear in protocol packet in
the same order as the set flag bits. The flag is defined by the
flag value, mask, size and associated metalist.
* condexprs:
"Conditional expressions" used to define and configure
various complex conditional expressions in kParser.
They are used to validate certain conditions for
protocol packet field values.
* condexprslist:
"List of Conditional expressions" used to create
more complex and composite expressions involving more than one
conditional expression(s).
* condexprstable:
"A table of Conditional expressions" used to
associate one or more than one list of Conditional expressions
with a packet parsing action handlers, i.e. parse node.
* counter:
It is used to create and configure counter objects which can
be used for a wide range of usages such as count how many VLAN
headers were parsed, how many TCP options are encountered etc.
* countertable:
kParser has a global table of counters, which supports various
and unique counter configurations upto seven entries. Multiple
kParser parser instances can share this countertable.
Global header file include/net/kparser.h exports the kParser datapath
KMOD APIs. The APIs are:
/* kparser_parse(): Function to parse a skb using a parser instance key.
*
* skb: input packet skb
* kparser_key: key of the associated kParser parser object which must
* be already created via CLI.
* _metadata: User provided metadata buffer. It must be same as
*
* configured metadata objects in CLI.
* metadata_len: Total length of the user provided metadata buffer.
*
* return: kParser error code as defined in include/uapi/linux/kparser.h
*/
int kparser_parse(struct sk_buff *skb,
const struct kparser_hkey *kparser_key,
void *_metadata, size_t metadata_len);
/* __kparser_parse(): Function to parse a void * packet buffer using a
* parser instance key.
*
* parser: Non NULL kparser_get_parser() returned and cached opaque
* pointer referencing a valid parser instance.
* _hdr: input packet buffer
* parse_len: length of input packet buffer
* _metadata: User provided metadata buffer. It must be same as
* configured metadata objects in CLI.
* metadata_len: Total length of the user provided metadata buffer.
*
* return: kParser error code as defined in include/uapi/linux/kparser.h
*/
int __kparser_parse(const void *parser, void *_hdr,
size_t parse_len, void *_metadata,
size_t metadata_len);
/* kparser_get_parser(): Function to get an opaque reference of a parser
* instance and mark it immutable so that while actively using, it can
* not be deleted. The parser is identified by a key. It marks the
* associated parser and whole parse tree immutable so that when it is
* locked, it can not be deleted.
*
* kparser_key: key of the associated kParser parser object which must
* be already created via CLI.
*
* return: NULL if key not found, else an opaque parser instance pointer
* which can be used in the following APIs 3 and 4.
*
* NOTE: This call makes the whole parser tree immutable. If caller
* calls this more than once, later caller will need to release the same
* parser exactly that many times using the API kparser_put_parser().
*/
const void *kparser_get_parser(const struct kparser_hkey
*kparser_key);
/* kparser_put_parser(): Function to return and undo the read only
* operation done previously by kparser_get_parser(). The parser
* instance is identified by using a previously obtained opaque parser
* pointer via API kparser_get_parser(). This undo the immutable
* change so that any component of the whole parse tree can be deleted
* again.
*
* parser: void *, Non NULL opaque pointer which was previously returned
* by kparser_get_parser(). Caller can use cached opaque pointer as
* long as system does not restart and kparser.ko is not reloaded.
*
* return: boolean, true if put operation is success, else false.
*
* NOTE: This call makes the whole parser tree deletable for the very
* last call.
*/
bool kparser_put_parser(const void *parser);
Now we can refer to an example kParser configuration which can parse
simple IPv4 five tuples, i.e. IPv4 header offset, offset of IPv4
addresses, IPv4 protocol number, L4 header offset (i.e. TCP/UDP) and
L4 port numbers. The sample ip commands are:
ip parser create md-rule name md.iphdr_offset type offset md-off 0
ip parser create md-rule name md.ipaddrs src-hdr-off 12 length 8 \
md-off 4
ip parser create md-rule name md.l4_hdr.offset type offset md-off 2
ip parser create md-rule name md.ports src-hdr-off 0 length 4 \
md-off 12 isendianneeded true
ip parser create node name node.ports hdr.minlen 4 \
md-rule md.l4_hdr.offset md-rule md.ports
ip parser create node name node.ipv4 hdr.minlen 20 \
hdr.len-field-off 0 hdr.len-field-len 1 \
hdr.len-field-mask 0x0f hdr.len-field-multiplier 4 \
nxt.field-off 9 nxt.field-len 1 \
nxt.table-ent 6:node.ports nxt.table-ent 17:node.ports \
md-rule md.iphdr_offset \
md-rule md.ipaddrs
ip parser create node name node.ether hdr.minlen 14 nxt.offset 12 \
nxt.length 2 nxt.table-ent 0x800:node.ipv4
ip parser create parser name tuple_parser rootnode node.ether \
base-metametadata-size 14
This sample parser will parse Ethernet/IPv4 to UDP and TCP, report the
offsets of the innermost IP and TCP or UDP header, extract IPv4
addresses and UDP or TCP ports (into a frame).
About the XDP kParser integration changes:
xdp: Support for kParser as bpf helper function
bpf xdp helper function is defined for kernel
parser(kParser).kParser is configured via userspace
ip command.
kParser data path API's as mentioned in include/net/kparser.h
are called using registered callback hooks.
xdp frame buffer is passed on to kParser via the xdp helper
function and metadata is populated in the user
specified buffer.
xdp user space component, which loads xdp kernel component
into xdp hook . when xdp packet is received by the xdp prog
in kernel calls bpf helper function to pass kparser configuration
and buffer to collect the metadata mapped to bpf map.
so that bpftool can display the metadata in BTF format.
kParser user space component displays number of packets
transmitted/received per second.
Following are the steps to test/run the kParser with xdp,
- load the kParser module
- configure the kparser via ip command
The ip command can be used to load the xdp kernel
component of xdp kParser.
ip link set dev <interface-name> xdp obj \
xdp_kparser_kern.o verbose
below is the command to run kParser on network interface
./xdp_kparser -S <interface-name>
xdp: Support for flow dissector as bpf helper function
xdp program which is loaded into the xdp hook calls the xdp
helper function to get the metadata.
bpf xdp helper function is defined for flow dissector.
xdp frame is passed on to Flow dissector call and with
keys(either basic keys or big parser keys) .
bpf helper function calls __skb_flow_dissector() with xdp
buffer, keys and user specified buffer for metadata.
xdp frame is passed via xdp helper function and metadata is
populated in the user specified buffer.
xdp user space component, which loads xdp kernel component
into xdp hook and displays number of packets
transmitted/received per second.
below is the command to run flow dissector on network
interface
./xdp_flow_dissector -S <interface-name>
The ip command can be used to load the xdp kernel
component of flow_dissector.
ip link set dev <interface-name> xdp obj \
xdp_flow_dissector_kern.o verbose
---
Aravind Kumar Buduri (2):
xdp: Support for kParser as bpf helper function
xdp: Support for flow_dissector as bpf helper function
Pratyush (1):
kParser: Add new kParser KMOD
Pratyush Kumar Khan (1):
docs: networking: add doc entry for kParser
Documentation/networking/kParser.rst | 302 ++
.../networking/parse_graph_example.svg | 2039 ++++++++++
include/net/kparser.h | 90 +
include/uapi/linux/bpf.h | 20 +
include/uapi/linux/kparser.h | 678 +++
net/Kconfig | 9 +
net/Makefile | 1 +
net/core/filter.c | 244 ++
net/kparser/Makefile | 10 +
net/kparser/kparser.h | 391 ++
net/kparser/kparser_cmds.c | 898 ++++
net/kparser/kparser_cmds_dump_ops.c | 532 +++
net/kparser/kparser_cmds_ops.c | 3621 +++++++++++++++++
net/kparser/kparser_condexpr.h | 52 +
net/kparser/kparser_datapath.c | 1094 +++++
net/kparser/kparser_main.c | 325 ++
net/kparser/kparser_metaextract.h | 896 ++++
net/kparser/kparser_types.h | 586 +++
samples/bpf/Makefile | 6 +
samples/bpf/metadata_def.h | 21 +
samples/bpf/xdp_flow_dissector_kern.c | 91 +
samples/bpf/xdp_flow_dissector_user.c | 170 +
samples/bpf/xdp_kparser_kern.c | 94 +
samples/bpf/xdp_kparser_user.c | 171 +
tools/include/uapi/linux/bpf.h | 20 +
25 files changed, 12361 insertions(+)
create mode 100644 Documentation/networking/kParser.rst
create mode 100644 Documentation/networking/parse_graph_example.svg
create mode 100644 include/net/kparser.h
create mode 100644 include/uapi/linux/kparser.h
create mode 100644 net/kparser/Makefile
create mode 100644 net/kparser/kparser.h
create mode 100644 net/kparser/kparser_cmds.c
create mode 100644 net/kparser/kparser_cmds_dump_ops.c
create mode 100644 net/kparser/kparser_cmds_ops.c
create mode 100644 net/kparser/kparser_condexpr.h
create mode 100644 net/kparser/kparser_datapath.c
create mode 100644 net/kparser/kparser_main.c
create mode 100644 net/kparser/kparser_metaextract.h
create mode 100644 net/kparser/kparser_types.h
create mode 100644 samples/bpf/metadata_def.h
create mode 100644 samples/bpf/xdp_flow_dissector_kern.c
create mode 100644 samples/bpf/xdp_flow_dissector_user.c
create mode 100644 samples/bpf/xdp_kparser_kern.c
create mode 100644 samples/bpf/xdp_kparser_user.c
--
2.34.1
Powered by blists - more mailing lists