lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 18 Jan 2019 13:56:49 -0800
From:   Yonghong Song <yhs@...com>
To:     <ast@...com>, <daniel@...earbox.net>, <netdev@...r.kernel.org>,
        <ecree@...arflare.com>
CC:     <kernel-team@...com>
Subject: [PATCH bpf-next v2] bpf: btf: add btf documentation

This patch added documentation for BTF (BPF Debug Format).
The document is placed under linux:Documentation/bpf directory.

Signed-off-by: Yonghong Song <yhs@...com>
---
 Documentation/bpf/btf.rst   | 870 ++++++++++++++++++++++++++++++++++++
 Documentation/bpf/index.rst |   7 +
 2 files changed, 877 insertions(+)
 create mode 100644 Documentation/bpf/btf.rst

Changelogs:
  v1 -> v2:
    address comments from Edward, mainly including:
    . more detailed explanation of BTF_INT_OFFSET().
    . more explanation about array dimensions.
    . better wording refers to "type" field in btf_type
      for typedef/const/volatile/restrict.
    . better explanation for BTF_KIND_FUNC.
    . explanation of what is btf_id.
    . more cross references inside the document.

diff --git a/Documentation/bpf/btf.rst b/Documentation/bpf/btf.rst
new file mode 100644
index 000000000000..1d434c3a268d
--- /dev/null
+++ b/Documentation/bpf/btf.rst
@@ -0,0 +1,870 @@
+=====================
+BPF Type Format (BTF)
+=====================
+
+1. Introduction
+***************
+
+BTF (BPF Type Format) is the meta data format which
+encodes the debug info related to BPF program/map.
+The name BTF was used initially to describe
+data types. The BTF was later extended to include
+function info for defined subroutines, and line info
+for source/line information.
+
+The debug info is used for map pretty print, function
+signature, etc. The function signature enables better
+bpf program/function kernel symbol.
+The line info helps generate
+source annotated translated byte code, jited code
+and verifier log.
+
+The BTF specification contains two parts,
+  * BTF kernel API
+  * BTF ELF file format
+
+The kernel API is the contract between
+user space and kernel. The kernel verifies
+the BTF info before using it.
+The ELF file format is a user space contract
+between ELF file and libbpf loader.
+
+The type and string sections are part of the
+BTF kernel API, describing the debug info
+(mostly types related) referenced by the bpf program.
+These two sections are discussed in
+details in :ref:`BTF_Type_String`.
+
+.. _BTF_Type_String:
+
+2. BTF Type and String Encoding
+*******************************
+
+The file ``include/uapi/linux/btf.h`` provides high
+level definition on how types/strings are encoded.
+
+The beginning of data blob must be::
+
+    struct btf_header {
+        __u16   magic;
+        __u8    version;
+        __u8    flags;
+        __u32   hdr_len;
+
+        /* All offsets are in bytes relative to the end of this header */
+        __u32   type_off;       /* offset of type section       */
+        __u32   type_len;       /* length of type section       */
+        __u32   str_off;        /* offset of string section     */
+        __u32   str_len;        /* length of string section     */
+    };
+
+The magic is ``0xeB9F``, which has different encoding for big and little
+endian system, and can be used to test whether BTF is generated for
+big or little endian target.
+The btf_header is designed to be extensible with hdr_len equal to
+``sizeof(struct btf_header)`` when the data blob is generated.
+
+2.1 String Encoding
+===================
+
+The first string in the string section must be a null string.
+The rest of string table is a concatenation of other null-treminated
+strings.
+
+2.2 Type Encoding
+=================
+
+The type id ``0`` is reserved for ``void`` type.
+The type section is parsed sequentially and the type id is assigned to
+each recognized type starting from id ``1``.
+Currently, the following types are supported::
+
+    #define BTF_KIND_INT            1       /* Integer      */
+    #define BTF_KIND_PTR            2       /* Pointer      */
+    #define BTF_KIND_ARRAY          3       /* Array        */
+    #define BTF_KIND_STRUCT         4       /* Struct       */
+    #define BTF_KIND_UNION          5       /* Union        */
+    #define BTF_KIND_ENUM           6       /* Enumeration  */
+    #define BTF_KIND_FWD            7       /* Forward      */
+    #define BTF_KIND_TYPEDEF        8       /* Typedef      */
+    #define BTF_KIND_VOLATILE       9       /* Volatile     */
+    #define BTF_KIND_CONST          10      /* Const        */
+    #define BTF_KIND_RESTRICT       11      /* Restrict     */
+    #define BTF_KIND_FUNC           12      /* Function     */
+    #define BTF_KIND_FUNC_PROTO     13      /* Function Proto       */
+
+Note that the type section encodes debug info, not just pure types.
+``BTF_KIND_FUNC`` is not a type, and it represents a defined subprogram.
+
+Each type contains the following common data::
+
+    struct btf_type {
+        __u32 name_off;
+        /* "info" bits arrangement
+         * bits  0-15: vlen (e.g. # of struct's members)
+         * bits 16-23: unused
+         * bits 24-27: kind (e.g. int, ptr, array...etc)
+         * bits 28-30: unused
+         * bit     31: kind_flag, currently used by
+         *             struct, union and fwd
+         */
+        __u32 info;
+        /* "size" is used by INT, ENUM, STRUCT and UNION.
+         * "size" tells the size of the type it is describing.
+         *
+         * "type" is used by PTR, TYPEDEF, VOLATILE, CONST, RESTRICT,
+         * FUNC and FUNC_PROTO.
+         * "type" is a type_id referring to another type.
+         */
+        union {
+                __u32 size;
+                __u32 type;
+        };
+    };
+
+For certain kinds, the common data are followed by kind specific data.
+The ``name_off`` in ``struct btf_type`` specifies the offset in the string table.
+The following details encoding of each kind.
+
+2.2.1 BTF_KIND_INT
+~~~~~~~~~~~~~~~~~~
+
+``struct btf_type`` encoding requirement:
+ * ``name_off``: any valid offset
+ * ``info.kind_flag``: 0
+ * ``info.kind``: BTF_KIND_INT
+ * ``info.vlen``: 0
+ * ``size``: the size of the int type in bytes.
+
+``btf_type`` is followed by a ``u32`` with following bits arrangement::
+
+  #define BTF_INT_ENCODING(VAL)   (((VAL) & 0x0f000000) >> 24)
+  #define BTF_INT_OFFSET(VAL)     (((VAL  & 0x00ff0000)) >> 16)
+  #define BTF_INT_BITS(VAL)       ((VAL)  & 0x000000ff)
+
+The ``BTF_INT_ENCODING`` has the following attributes::
+
+  #define BTF_INT_SIGNED  (1 << 0)
+  #define BTF_INT_CHAR    (1 << 1)
+  #define BTF_INT_BOOL    (1 << 2)
+
+The ``BTF_INT_ENCODING()`` provides extra information, signness,
+char, or bool, for the int type. The char and bool encoding
+are mostly useful for pretty print. At most one encoding can
+be specified for the int type.
+
+The ``BTF_INT_BITS()`` specifies the number of actual bits held by
+this int type. For example, a 4-bit bitfield encodes
+``BTF_INT_BITS()`` equals to 4. The ``btf_type.size * 8``
+must be equal to or greater than ``BTF_INT_BITS()`` for the type.
+The maximum value of ``BTF_INT_BITS()`` is 128.
+
+The ``BTF_INT_OFFSET()`` specifies the starting bit offset to
+calculate values for this int. For example, a bitfield struct
+member has
+
+ * btf member bit offset 100 from the start of the structure,
+ * btf member pointing to an int type,
+ * the int type has ``BTF_INT_OFFSET() = 2`` and ``BTF_INT_BITS() = 4``
+
+Then in the struct memory layout, this member will occupy
+``4`` bits starting from bits ``100 + 2 = 102``.
+
+Alternatively, the bitfield struct member can be the following to
+access the same bits as the above:
+
+ * btf member bit offset 102,
+ * btf member pointing to an int type,
+ * the int type has ``BTF_INT_OFFSET() = 0`` and ``BTF_INT_BITS() = 4``
+
+The original intention of ``BTF_INT_OFFSET()`` is to provide
+flexibility of bitfield encoding.
+Currently, both llvm and pahole generates ``BTF_INT_OFFSET() = 0``
+for all int types.
+
+2.2.2 BTF_KIND_PTR
+~~~~~~~~~~~~~~~~~~
+
+``struct btf_type`` encoding requirement:
+  * ``name_off``: 0
+  * ``info.kind_flag``: 0
+  * ``info.kind``: BTF_KIND_PTR
+  * ``info.vlen``: 0
+  * ``type``: the pointee type of the pointer
+
+No additional type data follow ``btf_type``.
+
+2.2.3 BTF_KIND_ARRAY
+~~~~~~~~~~~~~~~~~~~~
+
+``struct btf_type`` encoding requirement:
+  * ``name_off``: 0
+  * ``info.kind_flag``: 0
+  * ``info.kind``: BTF_KIND_ARRAY
+  * ``info.vlen``: 0
+  * ``size/type``: 0, not used
+
+btf_type is followed by one "struct btf_array"::
+
+    struct btf_array {
+        __u32   type;
+        __u32   index_type;
+        __u32   nelems;
+    };
+
+The ``struct btf_array`` encoding:
+  * ``type``: the element type
+  * ``index_type``: the index type
+  * ``nelems``: the number of elements for this array (``0`` is also allowed).
+
+The ``index_type`` can be any regular int types
+(u8, u16, u32, u64, unsigned __int128).
+The original design of including ``index_type`` follows dwarf
+which has a ``index_type`` for its array type.
+Currently in BTF, beyond type verification, the ``index_type`` is not used.
+
+The ``struct btf_array`` allows chaining through element type to represent
+multiple dimensional arrays. For example, ``int a[5][6]``, the following
+type system illustrates the chaining:
+
+  * [1]: int
+  * [2]: array, ``btf_array.type = [1]``, ``btf_array.nelems = 6``
+  * [3]: array, ``btf_array.type = [2]``, ``btf_array.nelems = 5``
+
+Currently, both pahole and llvm collapse multiple dimensional array
+into one dimensional array, e.g., ``a[5][6]``, the btf_array.nelems
+equal to ``30``. This is because the original use case is map pretty
+print where the whole array is dumped out so one dimensional array
+is enough. As more BTF usage is explored, pahole and llvm can be
+changed to generate proper chained representation for
+multiple dimensional arrays.
+
+2.2.4 BTF_KIND_STRUCT
+~~~~~~~~~~~~~~~~~~~~~
+2.2.5 BTF_KIND_UNION
+~~~~~~~~~~~~~~~~~~~~
+
+``struct btf_type`` encoding requirement:
+  * ``name_off``: 0 or offset to a valid C identifier
+  * ``info.kind_flag``: 0 or 1
+  * ``info.kind``: BTF_KIND_STRUCT or BTF_KIND_UNION
+  * ``info.vlen``: the number of struct/union members
+  * ``info.size``: the size of the struct/union in bytes
+
+``btf_type`` is followed by ``info.vlen`` number of ``struct btf_member``.::
+
+    struct btf_member {
+        __u32   name_off;
+        __u32   type;
+        __u32   offset;
+    };
+
+``struct btf_member`` encoding:
+  * ``name_off``: offset to a valid C identifier
+  * ``type``: the member type
+  * ``offset``: <see below>
+
+If the type info ``kind_flag`` is not set, the offset contains
+only bit offset of the member. Note that the base type of the
+bitfield can only be int or enum type. If the bitfield size
+is 32, the base type can be either int or enum type.
+If the bitfield size is not 32, the base type must be int,
+and int type ``BTF_INT_BITS()`` encodes the bitfield size.
+
+If the ``kind_flag`` is set, the ``btf_member.offset``
+contains both member bitfield size and bit offset. The
+bitfield size and bit offset are calculated as below.::
+
+  #define BTF_MEMBER_BITFIELD_SIZE(val)   ((val) >> 24)
+  #define BTF_MEMBER_BIT_OFFSET(val)      ((val) & 0xffffff)
+
+In this case, if the base type is an int type, it must
+be a regular int type:
+
+  * ``BTF_INT_OFFSET()`` must be 0.
+  * ``BTF_INT_BITS()`` must be equal to ``{1,2,4,8,16} * 8``.
+
+The following kernel patch introduced ``kind_flag`` and
+explained why both modes exist:
+
+  https://github.com/torvalds/linux/commit/9d5f9f701b1891466fb3dbb1806ad97716f95cc3#diff-fa650a64fdd3968396883d2fe8215ff3
+
+2.2.6 BTF_KIND_ENUM
+~~~~~~~~~~~~~~~~~~~
+
+``struct btf_type`` encoding requirement:
+  * ``name_off``: 0 or offset to a valid C identifier
+  * ``info.kind_flag``: 0
+  * ``info.kind``: BTF_KIND_ENUM
+  * ``info.vlen``: number of enum values
+  * ``size``: 4
+
+``btf_type`` is followed by ``info.vlen`` number of ``struct btf_enum``.::
+
+    struct btf_enum {
+        __u32   name_off;
+        __s32   val;
+    };
+
+The ``btf_enum`` encoding:
+  * ``name_off``: offset to a valid C identifier
+  * ``val``: any value
+
+2.2.7 BTF_KIND_FWD
+~~~~~~~~~~~~~~~~~~
+
+``struct btf_type`` encoding requirement:
+  * ``name_off``: offset to a valid C identifier
+  * ``info.kind_flag``: 0 for struct, 1 for union
+  * ``info.kind``: BTF_KIND_FWD
+  * ``info.vlen``: 0
+  * ``type``: 0
+
+No additional type data follow ``btf_type``.
+
+2.2.8 BTF_KIND_TYPEDEF
+~~~~~~~~~~~~~~~~~~~~~~
+
+``struct btf_type`` encoding requirement:
+  * ``name_off``: offset to a valid C identifier
+  * ``info.kind_flag``: 0
+  * ``info.kind``: BTF_KIND_TYPEDEF
+  * ``info.vlen``: 0
+  * ``type``: the type which can be referred by name at ``name_off``
+
+No additional type data follow ``btf_type``.
+
+2.2.9 BTF_KIND_VOLATILE
+~~~~~~~~~~~~~~~~~~~~~~~
+
+``struct btf_type`` encoding requirement:
+  * ``name_off``: 0
+  * ``info.kind_flag``: 0
+  * ``info.kind``: BTF_KIND_VOLATILE
+  * ``info.vlen``: 0
+  * ``type``: the type with ``volatile`` qualifier
+
+No additional type data follow ``btf_type``.
+
+2.2.10 BTF_KIND_CONST
+~~~~~~~~~~~~~~~~~~~~~
+
+``struct btf_type`` encoding requirement:
+  * ``name_off``: 0
+  * ``info.kind_flag``: 0
+  * ``info.kind``: BTF_KIND_CONST
+  * ``info.vlen``: 0
+  * ``type``: the type with ``const`` qualifier
+
+No additional type data follow ``btf_type``.
+
+2.2.11 BTF_KIND_RESTRICT
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+``struct btf_type`` encoding requirement:
+  * ``name_off``: 0
+  * ``info.kind_flag``: 0
+  * ``info.kind``: BTF_KIND_RESTRICT
+  * ``info.vlen``: 0
+  * ``type``: the type with ``restrict`` qualifier
+
+No additional type data follow ``btf_type``.
+
+2.2.12 BTF_KIND_FUNC
+~~~~~~~~~~~~~~~~~~~~
+
+``struct btf_type`` encoding requirement:
+  * ``name_off``: offset to a valid C identifier
+  * ``info.kind_flag``: 0
+  * ``info.kind``: BTF_KIND_FUNC
+  * ``info.vlen``: 0
+  * ``type``: a BTF_KIND_FUNC_PROTO type
+
+No additional type data follow ``btf_type``.
+
+A BTF_KIND_FUNC defines, not a type, but a subprogram (function) whose
+signature is defined by ``type``. The subprogram is thus an instance of
+that type. The BTF_KIND_FUNC may in turn be referenced by a func_info in
+the :ref:`BTF_Ext_Section` (ELF) or in the arguments to
+:ref:`BPF_Prog_Load` (ABI).
+
+2.2.13 BTF_KIND_FUNC_PROTO
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+``struct btf_type`` encoding requirement:
+  * ``name_off``: 0
+  * ``info.kind_flag``: 0
+  * ``info.kind``: BTF_KIND_FUNC_PROTO
+  * ``info.vlen``: # of parameters
+  * ``type``: the return type
+
+``btf_type`` is followed by ``info.vlen`` number of ``struct btf_param``.::
+
+    struct btf_param {
+        __u32   name_off;
+        __u32   type;
+    };
+
+If a BTF_KIND_FUNC_PROTO type is referred by a BTF_KIND_FUNC type,
+then ``btf_param.name_off`` must point to a valid C identifier
+except for the possible last argument representing the variable
+argument. The btf_param.type refers to parameter type.
+
+If the function has variable arguments, the last parameter
+is encoded with ``name_off = 0`` and ``type = 0``.
+
+3. BTF Kernel API
+*****************
+
+The following bpf syscall command involves BTF:
+   * BPF_BTF_LOAD: load a blob of BTF data into kernel
+   * BPF_MAP_CREATE: map creation with btf key and value type info.
+   * BPF_PROG_LOAD: prog load with btf function and line info.
+   * BPF_BTF_GET_FD_BY_ID: get a btf fd
+   * BPF_OBJ_GET_INFO_BY_FD: btf, func_info, line_info
+     and other btf related info are returned.
+
+The workflow typically looks like:
+::
+
+  Application:
+      BPF_BTF_LOAD
+          |
+          v
+      BPF_MAP_CREATE and BPF_PROG_LOAD
+          |
+          V
+      ......
+
+  Introspection tool:
+      ......
+      BPF_{PROG,MAP}_GET_NEXT_ID (get prog/map id's)
+          |
+          V
+      BPF_{PROG,MAP}_GET_FD_BY_ID (get a prog/map fd)
+          |
+          V
+      BPF_OBJ_GET_INFO_BY_FD (get bpf_prog_info/bpf_map_info with btf_id)
+          |                                     |
+          V                                     |
+      BPF_BTF_GET_FD_BY_ID (get btf_fd)         |
+          |                                     |
+          V                                     |
+      BPF_OBJ_GET_INFO_BY_FD (get btf)          |
+          |                                     |
+          V                                     V
+      pretty print types, dump func signatures and line info, etc.
+
+
+3.1 BPF_BTF_LOAD
+================
+
+Load a blob of BTF data into kernel. A blob of data
+described in :ref:`BTF_Type_String`
+can be directly loaded into the kernel.
+A ``btf_fd`` returns to userspace.
+
+3.2 BPF_MAP_CREATE
+==================
+
+A map can be created with ``btf_fd`` and specified key/value type id.::
+
+    __u32   btf_fd;         /* fd pointing to a BTF type data */
+    __u32   btf_key_type_id;        /* BTF type_id of the key */
+    __u32   btf_value_type_id;      /* BTF type_id of the value */
+
+In libbpf, the map can be defined with extra annotation like below:
+::
+
+    struct bpf_map_def SEC("maps") btf_map = {
+        .type = BPF_MAP_TYPE_ARRAY,
+        .key_size = sizeof(int),
+        .value_size = sizeof(struct ipv_counts),
+        .max_entries = 4,
+    };
+    BPF_ANNOTATE_KV_PAIR(btf_map, int, struct ipv_counts);
+
+Here, the parameters for macro BPF_ANNOTATE_KV_PAIR are map name,
+key and value types for the map.
+During ELF parsing, libbpf is able to extract key/value type_id's
+and assigned them to BPF_MAP_CREATE attributes automatically.
+
+.. _BPF_Prog_Load:
+
+3.3 BPF_PROG_LOAD
+=================
+
+During prog_load, func_info and line_info can be passed to kernel with
+proper values for the following attributes:
+::
+
+    __u32           insn_cnt;
+    __aligned_u64   insns;
+    ......
+    __u32           prog_btf_fd;    /* fd pointing to BTF type data */
+    __u32           func_info_rec_size;     /* userspace bpf_func_info size */
+    __aligned_u64   func_info;      /* func info */
+    __u32           func_info_cnt;  /* number of bpf_func_info records */
+    __u32           line_info_rec_size;     /* userspace bpf_line_info size */
+    __aligned_u64   line_info;      /* line info */
+    __u32           line_info_cnt;  /* number of bpf_line_info records */
+
+The func_info and line_info are an array of below, respectively.::
+
+    struct bpf_func_info {
+        __u32   insn_off; /* [0, insn_cnt - 1] */
+        __u32   type_id;  /* pointing to a BTF_KIND_FUNC type */
+    };
+    struct bpf_line_info {
+        __u32   insn_off; /* [0, insn_cnt - 1] */
+        __u32   file_name_off; /* offset to string table for the filename */
+        __u32   line_off; /* offset to string table for the source line */
+        __u32   line_col; /* line number and column number */
+    };
+
+func_info_rec_size is the size of each func_info record, and line_info_rec_size
+is the size of each line_info record. Passing the record size to kernel make
+it possible to extend the record itself in the future.
+
+Below are requirements for func_info:
+  * func_info[0].insn_off must be 0.
+  * the func_info insn_off is in strictly increasing order and matches
+    bpf func boundaries.
+
+Below are requirements for line_info:
+  * the first insn in each func must points to a line_info record.
+  * the line_info insn_off is in strictly increasing order.
+
+For line_info, the line number and column number are defined as below:
+::
+
+    #define BPF_LINE_INFO_LINE_NUM(line_col)        ((line_col) >> 10)
+    #define BPF_LINE_INFO_LINE_COL(line_col)        ((line_col) & 0x3ff)
+
+3.4 BPF_{PROG,MAP}_GET_NEXT_ID
+
+In kernel, every loaded program, map or btf has a unique id.
+The id won't change during the life time of the program, map or btf.
+
+The bpf syscall command BPF_{PROG,MAP}_GET_NEXT_ID
+returns all id's, one for each command, to user space, for bpf
+program or maps,
+so the inspection tool can inspect all programs and maps.
+
+3.5 BPF_{PROG,MAP}_GET_FD_BY_ID
+
+The introspection tool cannot use id to get details about program or maps.
+A file descriptor needs to be obtained first for reference counting purpose.
+
+3.6 BPF_OBJ_GET_INFO_BY_FD
+==========================
+
+Once a program/map fd is acquired, the introspection tool can
+get the detailed information from kernel about this fd,
+some of which is btf related. For example,
+``bpf_map_info`` returns ``btf_id``, key/value type id.
+``bpf_prog_info`` returns ``btf_id``, func_info and line info
+for translated bpf byte codes, and jited_line_info.
+
+3.7 BPF_BTF_GET_FD_BY_ID
+========================
+
+With ``btf_id`` obtained in ``bpf_map_info`` and ``bpf_prog_info``,
+bpf syscall command BPF_BTF_GET_FD_BY_ID can retrieve a btf fd.
+Then, with command BPF_OBJ_GET_INFO_BY_FD, the btf blob, originally
+loaded into the kernel with BPF_BTF_LOAD, can be retrieved.
+
+With the btf blob, ``bpf_map_info`` and ``bpf_prog_info``, the introspection
+tool has full btf knowledge and is able to pretty print map key/values,
+dump func signatures, dump line info along with byte/jit codes.
+
+4. ELF File Format Interface
+****************************
+
+4.1 .BTF section
+================
+
+The .BTF section contains type and string data. The format of this section
+is same as the one describe in :ref:`BTF_Type_String`.
+
+.. _BTF_Ext_Section:
+
+4.2 .BTF.ext section
+====================
+
+The .BTF.ext section encodes func_info and line_info which
+needs loader manipulation before loading into the kernel.
+
+The specification for .BTF.ext section is defined at
+``tools/lib/bpf/btf.h`` and ``tools/lib/bpf/btf.c``.
+
+The current header of .BTF.ext section::
+
+    struct btf_ext_header {
+        __u16   magic;
+        __u8    version;
+        __u8    flags;
+        __u32   hdr_len;
+
+        /* All offsets are in bytes relative to the end of this header */
+        __u32   func_info_off;
+        __u32   func_info_len;
+        __u32   line_info_off;
+        __u32   line_info_len;
+    };
+
+It is very similar to .BTF section. Instead of type/string section,
+it contains func_info and line_info section. See :ref:`BPF_Prog_Load`
+for details about func_info and line_info record format.
+
+The func_info is organized as below.::
+
+     func_info_rec_size
+     btf_ext_info_sec for section #1 /* func_info for section #1 */
+     btf_ext_info_sec for section #2 /* func_info for section #2 */
+     ...
+
+``func_info_rec_size`` specifies the size of ``bpf_func_info`` structure
+when .BTF.ext is generated. btf_ext_info_sec, defined below, is
+the func_info for each specific ELF section.::
+
+     struct btf_ext_info_sec {
+        __u32   sec_name_off; /* offset to section name */
+        __u32   num_info;
+        /* Followed by num_info * record_size number of bytes */
+        __u8    data[0];
+     };
+
+Here, num_info must be greater than 0.
+
+The line_info is organized as below.::
+
+     line_info_rec_size
+     btf_ext_info_sec for section #1 /* line_info for section #1 */
+     btf_ext_info_sec for section #2 /* line_info for section #2 */
+     ...
+
+``line_info_rec_size`` specifies the size of ``bpf_line_info`` structure
+when .BTF.ext is generated.
+
+The interpretation of ``bpf_func_info->insn_off`` and
+``bpf_line_info->insn_off`` is different between kernel API and ELF API.
+For kernel API, the ``insn_off`` is the instruction offset in the unit
+of ``struct bpf_insn``. For ELF API, the ``insn_off`` is the byte offset
+from the beginning of section (``btf_ext_info_sec->sec_name_off``).
+
+5. Using BTF
+************
+
+5.1 bpftool map pretty print
+============================
+
+With BTF, the map key/value can be printed based on fields rather than
+simply raw bytes. This is especially
+valuable for large structure or if you data structure
+has bitfields. For example, for the following map,::
+
+      enum A { A1, A2, A3, A4, A5 };
+      typedef enum A ___A;
+      struct tmp_t {
+           char a1:4;
+           int  a2:4;
+           int  :4;
+           __u32 a3:4;
+           int b;
+           ___A b1:4;
+           enum A b2:4;
+      };
+      struct bpf_map_def SEC("maps") tmpmap = {
+           .type = BPF_MAP_TYPE_ARRAY,
+           .key_size = sizeof(__u32),
+           .value_size = sizeof(struct tmp_t),
+           .max_entries = 1,
+      };
+      BPF_ANNOTATE_KV_PAIR(tmpmap, int, struct tmp_t);
+
+bpftool is able to pretty print like below:
+::
+
+      [{
+            "key": 0,
+            "value": {
+                "a1": 0x2,
+                "a2": 0x4,
+                "a3": 0x6,
+                "b": 7,
+                "b1": 0x8,
+                "b2": 0xa
+            }
+        }
+      ]
+
+5.2 bpftool prog dump
+=====================
+
+The following is an example to show func_info and line_info
+can help prog dump with better kernel symbol name, function prototype
+and line information.::
+
+    $ bpftool prog dump jited pinned /sys/fs/bpf/test_btf_haskv
+    [...]
+    int test_long_fname_2(struct dummy_tracepoint_args * arg):
+    bpf_prog_44a040bf25481309_test_long_fname_2:
+    ; static int test_long_fname_2(struct dummy_tracepoint_args *arg)
+       0:   push   %rbp
+       1:   mov    %rsp,%rbp
+       4:   sub    $0x30,%rsp
+       b:   sub    $0x28,%rbp
+       f:   mov    %rbx,0x0(%rbp)
+      13:   mov    %r13,0x8(%rbp)
+      17:   mov    %r14,0x10(%rbp)
+      1b:   mov    %r15,0x18(%rbp)
+      1f:   xor    %eax,%eax
+      21:   mov    %rax,0x20(%rbp)
+      25:   xor    %esi,%esi
+    ; int key = 0;
+      27:   mov    %esi,-0x4(%rbp)
+    ; if (!arg->sock)
+      2a:   mov    0x8(%rdi),%rdi
+    ; if (!arg->sock)
+      2e:   cmp    $0x0,%rdi
+      32:   je     0x0000000000000070
+      34:   mov    %rbp,%rsi
+    ; counts = bpf_map_lookup_elem(&btf_map, &key);
+    [...]
+
+5.3 verifier log
+================
+
+The following is an example how line_info can help verifier failure debug.::
+
+       /* The code at tools/testing/selftests/bpf/test_xdp_noinline.c
+        * is modified as below.
+        */
+       data = (void *)(long)xdp->data;
+       data_end = (void *)(long)xdp->data_end;
+       /*
+       if (data + 4 > data_end)
+               return XDP_DROP;
+       */
+       *(u32 *)data = dst->dst;
+
+    $ bpftool prog load ./test_xdp_noinline.o /sys/fs/bpf/test_xdp_noinline type xdp
+        ; data = (void *)(long)xdp->data;
+        224: (79) r2 = *(u64 *)(r10 -112)
+        225: (61) r2 = *(u32 *)(r2 +0)
+        ; *(u32 *)data = dst->dst;
+        226: (63) *(u32 *)(r2 +0) = r1
+        invalid access to packet, off=0 size=4, R2(id=0,off=0,r=0)
+        R2 offset is outside of the packet
+
+6. BTF Generation
+*****************
+
+You need latest pahole
+
+  https://git.kernel.org/pub/scm/devel/pahole/pahole.git/
+
+or llvm (8.0 or later). The pahole acts as a dwarf2btf converter. It doesn't support .BTF.ext
+and btf BTF_KIND_FUNC type yet. For example,::
+
+      -bash-4.4$ cat t.c
+      struct t {
+        int a:2;
+        int b:3;
+        int c:2;
+      } g;
+      -bash-4.4$ gcc -c -O2 -g t.c
+      -bash-4.4$ pahole -JV t.o
+      File t.o:
+      [1] STRUCT t kind_flag=1 size=4 vlen=3
+              a type_id=2 bitfield_size=2 bits_offset=0
+              b type_id=2 bitfield_size=3 bits_offset=2
+              c type_id=2 bitfield_size=2 bits_offset=5
+      [2] INT int size=4 bit_offset=0 nr_bits=32 encoding=SIGNED
+
+The llvm is able to generate .BTF and .BTF.ext directly with -g for bpf target only.
+The assembly code (-S) is able to show the BTF encoding in assembly format.::
+
+    -bash-4.4$ cat t2.c
+    typedef int __int32;
+    struct t2 {
+      int a2;
+      int (*f2)(char q1, __int32 q2, ...);
+      int (*f3)();
+    } g2;
+    int main() { return 0; }
+    int test() { return 0; }
+    -bash-4.4$ clang -c -g -O2 -target bpf t2.c
+    -bash-4.4$ readelf -S t2.o
+      ......
+      [ 8] .BTF              PROGBITS         0000000000000000  00000247
+           000000000000016e  0000000000000000           0     0     1
+      [ 9] .BTF.ext          PROGBITS         0000000000000000  000003b5
+           0000000000000060  0000000000000000           0     0     1
+      [10] .rel.BTF.ext      REL              0000000000000000  000007e0
+           0000000000000040  0000000000000010          16     9     8
+      ......
+    -bash-4.4$ clang -S -g -O2 -target bpf t2.c
+    -bash-4.4$ cat t2.s
+      ......
+            .section        .BTF,"",@progbits
+            .short  60319                   # 0xeb9f
+            .byte   1
+            .byte   0
+            .long   24
+            .long   0
+            .long   220
+            .long   220
+            .long   122
+            .long   0                       # BTF_KIND_FUNC_PROTO(id = 1)
+            .long   218103808               # 0xd000000
+            .long   2
+            .long   83                      # BTF_KIND_INT(id = 2)
+            .long   16777216                # 0x1000000
+            .long   4
+            .long   16777248                # 0x1000020
+      ......
+            .byte   0                       # string offset=0
+            .ascii  ".text"                 # string offset=1
+            .byte   0
+            .ascii  "/home/yhs/tmp-pahole/t2.c" # string offset=7
+            .byte   0
+            .ascii  "int main() { return 0; }" # string offset=33
+            .byte   0
+            .ascii  "int test() { return 0; }" # string offset=58
+            .byte   0
+            .ascii  "int"                   # string offset=83
+      ......
+            .section        .BTF.ext,"",@progbits
+            .short  60319                   # 0xeb9f
+            .byte   1
+            .byte   0
+            .long   24
+            .long   0
+            .long   28
+            .long   28
+            .long   44
+            .long   8                       # FuncInfo
+            .long   1                       # FuncInfo section string offset=1
+            .long   2
+            .long   .Lfunc_begin0
+            .long   3
+            .long   .Lfunc_begin1
+            .long   5
+            .long   16                      # LineInfo
+            .long   1                       # LineInfo section string offset=1
+            .long   2
+            .long   .Ltmp0
+            .long   7
+            .long   33
+            .long   7182                    # Line 7 Col 14
+            .long   .Ltmp3
+            .long   7
+            .long   58
+            .long   8206                    # Line 8 Col 14
+
+7. Testing
+**********
+
+Kernel bpf selftest `test_btf.c` provides extensive set of BTF related tests.
diff --git a/Documentation/bpf/index.rst b/Documentation/bpf/index.rst
index 00a8450a602f..4e77932959cc 100644
--- a/Documentation/bpf/index.rst
+++ b/Documentation/bpf/index.rst
@@ -15,6 +15,13 @@ that goes into great technical depth about the BPF Architecture.
 The primary info for the bpf syscall is available in the `man-pages`_
 for `bpf(2)`_.
 
+BPF Type Format (BTF)
+=====================
+
+.. toctree::
+   :maxdepth: 1
+
+   btf
 
 
 Frequently asked questions (FAQ)
-- 
2.17.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ