[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20190617180109.34950-9-sdf@google.com>
Date: Mon, 17 Jun 2019 11:01:08 -0700
From: Stanislav Fomichev <sdf@...gle.com>
To: netdev@...r.kernel.org, bpf@...r.kernel.org
Cc: davem@...emloft.net, ast@...nel.org, daniel@...earbox.net,
Stanislav Fomichev <sdf@...gle.com>, Martin Lau <kafai@...com>
Subject: [PATCH bpf-next v6 8/9] bpf: add sockopt documentation
Provide user documentation about sockopt prog type and cgroup hooks.
v6:
* describe cgroup chaining, add example
v2:
* use return code 2 for kernel bypass
Cc: Martin Lau <kafai@...com>
Signed-off-by: Stanislav Fomichev <sdf@...gle.com>
---
Documentation/bpf/index.rst | 1 +
Documentation/bpf/prog_cgroup_sockopt.rst | 72 +++++++++++++++++++++++
2 files changed, 73 insertions(+)
create mode 100644 Documentation/bpf/prog_cgroup_sockopt.rst
diff --git a/Documentation/bpf/index.rst b/Documentation/bpf/index.rst
index d3fe4cac0c90..801a6ed3f2e5 100644
--- a/Documentation/bpf/index.rst
+++ b/Documentation/bpf/index.rst
@@ -42,6 +42,7 @@ Program types
.. toctree::
:maxdepth: 1
+ prog_cgroup_sockopt
prog_cgroup_sysctl
prog_flow_dissector
diff --git a/Documentation/bpf/prog_cgroup_sockopt.rst b/Documentation/bpf/prog_cgroup_sockopt.rst
new file mode 100644
index 000000000000..8b9d55a3e655
--- /dev/null
+++ b/Documentation/bpf/prog_cgroup_sockopt.rst
@@ -0,0 +1,72 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+============================
+BPF_PROG_TYPE_CGROUP_SOCKOPT
+============================
+
+``BPF_PROG_TYPE_CGROUP_SOCKOPT`` program type can be attached to two
+cgroup hooks:
+
+* ``BPF_CGROUP_GETSOCKOPT`` - called every time process executes ``getsockopt``
+ system call.
+* ``BPF_CGROUP_SETSOCKOPT`` - called every time process executes ``setsockopt``
+ system call.
+
+The context (``struct bpf_sockopt``) has associated socket (``sk``) and
+all input arguments: ``level``, ``optname``, ``optval`` and ``optlen``.
+
+BPF_CGROUP_SETSOCKOPT
+=====================
+
+``BPF_CGROUP_SETSOCKOPT`` has a read-only context and this hook has
+access to cgroup and socket local storage.
+
+BPF_CGROUP_GETSOCKOPT
+=====================
+
+``BPF_CGROUP_GETSOCKOPT`` has to fill in ``optval`` and adjust
+``optlen`` accordingly. Input ``optlen`` contains the maximum length
+of data that can be returned to the userspace. In other words, BPF
+program can't increase ``optlen``, it can only decrease it.
+
+Return Type
+===========
+
+* ``0`` - reject the syscall, ``EPERM`` will be returned to the userspace.
+* ``1`` - success: after returning from the BPF hook, kernel will also
+ handle this socket option.
+* ``2`` - success: after returning from the BPF hook, kernel will _not_
+ handle this socket option; control will be returned to the userspace
+ instead.
+
+Cgroup Inheritance
+==================
+
+Suppose, there is the following cgroup hierarchy where each cgroup
+has BPF_CGROUP_GETSOCKOPT attached at each level with
+BPF_F_ALLOW_MULTI flag::
+
+ A (root)
+ \
+ B
+ \
+ C
+
+When the application calls getsockopt syscall from the cgroup C,
+the programs are executed from the bottom up: C, B, A. As long as
+BPF programs in the chain return 1, the execution continues. If
+some program in the C, B, A chain returns 2 (bypass kernel) or
+0 (EPERM), the control is immediately passed passed back to the
+userspace. This is in contrast with any existing per-cgroup BPF
+hook where all programs are called, even if some of them return
+0 (EPERM).
+
+In the example above, if C returns 1 (continue) and then B returns
+0 (EPERM) or 2 (bypass kernel), the program attached to A will _not_
+be executed.
+
+Example
+=======
+
+See ``tools/testing/selftests/bpf/progs/sockopt_sk.c`` for an example
+of BPF program that handles socket options.
--
2.22.0.410.gd8fdbe21b5-goog
Powered by blists - more mailing lists