[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250515211606.2697271-1-ameryhung@gmail.com>
Date: Thu, 15 May 2025 14:15:59 -0700
From: Amery Hung <ameryhung@...il.com>
To: bpf@...r.kernel.org
Cc: netdev@...r.kernel.org,
alexei.starovoitov@...il.com,
andrii@...nel.org,
daniel@...earbox.net,
tj@...nel.org,
memxor@...il.com,
martin.lau@...nel.org,
ameryhung@...il.com,
kernel-team@...a.com
Subject: [PATCH bpf-next v4 0/3] Task local data
* Overview *
Task local data defines an abstract storage type for storing data specific
to each task and provides user space and bpf libraries to access it. The
result is a fast and easy way to share per-task data between user space
and bpf programs. The intended use case is sched_ext, where user space
programs will pass hints to sched_ext bpf programs to affect task
scheduling.
Task local data is built on top of task local storage map and UPTR[0]
to achieve fast per-task data sharing. UPTR is a type of special field
supported in task local storage map value. A user page assigned to a UPTR
will be pinned by the kernel when the map is updated. Therefore, user
space programs can update data seen by bpf programs without syscalls.
Additionally, unlike most bpf maps, task local data does not require a
static map value definition. This design is driven by sched_ext, which
would like to allow multiple developers to share a storage without the
need to explicitly agree on the layout of it. While a centralized layout
definition would have worked, the friction of synchronizing it across
different repos is not desirable. This simplify code base management and
makes experimenting easier.
In the rest of the cover letter, "task local data" is used to refer to
the abstract storage and TLD is used to denote a single data entry in
the storage.
* Design *
Task local data library provides simple APIs for user space and bpf
through two header files, task_local_data.h and task_loca_data.bpf.h,
respectively. The usage is illustrated in the following diagram.
An entry of data in the task local data, TLD, first needs to be created
in the user space by calling tld_create_key() with the size of the data
and a name associated with the data. The function returns an opaque key
object of tld_key_t type, which can be used to locate a TLD. The same
key may be passed to tld_get_data() in different threads, and a pointer
to data specific to the calling thread will be returned. The pointer will
remain valid until the process terminates, so there is not need to call
tld_get_data() in subsequent accesses.
On the bpf side, programs will also use keys to locate TLDs. For every
new task, a bpf program must first fetch the keys and save them for later
uses. This is done by calling tld_fetch_key() with names specified in
tld_create_key(). The key will be saved in a task local storage map,
tld_key_map. The map value type, struct tld_keys, __must__ be defined by
developers. It should contain keys used in the compilation unit. Finally,
bpf programs can call tld_get_data() to get a pointer to a TLD that is
shared with user space.
┌─ Application ───────────────────────────────────────────────────────┐
│ tld_key_t kx = tld_create_key(fd, "X", sizeof(int)); │
│ ... ┌─ library A ────────────────────────┐│
│ int *x = tld_get_data(fd, kx);│ ky = tld_create_key(fd, "Y", ││
│ if (x) *x = 123; │ sizeof(bool)); ││
│ │ bool *y = tld_get_data(ky); ││
│ ┌─────┤ if (y) *y = true; ││
│ │ └────────────────────────────────────┘│
└───────┬─────────────────│───────────────────────────────────────────┘
V V
+ ─ Task local data ─ ─ ─ ─ ─ + ┌─ sched_ext_ops::init_task ────────┐
| ┌─ tld_data_map ──────────┐ | │ tld_init_object(task, &tld_obj); │
| │ BPF Task local storage │ | │ tld_fetch_key(&tld_obj, "X", kx); │
| │ │ |<─┤ tld_fetch_key(&tld_obj, "Y", ky); │
| │ data_page __uptr *data │ | └───────────────────────────────────┘
| │ metadata_page __uptr *metadata
| └─────────────────────────┘ | ┌─ Other sched_ext_ops op ──────────┐
| ┌─ tld_key_map ───────────┐ | │ tld_init_object(task, &tld_obj); │
| │ BPF Task local storage │ | │ bool *y = tld_get_data(&tld_obj, ├┐
| │ │ |<─┤ ky, 1); ││
| │ tld_key_t kx; │ | │ if (y) ││
| │ tld_key_t ky; │ | │ /* do something */ ││
| └─────────────────────────┘ | └┬──────────────────────────────────┘│
+ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ + └───────────────────────────────────┘
* Implementation *
Task local data defines the storage to be a task local storage map with
two UPTRs, data and metadata. Data points to a blob of memory for storing
TLDs individual to every task. Metadata, individual to each process and
shared by its threads, records the number of TLDs declared and the
metadata of each TLD. Metadata for a TLD contains the key name and the
size of the TLD in data.
struct data_page {
char data[PAGE_SIZE];
};
struct metadata_page {
u8 cnt;
struct metadata data[TLD_DATA_CNT];
};
Both user space and bpf API follow the same protocol when accessing
task local data. A pointer to a TLD is located by a key. tld_key_t
effectively is the offset of a TLD in data. To add a TLD, user space
API, tld_create_key(), loops through metadata->data until an empty slot
is found and update it. It also adds sizes of prior TLDs along the way
to derive the offset. To fetch a key, bpf API, tld_fetch_key(), also
loops through metadata->data until the key name is found. The offset is
also derived by adding sizes. The detail of task local data operations
can be found in patch 1.
[0] https://lore.kernel.org/bpf/20241023234759.860539-1-martin.lau@linux.dev/
v3 -> v4
- API improvements
- Simplify API
- Drop string obfuscation
- Use opaque type for key
- Better documentation
- Implementation
- Switch to dynamic allocation for per-task data
- Now offer as header-only libraries
- No TLS map pinning; leave it to users
- Drop pthread dependency
- Add more invalid tld_create_key() test
- Add a race test for tld_create_key()
v3: https://lore.kernel.org/bpf/20250425214039.2919818-1-ameryhung@gmail.com/
Amery Hung (3):
selftests/bpf: Introduce task local data
selftests/bpf: Test basic task local data operations
selftests/bpf: Test concurrent task local data key creation
.../bpf/prog_tests/task_local_data.h | 263 ++++++++++++++++++
.../bpf/prog_tests/test_task_local_data.c | 254 +++++++++++++++++
.../selftests/bpf/progs/task_local_data.bpf.h | 220 +++++++++++++++
.../bpf/progs/test_task_local_data.c | 81 ++++++
4 files changed, 818 insertions(+)
create mode 100644 tools/testing/selftests/bpf/prog_tests/task_local_data.h
create mode 100644 tools/testing/selftests/bpf/prog_tests/test_task_local_data.c
create mode 100644 tools/testing/selftests/bpf/progs/task_local_data.bpf.h
create mode 100644 tools/testing/selftests/bpf/progs/test_task_local_data.c
--
2.47.1
Powered by blists - more mailing lists