lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 25 Mar 2022 12:16:27 -0700
From:   Yonghong Song <yhs@...com>
To:     Hao Luo <haoluo@...gle.com>, Alexei Starovoitov <ast@...nel.org>,
        Andrii Nakryiko <andrii@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>
Cc:     KP Singh <kpsingh@...nel.org>, Martin KaFai Lau <kafai@...com>,
        Song Liu <songliubraving@...com>, bpf@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH RFC bpf-next 0/2] Mmapable task local storage.



On 3/24/22 4:41 PM, Hao Luo wrote:
> Some map types support mmap operation, which allows userspace to
> communicate with BPF programs directly. Currently only arraymap
> and ringbuf have mmap implemented.
> 
> However, in some use cases, when multiple program instances can
> run concurrently, global mmapable memory can cause race. In that
> case, userspace needs to provide necessary synchronizations to
> coordinate the usage of mapped global data. This can be a source
> of bottleneck.

I can see your use case here. Each calling process can get the
corresponding bpf program task local storage data through
mmap interface. As you mentioned, there is a tradeoff
between more memory vs. non-global synchronization.

I am thinking that another bpf_iter approach can retrieve
the similar result. We could implement a bpf_iter
for task local storage map, optionally it can provide
a tid to retrieve the data for that particular tid.
This way, user space needs an explicit syscall, but
does not need to allocate more memory than necessary.

WDYT?

> 
> It would be great to have a mmapable local storage in that case.
> This patch adds that.
> 
> Mmap isn't BPF syscall, so unpriv users can also use it to
> interact with maps.
> 
> Currently the only way of allocating mmapable map area is using
> vmalloc() and it's only used at map allocation time. Vmalloc()
> may sleep, therefore it's not suitable for maps that may allocate
> memory in an atomic context such as local storage. Local storage
> uses kmalloc() with GFP_ATOMIC, which doesn't sleep. This patch
> uses kmalloc() with GFP_ATOMIC as well for mmapable map area.
> 
> Allocating mmapable memory has requirment on page alignment. So we
> have to deliberately allocate more memory than necessary to obtain
> an address that has sdata->data aligned at page boundary. The
> calculations for mmapable allocation size, and the actual
> allocation/deallocation are packaged in three functions:
> 
>   - bpf_map_mmapable_alloc_size()
>   - bpf_map_mmapable_kzalloc()
>   - bpf_map_mmapable_kfree()
> 
> BPF local storage uses them to provide generic mmap API:
> 
>   - bpf_local_storage_mmap()
> 
> And task local storage adds the mmap callback:
> 
>   - task_storage_map_mmap()
> 
> When application calls mmap on a task local storage, it gets its
> own local storage.
> 
> Overall, mmapable local storage trades off memory with flexibility
> and efficiency. It brings memory fragmentation but can make programs
> stateless. Therefore useful in some cases.
> 
> Hao Luo (2):
>    bpf: Mmapable local storage.
>    selftests/bpf: Test mmapable task local storage.
> 
>   include/linux/bpf.h                           |  4 +
>   include/linux/bpf_local_storage.h             |  5 +-
>   kernel/bpf/bpf_local_storage.c                | 73 +++++++++++++++++--
>   kernel/bpf/bpf_task_storage.c                 | 40 ++++++++++
>   kernel/bpf/syscall.c                          | 67 +++++++++++++++++
>   .../bpf/prog_tests/task_local_storage.c       | 38 ++++++++++
>   .../bpf/progs/task_local_storage_mmapable.c   | 38 ++++++++++
>   7 files changed, 257 insertions(+), 8 deletions(-)
>   create mode 100644 tools/testing/selftests/bpf/progs/task_local_storage_mmapable.c
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ