lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAEf4BzbEht9srvg8kJowM1e-t=2WOE3GCHWWJWsYYwKfT06iSQ@mail.gmail.com>
Date:   Tue, 27 Jul 2021 14:30:24 -0700
From:   Andrii Nakryiko <andrii.nakryiko@...il.com>
To:     Stanislav Fomichev <sdf@...gle.com>
Cc:     Networking <netdev@...r.kernel.org>, bpf <bpf@...r.kernel.org>,
        Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        Andrii Nakryiko <andrii@...nel.org>,
        Martin KaFai Lau <kafai@...com>, Yonghong Song <yhs@...com>
Subject: Re: [PATCH bpf-next v3] bpf: increase supported cgroup storage value size

On Tue, Jul 27, 2021 at 1:47 PM <sdf@...gle.com> wrote:
>
> On 07/27, Andrii Nakryiko wrote:
> > On Mon, Jul 26, 2021 at 4:00 PM Stanislav Fomichev <sdf@...gle.com> wrote:
> > >
> > > Current max cgroup storage value size is 4k (PAGE_SIZE). The other local
> > > storages accept up to 64k (BPF_LOCAL_STORAGE_MAX_VALUE_SIZE). Let's
> > align
> > > max cgroup value size with the other storages.
> > >
> > > For percpu, the max is 32k (PCPU_MIN_UNIT_SIZE) because percpu
> > > allocator is not happy about larger values.
> > >
> > > netcnt test is extended to exercise those maximum values
> > > (non-percpu max size is close to, but not real max).
> > >
> > > v3:
> > > * refine SIZEOF_BPF_LOCAL_STORAGE_ELEM comment (Yonghong Song)
> > > * anonymous struct in percpu_net_cnt & net_cnt (Yonghong Song)
> > > * reorder free (Yonghong Song)
> > >
> > > v2:
> > > * cap max_value_size instead of BUILD_BUG_ON (Martin KaFai Lau)
> > >
> > > Cc: Martin KaFai Lau <kafai@...com>
> > > Cc: Yonghong Song <yhs@...com>
> > > Signed-off-by: Stanislav Fomichev <sdf@...gle.com>
> > > ---
> > >  kernel/bpf/local_storage.c                  | 11 +++++-
> > >  tools/testing/selftests/bpf/netcnt_common.h | 38 +++++++++++++++++----
> > >  tools/testing/selftests/bpf/test_netcnt.c   | 17 ++++++---
> > >  3 files changed, 53 insertions(+), 13 deletions(-)
> > >
> > > diff --git a/kernel/bpf/local_storage.c b/kernel/bpf/local_storage.c
> > > index 7ed2a14dc0de..035e9e3a7132 100644
> > > --- a/kernel/bpf/local_storage.c
> > > +++ b/kernel/bpf/local_storage.c
> > > @@ -1,6 +1,7 @@
> > >  //SPDX-License-Identifier: GPL-2.0
> > >  #include <linux/bpf-cgroup.h>
> > >  #include <linux/bpf.h>
> > > +#include <linux/bpf_local_storage.h>
> > >  #include <linux/btf.h>
> > >  #include <linux/bug.h>
> > >  #include <linux/filter.h>
> > > @@ -283,9 +284,17 @@ static int cgroup_storage_get_next_key(struct
> > bpf_map *_map, void *key,
> > >
> > >  static struct bpf_map *cgroup_storage_map_alloc(union bpf_attr *attr)
> > >  {
> > > +       __u32 max_value_size = BPF_LOCAL_STORAGE_MAX_VALUE_SIZE;
> > >         int numa_node = bpf_map_attr_numa_node(attr);
> > >         struct bpf_cgroup_storage_map *map;
> > >
> > > +       /* percpu is bound by PCPU_MIN_UNIT_SIZE, non-percu
> > > +        * is the same as other local storages.
> > > +        */
> > > +       if (attr->map_type == BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE)
> > > +               max_value_size = min_t(__u32, max_value_size,
> > > +                                      PCPU_MIN_UNIT_SIZE);
> > > +
> > >         if (attr->key_size != sizeof(struct bpf_cgroup_storage_key) &&
> > >             attr->key_size != sizeof(__u64))
> > >                 return ERR_PTR(-EINVAL);
> > > @@ -293,7 +302,7 @@ static struct bpf_map
> > *cgroup_storage_map_alloc(union bpf_attr *attr)
> > >         if (attr->value_size == 0)
> > >                 return ERR_PTR(-EINVAL);
> > >
> > > -       if (attr->value_size > PAGE_SIZE)
> > > +       if (attr->value_size > max_value_size)
> > >                 return ERR_PTR(-E2BIG);
> > >
> > >         if (attr->map_flags & ~LOCAL_STORAGE_CREATE_FLAG_MASK ||
> > > diff --git a/tools/testing/selftests/bpf/netcnt_common.h
> > b/tools/testing/selftests/bpf/netcnt_common.h
> > > index 81084c1c2c23..87f5b97e1932 100644
> > > --- a/tools/testing/selftests/bpf/netcnt_common.h
> > > +++ b/tools/testing/selftests/bpf/netcnt_common.h
> > > @@ -6,19 +6,43 @@
> > >
> > >  #define MAX_PERCPU_PACKETS 32
> > >
> > > +/* sizeof(struct bpf_local_storage_elem):
> > > + *
> > > + * It really is about 128 bytes on x86_64, but allocate more to
> > account for
> > > + * possible layout changes, different architectures, etc.
> > > + * The kernel will wrap up to PAGE_SIZE internally anyway.
> > > + */
> > > +#define SIZEOF_BPF_LOCAL_STORAGE_ELEM          256
> > > +
> > > +/* Try to estimate kernel's BPF_LOCAL_STORAGE_MAX_VALUE_SIZE: */
> > > +#define BPF_LOCAL_STORAGE_MAX_VALUE_SIZE       (0xFFFF - \
> > > +
> > SIZEOF_BPF_LOCAL_STORAGE_ELEM)
> > > +
> > > +#define PCPU_MIN_UNIT_SIZE                     32768
> > > +
> > >  struct percpu_net_cnt {
> > > -       __u64 packets;
> > > -       __u64 bytes;
> > > +       union {
>
> > so you have a struct with a single anonymous union inside, isn't that
> > right? Any problems with just making struct percpu_net_cnt into union
> > percpu_net_cnt?
> We'd have to s/struct/union/ everywhere in this case, not sure
> we want to add more churn? Seemed easier to do anonymous union+struct.

4 occurrences for net_cnt and another 4 for percpu_net_cnt, not much
churn (and all pretty localized). But I honestly don't care, just
wanted to note that you don't need this extra nesting.

>
> > > +               struct {
> > > +                       __u64 packets;
> > > +                       __u64 bytes;
> > >
> > > -       __u64 prev_ts;
> > > +                       __u64 prev_ts;
> > >
> > > -       __u64 prev_packets;
> > > -       __u64 prev_bytes;
> > > +                       __u64 prev_packets;
> > > +                       __u64 prev_bytes;
> > > +               };
> > > +               __u8 data[PCPU_MIN_UNIT_SIZE];
> > > +       };
> > >  };
> > >
> > >  struct net_cnt {
> > > -       __u64 packets;
> > > -       __u64 bytes;
> > > +       union {
>
> > similarly here
>
> > > +               struct {
> > > +                       __u64 packets;
> > > +                       __u64 bytes;
> > > +               };
> > > +               __u8 data[BPF_LOCAL_STORAGE_MAX_VALUE_SIZE];
> > > +       };
> > >  };
> > >
> > >  #endif
> > > diff --git a/tools/testing/selftests/bpf/test_netcnt.c
> > b/tools/testing/selftests/bpf/test_netcnt.c
> > > index a7b9a69f4fd5..372afccf2d17 100644
> > > --- a/tools/testing/selftests/bpf/test_netcnt.c
> > > +++ b/tools/testing/selftests/bpf/test_netcnt.c
> > > @@ -33,11 +33,11 @@ static int bpf_find_map(const char *test, struct
> > bpf_object *obj,
> > >
> > >  int main(int argc, char **argv)
> > >  {
> > > -       struct percpu_net_cnt *percpu_netcnt;
> > > +       struct percpu_net_cnt *percpu_netcnt = NULL;
> > >         struct bpf_cgroup_storage_key key;
> > > +       struct net_cnt *netcnt = NULL;
> > >         int map_fd, percpu_map_fd;
> > >         int error = EXIT_FAILURE;
> > > -       struct net_cnt netcnt;
> > >         struct bpf_object *obj;
> > >         int prog_fd, cgroup_fd;
> > >         unsigned long packets;
> > > @@ -52,6 +52,12 @@ int main(int argc, char **argv)
> > >                 goto err;
> > >         }
> > >
> > > +       netcnt = malloc(sizeof(*netcnt));
>
> > curious, was it too big to be just allocated on the stack? Isn't the
> > thread stack size much bigger than 64KB (at least by default)?
> I haven't tried really, I just moved it to malloc because it crossed
> some unconscious boundary for the 'stuff I allocate on the stack'.
> I can try it out if you prefer to keep it on the stack, let me know.

Yeah, if it can stay on the stack. Less thinking about freeing memory.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ