[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAEf4BzYCfEjwBErFh7QrVUoHQrUfJFFZwnzeUvYsGOA_Bmwm9Q@mail.gmail.com>
Date: Sat, 1 Jun 2019 18:10:13 -0700
From: Andrii Nakryiko <andrii.nakryiko@...il.com>
To: Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc: Andrii Nakryiko <andriin@...com>,
Networking <netdev@...r.kernel.org>, bpf <bpf@...r.kernel.org>,
Alexei Starovoitov <ast@...com>,
Daniel Borkmann <daniel@...earbox.net>,
Kernel Team <kernel-team@...com>
Subject: Re: [PATCH bpf-next] selftests/bpf: add real-world BPF verifier scale
test program
On Sat, Jun 1, 2019 at 3:05 PM Alexei Starovoitov
<alexei.starovoitov@...il.com> wrote:
>
> On Fri, May 31, 2019 at 11:39:52PM -0700, Andrii Nakryiko wrote:
> > This patch adds a new test program, based on real-world production
> > application, for testing BPF verifier scalability w/ realistic
> > complexity.
>
> Thanks!
>
> > - const char *pyperf[] = {
> > + const char *tp_progs[] = {
>
> I had very similar change in my repo :)
>
> > +struct strobemeta_payload {
> > + /* req_id has valid request ID, if req_meta_valid == 1 */
> > + int64_t req_id;
> > + uint8_t req_meta_valid;
> > + /*
> > + * mask has Nth bit set to 1, if Nth metavar was present and
> > + * successfully read
> > + */
> > + uint64_t int_vals_set_mask;
> > + int64_t int_vals[STROBE_MAX_INTS];
> > + /* len is >0 for present values */
> > + uint16_t str_lens[STROBE_MAX_STRS];
> > + /* if map_descrs[i].cnt == -1, metavar is not present/set */
> > + struct strobe_map_descr map_descrs[STROBE_MAX_MAPS];
> > + /*
> > + * payload has compactly packed values of str and map variables in the
> > + * form: strval1\0strval2\0map1key1\0map1val1\0map2key1\0map2val1\0
> > + * (and so on); str_lens[i], key_lens[i] and val_lens[i] determines
> > + * value length
> > + */
> > + char payload[STROBE_MAX_PAYLOAD];
> > +};
> > +
> > +struct strobelight_bpf_sample {
> > + uint64_t ktime;
> > + char comm[TASK_COMM_LEN];
> > + pid_t pid;
> > + int user_stack_id;
> > + int kernel_stack_id;
> > + int has_meta;
> > + struct strobemeta_payload metadata;
> > + /*
> > + * makes it possible to pass (<real payload size> + 1) as data size to
> > + * perf_submit() to avoid perf_submit's paranoia about passing zero as
> > + * size, as it deduces that <real payload size> might be
> > + * **theoretically** zero
> > + */
> > + char dummy_safeguard;
> > +};
>
> > +struct bpf_map_def SEC("maps") sample_heap = {
> > + .type = BPF_MAP_TYPE_PERCPU_ARRAY,
> > + .key_size = sizeof(uint32_t),
> > + .value_size = sizeof(struct strobelight_bpf_sample),
> > + .max_entries = 1,
> > +};
>
> due to this design the stressfulness of the test is
> limited by bpf max map value limitation which comes from
> alloc_percpu limit.
> That makes it not as stressful as I was hoping for :)
What's the limit for per-cpu allocation?
You can reduce STROBE_MAX_STR_LEN to just 1 to save quite a lot of
space and push settings further.
>
> > +#define STROBE_MAX_INTS 25
> > +#define STROBE_MAX_STRS 25
> > +#define STROBE_MAX_MAPS 5
> > +#define STROBE_MAX_MAP_ENTRIES 20
>
> so I could bump STROBE_MAX_INTS to 300 and got:
> verification time 302401 usec // with kasan
> stack depth 464
> processed 40388 insns (limit 1000000) max_states_per_insn 6 total_states 8863 peak_states 8796 mark_read 4110
> test_scale:./strobemeta25.o:OK
>
> which is not that stressful comparing to some of the tests :)
INTS and STRS are less complicated, try playing with MAX_MAPS and
MAX_MAP_ENTRIES.
E.g., I can't seem to push farther than STROBE_MAX_MAPS 15 and
STROBE_MAX_MAP_ENTRIES 30, not sure if it's due to allocation limit.
On the other hand, trying STROBE_MAX_MAPS 30 and
STROBE_MAX_MAP_ENTRIES 15 (which should use pretty similar amount of
space), I hit stack size limit. So this combination (and higher
values, if possible), should be a good demo case for loops. I'm
curious for you to try and let me know if you could go higher with
loops support... :)
To save some more space, try removing cnt, tag_len, and id from struct
strobe_map_descr, you can try to reduce val_lens and key_lens to be
just uint8_t. Similar thing can be done to int_vals in struct
strobemeta_valid. I don't want to remove them, as they add to
complexity of the program, but reducing size should be ok.
BTW, it's kind of hard to understand why verif_scale case fails, would
be nice to get better log output (not just stats, which are missing on
failure). So consider that a feature request. ;)
>
> Without unroll:
> verification time 435963 usec // with kasan
> stack depth 488
> processed 52812 insns (limit 1000000) max_states_per_insn 26 total_states 6786 peak_states 1405 mark_read 777
> test_scale:./strobemeta25.o:OK
>
> So things are looking pretty good.
>
> I'll roll your test into my set with few tweaks. Thanks a lot!
sounds good!
>
> btw I consistently see better code and less insn_processed in alu32 mode.
> It's probably time to make it llvm default.
>
yep, I remember I had to explicitly cast a bunch of things to uint64_t
just to avoid those pesky <<= and >>= operations, where possible. :)
Powered by blists - more mailing lists