[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LRH.2.20.1909281844480.5332@dhcp-10-175-218-65.vpn.oracle.com>
Date: Sat, 28 Sep 2019 18:46:22 +0100 (BST)
From: Alan Maguire <alan.maguire@...cle.com>
To: Andrii Nakryiko <andrii.nakryiko@...il.com>
cc: Alan Maguire <alan.maguire@...cle.com>,
Andrii Nakryiko <andriin@...com>, bpf <bpf@...r.kernel.org>,
Networking <netdev@...r.kernel.org>,
Alexei Starovoitov <ast@...com>,
Daniel Borkmann <daniel@...earbox.net>,
Kernel Team <kernel-team@...com>
Subject: Re: [PATCH bpf] libbpf: count present CPUs, not theoretically
possible
On Sat, 28 Sep 2019, Andrii Nakryiko wrote:
> On Sat, Sep 28, 2019 at 4:20 AM Alan Maguire <alan.maguire@...cle.com> wrote:
> >
> > On Fri, 27 Sep 2019, Andrii Nakryiko wrote:
> >
> > > This patch switches libbpf_num_possible_cpus() from using possible CPU
> > > set to present CPU set. This fixes issues with incorrect auto-sizing of
> > > PERF_EVENT_ARRAY map on HOTPLUG-enabled systems.
> > >
> > > On HOTPLUG enabled systems, /sys/devices/system/cpu/possible is going to
> > > be a set of any representable (i.e., potentially possible) CPU, which is
> > > normally way higher than real amount of CPUs (e.g., 0-127 on VM I've
> > > tested on, while there were just two CPU cores actually present).
> > > /sys/devices/system/cpu/present, on the other hand, will only contain
> > > CPUs that are physically present in the system (even if not online yet),
> > > which is what we really want, especially when creating per-CPU maps or
> > > perf events.
> > >
> > > On systems with HOTPLUG disabled, present and possible are identical, so
> > > there is no change of behavior there.
> > >
> >
> > Just curious - is there a reason for not adding a new libbpf_num_present_cpus()
> > function to cover this case, and switching to using that in various places?
>
> The reason is that libbpf_num_possible_cpus() is useless on HOTPLUG
> systems and never worked as intended. If you rely on this function to
> create perf_buffer and/or PERF_EVENT_ARRAY, it will simply fail due to
> specifying more CPUs than are present. I didn't want to keep adding
> new APIs for no good reason, while also leaving useless ones, so I
> fixed the existing API to behave as expected. It's unfortunate that
> name doesn't match sysfs file we are reading it from, of course, but
> having people to choose between libbpf_num_possible_cpus() vs
> libbpf_num_present_cpus() seems like even bigger problem, as
> differences are non-obvious.
>
> The good thing, it won't break all the non-HOTPLUG systems for sure,
> which seems to be the only cases that are used right now (otherwise
> someone would already complain about broken
> libbpf_num_possible_cpus()).
>
Understood, thanks for the explanation.
> >
> > Looking at the places libbpf_num_possible_cpus() is called in libbpf
> >
> > - __perf_buffer__new(): this could just change to use the number of
> > present CPUs, since perf_buffer__new_raw() with a cpu_cnt in struct
> > perf_buffer_raw_ops
> >
> > - bpf_object__create_maps(), which is called via bpf_oject__load_xattr().
> > In this case it seems like switching to num present makes sense, though
> > it might make sense to add a field to struct bpf_object_load_attr * to
> > allow users to explicitly set another max value.
>
> I believe more knobs is not always better for API. Plus, adding any
> field to those xxx_xattr structs is an ABI breakage and requires
> adding new APIs, so I don't think this is good enough reason to add
> new flag. See discussion in another thread about this whole API design
> w/ current attributes and ABI consequences of adding anything new to
> them.
>
> >
> > This would give the desired default behaviour, while still giving users
> > a way of specifying the possible number. What do you think? Thanks!
>
> BTW, if user wants to override the size of maps, they can do it easily
> either in map definition or programmatically after bpf_object__open,
> but before bpf_object__load, so there is no need for flags, it's all
> easily achievable with existing API.
>
Ah, I missed that. Thanks for clarifying!
Reviewed-by: Alan Maguire <alan.maguire@...cle.com>
> >
> > Alan
> >
> > > Signed-off-by: Andrii Nakryiko <andriin@...com>
> > > ---
> > > tools/lib/bpf/libbpf.c | 2 +-
> > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> > > index e0276520171b..45351c074e45 100644
> > > --- a/tools/lib/bpf/libbpf.c
> > > +++ b/tools/lib/bpf/libbpf.c
> > > @@ -5899,7 +5899,7 @@ void bpf_program__bpil_offs_to_addr(struct bpf_prog_info_linear *info_linear)
> > >
> > > int libbpf_num_possible_cpus(void)
> > > {
> > > - static const char *fcpu = "/sys/devices/system/cpu/possible";
> > > + static const char *fcpu = "/sys/devices/system/cpu/present";
> > > int len = 0, n = 0, il = 0, ir = 0;
> > > unsigned int start = 0, end = 0;
> > > int tmp_cpus = 0;
> > > --
> > > 2.17.1
> > >
> > >
>
Powered by blists - more mailing lists