lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180710042319.GA77322@rdna-mbp.dhcp.thefacebook.com>
Date:   Mon, 9 Jul 2018 21:23:20 -0700
From:   Andrey Ignatov <rdna@...com>
To:     Jakub Kicinski <jakub.kicinski@...ronome.com>
CC:     <alexei.starovoitov@...il.com>, <daniel@...earbox.net>,
        <oss-drivers@...ronome.com>, <netdev@...r.kernel.org>
Subject: Re: [PATCH bpf-next v2 11/12] tools: libbpf: allow map reuse

Jakub Kicinski <jakub.kicinski@...ronome.com> [Mon, 2018-07-09 19:49 -0700]:
> On Mon, 9 Jul 2018 13:22:54 -0700, Andrey Ignatov wrote:
> > Jakub Kicinski <jakub.kicinski@...ronome.com> [Mon, 2018-07-09 11:01 -0700]:
> > > More advanced applications may want to only replace programs without
> > > destroying associated maps.  Allow libbpf users to achieve that.
> > > Instead of always creating all of the maps at load time, expose to
> > > users an API to reconstruct the map object from already existing
> > > map.
> > > 
> > > The map parameters are read from the kernel and replace the parameters
> > > of the ELF map.  libbpf does not restrict the map replacement, i.e.
> > > the reused map does not have to be compatible with the ELF map
> > > definition.  We relay on the verifier for checking the compatibility
> > > between maps and programs.  The ELF map definition is completely
> > > overwritten by the information read from the kernel, to make sure
> > > libbpf's view of map object corresponds to the actual map.  
> > 
> > Thanks for working on this Jakub! I encountered this shortcoming of
> > libbpf as well and was planning to fix it, but you beat me to it :)
> 
> Ah!  I wish I didn't! :)
> 
> > > Signed-off-by: Jakub Kicinski <jakub.kicinski@...ronome.com>
> > > Reviewed-by: Quentin Monnet <quentin.monnet@...ronome.com>
> > > ---
> > >  tools/lib/bpf/libbpf.c | 35 +++++++++++++++++++++++++++++++++++
> > >  tools/lib/bpf/libbpf.h |  1 +
> > >  2 files changed, 36 insertions(+)
> > > 
> > > diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> > > index b653dbb266c7..c80033fe66c3 100644
> > > --- a/tools/lib/bpf/libbpf.c
> > > +++ b/tools/lib/bpf/libbpf.c
> > > @@ -215,6 +215,7 @@ struct bpf_map {
> > >  	int fd;
> > >  	char *name;
> > >  	size_t offset;
> > > +	bool fd_preset;  
> > 
> > Any reason not to use map->fd itself to identify if fd is present?
> 
> Note: pre-set, not present.

Oh, sorry, I'm blind :)


> > fd of every map is set to -1 in bpf_object__init_maps() that, in turn, is
> > called from __bpf_object__open():
> > 
> > 	for (i = 0; i < nr_maps; i++)
> > 		obj->maps[i].fd = -1;
> > 
> > Later it will either contain valid fd that is >= 0, or that same -1, what
> > should be enough to identify fd presence.
> 
> I thought it to be cleaner to indicate the fd has been pre-set, in case
> things get more complicated in the future and fd >= 0 becomes ambiguous.
> 
> But no strong preference, should I change?

My preference (not strong either) is to avoid a new field whenever it's
possible. Though if you have a use-case that can't be covered by
(fd >= 0) keeping the field is fine as well.


> > >  	int map_ifindex;
> > >  	struct bpf_map_def def;
> > >  	uint32_t btf_key_type_id;
> > > @@ -1082,6 +1083,34 @@ static int bpf_map_find_btf_info(struct bpf_map *map, const struct btf *btf)
> > >  	return 0;
> > >  }
> > >  
> > > +int bpf_map__reuse_fd(struct bpf_map *map, int fd)
> > > +{
> > > +	struct bpf_map_info info = {};
> > > +	__u32 len = sizeof(info);
> > > +	int err;
> > > +
> > > +	err = bpf_obj_get_info_by_fd(fd, &info, &len);
> > > +	if (err)
> > > +		return err;
> > > +  
> > 
> > Should there be a check that map->fd doesn't contain any valid fd (>= 0)
> > before rewriting it so that if it does (e.g. because the function is
> > called after bpf_object__load() by mistake), current map->fd won't be
> > leaked?
> 
> Hm.  In my first implementation libbpf just took the passed fd and
> didn't do a dup(), the lifetime of the fd remained with the caller.
> Having a check will prevent changing the descriptor unless we add some
> from of "un-reuse" as well.  Perhaps I should just add a close() in
> case fd >= 0?  Or do you prefer a hard error?

Agree, close() in case fd >= 0 should be fine since caller already made it
explicit that they don't care about current fd and there should not be a
reason to hard-fail.


> > > +	map->fd = dup(fd);  
> > 
> > Unfortunately, new descriptor created by dup(2) will not have O_CLOEXEC set, in
> > contrast to original fd returned by kernel on map creation.
> > 
> > libbpf has other interface shortcomings where it comes up. E.g. struct
> > bpf_object owns all descriptors it contains (progs, maps) and closes them in
> > bpf_object__close(). if one wants to open/load ELF, then close it but
> > keep, say, prog fd to attach it to cgroup some time later, then fd
> > should be duplicated as well to get a new one not owned by bpf_object.
> > 
> > Currently I use this workaround to avoid time when new fd doesn't have
> > O_CLOEXEC:
> > 
> > 	int new_prog_fd = open("/dev/null", O_RDONLY | O_CLOEXEC);
> > 	if (new_prog_fd < 0 ||
> > 	    dup3(bpf_program__fd(prog), new_prog_fd, O_CLOEXEC) == -1) {
> > 		/* .. handle error .. */
> > 		close(new_prog_fd);
> > 	}
> > 	/* .. use new_prog_fd with O_CLOEXEC set */
> > 
> > Not sure how to simplify it. dup2() has same problem with regard to
> > O_CLOEXEC.
> > 
> > Use-case: standalone server application that uses libbpf and does
> > fork()/execve() a lot.
> 
> Good point!  I have no better ideas.  Although being slightly paranoid
> I would perhaps use "/" instead of "/dev/null"?  Shouldn't matter?

No strong preferences, important thing is to create fd with O_CLOEXEC
set somehow.

Is it safer to use "/" than "/dev/null"? (trying to understand if I
should change my code as well)


> > > +	if (map->fd < 0)
> > > +		return map->fd;
> > > +	map->fd_preset = true;
> > > +
> > > +	free(map->name);
> > > +	map->name = strdup(info.name);
> > > +	map->def.type = info.type;
> > > +	map->def.key_size = info.key_size;
> > > +	map->def.value_size = info.value_size;
> > > +	map->def.max_entries = info.max_entries;
> > > +	map->def.map_flags = info.map_flags;
> > > +	map->btf_key_type_id = info.btf_key_type_id;
> > > +	map->btf_value_type_id = info.btf_value_type_id;
> > > +
> > > +	return 0;
> > > +}
> > > +
> > >  static int
> > >  bpf_object__create_maps(struct bpf_object *obj)
> > >  {
> > > @@ -1094,6 +1123,12 @@ bpf_object__create_maps(struct bpf_object *obj)
> > >  		struct bpf_map_def *def = &map->def;
> > >  		int *pfd = &map->fd;
> > >  
> > > +		if (map->fd_preset) {
> > > +			pr_debug("skip map create (preset) %s: fd=%d\n",
> > > +				 map->name, map->fd);
> > > +			continue;
> > > +		}
> > > +
> > >  		create_attr.name = map->name;
> > >  		create_attr.map_ifindex = map->map_ifindex;
> > >  		create_attr.map_type = def->type;
> > > diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
> > > index 60593ac44700..8e709a74f47c 100644
> > > --- a/tools/lib/bpf/libbpf.h
> > > +++ b/tools/lib/bpf/libbpf.h
> > > @@ -261,6 +261,7 @@ typedef void (*bpf_map_clear_priv_t)(struct bpf_map *, void *);
> > >  int bpf_map__set_priv(struct bpf_map *map, void *priv,
> > >  		      bpf_map_clear_priv_t clear_priv);
> > >  void *bpf_map__priv(struct bpf_map *map);
> > > +int bpf_map__reuse_fd(struct bpf_map *map, int fd);
> > >  bool bpf_map__is_offload_neutral(struct bpf_map *map);
> > >  void bpf_map__set_ifindex(struct bpf_map *map, __u32 ifindex);
> > >  int bpf_map__pin(struct bpf_map *map, const char *path);
> > > -- 
> > > 2.17.1
> > >   
> > 
> 

-- 
Andrey Ignatov

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ