lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20181016232006.rmvjgglbybf4ypwk@ast-mbp.dhcp.thefacebook.com>
Date:   Tue, 16 Oct 2018 16:20:07 -0700
From:   Alexei Starovoitov <alexei.starovoitov@...il.com>
To:     Mauricio Vasquez <mauricio.vasquez@...ito.it>
Cc:     Song Liu <liu.song.a23@...il.com>,
        Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        Networking <netdev@...r.kernel.org>
Subject: Re: [PATCH bpf-next v2 3/7] bpf: add MAP_LOOKUP_AND_DELETE_ELEM
 syscall

On Tue, Oct 16, 2018 at 04:16:39PM -0500, Mauricio Vasquez wrote:
> 
> 
> On 10/11/2018 06:51 PM, Alexei Starovoitov wrote:
> > On Wed, Oct 10, 2018 at 05:50:01PM -0500, Mauricio Vasquez wrote:
> > > > > Does it make sense to you?
> > > > I reread the other patch, and found it does NOT use the following logic for
> > > > queue and stack:
> > > > 
> > > >                  rcu_read_lock();
> > > >                  ptr = map->ops->map_lookup_and_delete_elem(map, key);
> > > >                  if (ptr)
> > > >                          memcpy(value, ptr, value_size);
> > > > 
> > > > I guess this part is not used at all? Can we just remove it?
> > > > 
> > > > Thanks,
> > > > Song
> > > This is the base code for map_lookup_and_delete support, it is not used in
> > > queue/stack maps.
> > > 
> > > I think we can leave it there, so when somebody implements lookup_and_delete
> > > for other maps doesn't have to care about implementing also this.
> > The code looks useful to me, but I also agree with Song. And in the kernel
> > we don't typically add 'almost dead code'.
> > May be provide an implementation of the lookup_and_delete for hash map
> > so it's actually used ?
> 
> I haven't written any code but I think there is a potential problem here.
> Current lookup_and_delete returns a pointer to the element, hence deletion
> of the element should be done using call_rcu to guarantee this is valid
> after returning.
> In the hashtab, the deletion only uses call_rcu when there is not prealloc,
> otherwise the element is pushed on the list of free elements immediately.
> If we move the logic to push elements into the free list under a call_rcu
> invocation, it could happen that the free list is empty because the call_rcu
> is still pending (a similar issue we had with the queue/stack maps when they
> used a pass by reference API).
> 
> There is another way to implement it without this issue, in syscall.c:
> l = ops->lookup(key);
> memcpy(l, some_buffer)
> ops->delete(key)
> 
> The point here is that the lookup_and_delete operation is not being used at
> all, so, is lookup + delete = lookup_and_delete?, can it be generalized?
> If this is true, then what is the sense of having lookup_and_delete
> syscall?,

I though of lookup_and_delete command as atomic operation.
Only in such case it would make sense.
Otherwise there is no point in having additional cmd.
In case of hash map the implementation would be similar to delete:
  raw_spin_lock_irqsave(&b->lock, flags);
  l = lookup_elem_raw(head, hash, key, key_size);
  if (l) {
          hlist_nulls_del_rcu(&l->hash_node);
          bpf_long_memcpy(); // into temp kernel area
          free_htab_elem(htab, l);
          ret = 0;
  }
  raw_spin_unlock_irqrestore(&b->lock, flags);
  copy_to_user();

there is a chance that some other cpu is doing lookup in parallel
and may be modifying value, so bpf_long_mempcy() isn't fully atomic.
But bpf side is written together with user space side,
so above almost-atomic lookup_and_delete is usable instead
of lookup and then delete which races too much.

Having said that I think it's fine to defer this new ndo for now
and leave lookup_and_delete syscall cmd for stack/queue map only.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ