netdev - Re: [PATCH v5 bpf-next 04/11] bpf: Add map side support for bpf timers.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <CAADnVQKUEsW8kF4iJP_RF07wX2z06wz9yJ1h-CY9si70XzhsDA@mail.gmail.com>
Date:   Fri, 9 Jul 2021 00:00:16 -0700
From:   Alexei Starovoitov <alexei.starovoitov@...il.com>
To:     Martin KaFai Lau <kafai@...com>
Cc:     "David S. Miller" <davem@...emloft.net>,
        Daniel Borkmann <daniel@...earbox.net>,
        Andrii Nakryiko <andrii@...nel.org>,
        Network Development <netdev@...r.kernel.org>,
        bpf <bpf@...r.kernel.org>, Kernel Team <kernel-team@...com>
Subject: Re: [PATCH v5 bpf-next 04/11] bpf: Add map side support for bpf timers.

On Thu, Jul 8, 2021 at 11:04 PM Martin KaFai Lau <kafai@...com> wrote:
>
> On Thu, Jul 08, 2021 at 08:52:23PM -0700, Alexei Starovoitov wrote:
> > On Thu, Jul 08, 2021 at 06:51:19PM -0700, Martin KaFai Lau wrote:
> > > > +
> > > >  /* Called when map->refcnt goes to zero, either from workqueue or from syscall */
> > > >  static void array_map_free(struct bpf_map *map)
> > > >  {
> > > > @@ -382,6 +402,7 @@ static void array_map_free(struct bpf_map *map)
> > > >   if (array->map.map_type == BPF_MAP_TYPE_PERCPU_ARRAY)
> > > >           bpf_array_free_percpu(array);
> > > >
> > > > + array_map_free_timers(map);
> > > array_map_free() is called when map->refcnt reached 0.
> > > By then, map->usercnt should have reached 0 before
> > > and array_map_free_timers() should have already been called,
> > > so no need to call it here again?  The same goes for hashtab.
> >
> > Not sure it's that simple.
> > Currently map->usercnt > 0 check is done for bpf_timer_set_callback only,
> > because prog refcnting is what matters to usercnt and map_release_uref scheme.
> > bpf_map_init doesn't have this check because there is no circular dependency
> > prog->map->timer->prog to worry about.
> > So after usercnt reached zero the prog can still do bpf_timer_init.
> Ah. right. missed the bpf_timer_init().
>
> > I guess we can add usercnt > 0 to bpf_timer_init as well.
> > Need to think whether it's enough and the race between atomic64_read(usercnt)
> > and atomic64_dec_and_test(usercnt) is addressed the same way as the race
> > in set_callback and cancel_and_free. So far looks like it. Hmm.
> hmm... right, checking usercnt > 0 seems ok.
> When usercnt is 0, it may be better to also error out instead of allocating
> a timer that cannot be used.
>
> I was mostly thinking avoiding changes in map_free could make future map
> support a little easier.

ok. let me try with usercnt>0 in bpf_timer_init.

> >
> > >
> > > > +static void htab_free_malloced_timers(struct bpf_htab *htab)
> > > > +{
> > > > + int i;
> > > > +
> > > > + rcu_read_lock();
> > > > + for (i = 0; i < htab->n_buckets; i++) {
> > > > +         struct hlist_nulls_head *head = select_bucket(htab, i);
> > > > +         struct hlist_nulls_node *n;
> > > > +         struct htab_elem *l;
> > > > +
> > > > +         hlist_nulls_for_each_entry(l, n, head, hash_node)
> > > May be put rcu_read_lock/unlock() in the loop and do a
> > > cond_resched() in case the hashtab is large.
> Just recalled cond_resched_rcu() may be cleaner, like:
>
> static void htab_free_malloced_timers(struct bpf_htab *htab)
> {
>         int i;
>
>         rcu_read_lock();
>         for (i = 0; i < htab->n_buckets; i++) {
>                 /* ... */
>                 hlist_nulls_for_each_entry_rcu(l, n, head, hash_node)
>                         check_and_free_timer(htab, l);
>                 cond_resched_rcu();

ahh. I didn't know about this flavor. Will give it a shot.
Thanks!

>         }
>         rcu_read_unlock();
> }
>