lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Fri, 14 Jan 2022 13:16:01 +0530
From:   Kumar Kartikeya Dwivedi <memxor@...il.com>
To:     Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc:     bpf <bpf@...r.kernel.org>, Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        Andrii Nakryiko <andrii@...nel.org>,
        Network Development <netdev@...r.kernel.org>,
        netfilter-devel <netfilter-devel@...r.kernel.org>,
        Martin KaFai Lau <kafai@...com>,
        Song Liu <songliubraving@...com>, Yonghong Song <yhs@...com>,
        John Fastabend <john.fastabend@...il.com>,
        Maxim Mikityanskiy <maximmi@...dia.com>,
        Pablo Neira Ayuso <pablo@...filter.org>,
        Florian Westphal <fw@...len.de>,
        Jesper Dangaard Brouer <brouer@...hat.com>,
        Toke Høiland-Jørgensen <toke@...hat.com>
Subject: Re: [PATCH bpf-next v7 02/10] bpf: Populate kfunc BTF ID sets in
 struct btf

On Fri, Jan 14, 2022 at 12:22:29PM IST, Alexei Starovoitov wrote:
> On Thu, Jan 13, 2022 at 9:22 PM Kumar Kartikeya Dwivedi
> <memxor@...il.com> wrote:
> >
> > On Fri, Jan 14, 2022 at 10:19:50AM IST, Kumar Kartikeya Dwivedi wrote:
> > > On Fri, Jan 14, 2022 at 04:02:11AM IST, Alexei Starovoitov wrote:
> > > > On Tue, Jan 11, 2022 at 11:34:20PM +0530, Kumar Kartikeya Dwivedi wrote:
> > > > > [...]
> > > > > + /* Make sure all updates are visible before we go to MODULE_STATE_LIVE,
> > > > > +  * pairs with smp_rmb in btf_try_get_module (for success case).
> > > > > +  *
> > > > > +  * btf_populate_kfunc_set(...)
> > > > > +  * smp_wmb()    <-----------.
> > > > > +  * mod->state = LIVE        |           if (mod->state == LIVE)
> > > > > +  *                          |             atomic_inc_nz(mod)
> > > > > +  *                          `--------->   smp_rmb()
> > > > > +  *                                        btf_kfunc_id_set_contains(...)
> > > > > +  */
> > > > > + smp_wmb();
> > > >
> > > > This comment somehow implies that mod->state = LIVE
> > > > and if (mod->state == LIVE && try_mod_get) can race.
> > > > That's not the case.
> > > > The patch 1 closed the race.
> > > > btf_kfunc_id_set_contains() will be called only on LIVE modules.
> > > > At that point all __init funcs of the module including register_btf_kfunc_id_set()
> > > > have completed.
> > > > This smp_wmb/rmb pair serves no purpose.
> > > > Unless I'm missing something?
> > > >
> > >
> > > Right, I'm no expert on memory ordering, but even if we closed the race, to me
> > > there seems to be no reason why the CPU cannot reorder the stores to tab (or its
> > > hook/type slot) with mod->state = LIVE store.
> > >
> > > Usually, the visibility is handled by whatever lock is used to register the
> > > module somewhere in some subsystem, as the next acquirer can see all updates
> > > from the previous registration.
> > >
> > > In this case, we're directly assigning a pointer without holding any locks etc.
> > > While it won't be concurrently accessed until module state is LIVE, it is
> > > necessary to make all updates visible in the right order (that is, once state is
> > > LIVE, everything stored previously in struct btf for module is also visible).
> > >
> > > Once mod->state = LIVE is visible, we will start accessing kfunc_set_tab, but if
> > > previous stores to it were not visible by then, we'll access a badly-formed
> > > kfunc_set_tab.
> > >
> > > For this particular case, you can think of mod->state = LIVE acting as a release
> > > store, and the read for mod->state == LIVE acting as an acquire load.
> > >
> >
> > Also, to be more precise, we're now synchronizing using btf_mod->flags, not
> > mod->state, so I should atleast update the comment, but the idea is the same.
>
> So the concern is that cpu can execute
> mod->state = MODULE_STATE_LIVE;
> from kernel/module.c
> earlier than stores inside __btf_populate_kfunc_set
> that are called from do_one_initcall()
> couple lines earlier in kernel/module.c ?
> Let's assume cpu is not Intel, since Intel never reorders stores.
> (as far as I remember the only weak store ordering architecture
> ever produced is Alpha).
> But it's not mod->state, it's btf_mod->flags (as you said)
> which is done under btf_module_mutex.
> and btf_kfunc_id_set_contains() can only do that after
> btf_try_get_module() succeeds which is under the same
> btf_module_mutex.
> So what is the race ?
> I think it's important to be concerned about race
> conditions, but they gotta be real.
> So please spell out a precise scenario if you still believe it's there.

Ah, indeed you're right, btf_module_mutex's unlock would prevent it now, so we
can drop it. I should have revisited whether the barrier was still required.

---

Just for the record, I was thinking about this case when adding it.

do_one_initcall
  register_btf_kfunc_id_set
    btf_get_module_btf
    btf->kfunc_set_tab = ...		// A
    tab->sets[hook][type] = ...		// B
mod->state = LIVE			// C

There was nothing preventing A and B to be visible after C (as per LKMM, maybe
it isn't a problem on real architectures after all), so there was need for some
ordering.

After the btf_mod->flags change, we would have:

do_one_initcall
  register_btf_kfunc_id_set
    btf_get_module_btf
    btf->kfunc_set_tab = ...		// A
    tab->sets[hook][type] = ...		// B
mod->state = LIVE
notifier_call
  btf_module_notify
  case MODULE_STATE_LIVE:
    mutex_lock(btf_module_mutex)
      btf_mod->flags |= LIVE		// C
    mutex_unlock(btf_module_mutex)	// D

Now we have A, B, and C that may be individually reordered, but when taking the
mutex all will be visible due to the release in mutex_unlock (D), even though in
the worst case A and B can seep into the critical section and reorder with C
(again, perhaps only theoretically, as per LKMM), but next critical section
should see everything.

--
Kartikeya

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ