lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 13 Jul 2018 14:51:15 -0700
From:   Cong Wang <xiyou.wangcong@...il.com>
To:     Vlad Buslov <vladbu@...lanox.com>
Cc:     Linux Kernel Network Developers <netdev@...r.kernel.org>,
        David Miller <davem@...emloft.net>,
        Jamal Hadi Salim <jhs@...atatu.com>,
        Jiri Pirko <jiri@...nulli.us>,
        Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        Yevgeny Kliteynik <kliteyn@...lanox.com>,
        Jiri Pirko <jiri@...lanox.com>
Subject: Re: [PATCH net-next v6 01/11] net: sched: use rcu for action cookie update

On Fri, Jul 13, 2018 at 6:30 AM Vlad Buslov <vladbu@...lanox.com> wrote:
>
>
> On Fri 13 Jul 2018 at 03:52, Cong Wang <xiyou.wangcong@...il.com> wrote:
> > On Thu, Jul 5, 2018 at 7:24 AM Vlad Buslov <vladbu@...lanox.com> wrote:
> >>
> >> Implement functions to atomically update and free action cookie
> >> using rcu mechanism.
> >
> > Without stating any reason..... Is this even a changelog?
>
> Yes, it is.

What do you expect in a changelog generally? Repeating what
your code does? Thanks but we don't even want to read any code
unless the need of this code is reasonably justified.

Can we at least agree you have no justification for this change
in this changelog? Or you believe this patch is as trivial as
a white space change which doesn't need a justification?


>
> >
> >>
> >> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@...il.com>
> >
> > Dear Marcelo, how did it pass your review? See below:
> >
> >
> >> +static void tcf_set_action_cookie(struct tc_cookie __rcu **old_cookie,
> >> +                                 struct tc_cookie *new_cookie)
> >> +{
> >> +       struct tc_cookie *old;
> >> +
> >> +       old = xchg(old_cookie, new_cookie);
> >
> >
> > This is an incorrect use of RCU, obviously should be rcu_assign_pointer()
> > here.
>
> Could you please explain your concern in more details? Similar pattern
> is already widely used in kernel for re-assigning rcu pointers. For

My reasoning is too simple: search whatisRCU.txt for "xchg",
I find nothing. :)

Here is the link:
https://www.kernel.org/doc/Documentation/RCU/whatisRCU.txt

Of course, both xchg() and rcu_assign_pointer() are aimed to make
an assignment of a pointer. But even without looking into their
implementations, there must be a reason for rcu_assign_pointer() to
exist, right? Can we agree or you believe rcu_assign_pointer() can
be replaced by xchg() and removed finally?

This also means you need to justify your pick of xchg() in your
changelog where there is nothing literally.


> example, Eric Dumazet uses it in 1c0d32fde5bd ("net_sched:
> gen_estimator: complete rewrite of rate estimators"):
>
> void gen_kill_estimator(struct net_rate_estimator __rcu **rate_est)
> {
>         struct net_rate_estimator *est;
>
>         est = xchg((__force struct net_rate_estimator **)rate_est, NULL);
>         if (est) {
>                 del_timer_sync(&est->timer);
>                 kfree_rcu(est, rcu);
>         }
> }

In this case, *I think* the only reason is the burden of the API
gen_kill_estimator(). It aims to be a core API for both netfilter and
net_sched, therefore it _has to_ provide a wrapper for its users,
because otherwise each user has to repeat rcu_assign_pointer()
+ del_timer_sync() + kfree_rcu(), just a matter of duplication.

Apparently this rule does NOT apply to your case, where
tcf_set_action_cookie() is merely a static function called by two
users in the same C file, and without anything but a call_rcu().


>
> Tom Herbert uses same idiom in a8c5f90fb59a ("ip_tunnel: Ops
> registration for secondary encap (fou, gue)"):
>
> int ip_tunnel_encap_add_ops(const struct ip_tunnel_encap_ops *ops,
>                             unsigned int num)
> {
>         if (num >= MAX_IPTUN_ENCAP_OPS)
>                 return -ERANGE;
>
>         return !cmpxchg((const struct ip_tunnel_encap_ops **)
>                         &iptun_encaps[num],
>                         NULL, ops) ? 0 : -1;
> }


cmpxchg() is completely different with xchg(), first of all.

In this case, its caller expects if this cmpxchg() fails or not.
How this could be even related to your case given
tcf_set_action_cookie() returns void?


>
> Again, Eric uses xchg to re-assign rcu pointer in 45f6fad84cc3 ("ipv6:
> add complete rcu protection around np->opt"):
>
> struct ipv6_txoptions *ipv6_update_options(struct sock *sk,
>                                            struct ipv6_txoptions *opt)
> {
>         if (inet_sk(sk)->is_icsk) {
>                 if (opt &&
>                     !((1 << sk->sk_state) & (TCPF_LISTEN | TCPF_CLOSE)) &&
>                     inet_sk(sk)->inet_daddr != LOOPBACK4_IPV6) {
>                         struct inet_connection_sock *icsk = inet_csk(sk);
>                         icsk->icsk_ext_hdr_len = opt->opt_flen + opt->opt_nflen;
>                         icsk->icsk_sync_mss(sk, icsk->icsk_pmtu_cookie);
>                 }
>         }
>         opt = xchg((__force struct ipv6_txoptions **)&inet6_sk(sk)->opt,
>                    opt);
>         sk_dst_reset(sk);


In this case, it is the caller's requirement. The callers of
ipv6_update_options() want to get the old pointer and do
something about it:

                opt = ipv6_update_options(sk, opt);
                if (opt) {
                        atomic_sub(opt->tot_len, &sk->sk_omem_alloc);
                        txopt_put(opt);
                }

It should be functionally equivalent to saving the old pointer
before a rcu_assign_pointer().

So, this does NOT apply to your case either, you only call
call_rcu() and the callers require nothing.


>
>         return opt;
> }
>
> >
> >
> >> @@ -65,10 +83,7 @@ static void free_tcf(struct tc_action *p)
> >>         free_percpu(p->cpu_bstats);
> >>         free_percpu(p->cpu_qstats);
> >>
> >> -       if (p->act_cookie) {
> >> -               kfree(p->act_cookie->data);
> >> -               kfree(p->act_cookie);
> >> -       }
> >> +       tcf_set_action_cookie(&p->act_cookie, NULL);
> >
> > So, this is called in free_tcf(), where the action is already
> > invisible from readers so it is ready to be freed.
> >
> > The question is:
> >
> > If the action itself is already ready to be freed, why do you
> > need RCU here? What could still read 'act->act_cookie'
> > while 'act' is already invisible?
> >
> > Its last refcnt is already gone, the fast path RCU readers
> > are gone too given filters use rcu work already.
> >
> > Standalone action dump? Again, the last refcnt is already
> > gone.
>
> It is not necessary here, I just used tcf_set_action_cookie() that
> already implements cookie pointer cleanup to prevent code duplication.
> I'm open to changing it, if you concerned with performance impact of
> using atomic operation for re-assigning cookie pointer.

Yeah, totally understand. But >act_cookie very special here,
it requires no copying when update, unlike the normal cases.

This means from RCU we can remove the "C" here. I know
you still copy it when dumping it, but it is a part of Read, not
a part of Update, so it is safe to say you only need R and U
here.

Which in turn means two things:

1. You don't have to use RCU anymore.

2. You don't need a lock for writers given there is no copy
during update, if you still stick to RCU.

This is why I keep saying you need to justify it, it is not trivial
and it is not easy to understand either.

Thanks!

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ