[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <fd20899c60d96695060ecb782421133829f09bc2.camel@redhat.com>
Date: Tue, 16 Jun 2020 12:12:59 +0200
From: Davide Caratti <dcaratti@...hat.com>
To: Vladimir Oltean <olteanv@...il.com>
Cc: Po Liu <Po.Liu@....com>, Cong Wang <xiyou.wangcong@...il.com>,
"David S . Miller" <davem@...emloft.net>,
netdev <netdev@...r.kernel.org>
Subject: Re: [PATCH net v2 2/2] net/sched: act_gate: fix configuration of
the periodic timer
hello Vladimir,
thanks a lot for reviewing this.
On Tue, 2020-06-16 at 00:55 +0300, Vladimir Oltean wrote:
[...]
> > diff --git a/net/sched/act_gate.c b/net/sched/act_gate.c
> > index 6775ccf355b0..3c529a4bcca5 100644
> > --- a/net/sched/act_gate.c
> > +++ b/net/sched/act_gate.c
> > @@ -272,6 +272,27 @@ static int parse_gate_list(struct nlattr *list_attr,
> > return err;
> > }
> >
> > +static void gate_setup_timer(struct tcf_gate *gact, u64 basetime,
> > + enum tk_offsets tko, s32 clockid,
> > + bool do_init)
> > +{
> > + if (!do_init) {
> > + if (basetime == gact->param.tcfg_basetime &&
> > + tko == gact->tk_offset &&
> > + clockid == gact->param.tcfg_clockid)
> > + return;
> > +
> > + spin_unlock_bh(&gact->tcf_lock);
> > + hrtimer_cancel(&gact->hitimer);
> > + spin_lock_bh(&gact->tcf_lock);
>
> I think it's horrible to do this just to get out of atomic context.
> What if you split the "replace" functionality of gate_setup_timer into
> a separate gate_cancel_timer function, which you could call earlier
> (before taking the spin lock)?
I think it would introduce the following 2 problems:
problem #1) a race condition, see below:
> That change would look like this:
> diff --git a/net/sched/act_gate.c b/net/sched/act_gate.c
> index 3c529a4bcca5..47c625a0e70c 100644
> --- a/net/sched/act_gate.c
> +++ b/net/sched/act_gate.c
> @@ -273,19 +273,8 @@ static int parse_gate_list(struct nlattr *list_attr,
> }
>
> static void gate_setup_timer(struct tcf_gate *gact, u64 basetime,
> - enum tk_offsets tko, s32 clockid,
> - bool do_init)
> + enum tk_offsets tko, s32 clockid)
> {
> - if (!do_init) {
> - if (basetime == gact->param.tcfg_basetime &&
> - tko == gact->tk_offset &&
> - clockid == gact->param.tcfg_clockid)
> - return;
> -
> - spin_unlock_bh(&gact->tcf_lock);
> - hrtimer_cancel(&gact->hitimer);
> - spin_lock_bh(&gact->tcf_lock);
> - }
> gact->param.tcfg_basetime = basetime;
> gact->param.tcfg_clockid = clockid;
> gact->tk_offset = tko;
> @@ -293,6 +282,17 @@ static void gate_setup_timer(struct tcf_gate
> *gact, u64 basetime,
> gact->hitimer.function = gate_timer_func;
> }
>
> +static void gate_cancel_timer(struct tcf_gate *gact, u64 basetime,
> + enum tk_offsets tko, s32 clockid)
> +{
> + if (basetime == gact->param.tcfg_basetime &&
> + tko == gact->tk_offset &&
> + clockid == gact->param.tcfg_clockid)
> + return;
> +
> + hrtimer_cancel(&gact->hitimer);
> +}
> +
the above function either cancels a timer, or does nothing: it depends on
the value of the 3-ple {tcfg_basetime, tk_offset, tcfg_clockid}. If we run
this function without holding tcf_lock, nobody will guarantee that
{tcfg_basetime, tk_offset, tcfg_clockid} is not being concurrently
rewritten by some other command like:
# tc action replace action gate <parameters> index <x>
> static int tcf_gate_init(struct net *net, struct nlattr *nla,
> struct nlattr *est, struct tc_action **a,
> int ovr, int bind, bool rtnl_held,
> @@ -381,6 +381,8 @@ static int tcf_gate_init(struct net *net, struct
> nlattr *nla,
> gact = to_gate(*a);
> if (ret == ACT_P_CREATED)
> INIT_LIST_HEAD(&gact->param.entries);
> + else
> + gate_cancel_timer(gact, basetime, tk_offset, clockid);
>
IOW, the above line is racy unless we do spin_lock()/spin_unlock() around
the
if (<expression depending on gact-> members>)
return;
statement before hrtimer_cancel(), which does not seem much different
than what I did in gate_setup_timer().
[...]
> @@ -433,6 +448,11 @@ static int tcf_gate_init(struct net *net, struct nlattr *nla,
> > if (goto_ch)
> > tcf_chain_put_by_act(goto_ch);
> > release_idr:
> > + /* action is not in: hitimer can be inited without taking tcf_lock */
> > + if (ret == ACT_P_CREATED)
> > + gate_setup_timer(gact, gact->param.tcfg_basetime,
> > + gact->tk_offset, gact->param.tcfg_clockid,
> > + true);
please note, here I felt the need to add a comment, because when ret ==
ACT_P_CREATED the action is not inserted in any list, so there is no
concurrent writer of gact-> members for that action.
> > tcf_idr_release(*a, bind);
> > return err;
> > }
problem #2) a functional issue that originates in how 'cycle_time' and
'entries' are validated (*). See below:
On Tue, 2020-06-16 at 00:55 +0300, Vladimir Oltean wrote:
> static int tcf_gate_init(struct net *net, struct nlattr *nla,
> struct nlattr *est, struct tc_action **a,
> int ovr, int bind, bool rtnl_held,
> @@ -381,6 +381,8 @@ static int tcf_gate_init(struct net *net, struct nlattr *nla,
> gact = to_gate(*a);
> if (ret == ACT_P_CREATED)
> INIT_LIST_HEAD(&gact->param.entries);
> + else
> + gate_cancel_timer(gact, basetime, tk_offset, clockid);
here you propose to cancel the timer, but few lines after we have this:
385 err = tcf_action_check_ctrlact(parm->action, tp, &goto_ch, extack);
386 if (err < 0)
387 goto release_idr;
388
so, when users try the following commands:
# tc action add action gate <good parameters> index 2
# tc action replace action gate <other good parameters> goto chain 42 index 2
and chain 42 does not exist, the second command will fail. But the timer
is erroneously stopped, and never started again. So, the first rule is
correctly inserted but it becomes no more functional after users try to
replace it with another one having invalid control action.
Moving the call to gate_cancel_timer() after the validation of the control
action will not fix this problem, because 'cycle_time' and 'entries' are
validated together, and with the spinlock taken. Because of this, we need
to cancel that timer only when we know that we will not do
tcf_idr_release() and return some error to the user.
please let me know if you think my doubts are not well-founded.
--
davide
(*) now that I see parse_gate_list() again, I noticed another potential
issue with replace (that I need to verify first): apparently the list is
not replaced, it's just "updated" with new entries appended at the end. I
will try to write a fix for that (separate from this series).
Powered by blists - more mailing lists