[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <429bc64106ac69c8291f4466ddbaa2b48b8e16c4.camel@redhat.com>
Date: Tue, 16 Jun 2020 14:42:58 +0200
From: Davide Caratti <dcaratti@...hat.com>
To: Vladimir Oltean <olteanv@...il.com>
Cc: Po Liu <Po.Liu@....com>, Cong Wang <xiyou.wangcong@...il.com>,
"David S . Miller" <davem@...emloft.net>,
netdev <netdev@...r.kernel.org>
Subject: Re: [PATCH net v2 2/2] net/sched: act_gate: fix configuration of
the periodic timer
On Tue, 2020-06-16 at 13:38 +0300, Vladimir Oltean wrote:
> Hi Davide,
>
> On Tue, 16 Jun 2020 at 13:13, Davide Caratti <dcaratti@...hat.com> wrote:
> > hello Vladimir,
> >
> > thanks a lot for reviewing this.
> >
> > On Tue, 2020-06-16 at 00:55 +0300, Vladimir Oltean wrote:
[...]
> > > What if you split the "replace" functionality of gate_setup_timer into
> > > a separate gate_cancel_timer function, which you could call earlier
> > > (before taking the spin lock)?
> >
> > I think it would introduce the following 2 problems:
> >
> > problem #1) a race condition, see below:
[...]
> > > @@ -433,6 +448,11 @@ static int tcf_gate_init(struct net *net, struct nlattr *nla,
> > > > if (goto_ch)
> > > > tcf_chain_put_by_act(goto_ch);
> > > > release_idr:
> > > > + /* action is not in: hitimer can be inited without taking tcf_lock */
> > > > + if (ret == ACT_P_CREATED)
> > > > + gate_setup_timer(gact, gact->param.tcfg_basetime,
> > > > + gact->tk_offset, gact->param.tcfg_clockid,
> > > > + true);
> >
> > please note, here I felt the need to add a comment, because when ret ==
> > ACT_P_CREATED the action is not inserted in any list, so there is no
> > concurrent writer of gact-> members for that action.
> >
>
> Then please rephrase the comment. I had read it and it still wasn't
> clear at all for me what you were talking about.
something like:
/* action is not yet inserted in any list: it's safe to init hitimer
* without taking tcf_lock.
*/
would be ok?
[...]
> I wonder, could you call tcf_gate_cleanup instead of just canceling the
> hrtimer?
not with the current tcf_gate_cleanup() [1] and parse_gate_list() [2],
because it would introduce another bug: 'p->entries' gets cleared on
action overwrite after being successfully created here:
395 if (tb[TCA_GATE_ENTRY_LIST]) {
396 err = parse_gate_list(tb[TCA_GATE_ENTRY_LIST], p, extack);
397 if (err < 0)
398 goto chain_put;
399 }
like mentioned earlier, 'hitimer' can not be canceled/re-initialized easily when
tcf_gate_init() still has a possible error path. And in my understanding
'p->entries' must be consistent when the timer is initialized.
IMO, the correct way to handle 'entries' is to:
- populate the list on a local variable, before taking the spinlock and
allocating the IDR
- assign to p->entries after validation is successful (with the spinlock
taken). Same as what was done with 'cycletime' in patch 1/2, but with the
variable initialized (btw, thanks for catching this), and free the old
list in case of action replace
- release the newly allocated list in the error path of tcf_gate_init()
(but again, this would be a fix for 'entries' - not for 'hitimer', so I
plan to work on it as a separate patch, that fits better 'net-next' rather
than 'net').
--
davide
[1] https://elixir.bootlin.com/linux/v5.8-rc1/source/net/sched/act_gate.c#L450
[2] https://elixir.bootlin.com/linux/v5.8-rc1/source/net/sched/act_gate.c#L235
Powered by blists - more mailing lists