[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+h21hpL+7tuEX7_NCNo7NdgZ1OYqjQ03=DHuZ3aOOKh6Z4tsw@mail.gmail.com>
Date: Tue, 16 Jun 2020 17:23:26 +0300
From: Vladimir Oltean <olteanv@...il.com>
To: Davide Caratti <dcaratti@...hat.com>
Cc: Po Liu <Po.Liu@....com>, Cong Wang <xiyou.wangcong@...il.com>,
"David S . Miller" <davem@...emloft.net>,
netdev <netdev@...r.kernel.org>
Subject: Re: [PATCH net v2 2/2] net/sched: act_gate: fix configuration of the
periodic timer
On Tue, 16 Jun 2020 at 15:43, Davide Caratti <dcaratti@...hat.com> wrote:
>
> On Tue, 2020-06-16 at 13:38 +0300, Vladimir Oltean wrote:
> > Hi Davide,
> >
> > On Tue, 16 Jun 2020 at 13:13, Davide Caratti <dcaratti@...hat.com> wrote:
> > > hello Vladimir,
> > >
> > > thanks a lot for reviewing this.
> > >
> > > On Tue, 2020-06-16 at 00:55 +0300, Vladimir Oltean wrote:
>
> [...]
>
> > > > What if you split the "replace" functionality of gate_setup_timer into
> > > > a separate gate_cancel_timer function, which you could call earlier
> > > > (before taking the spin lock)?
> > >
> > > I think it would introduce the following 2 problems:
> > >
> > > problem #1) a race condition, see below:
>
> [...]
>
> > > > @@ -433,6 +448,11 @@ static int tcf_gate_init(struct net *net, struct nlattr *nla,
> > > > > if (goto_ch)
> > > > > tcf_chain_put_by_act(goto_ch);
> > > > > release_idr:
> > > > > + /* action is not in: hitimer can be inited without taking tcf_lock */
> > > > > + if (ret == ACT_P_CREATED)
> > > > > + gate_setup_timer(gact, gact->param.tcfg_basetime,
> > > > > + gact->tk_offset, gact->param.tcfg_clockid,
> > > > > + true);
> > >
> > > please note, here I felt the need to add a comment, because when ret ==
> > > ACT_P_CREATED the action is not inserted in any list, so there is no
> > > concurrent writer of gact-> members for that action.
> > >
> >
> > Then please rephrase the comment. I had read it and it still wasn't
> > clear at all for me what you were talking about.
>
> something like:
>
> /* action is not yet inserted in any list: it's safe to init hitimer
> * without taking tcf_lock.
> */
>
> would be ok?
>
Yes, better.
> [...]
>
> > I wonder, could you call tcf_gate_cleanup instead of just canceling the
> > hrtimer?
>
> not with the current tcf_gate_cleanup() [1] and parse_gate_list() [2],
> because it would introduce another bug: 'p->entries' gets cleared on
> action overwrite after being successfully created here:
>
> 395 if (tb[TCA_GATE_ENTRY_LIST]) {
> 396 err = parse_gate_list(tb[TCA_GATE_ENTRY_LIST], p, extack);
> 397 if (err < 0)
> 398 goto chain_put;
> 399 }
>
>
> like mentioned earlier, 'hitimer' can not be canceled/re-initialized easily when
> tcf_gate_init() still has a possible error path. And in my understanding
> 'p->entries' must be consistent when the timer is initialized.
>
> IMO, the correct way to handle 'entries' is to:
>
> - populate the list on a local variable, before taking the spinlock and
> allocating the IDR
>
> - assign to p->entries after validation is successful (with the spinlock
> taken). Same as what was done with 'cycletime' in patch 1/2, but with the
> variable initialized (btw, thanks for catching this), and free the old
> list in case of action replace
>
> - release the newly allocated list in the error path of tcf_gate_init()
>
> (but again, this would be a fix for 'entries' - not for 'hitimer', so I
> plan to work on it as a separate patch, that fits better 'net-next' rather
> than 'net').
>
Targeting net-next would mean that the net tree would still keep
appending to p->entries upon action replacement, instead of just
replacing p->entries?
> --
> davide
>
> [1] https://elixir.bootlin.com/linux/v5.8-rc1/source/net/sched/act_gate.c#L450
> [2] https://elixir.bootlin.com/linux/v5.8-rc1/source/net/sched/act_gate.c#L235
>
Thanks,
-Vladimir
Powered by blists - more mailing lists