[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAM_iQpVA2SQ4eFWEQxjQwmoxwQnQiLgREETKdQLoADWt_xD4Bg@mail.gmail.com>
Date: Wed, 18 Oct 2017 09:43:12 -0700
From: Cong Wang <xiyou.wangcong@...il.com>
To: Chris Mi <chrism@...lanox.com>
Cc: Linux Kernel Network Developers <netdev@...r.kernel.org>,
Jamal Hadi Salim <jhs@...atatu.com>,
Lucas Bates <lucasb@...atatu.com>,
Jiri Pirko <jiri@...nulli.us>,
David Miller <davem@...emloft.net>
Subject: Re: [patch net v2 1/4] net/sched: Change tc_action refcnt and bindcnt
to atomic
On Tue, Oct 17, 2017 at 6:03 PM, Chris Mi <chrism@...lanox.com> wrote:
>> -----Original Message-----
>> From: Cong Wang [mailto:xiyou.wangcong@...il.com]
>> Sent: Tuesday, October 17, 2017 11:53 PM
>> To: Chris Mi <chrism@...lanox.com>
>> Cc: Linux Kernel Network Developers <netdev@...r.kernel.org>; Jamal Hadi
>> Salim <jhs@...atatu.com>; Lucas Bates <lucasb@...atatu.com>; Jiri Pirko
>> <jiri@...nulli.us>; David Miller <davem@...emloft.net>
>> Subject: Re: [patch net v2 1/4] net/sched: Change tc_action refcnt and
>> bindcnt to atomic
>>
>> On Mon, Oct 16, 2017 at 6:14 PM, Chris Mi <chrism@...lanox.com> wrote:
>> > I don't think this bug were introduced by above two commits only.
>> > Actually, this bug were introduced by several commits, at least the
>> following:
>> > 1. refcnt and bindcnt are not atomic
>>
>> Nope, it is perfectly okay with non-atomic as long as no parallel, and without
>> RCU callback they are perfectly serialized by RTNL.
> Agree.
>>
>>
>> > 2. passing actions using list instead of arrays (I think initially we
>> > are using arrays)
>>
>> We are discussing patch 1/4, this is patch 2/4, so irrelevant.
> Agree.
>>
>>
>> > 3. using RCU callbacks
>>
>> This introduces problem 1.
> I think this patch set only fixes one problem, that's the race and the panic.
> What do you mean by problem 1.
You listed 3 problems, and you think they are 3 different ones, here
I argue problem 3 (using RCU callbacks) is the cause of problem 1
(refcnt not atomic). This is why I mentioned I have been thinking about
removing RCU callbacks, because it probably could fix all of them.
>>
>>
>> > So instead of blaming the latest commit, it is better to say it is a pre-git error.
>>
>> You are wrong.
> OK, you are right. But could I know what's your suggestion for this patch set?
> 1. reject it?
> 2. change the "Fixes" as you suggested?
> 3. something else?
In my opinion we need to think about removing RCU callbacks
rather than fixing all bugs they introduce, because it is really hard
to prove we can fix all of them. In your patchset, you fix 2 bugs.
Before, we fixed 2 bugs (I already list them in the other reply to you).
In total, we have 4 bugs... Are we totally race-free even after
your patches? It seems not at all without a lock, but as I said locking
itself is hard...
I will start a new thread to discuss this and keep you Cc'ed. So
please hold your patches until we have a conclusion.
Thanks.
Powered by blists - more mailing lists