[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87y2nugepz.fsf@mellanox.com>
Date: Wed, 08 Jul 2020 18:21:12 +0200
From: Petr Machata <petrm@...lanox.com>
To: Cong Wang <xiyou.wangcong@...il.com>
Cc: Linux Kernel Network Developers <netdev@...r.kernel.org>,
David Miller <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>,
Eric Dumazet <eric.dumazet@...il.com>,
Jiri Pirko <jiri@...lanox.com>,
Ido Schimmel <idosch@...lanox.com>
Subject: Re: [PATCH net-next v1 2/5] net: sched: Introduce helpers for qevent blocks
Petr Machata <petrm@...lanox.com> writes:
> Cong Wang <xiyou.wangcong@...il.com> writes:
>
>> On Tue, Jul 7, 2020 at 8:22 AM Petr Machata <petrm@...lanox.com> wrote:
>>>
>>>
>>> Cong Wang <xiyou.wangcong@...il.com> writes:
>>>
>>> > On Fri, Jun 26, 2020 at 3:46 PM Petr Machata <petrm@...lanox.com> wrote:
>>> >> The function tcf_qevent_handle() should be invoked when qdisc hits the
>>> >> "interesting event" corresponding to a block. This function releases root
>>> >> lock for the duration of executing the attached filters, to allow packets
>>> >> generated through user actions (notably mirred) to be reinserted to the
>>> >> same qdisc tree.
>>> >
>>> > Are you sure releasing the root lock in the middle of an enqueue operation
>>> > is a good idea? I mean, it seems racy with qdisc change or reset path,
>>> > for example, __red_change() could update some RED parameters
>>> > immediately after you release the root lock.
>>>
>>> So I had mulled over this for a while. If packets are enqueued or
>>> dequeued while the lock is released, maybe the packet under
>>> consideration should have been hard_marked instead of prob_marked, or
>>> vice versa. (I can't really go to not marked at all, because the fact
>>> that we ran the qevent is a very observable commitment to put the packet
>>> in the queue with marking.) I figured that is not such a big deal.
>>>
>>> Regarding a configuration change, for a brief period after the change, a
>>> few not-yet-pushed packets could have been enqueued with ECN marking
>>> even as I e.g. disabled ECN. This does not seem like a big deal either,
>>> these are transient effects.
>>
>> Hmm, let's see:
>>
>> 1. red_enqueue() caches a pointer to child qdisc, child = q->qdisc
>> 2. root lock is released by tcf_qevent_handle().
>> 3. __red_change() acquires the root lock and then changes child
>> qdisc with a new one
>> 4. The old child qdisc is put by qdisc_put()
>> 5. tcf_qevent_handle() acquires the root lock again, and still uses
>> the cached but now freed old child qdisc.
>>
>> Isn't this a problem?
>
> I missed this. It is a problem, destroy gets called right away and then
> enqueue would use invalid data.
>
> Also qdisc_graft() calls cops->graft, which locks the tree and swaps the
> qdisc pointes, and then qdisc_put()s the original one. So dropping the
> root lock can lead to destruction of the qdisc whose enqueue is
> currently executed.
>
> So that's no good. The lock has to be held throughout.
>
>> Even if it really is not, why do we make tcf_qevent_handle() callers'
>> life so hard? They have to be very careful with race conditions like this.
>
> Do you have another solution in mind here? I think the deadlock (in both
> classification and qevents) is an issue, but really don't know how to
> avoid it except by dropping the lock.
Actually I guess I could qdisc_refcount_inc() the current qdisc so that
it doesn't go away. Then when enqueing I could access the child
directly, not relying on the now-obsolete cache from the beginning of
the enqueue function. I suppose that a similar approach could be used in
other users of tcf_classify() as well. What do you think?
Powered by blists - more mailing lists