[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190305005711.0dba1b69@redhat.com>
Date: Tue, 5 Mar 2019 00:57:11 +0100
From: Stefano Brivio <sbrivio@...hat.com>
To: Vlad Buslov <vladbu@...lanox.com>
Cc: netdev@...r.kernel.org, jhs@...atatu.com, xiyou.wangcong@...il.com,
jiri@...nulli.us, davem@...emloft.net
Subject: Re: [PATCH net-next v2 00/12] Refactor flower classifier to remove
dependency on rtnl lock
On Wed, 27 Feb 2019 12:12:14 +0200
Vlad Buslov <vladbu@...lanox.com> wrote:
> Currently, all netlink protocol handlers for updating rules, actions and
> qdiscs are protected with single global rtnl lock which removes any
> possibility for parallelism. This patch set is a third step to remove
> rtnl lock dependency from TC rules update path.
>
> Recently, new rtnl registration flag RTNL_FLAG_DOIT_UNLOCKED was added.
> TC rule update handlers (RTM_NEWTFILTER, RTM_DELTFILTER, etc.) are
> already registered with this flag and only take rtnl lock when qdisc or
> classifier requires it. Classifiers can indicate that their ops
> callbacks don't require caller to hold rtnl lock by setting the
> TCF_PROTO_OPS_DOIT_UNLOCKED flag. The goal of this change is to refactor
> flower classifier to support unlocked execution and register it with
> unlocked flag.
>
> This patch set implements following changes to make flower classifier
> concurrency-safe:
>
> - Implement reference counting for individual filters. Change fl_get to
> take reference to filter. Implement tp->ops->put callback that was
> introduced in cls API patch set to release reference to flower filter.
>
> - Use tp->lock spinlock to protect internal classifier data structures
> from concurrent modification.
>
> - Handle concurrent tcf proto deletion by returning EAGAIN, which will
> cause cls API to retry and create new proto instance or return error
> to the user (depending on message type).
>
> - Handle concurrent insertion of filter with same priority and handle by
> returning EAGAIN, which will cause cls API to lookup filter again and
> process it accordingly to netlink message flags.
>
> - Extend flower mask with reference counting and protect masks list with
> masks_lock spinlock.
>
> - Prevent concurrent mask insertion by inserting temporary value to
> masks hash table. This is necessary because mask initialization is a
> sleeping operation and cannot be done while holding tp->lock.
>
> Both chain level and classifier level conflicts are resolved by
> returning -EAGAIN to cls API that results restart of whole operation.
> This retry mechanism is a result of fine-grained locking approach used
> in this and previous changes in series and is necessary to allow
> concurrent updates on same chain instance. Alternative approach would be
> to lock the whole chain while updating filters on any of child tp's,
> adding and removing classifier instances from the chain. However, since
> most CPU-intensive parts of filter update code are specifically in
> classifier code and its dependencies (extensions and hw offloads), such
> approach would negate most of the gains introduced by this change and
> previous changes in the series when updating same chain instance.
>
> Tcf hw offloads API is not changed by this patch set and still requires
> caller to hold rtnl lock. Refactored flower classifier tracks rtnl lock
> state by means of 'rtnl_held' flag provided by cls API and obtains the
> lock before calling hw offloads. Following patch set will lift this
> restriction and refactor cls hw offloads API to support unlocked
> execution.
>
> With these changes flower classifier is safely registered with
> TCF_PROTO_OPS_DOIT_UNLOCKED flag in last patch.
>
> Changes from V1 to V2:
> - Extend cover letter with explanation about retry mechanism.
> - Rebase on current net-next.
> - Patch 1:
> - Use rcu_dereference_raw() for tp->root dereference.
> - Update comment in fl_head_dereference().
> - Patch 2:
> - Remove redundant check in fl_change error handling code.
> - Add empty line between error check and new handle assignment.
> - Patch 3:
> - Refactor loop in fl_get_next_filter() to improve readability.
> - Patch 4:
> - Refactor __fl_delete() to improve readability.
> - Patch 6:
> - Fix comment in fl_check_assign_mask().
> - Patch 9:
> - Extend commit message.
> - Fix error code in comment.
> - Patch 11:
> - Fix fl_hw_replace_filter() to always release rtnl lock in error
> handlers.
> - Patch 12:
> - Don't take rtnl lock before calling __fl_destroy_filter() in
> workqueue context.
> - Extend commit message with explanation why flower still takes rtnl
> lock before calling hardware offloads API.
FWIW,
Reviewed-by: Stefano Brivio <sbrivio@...hat.com>
--
Stefano
Powered by blists - more mailing lists