[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20121030162103.GA17261@Krystal>
Date: Tue, 30 Oct 2012 12:21:03 -0400
From: Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>
To: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Cc: linux-kernel@...r.kernel.org, mingo@...e.hu, laijs@...fujitsu.com,
dipankar@...ibm.com, akpm@...ux-foundation.org,
josh@...htriplett.org, niv@...ibm.com, tglx@...utronix.de,
peterz@...radead.org, rostedt@...dmis.org, Valdis.Kletnieks@...edu,
dhowells@...hat.com, eric.dumazet@...il.com, darren@...art.com,
fweisbec@...il.com, sbw@....edu, patches@...aro.org
Subject: Re: [PATCH tip/core/rcu 4/4] rcu: Document alternative
RCU/reference-count algorithms
* Paul E. McKenney (paulmck@...ux.vnet.ibm.com) wrote:
> From: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
>
> The approach for mixing RCU and reference counting listed in the RCU
> documentation only describes one possible approach. This approach can
> result in failure on the read side, which is nice if you want fresh data,
> but not so good if you want simple code. This commit therefore adds
> two additional approaches that feature unconditional reference-count
> acquisition by RCU readers. These approaches are very similar to that
> used in the security code.
>
> Signed-off-by: Paul E. McKenney <paulmck@...ux.vnet.ibm.com>
> ---
> Documentation/RCU/rcuref.txt | 61 ++++++++++++++++++++++++++++++++++++++++-
> 1 files changed, 59 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/RCU/rcuref.txt b/Documentation/RCU/rcuref.txt
> index 4202ad0..99ca662 100644
> --- a/Documentation/RCU/rcuref.txt
> +++ b/Documentation/RCU/rcuref.txt
> @@ -20,7 +20,7 @@ release_referenced() delete()
> { {
> ... write_lock(&list_lock);
> atomic_dec(&el->rc, relfunc) ...
> - ... delete_element
> + ... remove_element
> } write_unlock(&list_lock);
> ...
> if (atomic_dec_and_test(&el->rc))
> @@ -52,7 +52,7 @@ release_referenced() delete()
> { {
> ... spin_lock(&list_lock);
> if (atomic_dec_and_test(&el->rc)) ...
> - call_rcu(&el->head, el_free); delete_element
> + call_rcu(&el->head, el_free); remove_element
> ... spin_unlock(&list_lock);
> } ...
> if (atomic_dec_and_test(&el->rc))
> @@ -64,3 +64,60 @@ Sometimes, a reference to the element needs to be obtained in the
> update (write) stream. In such cases, atomic_inc_not_zero() might be
> overkill, since we hold the update-side spinlock. One might instead
> use atomic_inc() in such cases.
> +
> +It is not always convenient to deal with "FAIL" in the
> +search_and_reference() code path. In such cases, the
> +atomic_dec_and_test() may be moved from delete() to el_free()
> +as follows:
> +
> +1. 2.
> +add() search_and_reference()
> +{ {
> + alloc_object rcu_read_lock();
> + ... search_for_element
> + atomic_set(&el->rc, 1); atomic_inc(&el->rc);
> + spin_lock(&list_lock); ...
> +
> + add_element rcu_read_unlock();
> + ... }
indentation looks wrong in my mail client for the two lines above (for
the 2. block).
Otherwise, it looks good to me,
Thanks,
Mathieu
> + spin_unlock(&list_lock); 4.
> +} delete()
> +3. {
> +release_referenced() spin_lock(&list_lock);
> +{ ...
> + ... remove_element
> + if (atomic_dec_and_test(&el->rc)) spin_unlock(&list_lock);
> + kfree(el); ...
> + ... call_rcu(&el->head, el_free);
> +} ...
> +5. }
> +void el_free(struct rcu_head *rhp)
> +{
> + release_referenced();
> +}
> +
> +The key point is that the initial reference added by add() is not removed
> +until after a grace period has elapsed following removal. This means that
> +search_and_reference() cannot find this element, which means that the value
> +of el->rc cannot increase. Thus, once it reaches zero, there are no
> +readers that can or ever will be able to reference the element. The
> +element can therefore safely be freed. This in turn guarantees that if
> +any reader finds the element, that reader may safely acquire a reference
> +without checking the value of the reference counter.
> +
> +In cases where delete() can sleep, synchronize_rcu() can be called from
> +delete(), so that el_free() can be subsumed into delete as follows:
> +
> +4.
> +delete()
> +{
> + spin_lock(&list_lock);
> + ...
> + remove_element
> + spin_unlock(&list_lock);
> + ...
> + synchronize_rcu();
> + if (atomic_dec_and_test(&el->rc))
> + kfree(el);
> + ...
> +}
> --
> 1.7.8
>
--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists