linux-kernel - Re: [PATCH v2 2/4] Documentation/atomic

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20161125112512.76cc54cf@vento.lan>
Date:   Fri, 25 Nov 2016 11:25:12 -0200
From:   Mauro Carvalho Chehab <mchehab@...pensource.com>
To:     Silvio Fricke <silvio.fricke@...il.com>
Cc:     linux-doc@...r.kernel.org, Jonathan Corbet <corbet@....net>,
        Ming Lei <ming.lei@...onical.com>,
        "Luis R . Rodriguez" <mcgrof@...nel.org>,
        Mauro Carvalho Chehab <mchehab@...nel.org>,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 2/4] Documentation/atomic_ops.txt: convert to ReST
 markup

Em Fri, 25 Nov 2016 11:02:40 +0100
Silvio Fricke <silvio.fricke@...il.com> escreveu:

> ... and move to core-api folder.
> 
> Signed-off-by: Silvio Fricke <silvio.fricke@...il.com>
> ---
>  Documentation/atomic_ops.txt => Documentation/core-api/atomic_ops.rst | 745 +++++++++++++++++++++++++++++++++++++-----------------------------------
>  Documentation/core-api/index.rst                                      |   1 +-
>  Documentation/process/volatile-considered-harmful.rst                 |   3 +-
>  3 files changed, 391 insertions(+), 358 deletions(-)
> 
> diff --git a/Documentation/atomic_ops.txt b/Documentation/core-api/atomic_ops.rst
> similarity index 47%
> rename from Documentation/atomic_ops.txt
> rename to Documentation/core-api/atomic_ops.rst
> index 6c5e8a9..3ac94ff 100644
> --- a/Documentation/atomic_ops.txt
> +++ b/Documentation/core-api/atomic_ops.rst
> @@ -1,204 +1,212 @@
> -		Semantics and Behavior of Atomic and
> -		         Bitmask Operations
> +=======================================================
> +Semantics and Behavior of Atomic and Bitmask Operations
> +=======================================================
>  
> -			  David S. Miller	 
> +:Author: David S. Miller
>  
> -	This document is intended to serve as a guide to Linux port
> +This document is intended to serve as a guide to Linux port
>  maintainers on how to implement atomic counter, bitops, and spinlock
>  interfaces properly.
>  
> -	The atomic_t type should be defined as a signed integer and
> -the atomic_long_t type as a signed long integer.  Also, they should
> +Atomic Type And Operations
> +==========================
> +
> +The ``atomic_t type`` should be defined as a signed integer and
> +the ``atomic_long_t`` type as a signed long integer.  Also, they should
>  be made opaque such that any kind of cast to a normal C integer type
> -will fail.  Something like the following should suffice:
> +will fail.  Something like the following should suffice::
>  
> -	typedef struct { int counter; } atomic_t;
> -	typedef struct { long counter; } atomic_long_t;
> +    typedef struct { int counter; } atomic_t;
> +    typedef struct { long counter; } atomic_long_t;
>  
>  Historically, counter has been declared volatile.  This is now discouraged.
> -See Documentation/process/volatile-considered-harmful.rst for the complete rationale.
> +See :ref:`Documentation/process/volatile-considered-harmful.rst
> +<volatile_considered_harmful>` for the complete rationale.
>  
> -local_t is very similar to atomic_t. If the counter is per CPU and only
> -updated by one CPU, local_t is probably more appropriate. Please see
> -Documentation/local_ops.txt for the semantics of local_t.
> +``local_t`` is very similar to ``atomic_t``. If the counter is per CPU and only
> +updated by one CPU, ``local_t`` is probably more appropriate. Please see
> +:ref:`Documentation/local_ops.txt <local_ops>` for the semantics of ``local_t``.
>  
> -The first operations to implement for atomic_t's are the initializers and
> -plain reads.
> +The first operations to implement for ``atomic_t``'s are the initializers and
> +plain reads. ::
>  
> -	#define ATOMIC_INIT(i)		{ (i) }
> -	#define atomic_set(v, i)	((v)->counter = (i))
> +    #define ATOMIC_INIT(i)      { (i) }
> +    #define atomic_set(v, i)    ((v)->counter = (i))
>  
> -The first macro is used in definitions, such as:
> +The first macro is used in definitions, such as::
>  
> -static atomic_t my_counter = ATOMIC_INIT(1);
> +    static atomic_t my_counter = ATOMIC_INIT(1);
>  
>  The initializer is atomic in that the return values of the atomic operations
>  are guaranteed to be correct reflecting the initialized value if the
>  initializer is used before runtime.  If the initializer is used at runtime, a
>  proper implicit or explicit read memory barrier is needed before reading the
> -value with atomic_read from another thread.
> +value with ``atomic_read`` from another thread.
>  
> -As with all of the atomic_ interfaces, replace the leading "atomic_"
> -with "atomic_long_" to operate on atomic_long_t.
> +As with all of the ``atomic_`` interfaces, replace the leading ``atomic_``
> +with ``atomic_long_`` to operate on ``atomic_long_t``.
>  
> -The second interface can be used at runtime, as in:
> +The second interface can be used at runtime, as in::
>  
> -	struct foo { atomic_t counter; };
> -	...
> +    struct foo { atomic_t counter; };
> +    ...
>  
> -	struct foo *k;
> +    struct foo *k;
>  
> -	k = kmalloc(sizeof(*k), GFP_KERNEL);
> -	if (!k)
> -		return -ENOMEM;
> -	atomic_set(&k->counter, 0);
> +    k = kmalloc(sizeof(*k), GFP_KERNEL);
> +    if (!k)
> +            return -ENOMEM;
> +    atomic_set(&k->counter, 0);
>  
>  The setting is atomic in that the return values of the atomic operations by
>  all threads are guaranteed to be correct reflecting either the value that has
>  been set with this operation or set with another operation.  A proper implicit
>  or explicit memory barrier is needed before the value set with the operation
> -is guaranteed to be readable with atomic_read from another thread.
> +is guaranteed to be readable with ``atomic_read`` from another thread.
>  
> -Next, we have:
> +Next, we have::
>  
> -	#define atomic_read(v)	((v)->counter)
> +    #define atomic_read(v)  ((v)->counter)
>  
>  which simply reads the counter value currently visible to the calling thread.
>  The read is atomic in that the return value is guaranteed to be one of the
>  values initialized or modified with the interface operations if a proper
>  implicit or explicit memory barrier is used after possible runtime
>  initialization by any other thread and the value is modified only with the
> -interface operations.  atomic_read does not guarantee that the runtime
> +interface operations.  ``atomic_read`` does not guarantee that the runtime
>  initialization by any other thread is visible yet, so the user of the
>  interface must take care of that with a proper implicit or explicit memory
>  barrier.
>  
> -*** WARNING: atomic_read() and atomic_set() DO NOT IMPLY BARRIERS! ***
> +.. warning::
>  
> -Some architectures may choose to use the volatile keyword, barriers, or inline
> -assembly to guarantee some degree of immediacy for atomic_read() and
> -atomic_set().  This is not uniformly guaranteed, and may change in the future,
> -so all users of atomic_t should treat atomic_read() and atomic_set() as simple
> -C statements that may be reordered or optimized away entirely by the compiler
> -or processor, and explicitly invoke the appropriate compiler and/or memory
> -barrier for each use case.  Failure to do so will result in code that may
> -suddenly break when used with different architectures or compiler
> -optimizations, or even changes in unrelated code which changes how the
> -compiler optimizes the section accessing atomic_t variables.
> +    ``atomic_read()`` and ``atomic_set()`` DO NOT IMPLY BARRIERS!
>  
> -*** YOU HAVE BEEN WARNED! ***
> +    Some architectures may choose to use the ``volatile`` keyword, barriers, or
> +    ``inline`` assembly to guarantee some degree of immediacy for ``atomic_read()``
> +    and ``atomic_set()``.  This is not uniformly guaranteed, and may change in the
> +    future, so all users of ``atomic_t`` should treat ``atomic_read()`` and
> +    ``atomic_set()`` as simple C statements that may be reordered or optimized away
> +    entirely by the compiler or processor, and explicitly invoke the appropriate
> +    compiler and/or memory barrier for each use case.  Failure to do so will result
> +    in code that may suddenly break when used with different architectures or
> +    compiler optimizations, or even changes in unrelated code which changes how the
> +    compiler optimizes the section accessing ``atomic_t`` variables.
>  
>  Properly aligned pointers, longs, ints, and chars (and unsigned
>  equivalents) may be atomically loaded from and stored to in the same
> -sense as described for atomic_read() and atomic_set().  The READ_ONCE()
> -and WRITE_ONCE() macros should be used to prevent the compiler from using
> -optimizations that might otherwise optimize accesses out of existence on
> -the one hand, or that might create unsolicited accesses on the other.
> +sense as described for ``atomic_read()`` and ``atomic_set()``.  The
> +``READ_ONCE()`` and ``WRITE_ONCE()`` macros should be used to prevent the
> +compiler from using optimizations that might otherwise optimize accesses out of
> +existence on the one hand, or that might create unsolicited accesses on the
> +other.
>  
> -For example consider the following code:
> +For example consider the following code::
>  
> -	while (a > 0)
> -		do_something();
> +    while (a > 0)
> +            do_something();
>  
> -If the compiler can prove that do_something() does not store to the
> +If the compiler can prove that ``do_something()`` does not store to the
>  variable a, then the compiler is within its rights transforming this to
> -the following:
> +the following::
>  
> -	tmp = a;
> -	if (a > 0)
> -		for (;;)
> -			do_something();
> +    tmp = a;
> +    if (a > 0)
> +            for (;;)
> +                    do_something();
>  
>  If you don't want the compiler to do this (and you probably don't), then
> -you should use something like the following:
> +you should use something like the following::
>  
> -	while (READ_ONCE(a) < 0)
> -		do_something();
> +    while (READ_ONCE(a) < 0)
> +            do_something();
>  
> -Alternatively, you could place a barrier() call in the loop.
> +Alternatively, you could place a ``barrier()`` call in the loop.
>  
> -For another example, consider the following code:
> +For another example, consider the following code::
>  
> -	tmp_a = a;
> -	do_something_with(tmp_a);
> -	do_something_else_with(tmp_a);
> +    tmp_a = a;
> +    do_something_with(tmp_a);
> +    do_something_else_with(tmp_a);
>  
> -If the compiler can prove that do_something_with() does not store to the
> -variable a, then the compiler is within its rights to manufacture an
> -additional load as follows:
> +If the compiler can prove that ``do_something_with()`` does not store to the
> +variable ``a``, then the compiler is within its rights to manufacture an
> +additional load as follows::
>  
> -	tmp_a = a;
> -	do_something_with(tmp_a);
> -	tmp_a = a;
> -	do_something_else_with(tmp_a);
> +    tmp_a = a;
> +    do_something_with(tmp_a);
> +    tmp_a = a;
> +    do_something_else_with(tmp_a);
>  
>  This could fatally confuse your code if it expected the same value
> -to be passed to do_something_with() and do_something_else_with().
> +to be passed to ``do_something_with()`` and ``do_something_else_with()``.
>  
>  The compiler would be likely to manufacture this additional load if
> -do_something_with() was an inline function that made very heavy use
> +``do_something_with()`` was an inline function that made very heavy use
>  of registers: reloading from variable a could save a flush to the
>  stack and later reload.  To prevent the compiler from attacking your
> -code in this manner, write the following:
> +code in this manner, write the following::
>  
> -	tmp_a = READ_ONCE(a);
> -	do_something_with(tmp_a);
> -	do_something_else_with(tmp_a);
> +    tmp_a = READ_ONCE(a);
> +    do_something_with(tmp_a);
> +    do_something_else_with(tmp_a);
>  
>  For a final example, consider the following code, assuming that the
> -variable a is set at boot time before the second CPU is brought online
> -and never changed later, so that memory barriers are not needed:
> +variable ``a`` is set at boot time before the second CPU is brought online
> +and never changed later, so that memory barriers are not needed::
>  
> -	if (a)
> -		b = 9;
> -	else
> -		b = 42;
> +    if (a)
> +            b = 9;
> +    else
> +            b = 42;
>  
>  The compiler is within its rights to manufacture an additional store
> -by transforming the above code into the following:
> +by transforming the above code into the following::
>  
> -	b = 42;
> -	if (a)
> -		b = 9;
> +    b = 42;
> +    if (a)
> +            b = 9;
>  
>  This could come as a fatal surprise to other code running concurrently
> -that expected b to never have the value 42 if a was zero.  To prevent
> -the compiler from doing this, write something like:
> +that expected ``b`` to never have the value ``42`` if ``a`` was zero.  To
> +prevent the compiler from doing this, write something like::
> +
> +    if (a)
> +            WRITE_ONCE(b, 9);
> +    else
> +            WRITE_ONCE(b, 42);
>  
> -	if (a)
> -		WRITE_ONCE(b, 9);
> -	else
> -		WRITE_ONCE(b, 42);
> +Don't even **think** about doing this without proper use of memory barriers,
> +locks, or atomic operations if variable ``a`` can change at runtime!
>  
> -Don't even -think- about doing this without proper use of memory barriers,
> -locks, or atomic operations if variable a can change at runtime!
> +.. warning::
>  
> -*** WARNING: READ_ONCE() OR WRITE_ONCE() DO NOT IMPLY A BARRIER! ***
> +    ``READ_ONCE()`` OR ``WRITE_ONCE()`` DO NOT IMPLY A BARRIER!
>  
>  Now, we move onto the atomic operation interfaces typically implemented with
> -the help of assembly code.
> +the help of assembly code. ::
>  
> -	void atomic_add(int i, atomic_t *v);
> -	void atomic_sub(int i, atomic_t *v);
> -	void atomic_inc(atomic_t *v);
> -	void atomic_dec(atomic_t *v);
> +    void atomic_add(int i, atomic_t *v);
> +    void atomic_sub(int i, atomic_t *v);
> +    void atomic_inc(atomic_t *v);
> +    void atomic_dec(atomic_t *v);
>  
>  These four routines add and subtract integral values to/from the given
> -atomic_t value.  The first two routines pass explicit integers by
> +``atomic_t`` value.  The first two routines pass explicit integers by
>  which to make the adjustment, whereas the latter two use an implicit
>  adjustment value of "1".
>  
> -One very important aspect of these two routines is that they DO NOT
> +One very important aspect of these two routines is that they **DO NOT**
>  require any explicit memory barriers.  They need only perform the
> -atomic_t counter update in an SMP safe manner.
> +``atomic_t`` counter update in an SMP safe manner.
>  
> -Next, we have:
> +Next, we have::
>  
> -	int atomic_inc_return(atomic_t *v);
> -	int atomic_dec_return(atomic_t *v);
> +    int atomic_inc_return(atomic_t *v);
> +    int atomic_dec_return(atomic_t *v);
>  
>  These routines add 1 and subtract 1, respectively, from the given
> -atomic_t and return the new counter value after the operation is
> +``atomic_t`` and return the new counter value after the operation is
>  performed.
>  
>  Unlike the above routines, it is required that these primitives
> @@ -207,317 +215,330 @@ the operation.  It must be done such that all memory operations before
>  and after the atomic operation calls are strongly ordered with respect
>  to the atomic operation itself.
>  
> -For example, it should behave as if a smp_mb() call existed both
> +For example, it should behave as if a ``smp_mb()`` call existed both
>  before and after the atomic operation.
>  
>  If the atomic instructions used in an implementation provide explicit
>  memory barrier semantics which satisfy the above requirements, that is
>  fine as well.
>  
> -Let's move on:
> +Let's move on::
>  
> -	int atomic_add_return(int i, atomic_t *v);
> -	int atomic_sub_return(int i, atomic_t *v);
> +    int atomic_add_return(int i, atomic_t *v);
> +    int atomic_sub_return(int i, atomic_t *v);
>  
> -These behave just like atomic_{inc,dec}_return() except that an
> +These behave just like ``atomic_{inc,dec}_return()`` except that an
>  explicit counter adjustment is given instead of the implicit "1".
> -This means that like atomic_{inc,dec}_return(), the memory barrier
> +This means that like ``atomic_{inc,dec}_return()``, the memory barrier
>  semantics are required.
>  
> -Next:
> +Next::
>  
> -	int atomic_inc_and_test(atomic_t *v);
> -	int atomic_dec_and_test(atomic_t *v);
> +    int atomic_inc_and_test(atomic_t *v);
> +    int atomic_dec_and_test(atomic_t *v);
>  
>  These two routines increment and decrement by 1, respectively, the
>  given atomic counter.  They return a boolean indicating whether the
>  resulting counter value was zero or not.
>  
>  Again, these primitives provide explicit memory barrier semantics around
> -the atomic operation.
> +the atomic operation. ::

Better to remove the dot here:

	the atomic operation::

And the same on other places.


>  
> -	int atomic_sub_and_test(int i, atomic_t *v);
> +    int atomic_sub_and_test(int i, atomic_t *v);
>  
> -This is identical to atomic_dec_and_test() except that an explicit
> +This is identical to ``atomic_dec_and_test()`` except that an explicit
>  decrement is given instead of the implicit "1".  This primitive must
> -provide explicit memory barrier semantics around the operation.
> +provide explicit memory barrier semantics around the operation. ::
>  
> -	int atomic_add_negative(int i, atomic_t *v);
> +    int atomic_add_negative(int i, atomic_t *v);
>  
>  The given increment is added to the given atomic counter value.  A boolean
>  is return which indicates whether the resulting counter value is negative.
>  This primitive must provide explicit memory barrier semantics around
>  the operation.
>  
> -Then:
> +Then::
>  
> -	int atomic_xchg(atomic_t *v, int new);
> +    int atomic_xchg(atomic_t *v, int new);
>  
>  This performs an atomic exchange operation on the atomic variable v, setting
>  the given new value.  It returns the old value that the atomic variable v had
>  just before the operation.
>  
> -atomic_xchg must provide explicit memory barriers around the operation.
> +``atomic_xchg`` must provide explicit memory barriers around the operation. ::
>  
> -	int atomic_cmpxchg(atomic_t *v, int old, int new);
> +    int atomic_cmpxchg(atomic_t *v, int old, int new);
>  
>  This performs an atomic compare exchange operation on the atomic value v,
> -with the given old and new values. Like all atomic_xxx operations,
> -atomic_cmpxchg will only satisfy its atomicity semantics as long as all
> -other accesses of *v are performed through atomic_xxx operations.
> +with the given old and new values. Like all ``atomic_*`` operations,
> +``atomic_cmpxchg`` will only satisfy its atomicity semantics as long as all
> +other accesses of ``*v`` are performed through ``atomic_*`` operations.
>  
> -atomic_cmpxchg must provide explicit memory barriers around the operation,
> +``atomic_cmpxchg`` must provide explicit memory barriers around the operation,
>  although if the comparison fails then no memory ordering guarantees are
>  required.
>  
> -The semantics for atomic_cmpxchg are the same as those defined for 'cas'
> +The semantics for ``atomic_cmpxchg`` are the same as those defined for 'cas'
>  below.
>  
> -Finally:
> +Finally::
>  
> -	int atomic_add_unless(atomic_t *v, int a, int u);
> +    int atomic_add_unless(atomic_t *v, int a, int u);
>  
>  If the atomic value v is not equal to u, this function adds a to v, and
>  returns non zero. If v is equal to u then it returns zero. This is done as
>  an atomic operation.
>  
> -atomic_add_unless must provide explicit memory barriers around the
> +``atomic_add_unless`` must provide explicit memory barriers around the
>  operation unless it fails (returns 0).
>  
> -atomic_inc_not_zero, equivalent to atomic_add_unless(v, 1, 0)
> +``atomic_inc_not_zero``, equivalent to ``atomic_add_unless(v, 1, 0)``
>  
>  
> -If a caller requires memory barrier semantics around an atomic_t
> +If a caller requires memory barrier semantics around an ``atomic_t``
>  operation which does not return a value, a set of interfaces are
> -defined which accomplish this:
> +defined which accomplish this::
>  
> -	void smp_mb__before_atomic(void);
> -	void smp_mb__after_atomic(void);
> +    void smp_mb__before_atomic(void);
> +    void smp_mb__after_atomic(void);
>  
> -For example, smp_mb__before_atomic() can be used like so:
> +For example, ``smp_mb__before_atomic()`` can be used like so::
>  
> -	obj->dead = 1;
> -	smp_mb__before_atomic();
> -	atomic_dec(&obj->ref_count);
> +    obj->dead = 1;
> +    smp_mb__before_atomic();
> +    atomic_dec(&obj->ref_count);
>  
> -It makes sure that all memory operations preceding the atomic_dec()
> +It makes sure that all memory operations preceding the ``atomic_dec()``
>  call are strongly ordered with respect to the atomic counter
>  operation.  In the above example, it guarantees that the assignment of
> -"1" to obj->dead will be globally visible to other cpus before the
> +"1" to ``obj->dead`` will be globally visible to other cpus before the
>  atomic counter decrement.
>  
> -Without the explicit smp_mb__before_atomic() call, the
> +Without the explicit ``smp_mb__before_atomic()`` call, the
>  implementation could legally allow the atomic counter update visible
> -to other cpus before the "obj->dead = 1;" assignment.
> +to other cpus before the ``obj->dead = 1;`` assignment.
>  
>  A missing memory barrier in the cases where they are required by the
> -atomic_t implementation above can have disastrous results.  Here is
> +``atomic_t`` implementation above can have disastrous results.  Here is
>  an example, which follows a pattern occurring frequently in the Linux
>  kernel.  It is the use of atomic counters to implement reference
>  counting, and it works such that once the counter falls to zero it can
>  be guaranteed that no other entity can be accessing the object:
>  
> -static void obj_list_add(struct obj *obj, struct list_head *head)
> -{
> -	obj->active = 1;
> -	list_add(&obj->list, head);
> -}
> -
> -static void obj_list_del(struct obj *obj)
> -{
> -	list_del(&obj->list);
> -	obj->active = 0;
> -}
> -
> -static void obj_destroy(struct obj *obj)
> -{
> -	BUG_ON(obj->active);
> -	kfree(obj);
> -}
> -
> -struct obj *obj_list_peek(struct list_head *head)
> -{
> -	if (!list_empty(head)) {
> -		struct obj *obj;
> -
> -		obj = list_entry(head->next, struct obj, list);
> -		atomic_inc(&obj->refcnt);
> -		return obj;
> -	}
> -	return NULL;
> -}
> -
> -void obj_poke(void)
> -{
> -	struct obj *obj;
> -
> -	spin_lock(&global_list_lock);
> -	obj = obj_list_peek(&global_list);
> -	spin_unlock(&global_list_lock);
> -
> -	if (obj) {
> -		obj->ops->poke(obj);
> -		if (atomic_dec_and_test(&obj->refcnt))
> -			obj_destroy(obj);
> -	}
> -}
> -
> -void obj_timeout(struct obj *obj)
> -{
> -	spin_lock(&global_list_lock);
> -	obj_list_del(obj);
> -	spin_unlock(&global_list_lock);
> -
> -	if (atomic_dec_and_test(&obj->refcnt))
> -		obj_destroy(obj);
> -}
> -
> -(This is a simplification of the ARP queue management in the
> - generic neighbour discover code of the networking.  Olaf Kirch
> - found a bug wrt. memory barriers in kfree_skb() that exposed
> - the atomic_t memory barrier requirements quite clearly.)
> -
> -Given the above scheme, it must be the case that the obj->active
> +.. code-block:: c
> +
> +        static void obj_list_add(struct obj *obj, struct list_head *head)
> +        {
> +                obj->active = 1;
> +                list_add(&obj->list, head);
> +        }
> +
> +        static void obj_list_del(struct obj *obj)
> +        {
> +                list_del(&obj->list);
> +                obj->active = 0;
> +        }
> +
> +        static void obj_destroy(struct obj *obj)
> +        {
> +                BUG_ON(obj->active);
> +                kfree(obj);
> +        }
> +
> +        struct obj *obj_list_peek(struct list_head *head)
> +        {
> +                if (!list_empty(head)) {
> +                        struct obj *obj;
> +
> +                        obj = list_entry(head->next, struct obj, list);
> +                        atomic_inc(&obj->refcnt);
> +                        return obj;
> +                }
> +                return NULL;
> +        }
> +
> +        void obj_poke(void)
> +        {
> +                struct obj *obj;
> +
> +                spin_lock(&global_list_lock);
> +                obj = obj_list_peek(&global_list);
> +                spin_unlock(&global_list_lock);
> +
> +                if (obj) {
> +                        obj->ops->poke(obj);
> +                        if (atomic_dec_and_test(&obj->refcnt))
> +                                obj_destroy(obj);
> +                }
> +        }
> +
> +        void obj_timeout(struct obj *obj)
> +        {
> +                spin_lock(&global_list_lock);
> +                obj_list_del(obj);
> +                spin_unlock(&global_list_lock);
> +
> +                if (atomic_dec_and_test(&obj->refcnt))
> +                        obj_destroy(obj);
> +        }
> +
> +.. note:
> +
> +    This is a simplification of the ARP queue management in the
> +    generic neighbour discover code of the networking.  Olaf Kirch
> +    found a bug wrt. memory barriers in kfree_skb() that exposed
> +    the atomic_t memory barrier requirements quite clearly.
> +
> +Given the above scheme, it must be the case that the ``obj->active``
>  update done by the obj list deletion be visible to other processors
>  before the atomic counter decrement is performed.
>  
> -Otherwise, the counter could fall to zero, yet obj->active would still
> -be set, thus triggering the assertion in obj_destroy().  The error
> -sequence looks like this:
> -
> -	cpu 0				cpu 1
> -	obj_poke()			obj_timeout()
> -	obj = obj_list_peek();
> -	... gains ref to obj, refcnt=2
> -					obj_list_del(obj);
> -					obj->active = 0 ...
> -					... visibility delayed ...
> -					atomic_dec_and_test()
> -					... refcnt drops to 1 ...
> -	atomic_dec_and_test()
> -	... refcount drops to 0 ...
> -	obj_destroy()
> -	BUG() triggers since obj->active
> -	still seen as one
> -					obj->active update visibility occurs
> +Otherwise, the counter could fall to zero, yet ``obj->active`` would still
> +be set, thus triggering the assertion in ``obj_destroy()``.  The error
> +sequence looks like this::
> +
> +        cpu 0                           cpu 1
> +        obj_poke()                      obj_timeout()
> +        obj = obj_list_peek();
> +        ... gains ref to obj, refcnt=2
> +                                        obj_list_del(obj);
> +                                        obj->active = 0 ...
> +                                        ... visibility delayed ...
> +                                        atomic_dec_and_test()
> +                                        ... refcnt drops to 1 ...
> +        atomic_dec_and_test()
> +        ... refcount drops to 0 ...
> +        obj_destroy()
> +        BUG() triggers since obj->active
> +        still seen as one
> +                                        obj->active update visibility occurs
>  
>  With the memory barrier semantics required of the atomic_t operations
>  which return values, the above sequence of memory visibility can never
> -happen.  Specifically, in the above case the atomic_dec_and_test()
> +happen.  Specifically, in the above case the ``atomic_dec_and_test()``
>  counter decrement would not become globally visible until the
> -obj->active update does.
> +``obj->active`` update does.
>  
>  As a historical note, 32-bit Sparc used to only allow usage of
> -24-bits of its atomic_t type.  This was because it used 8 bits
> +24-bits of its ``atomic_t`` type.  This was because it used 8 bits
>  as a spinlock for SMP safety.  Sparc32 lacked a "compare and swap"
>  type instruction.  However, 32-bit Sparc has since been moved over
>  to a "hash table of spinlocks" scheme, that allows the full 32-bit
>  counter to be realized.  Essentially, an array of spinlocks are
> -indexed into based upon the address of the atomic_t being operated
> +indexed into based upon the address of the ``atomic_t`` being operated
>  on, and that lock protects the atomic operation.  Parisc uses the
>  same scheme.
>  
> -Another note is that the atomic_t operations returning values are
> +Another note is that the ``atomic_t`` operations returning values are
>  extremely slow on an old 386.
>  
> +Atomic Bitmask
> +==============
> +
>  We will now cover the atomic bitmask operations.  You will find that
>  their SMP and memory barrier semantics are similar in shape and scope
> -to the atomic_t ops above.
> +to the ``atomic_t`` ops above.
>  
>  Native atomic bit operations are defined to operate on objects aligned
> -to the size of an "unsigned long" C data type, and are least of that
> -size.  The endianness of the bits within each "unsigned long" are the
> -native endianness of the cpu.
> +to the size of an ``unsigned long`` C data type, and are least of that
> +size.  The endianness of the bits within each ``unsigned long`` are the
> +native endianness of the cpu. ::
>  
> -	void set_bit(unsigned long nr, volatile unsigned long *addr);
> -	void clear_bit(unsigned long nr, volatile unsigned long *addr);
> -	void change_bit(unsigned long nr, volatile unsigned long *addr);
> +    void set_bit(unsigned long nr, volatile unsigned long *addr);
> +    void clear_bit(unsigned long nr, volatile unsigned long *addr);
> +    void change_bit(unsigned long nr, volatile unsigned long *addr);
>  
>  These routines set, clear, and change, respectively, the bit number
> -indicated by "nr" on the bit mask pointed to by "ADDR".
> +indicated by ``nr`` on the bit mask pointed to by ``addr``.
>  
>  They must execute atomically, yet there are no implicit memory barrier
> -semantics required of these interfaces.
> +semantics required of these interfaces. ::
>  
> -	int test_and_set_bit(unsigned long nr, volatile unsigned long *addr);
> -	int test_and_clear_bit(unsigned long nr, volatile unsigned long *addr);
> -	int test_and_change_bit(unsigned long nr, volatile unsigned long *addr);
> +    int test_and_set_bit(unsigned long nr, volatile unsigned long *addr);
> +    int test_and_clear_bit(unsigned long nr, volatile unsigned long *addr);
> +    int test_and_change_bit(unsigned long nr, volatile unsigned long *addr);
>  
>  Like the above, except that these routines return a boolean which
>  indicates whether the changed bit was set _BEFORE_ the atomic bit
>  operation.
>  
> -WARNING! It is incredibly important that the value be a boolean,
> -ie. "0" or "1".  Do not try to be fancy and save a few instructions by
> -declaring the above to return "long" and just returning something like
> -"old_val & mask" because that will not work.
> +.. warning::
>  
> -For one thing, this return value gets truncated to int in many code
> -paths using these interfaces, so on 64-bit if the bit is set in the
> -upper 32-bits then testers will never see that.
> +    It is incredibly important that the value be a boolean,
> +    ie. "0" or "1".  Do not try to be fancy and save a few instructions by
> +    declaring the above to return "long" and just returning something like
> +    "old_val & mask" because that will not work.
>  
> -One great example of where this problem crops up are the thread_info
> -flag operations.  Routines such as test_and_set_ti_thread_flag() chop
> -the return value into an int.  There are other places where things
> -like this occur as well.
> +    For one thing, this return value gets truncated to int in many code
> +    paths using these interfaces, so on 64-bit if the bit is set in the
> +    upper 32-bits then testers will never see that.
>  
> -These routines, like the atomic_t counter operations returning values,
> +    One great example of where this problem crops up are the ``thread_info``
> +    flag operations.  Routines such as ``test_and_set_ti_thread_flag()`` chop
> +    the return value into an int.  There are other places where things
> +    like this occur as well.
> +
> +These routines, like the ``atomic_t`` counter operations returning values,
>  must provide explicit memory barrier semantics around their execution.
>  All memory operations before the atomic bit operation call must be
>  made visible globally before the atomic bit operation is made visible.
>  Likewise, the atomic bit operation must be visible globally before any
>  subsequent memory operation is made visible.  For example:
>  
> -	obj->dead = 1;
> -	if (test_and_set_bit(0, &obj->flags))
> -		/* ... */;
> -	obj->killed = 1;
> +.. code-block:: c

Instead of using code-block, better to do: 

	subsequent memory operation is made visible.  For example::

> +
> +    obj->dead = 1;
> +    if (test_and_set_bit(0, &obj->flags))
> +            /* ... */;
> +    obj->killed = 1;
>  
> -The implementation of test_and_set_bit() must guarantee that
> -"obj->dead = 1;" is visible to cpus before the atomic memory operation
> -done by test_and_set_bit() becomes visible.  Likewise, the atomic
> -memory operation done by test_and_set_bit() must become visible before
> -"obj->killed = 1;" is visible.
> +The implementation of ``test_and_set_bit()`` must guarantee that
> +``obj->dead = 1;`` is visible to cpus before the atomic memory operation
> +done by ``test_and_set_bit()`` becomes visible.  Likewise, the atomic
> +memory operation done by ``test_and_set_bit()`` must become visible before
> +``obj->killed = 1;`` is visible.
>  
> -Finally there is the basic operation:
> +Finally there is the basic operation::
>  
> -	int test_bit(unsigned long nr, __const__ volatile unsigned long *addr);
> +    int test_bit(unsigned long nr, __const__ volatile unsigned long *addr);
>  
> -Which returns a boolean indicating if bit "nr" is set in the bitmask
> -pointed to by "addr".
> +Which returns a boolean indicating if bit ``nr`` is set in the bitmask
> +pointed to by ``addr``.
>  
> -If explicit memory barriers are required around {set,clear}_bit() (which do
> +If explicit memory barriers are required around ``{set,clear}_bit()`` (which do
>  not return a value, and thus does not need to provide memory barrier
> -semantics), two interfaces are provided:
> +semantics), two interfaces are provided::
>  
> -	void smp_mb__before_atomic(void);
> -	void smp_mb__after_atomic(void);
> +    void smp_mb__before_atomic(void);
> +    void smp_mb__after_atomic(void);
>  
> -They are used as follows, and are akin to their atomic_t operation
> +They are used as follows, and are akin to their ``atomic_t`` operation
>  brothers:
>  
> -	/* All memory operations before this call will
> -	 * be globally visible before the clear_bit().
> -	 */
> -	smp_mb__before_atomic();
> -	clear_bit( ... );
> +.. code-block:: c
>  
> -	/* The clear_bit() will be visible before all
> -	 * subsequent memory operations.
> -	 */
> -	 smp_mb__after_atomic();
> +    /* All memory operations before this call will
> +     * be globally visible before the clear_bit().
> +     */
> +    smp_mb__before_atomic();
> +    clear_bit( ... );
> +
> +    /* The clear_bit() will be visible before all
> +     * subsequent memory operations.
> +     */
> +     smp_mb__after_atomic();
>  
>  There are two special bitops with lock barrier semantics (acquire/release,
>  same as spinlocks). These operate in the same way as their non-_lock/unlock
>  postfixed variants, except that they are to provide acquire/release semantics,
> -respectively. This means they can be used for bit_spin_trylock and
> -bit_spin_unlock type operations without specifying any more barriers.
> +respectively. This means they can be used for ``bit_spin_trylock`` and
> +``bit_spin_unlock`` type operations without specifying any more barriers. ::
>  
> -	int test_and_set_bit_lock(unsigned long nr, unsigned long *addr);
> -	void clear_bit_unlock(unsigned long nr, unsigned long *addr);
> -	void __clear_bit_unlock(unsigned long nr, unsigned long *addr);
> +    int test_and_set_bit_lock(unsigned long nr, unsigned long *addr);
> +    void clear_bit_unlock(unsigned long nr, unsigned long *addr);
> +    void __clear_bit_unlock(unsigned long nr, unsigned long *addr);
>  
> -The __clear_bit_unlock version is non-atomic, however it still implements
> +The ``__clear_bit_unlock`` version is non-atomic, however it still implements
>  unlock barrier semantics. This can be useful if the lock itself is protecting
>  the other bits in the word.
>  
> @@ -526,41 +547,43 @@ provided.  They are used in contexts where some other higher-level SMP
>  locking scheme is being used to protect the bitmask, and thus less
>  expensive non-atomic operations may be used in the implementation.
>  They have names similar to the above bitmask operation interfaces,
> -except that two underscores are prefixed to the interface name.
> +except that two underscores are prefixed to the interface name. ::
>  
> -	void __set_bit(unsigned long nr, volatile unsigned long *addr);
> -	void __clear_bit(unsigned long nr, volatile unsigned long *addr);
> -	void __change_bit(unsigned long nr, volatile unsigned long *addr);
> -	int __test_and_set_bit(unsigned long nr, volatile unsigned long *addr);
> -	int __test_and_clear_bit(unsigned long nr, volatile unsigned long *addr);
> -	int __test_and_change_bit(unsigned long nr, volatile unsigned long *addr);
> +    void __set_bit(unsigned long nr, volatile unsigned long *addr);
> +    void __clear_bit(unsigned long nr, volatile unsigned long *addr);
> +    void __change_bit(unsigned long nr, volatile unsigned long *addr);
> +    int __test_and_set_bit(unsigned long nr, volatile unsigned long *addr);
> +    int __test_and_clear_bit(unsigned long nr, volatile unsigned long *addr);
> +    int __test_and_change_bit(unsigned long nr, volatile unsigned long *addr);
>  
>  These non-atomic variants also do not require any special memory
>  barrier semantics.
>  
> -The routines xchg() and cmpxchg() must provide the same exact
> +The routines ``xchg()`` and ``cmpxchg()`` must provide the same exact
>  memory-barrier semantics as the atomic and bit operations returning
>  values.
>  
> -Note: If someone wants to use xchg(), cmpxchg() and their variants,
> -linux/atomic.h should be included rather than asm/cmpxchg.h, unless
> -the code is in arch/* and can take care of itself.
> +.. note::
> +
> +    If someone wants to use ``xchg()``, ``cmpxchg()`` and their variants,
> +    ``linux/atomic.h`` should be included rather than ``asm/cmpxchg.h``, unless
> +    the code is in arch/* and can take care of itself.
>  
>  Spinlocks and rwlocks have memory barrier expectations as well.
>  The rule to follow is simple:
>  
> -1) When acquiring a lock, the implementation must make it globally
> +1. When acquiring a lock, the implementation must make it globally
>     visible before any subsequent memory operation.
>  
> -2) When releasing a lock, the implementation must make it such that
> +2. When releasing a lock, the implementation must make it such that
>     all previous memory operations are globally visible before the
>     lock release.
>  
> -Which finally brings us to _atomic_dec_and_lock().  There is an
> -architecture-neutral version implemented in lib/dec_and_lock.c,
> -but most platforms will wish to optimize this in assembler.
> +Which finally brings us to ``_atomic_dec_and_lock()``.  There is an
> +architecture-neutral version implemented in ``lib/dec_and_lock.c``,
> +but most platforms will wish to optimize this in assembler. ::
>  
> -	int _atomic_dec_and_lock(atomic_t *atomic, spinlock_t *lock);
> +    int _atomic_dec_and_lock(atomic_t *atomic, spinlock_t *lock);
>  
>  Atomically decrement the given counter, and if will drop to zero
>  atomically acquire the given spinlock and perform the decrement
> @@ -573,68 +596,74 @@ sure the spinlock operation is globally visible before any
>  subsequent memory operation.
>  
>  We can demonstrate this operation more clearly if we define
> -an abstract atomic operation:
> +an abstract atomic operation::
>  
> -	long cas(long *mem, long old, long new);
> +    long cas(long *mem, long old, long new);
>  
> -"cas" stands for "compare and swap".  It atomically:
> +"``cas``" stands for "compare and swap".  It atomically:
>  
> -1) Compares "old" with the value currently at "mem".
> -2) If they are equal, "new" is written to "mem".
> -3) Regardless, the current value at "mem" is returned.
> +1. Compares "old" with the value currently at "mem".
> +2. If they are equal, "new" is written to "mem".
> +3. Regardless, the current value at "mem" is returned.
>  
>  As an example usage, here is what an atomic counter update
>  might look like:
>  
> -void example_atomic_inc(long *counter)
> -{
> -	long old, new, ret;
> +.. code-block:: c

Same here. Use :: instead of code-block.

>  
> -	while (1) {
> -		old = *counter;
> -		new = old + 1;
> +    void example_atomic_inc(long *counter)
> +    {
> +            long old, new, ret;
>  
> -		ret = cas(counter, old, new);
> -		if (ret == old)
> -			break;
> -	}
> -}
> +            while (1) {
> +                    old = *counter;
> +                    new = old + 1;
> +
> +                    ret = cas(counter, old, new);
> +                    if (ret == old)
> +                            break;
> +            }
> +    }
>  
>  Let's use cas() in order to build a pseudo-C atomic_dec_and_lock():
>  
> -int _atomic_dec_and_lock(atomic_t *atomic, spinlock_t *lock)
> -{
> -	long old, new, ret;
> -	int went_to_zero;
> -
> -	went_to_zero = 0;
> -	while (1) {
> -		old = atomic_read(atomic);
> -		new = old - 1;
> -		if (new == 0) {
> -			went_to_zero = 1;
> -			spin_lock(lock);
> -		}
> -		ret = cas(atomic, old, new);
> -		if (ret == old)
> -			break;
> -		if (went_to_zero) {
> -			spin_unlock(lock);
> -			went_to_zero = 0;
> -		}
> -	}
> -
> -	return went_to_zero;
> -}
> -
> -Now, as far as memory barriers go, as long as spin_lock()
> +.. code-block:: c

And here

> +
> +    int _atomic_dec_and_lock(atomic_t *atomic, spinlock_t *lock)
> +    {
> +            long old, new, ret;
> +            int went_to_zero;
> +
> +            went_to_zero = 0;
> +            while (1) {
> +                    old = atomic_read(atomic);
> +                    new = old - 1;
> +                    if (new == 0) {
> +                            went_to_zero = 1;
> +                            spin_lock(lock);
> +                    }
> +                    ret = cas(atomic, old, new);
> +                    if (ret == old)
> +                            break;
> +                    if (went_to_zero) {
> +                            spin_unlock(lock);
> +                            went_to_zero = 0;
> +                    }
> +            }
> +
> +            return went_to_zero;
> +    }
> +
> +Now, as far as memory barriers go, as long as i``spin_lock()``
>  strictly orders all subsequent memory operations (including
> -the cas()) with respect to itself, things will be fine.
> +the ``cas()``) with respect to itself, things will be fine.
>  
> -Said another way, _atomic_dec_and_lock() must guarantee that
> +Said another way, ``_atomic_dec_and_lock()`` must guarantee that
>  a counter dropping to zero is never made visible before the
>  spinlock being acquired.
>  
> -Note that this also means that for the case where the counter
> -is not dropping to zero, there are no memory ordering
> -requirements.
> +.. note::
> +
> +    Note that this also means that for the case where the counter
> +    is not dropping to zero, there are no memory ordering
> +    requirements.
> diff --git a/Documentation/core-api/index.rst b/Documentation/core-api/index.rst
> index 480d9a3..f3e5f5e 100644
> --- a/Documentation/core-api/index.rst
> +++ b/Documentation/core-api/index.rst
> @@ -8,6 +8,7 @@ Kernel and driver related documentation.
>     :maxdepth: 1
>  
>     assoc_array
> +   atomic_ops
>     workqueue
>  
>  .. only::  subproject
> diff --git a/Documentation/process/volatile-considered-harmful.rst b/Documentation/process/volatile-considered-harmful.rst
> index e0d042a..4934e65 100644
> --- a/Documentation/process/volatile-considered-harmful.rst
> +++ b/Documentation/process/volatile-considered-harmful.rst
> @@ -1,3 +1,6 @@
> +
> +.. _volatile_considered_harmful:
> +
>  Why the "volatile" type class should not be used
>  ------------------------------------------------
>  



Thanks,
Mauro