linux-kernel - dataring_push() barriers Re: [RFC PATCH v4 1/9] printk-rb: add a new printk ringbuffer implementation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190820135004.7vatbrzphfsgsnw2@pathway.suse.cz>
Date:   Tue, 20 Aug 2019 15:50:04 +0200
From:   Petr Mladek <pmladek@...e.com>
To:     John Ogness <john.ogness@...utronix.de>
Cc:     linux-kernel@...r.kernel.org,
        Andrea Parri <andrea.parri@...rulasolutions.com>,
        Sergey Senozhatsky <sergey.senozhatsky@...il.com>,
        Sergey Senozhatsky <sergey.senozhatsky.work@...il.com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Brendan Higgins <brendanhiggins@...gle.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Subject: dataring_push() barriers Re: [RFC PATCH v4 1/9] printk-rb: add a new
 printk ringbuffer implementation

On Thu 2019-08-08 00:32:26, John Ogness wrote:
> +/**
> + * dataring_push() - Reserve a data block in the data array.
> + *
> + * @dr:   The data ringbuffer to reserve data in.
> + *
> + * @size: The size to reserve.
> + *
> + * @desc: A pointer to a descriptor to store the data block information.
> + *
> + * @id:   The ID of the descriptor to be associated.
> + *        The data block will not be set with @id, but rather initialized with
> + *        a value that is explicitly different than @id. This is to handle the
> + *        case when newly available garbage by chance matches the descriptor
> + *        ID.
> + *
> + * This function expects to move the head pointer forward. If this would
> + * result in overtaking the data array index of the tail, the tail data block
> + * will be invalidated.
> + *
> + * Return: A pointer to the reserved writer data, otherwise NULL.
> + *
> + * This will only fail if it was not possible to invalidate the tail data
> + * block.
> + */
> +char *dataring_push(struct dataring *dr, unsigned int size,
> +		    struct dr_desc *desc, unsigned long id)
> +{
> +	unsigned long begin_lpos;
> +	unsigned long next_lpos;
> +	struct dr_datablock *db;
> +	bool ret;
> +
> +	to_db_size(&size);
> +
> +	do {
> +		/* fA: */
> +		ret = get_new_lpos(dr, size, &begin_lpos, &next_lpos);
> +
> +		/*
> +		 * fB:
> +		 *
> +		 * The data ringbuffer tail may have been pushed (by this or
> +		 * any other task). The updated @tail_lpos must be visible to
> +		 * all observers before changes to @begin_lpos, @next_lpos, or
> +		 * @head_lpos by this task are visible in order to allow other
> +		 * tasks to recognize the invalidation of the data
> +		 * blocks.

This sounds strange. The write barrier should be done only on CPU
that really modified tail_lpos. I.e. it should be in _dataring_pop()
after successful dr->tail_lpos modification.

> +		 * This pairs with the smp_rmb() in _dataring_pop() as well as
> +		 * any reader task using smp_rmb() to post-validate data that
> +		 * has been read from a data block.
> +
> +		 * Memory barrier involvement:
> +		 *
> +		 * If dE reads from fE, then dI reads from fA->eA.
> +		 * If dC reads from fG, then dI reads from fA->eA.
> +		 * If dD reads from fH, then dI reads from fA->eA.
> +		 * If mC reads from fH, then mF reads from fA->eA.
> +		 *
> +		 * Relies on:
> +		 *
> +		 * FULL MB between fA->eA and fE
> +		 *    matching
> +		 * RMB between dE and dI
> +		 *
> +		 * FULL MB between fA->eA and fG
> +		 *    matching
> +		 * RMB between dC and dI
> +		 *
> +		 * FULL MB between fA->eA and fH
> +		 *    matching
> +		 * RMB between dD and dI
> +		 *
> +		 * FULL MB between fA->eA and fH
> +		 *    matching
> +		 * RMB between mC and mF
> +		 */
> +		smp_mb();

All these comments talk about sychronization against read barriers.
It means that we would need a write barrier here. But it does
not make much sense to do write barrier before actually
writing dr->head_lpos.

After all I think that we do not need any barrier here.
The write barrier for dr->tail_lpos should be in
_dataring_pop(). The read barrier is not needed because
we are not reading anything here.

Instead we should put a barrier after modyfying dr->head_lpos,
see below.

> +		if (!ret) {
> +			/*
> +			 * Force @desc permanently invalid to minimize risk
> +			 * of the descriptor later unexpectedly being
> +			 * determined as valid due to overflowing/wrapping of
> +			 * @head_lpos. An unaligned @begin_lpos can never
> +			 * point to a data block and having the same value
> +			 * for @begin_lpos and @next_lpos is also invalid.
> +			 */
> +
> +			/* fC: */
> +			WRITE_ONCE(desc->begin_lpos, 1);
> +
> +			/* fD: */
> +			WRITE_ONCE(desc->next_lpos, 1);
> +
> +			return NULL;
> +		}
> +	/* fE: */
> +	} while (atomic_long_cmpxchg_relaxed(&dr->head_lpos, begin_lpos,
> +					     next_lpos) != begin_lpos);
> +

We need a write barrier here to make sure that dr->head_lpos
is updated before we start updating other values, e.g.
db->id below.

Best Regards,
Petr

> +	db = to_datablock(dr, begin_lpos);
> +
> +	/*
> +	 * fF:
> +	 *
> +	 * @db->id is a garbage value and could possibly match the @id. This
> +	 * would be a problem because the data block would be considered
> +	 * valid before the writer has finished with it (i.e. before the
> +	 * writer has set @id). Force some other ID value.
> +	 */
> +	WRITE_ONCE(db->id, id - 1);
>
> +	/*
> +	 * fG:
> +	 *
> +	 * Ensure that @db->id is initialized to a wrong ID value before
> +	 * setting @begin_lpos so that there is no risk of accidentally
> +	 * matching a data block to a descriptor before the writer is finished
> +	 * with it (i.e. before the writer has set the correct @id). This
> +	 * pairs with the _acquire() in _dataring_pop().
> +	 *
> +	 * Memory barrier involvement:
> +	 *
> +	 * If dC reads from fG, then dF reads from fF.
> +	 *
> +	 * Relies on:
> +	 *
> +	 * RELEASE from fF to fG
> +	 *    matching
> +	 * ACQUIRE from dC to dF
> +	 */
> +	smp_store_release(&desc->begin_lpos, begin_lpos);
> +
> +	/* fH: */
> +	WRITE_ONCE(desc->next_lpos, next_lpos);
> +
> +	/* If this data block wraps, use @data from the content data block. */
> +	if (DATA_WRAPS(dr, begin_lpos) != DATA_WRAPS(dr, next_lpos))
> +		db = to_datablock(dr, 0);
> +
> +	return &db->data[0];
> +}