netdev - RE: [net-next PATCH 1/1 V4] qdisc: bulk dequeue support for qdiscs with TCQ_F

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <063D6719AE5E284EB5DD2968C1650D6D1749E380@AcuExch.aculab.com>
Date:	Fri, 26 Sep 2014 09:23:57 +0000
From:	David Laight <David.Laight@...LAB.COM>
To:	'Eric Dumazet' <eric.dumazet@...il.com>,
	Tom Herbert <therbert@...gle.com>
CC:	Jesper Dangaard Brouer <brouer@...hat.com>,
	Linux Netdev List <netdev@...r.kernel.org>,
	"David S. Miller" <davem@...emloft.net>,
	"Alexander Duyck" <alexander.h.duyck@...el.com>,
	Toke Høiland-Jørgensen <toke@...e.dk>,
	Florian Westphal <fw@...len.de>,
	Jamal Hadi Salim <jhs@...atatu.com>,
	Dave Taht <dave.taht@...il.com>,
	John Fastabend <john.r.fastabend@...el.com>,
	"Daniel Borkmann" <dborkman@...hat.com>,
	Hannes Frederic Sowa <hannes@...essinduktion.org>
Subject: RE: [net-next PATCH 1/1 V4] qdisc: bulk dequeue support for qdiscs
 with TCQ_F_ONETXQUEUE

From: Eric Dumazet
> On Wed, 2014-09-24 at 19:12 -0700, Eric Dumazet wrote:
...
> It turned out the problem I noticed was caused by compiler trying to be
> smart, but involving a bad MESI transaction.
> 
>   0.05   mov    0xc0(%rax),%edi    // LOAD dql->num_queued
>   0.48   mov    %edx,0xc8(%rax)    // STORE dql->last_obj_cnt = count
>  58.23   add    %edx,%edi
>   0.58   cmp    %edi,0xc4(%rax)
>   0.76   mov    %edi,0xc0(%rax)    // STORE dql->num_queued += count
>   0.72   js     bd8
> 
> 
> I get an incredible 10 % gain by making sure cpu wont get the cache line
> in Shared mode.

That is a stunning difference between requesting 'exclusive' access
and upgrading 'shared' to exclusive.
Stinks of a cpu bug?

Or is the reported stall a side effect of waiting for the earlier
'cache line read' to complete in order to issue the 'upgrade to exclusive'.
In which case gcc's instruction scheduler probably needs to be taught
to schedule writes before reads.

> (I also tried a barrier() in netdev_tx_sent_queue() between the
> 
> 	dql_queued(&dev_queue->dql, bytes);
> -
> +	barrier();
>  	if (likely(dql_avail(&dev_queue->dql) >= 0))
> 
> But following patch seems cleaner
> 
> diff --git a/include/linux/dynamic_queue_limits.h b/include/linux/dynamic_queue_limits.h
> index 5621547d631b..978fbe332090 100644
> --- a/include/linux/dynamic_queue_limits.h
> +++ b/include/linux/dynamic_queue_limits.h
> @@ -80,7 +80,7 @@ static inline void dql_queued(struct dql *dql, unsigned int count)
>  /* Returns how many objects can be queued, < 0 indicates over limit. */
>  static inline int dql_avail(const struct dql *dql)
>  {
> -	return dql->adj_limit - dql->num_queued;
> +	return ACCESS_ONCE(dql->adj_limit) - ACCESS_ONCE(dql->num_queued);

Dunno, that could have an impact on other calls where the values
are already in registers - I suspect ACCESS_ONCE() forces an access.

	David