[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1414610868.2420.52.camel@jtkirshe-mobl>
Date: Wed, 29 Oct 2014 12:27:48 -0700
From: Jeff Kirsher <jeffrey.t.kirsher@...el.com>
To: Alexander Duyck <alexander.h.duyck@...hat.com>
Cc: Eric Dumazet <eric.dumazet@...il.com>,
netdev <netdev@...r.kernel.org>
Subject: Re: [RFC] use smp_load_acquire()/smp_store_release()
On Wed, 2014-10-29 at 09:16 -0700, Alexander Duyck wrote:
> On 10/29/2014 07:49 AM, Eric Dumazet wrote:
> > Hi Alexander
> >
> > The memory barriers added in commit
> > b37c0fbe3f6dfba1f8ad2aed47fb40578a254635
> > ("net: Add memory barriers to prevent possible race in byte queue
> > limits")
> >
> > have heavy cost.
> >
> > It seems we could use smp_load_acquire() and smp_store_release()
> > instead ?
> >
> > I'll post a patch later today. I would be interested if someone was able
> > to test it, as your commit apparently was tested and known to fix a
> > reproducible race.
> >
> > Thanks !
Eric- just CC me on the patch you post and I will see what I can do
about getting validation eyes on it.
>
> Unfortunately Stephen left Intel before I did, so we will need to find
> someone else in the validation team to test this if possible. I have
> added Jeff to the CC so that he can give the appropriate validation
> people a heads up that this patch might be coming.
>
> As I recall what was seen was random Tx hangs on systems with the
> original BQL code when interfaces were stressed. It has been a while so
> I don't recall the exact set-up for all of it. Also some less
> used/tested architectures such as PowerPC can be more susceptible to
> synchronization issues such as these as the memory model is more weakly
> ordered.
>
> I'm wondering where you are seeing the barrier show up? In
> netdev_tx_send_queue you should only hit the barrier if you actually are
> triggering the XOFF condition, and in netdev_tx_completed_queue the
> barrier should be coalesced in amongst a number of frames reducing the cost.
>
> My concern with this would be that we are actually syncronizing multiple
> things, the __QUEUE_STATE_STACK_XOFF flag, dql->adj_limit, and
> dql->num_queued, and we might be trading off reducing the cost on x86 to
> result in it being increased on other architectures as we may have to
> actually add additional synchronization as I suspect we would need to
> use acquire/release on both adj_limit and num_queued.
>
> Thanks,
>
> Alex
>
Download attachment "signature.asc" of type "application/pgp-signature" (820 bytes)
Powered by blists - more mailing lists