lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1308009552.31900.80.camel@vi2.jf.intel.com>
Date:	Mon, 13 Jun 2011 16:59:12 -0700
From:	Vasu Dev <vasu.dev@...ux.intel.com>
To:	Eric Dumazet <eric.dumazet@...il.com>
Cc:	Jeff Kirsher <jeffrey.t.kirsher@...el.com>, davem@...emloft.net,
	Vasu Dev <vasu.dev@...el.com>, netdev@...r.kernel.org,
	gospo@...hat.com
Subject: Re: [net-next 13/13] ixgbe: use per NUMA node lock for FCoE DDP

On Sat, 2011-06-11 at 07:42 +0200, Eric Dumazet wrote:
> Le samedi 11 juin 2011 à 07:18 +0200, Eric Dumazet a écrit :
> > >  /**
> > > diff --git a/drivers/net/ixgbe/ixgbe_fcoe.h b/drivers/net/ixgbe/ixgbe_fcoe.h
> > > index d876e7a..8618892 100644
> > > --- a/drivers/net/ixgbe/ixgbe_fcoe.h
> > > +++ b/drivers/net/ixgbe/ixgbe_fcoe.h
> > > @@ -69,6 +69,7 @@ struct ixgbe_fcoe {
> > >  	struct pci_pool **pool;
> > >  	atomic_t refcnt;
> > >  	spinlock_t lock;
> > > +	struct spinlock **node_lock;
> > 
> > Wont this read_mostly pointer sits in often modified cache line ?
> > 
> > >  	struct ixgbe_fcoe_ddp ddp[IXGBE_FCOE_DDP_MAX];
> > >  	unsigned char *extra_ddp_buffer;
> > >  	dma_addr_t extra_ddp_buffer_dma;
> > 
> 
> This patch seems overkill to me, have you tried the more simple way I
> did in commit 79640a4ca6955e3ebdb7038508fa7a0cd7fa5527
> (net: add additional lock to qdisc to increase throughput )
> 

Conceptually this patch does same as referred patch from you, in your
case added Qdisc busylock worked fine with multiple tx/Qdisc  but for
fcoe a single ixgbe_fcoe instance is used and therefore this patch
required to setup per node lock similar to Qdisc busylock and that was
the only additional code here and rest is similar.

> (remember you must place ->busylock in a separate cache line, to not
> slow down the two cpus that have access to ->lock)
> 
> struct ixgbe_fcoe could probably be more carefuly reordered to lower
> false sharing
> 

I did optimize ixgbe_fcoe due to added node_lock using pahole and not to
stepped on any mostly modified line beside also reducing holes, but
moving far away in the struct on new cacheline boundary is even better
and I'll do that in updated patch.

> I kindly ask you guys provide actual perf numbers between
> 
> 1) before any patch
> 2) After your multilevel per numanode locks
> 3) A more simple way (my suggestion of adding a single 'busylock')
> 

I'm retracting this patch for now and in case of updated patch I'll get
number by just this change, earlier I had number w/ and w/o three DDP
related improvements patches in this series and that had net 26% IOPS
increase for 512Bytes reads but that is mainly from other patch adding
per CPU pci pool. The two sockets systems had about same IOPS w/ or w/o
just this patch and so may be this will work better w/ more sockets, so
I'll will respin this patch only if I see any justifiable gain by just
this patch.

Thanks
Vasu

> Thanks
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ