[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4ADD41F5.5080707@candelatech.com>
Date: Mon, 19 Oct 2009 21:52:05 -0700
From: Ben Greear <greearb@...delatech.com>
To: Eric Dumazet <eric.dumazet@...il.com>
CC: NetDev <netdev@...r.kernel.org>
Subject: Re: pktgen and spin_lock_bh in xmit path
Eric Dumazet wrote:
> Ben Greear a écrit :
>
>> I'm having strange issues when running pktgen on 10G interfaces while
>> also running
>> pktgen on mac-vlans on that interface, when the mac-vlan pktgen threads
>> are on a different
>> CPU.
>>
>> First, lockdep gives up and says that things are not properly
>> annotated. I believe this is because
>> the macvlan tx path will lock it's txq and will also lock the
>> lower-dev's txq. To fix this, perhaps
>> we need some new lockdep aware primitives for netdev txq locking?
>>
>> Second, is using _bh() locking really sufficient if we have pktgen
>> writing to a physical device
>> and also have other pktgen threads writing to that same device though
>> mac-vlans? I'm seeing
>> deadlocks spinning on the _bh() lock in pktgen as well as strange
>> corruptions, so I think there
>> must be *some* problem somewhere, I just don't know quite what it is yet.
>>
>>
>
> Could you please give us a copy if your pktgen scripts ?
>
I'm driving it with another program, and my pktgen is a bit hacked, but
the basic idea is:
1 pktgen connection on cpu 0 running as fast as it can (trying for
10Gbps, but getting maybe 3-4),
running between two 10G ports (intel 82599).
Multi-pkt is set to 10,000 on each side.
3 pairs of mac-vlans on each of the two physical 10G ports.
3 pktgen 'connections' between these..each are running at about 1Gbps.
These 3 pktgen connections are on CPU 4.
Multi-pkt is set to 1 since multi-pkt is a very bad idea on virtual
devices.
1514 byte pkts. No IPs on the interfaces, using ToS in pktgen, but
nothing else is configured to
care.
The two physical ports are cabled together directly with a fibre cable.
All pktgen connections are full duplex (both sides transmitting to each
other..and I have
rx logic to gather stats on received pkts as well). With no kernel
debugging, this can run right at 10Gbps bi-directional,
with lockdep it gets around 5-6Gbps in each direction.
The lockup often occurs near starting/stopping pktgen, but also happens
while just normally
running under load, usually within 10 minutes.
I tried and failed to reproduce this on a 1G network, but maybe I'm just
not getting (un)lucky,
didn't try for too long.
Among other things, it appears as if the mac-vlan interfaces sometimes
become locked to transmit
by pktgen, but a raw socket in user-space can send on them fine. I'm
going to add some debugging
for this particular issue tomorrow to try to figure out why that happens.
Please note I have the rest of my network patches applied (but not using
any proprietary modules),
so it could easily be something I've caused. I think fixing lockdep to
work with the txq _bh locks
would be a good first step to fixing this..
Thanks,
Ben
--
Ben Greear <greearb@...delatech.com>
Candela Technologies Inc http://www.candelatech.com
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists