lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Thu, 5 Feb 2009 08:33:41 -0500 From: Neil Horman <nhorman@...driver.com> To: Eric Dumazet <dada1@...mosbay.com> Cc: Wesley Chow <wchow@...enacr.com>, netdev@...r.kernel.org, Kenny Chang <kchang@...enacr.com> Subject: Re: Multicast packet loss On Wed, Feb 04, 2009 at 07:11:36PM +0100, Eric Dumazet wrote: > Wesley Chow a écrit : > >>>>>> > >>>>>> > >>>>> Are these quad core systems? Or dual core w/ hyperthreading? I > >>>>> ask because in > >>>>> your working setup you have 1/2 the number of cpus' and was not > >>>>> sure if you > >>>>> removed an entire package of if you just disabled hyperthreading. > >>>>> > >>>>> > >>>>> Neil > >>>>> > >>>>> > >>>> Yeah, these are quad core systems. The 8 cpu system is a > >>>> dual-processor quad-core. The other is my desktop, single cpu quad > >>>> core. > >>>> > >>>> > > > > > > Just to be clear: on the 2 x quad core system, we can run with a 2.6.15 > > kernel and see no packet drops. In fact, we can run with 2.6.19, 2.6.20, > > and 2.6.21 just fine. 2.6.22 is the first kernel that shows problems. > > > > Kenny posted results from a working setup on a different machine. > > > > What I would really like to know is if whatever changed between 2.6.21 > > and 2.6.22 that broke things is confined just to bnx2. To make this a > > rigorous test, we would need to use the same machine with a different > > nic, which we don't have quite yet. An Intel Pro 1000 ethernet card is > > in the mail as I type this. > > > > I also tried forward porting the bnx2 driver in 2.6.21 to 2.6.22 > > (unsuccessfully), and building the most recent driver from the Broadcom > > site to Ubuntu Hardy's 2.6.24. The most recent driver with hardy 2.6.24 > > showed similar packet dropping problems. Hm, perhaps I'll try to build > > the most recent broadcom driver against 2.6.21. > > > > Try oprofile session, you shall see a scheduler effect (dont want to call > this a regression, no need for another flame war). > > also give us "vmstat 1" results (number of context switches per second) > > On recent kernels, scheduler might be faster than before: You get more wakeups per > second and more work to do by softirq handler (it does more calls to scheduler, > thus less cpu cycles available for draining NIC RX queue in time) > > opcontrol --vmlinux=/path/vmlinux --start > <run benchmark> > opreport -l /path/vmlinux | head -n 50 > > Recent schedulers tend to be optimum for lower latencies (and thus, on > a high level of wakeups, you get less bandwidth because of sofirq using > a whole CPU) > > For example, if you have one tread receiving data on 4 or 8 sockets, you'll > probably notice better throughput (because it will sleep less often) > > Multicast receiving on N sockets, with one thread waiting on each socket > is basically a way to trigger a scheduler storm. (N wakeups per packet). > So its more a benchmark to stress scheduler than stressing network stack... > > > Maybe its time to change user side, and not try to find an appropriate kernel :) > > If you know you have to receive N frames per 20us units, then its better to : > Use non blocking sockets, and doing such loop : > > { > usleep(20); // or try to compensate if this thread is slowed too much by following code > for (i = 0 ; i < N ; i++) { > while (revfrom(socket[N], ....) != -1) > receive_frame(...); > } > } > > That way, you are pretty sure network softirq handler wont have to spend time trying > to wakeup 400.000 time per second one thread. All cpu cycles can be spent in NIC driver > and network stack. > > Your thread will do 50.000 calls to nanosleep() per second, that is not really expensive, > then N recvfrom() per iteration. It should work on all past , current and future kernels. > +1 to this idea. Since the last oprofile traces showed significant variance in the time spent in schedule(), it might be worthwhile to investigate the affects of the application behavior on this. I might also be worth adding a systemtap probe to sys_recvmsg, to count how many times we receive frames on a working and non-working system. If the app is behaving differently on different kernels, and its affecting the number of times you go to get a frame out of the stack, that would affect your drop rates, and it would show up in sys_recvmsg Neil -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists