Tested using pktgen.

All test were run on the same H/W. The CPU clock was changed from the BIOS
and the machine rebooted before each iteration.

Results in pps. Sending 4000000 60-byte packets.

Iteration 0 (under-clocked 1052.476 MHz):
Cpu(s):  0.3%us, 13.6%sy,  0.0%ni,  0.0%id,  0.0%wa, 31.2%hi, 54.8%si,  0.0%st
Result: OK: 28910148(c28791584+d118564) usec, 4000000 (60byte,0frags)
  138359pps 66Mb/sec (66412320bps) errors: 0
Interrupts: 3234740

Iteration 1 (normal 1397.657 MHz):
Cpu(s):  0.3%us, 20.9%sy,  0.0%ni,  0.0%id,  0.0%wa, 29.9%hi, 48.8%si,  0.0%st
Result: OK: 26947273(c22637342+d4309931) usec, 4000000 (60byte,0frags)
  148438pps 71Mb/sec (71250240bps) errors: 0
Interrupts: 3998176

Iteration 2 (over-clocked 1575.819 MHz):
Cpu(s):  0.3%us, 33.0%sy,  0.0%ni,  0.0%id,  0.0%wa, 27.3%hi, 39.3%si,  0.0%st
Result: OK: 26937148(c21656005+d5281143) usec, 4000000 (60byte,0frags)
  148493pps 71Mb/sec (71276640bps) errors: 0
Interrupts: 3999634

The next few iterations are with a change to the driver. Modified finish_xmit
to only wake the transmit queue when there are least 16 free spots in the tx
ring. Previously, the driver would wake the transmit queue when there was at
least 1 free spot in the tx ring. This should add some hysteresis.

Iteration 3 (under-clocked 1052.476 MHz):
Cpu(s):  0.3%us, 16.3%sy,  0.0%ni,  0.0%id,  0.0%wa, 30.0%hi, 53.3%si,  0.0%st
Result: OK: 28246751(c28169436+d77315) usec, 4000000 (60byte,0frags)
  141609pps 67Mb/sec (67972320bps) errors: 0
Interrupts: 3227925

Iteration 4 (normal 1397.657 MHz):
Cpu(s):  0.3%us, 23.7%sy,  0.0%ni,  0.0%id,  0.0%wa, 30.0%hi, 46.0%si,  0.0%st
Result: OK: 26935554(c25058872+d1876682) usec, 4000000 (60byte,0frags)
  148502pps 71Mb/sec (71280960bps) errors: 0
Interrupts: 3994491

Iteration 5 (over-clocked 1575.819 MHz):
Cpu(s):  0.3%us, 30.8%sy,  0.0%ni,  0.0%id,  0.0%wa, 27.2%hi, 41.7%si,  0.0%st
Result: OK: 26933751(c23148154+d3785597) usec, 4000000 (60byte,0frags)
  148512pps 71Mb/sec (71285760bps) errors: 0
Interrupts: 3999595