[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <20070214224318.f9495001.akpm@linux-foundation.org>
Date: Wed, 14 Feb 2007 22:43:18 -0800
From: Andrew Morton <akpm@...ux-foundation.org>
To: schidambaram@...ric7.com
Cc: Stephen Hemminger <shemminger@...ux-foundation.org>,
Jeff Garzik <jeff@...zik.org>,
Driver Support <driver-support@...ric7.com>,
linux-kernel@...r.kernel.org, netdev@...r.kernel.org
Subject: Re: [PATCH 1/1] Fabric7 VIOC driver source code
On Wed, 07 Feb 2007 13:07:40 -0800 Sriram Chidambaram <schidambaram@...ric7.com> wrote:
> This patch provides the Fabric7 VIOC driver source code.
> This git mbox patch is built against
> git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/netdev-2.6.git
>
> The patch can be pulled from
>ftp://ftp.fabric7.com/VIOC/Fabric7-VIOC-driver-patch.FEB-07-2007
For people wondering what this is, the documentation file is below.
I'll pull this driver into my queue so that it doesn't get lost and to give
people an opportunity to review it more easily. From a quick peek, I'd
expect some changes to be needed: stylistic things, plus some suspicious
looking PCI-poking in vioc_irq.c. But I didn't look at it at all closely.
The driver needed a bit of help to make it compile on ia64 (I haven't tried
any other architectures). If it's simply not possible that this device
will ever be present on any non-x86 machines then perhaps we should
restrict it to those architectures at kernel configuration time.
But then, all the changes I made were good ones..
Overview
========
A Virtual Input-Output Controller (VIOC) is a PCI device that provides
10Gbps of I/O bandwidth that can be shared by up to 16 virtual network
interfaces (VNICs). VIOC hardware supports several features such as
large frames, checksum offload, gathered send, MSI/MSI-X, bandwidth
control, interrupt mitigation, etc.
VNICs are provisioned to a host partition via an out-of-band interface
from the System Controller -- typically before the partition boots,
although they can be dynamically added or removed from a running
partition as well.
Each provisioned VNIC appears as an Ethernet netdevice to the host OS,
and maintains its own transmit ring in DMA memory. VNICs are
configured to share up to 4 of total 16 receive rings and 1 of total
16 receive-completion rings in DMA memory. VIOC hardware classifies
packets into receive rings based on size, allowing more efficient use
of DMA buffer memory. The default, and recommended, configuration
uses groups of 'receive sets' (rxsets), each with 3 receive rings, a
receive completion ring, and a VIOC Rx interrupt. The driver gives
each rxset a NAPI poll handler associated with a phantom (invisible)
netdevice, for concurrency. VNICs are assigned to rxsets using a
simple modulus.
VIOC provides 4 interrupts in INTx mode: 2 for Rx, 1 for Tx, and 1 for
out-of-band messages from the System Controller and errors. VIOC also
provides 19 MSI-X interrupts: 16 for Rx, 1 for Tx, 1 for out-of-band
messages from the System Controller, and 1 for error signalling from
the hardware. The VIOC driver makes a determination whether MSI-X
functionality is supported and initializes interrupts accordingly.
[Note: The Linux kernel disables MSI-X for VIOCs on modules with AMD
8131, even if the device is on the HT link.]
Module loadable parameters
==========================
- poll_weight (default 8) - the number of received packets will be
processed during one call into the NAPI poll handler.
- rx_intr_timeout (default 1) - hardware rx interrupt mitigation
timer, in units of 5us.
- rx_intr_pkt_cnt (default 64) - hardware rx interrupt mitigation
counter, in units of packets.
- tx_pkts_per_irq (default 64) - hardware tx interrupt mitigation
counter, in units of packets.
- tx_pkts_per_bell (default 1) - the number of packets to enqueue on a
transmit ring before issuing a doorbell to hardware.
Performance Tuning
==================
You may want to use the following sysctl settings to improve
performance. [NOTE: To be re-checked]
# set in /etc/sysctl.conf
net.ipv4.tcp_timestamps = 0
net.ipv4.tcp_sack = 0
net.ipv4.tcp_rmem = 10000000 10000000 10000000
net.ipv4.tcp_wmem = 10000000 10000000 10000000
net.ipv4.tcp_mem = 10000000 10000000 10000000
net.core.rmem_max = 5242879
net.core.wmem_max = 5242879
net.core.rmem_default = 5242879
net.core.wmem_default = 5242879
net.core.optmem_max = 5242879
net.core.netdev_max_backlog = 100000
Out-of-band Communications with System Controller
=================================================
System operators can use the out-of-band facility to allow for remote
shutdown or reboot of the host partition. Upon receiving such a
command, the VIOC driver executes "/sbin/reboot" or "/sbin/shutdown"
via the usermodehelper() call.
This same communications facility is used for dynamic VNIC
provisioning (plug in and out).
The VIOC driver also registers a callback with
register_reboot_notifier(). When the callback is executed, the driver
records the shutdown event and reason in a VIOC register to notify the
System Controller.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists