Linux-2.6.12-rc2
Initial git repository build. I'm not bothering with the full history, even though we have it. We can create a separate "historical" git archive of that later if we want to, and in the meantime it's about 3.2GB when imported into git - space that would just make the early git days unnecessarily complicated, when we don't have a lot of good infrastructure for it. Let it rip!
This commit is contained in:
56
Documentation/infiniband/ipoib.txt
Normal file
56
Documentation/infiniband/ipoib.txt
Normal file
@@ -0,0 +1,56 @@
|
||||
IP OVER INFINIBAND
|
||||
|
||||
The ib_ipoib driver is an implementation of the IP over InfiniBand
|
||||
protocol as specified by the latest Internet-Drafts issued by the
|
||||
IETF ipoib working group. It is a "native" implementation in the
|
||||
sense of setting the interface type to ARPHRD_INFINIBAND and the
|
||||
hardware address length to 20 (earlier proprietary implementations
|
||||
masqueraded to the kernel as ethernet interfaces).
|
||||
|
||||
Partitions and P_Keys
|
||||
|
||||
When the IPoIB driver is loaded, it creates one interface for each
|
||||
port using the P_Key at index 0. To create an interface with a
|
||||
different P_Key, write the desired P_Key into the main interface's
|
||||
/sys/class/net/<intf name>/create_child file. For example:
|
||||
|
||||
echo 0x8001 > /sys/class/net/ib0/create_child
|
||||
|
||||
This will create an interface named ib0.8001 with P_Key 0x8001. To
|
||||
remove a subinterface, use the "delete_child" file:
|
||||
|
||||
echo 0x8001 > /sys/class/net/ib0/delete_child
|
||||
|
||||
The P_Key for any interface is given by the "pkey" file, and the
|
||||
main interface for a subinterface is in "parent."
|
||||
|
||||
Debugging Information
|
||||
|
||||
By compiling the IPoIB driver with CONFIG_INFINIBAND_IPOIB_DEBUG set
|
||||
to 'y', tracing messages are compiled into the driver. They are
|
||||
turned on by setting the module parameters debug_level and
|
||||
mcast_debug_level to 1. These parameters can be controlled at
|
||||
runtime through files in /sys/module/ib_ipoib/.
|
||||
|
||||
CONFIG_INFINIBAND_IPOIB_DEBUG also enables the "ipoib_debugfs"
|
||||
virtual filesystem. By mounting this filesystem, for example with
|
||||
|
||||
mkdir -p /ipoib_debugfs
|
||||
mount -t ipoib_debugfs none /ipoib_debufs
|
||||
|
||||
it is possible to get statistics about multicast groups from the
|
||||
files /ipoib_debugfs/ib0_mcg and so on.
|
||||
|
||||
The performance impact of this option is negligible, so it
|
||||
is safe to enable this option with debug_level set to 0 for normal
|
||||
operation.
|
||||
|
||||
CONFIG_INFINIBAND_IPOIB_DEBUG_DATA enables even more debug output in
|
||||
the data path when data_debug_level is set to 1. However, even with
|
||||
the output disabled, enabling this configuration option will affect
|
||||
performance, because it adds tests to the fast path.
|
||||
|
||||
References
|
||||
|
||||
IETF IP over InfiniBand (ipoib) Working Group
|
||||
http://ietf.org/html.charters/ipoib-charter.html
|
66
Documentation/infiniband/sysfs.txt
Normal file
66
Documentation/infiniband/sysfs.txt
Normal file
@@ -0,0 +1,66 @@
|
||||
SYSFS FILES
|
||||
|
||||
For each InfiniBand device, the InfiniBand drivers create the
|
||||
following files under /sys/class/infiniband/<device name>:
|
||||
|
||||
node_type - Node type (CA, switch or router)
|
||||
node_guid - Node GUID
|
||||
sys_image_guid - System image GUID
|
||||
|
||||
In addition, there is a "ports" subdirectory, with one subdirectory
|
||||
for each port. For example, if mthca0 is a 2-port HCA, there will
|
||||
be two directories:
|
||||
|
||||
/sys/class/infiniband/mthca0/ports/1
|
||||
/sys/class/infiniband/mthca0/ports/2
|
||||
|
||||
(A switch will only have a single "0" subdirectory for switch port
|
||||
0; no subdirectory is created for normal switch ports)
|
||||
|
||||
In each port subdirectory, the following files are created:
|
||||
|
||||
cap_mask - Port capability mask
|
||||
lid - Port LID
|
||||
lid_mask_count - Port LID mask count
|
||||
rate - Port data rate (active width * active speed)
|
||||
sm_lid - Subnet manager LID for port's subnet
|
||||
sm_sl - Subnet manager SL for port's subnet
|
||||
state - Port state (DOWN, INIT, ARMED, ACTIVE or ACTIVE_DEFER)
|
||||
phys_state - Port physical state (Sleep, Polling, LinkUp, etc)
|
||||
|
||||
There is also a "counters" subdirectory, with files
|
||||
|
||||
VL15_dropped
|
||||
excessive_buffer_overrun_errors
|
||||
link_downed
|
||||
link_error_recovery
|
||||
local_link_integrity_errors
|
||||
port_rcv_constraint_errors
|
||||
port_rcv_data
|
||||
port_rcv_errors
|
||||
port_rcv_packets
|
||||
port_rcv_remote_physical_errors
|
||||
port_rcv_switch_relay_errors
|
||||
port_xmit_constraint_errors
|
||||
port_xmit_data
|
||||
port_xmit_discards
|
||||
port_xmit_packets
|
||||
symbol_error
|
||||
|
||||
Each of these files contains the corresponding value from the port's
|
||||
Performance Management PortCounters attribute, as described in
|
||||
section 16.1.3.5 of the InfiniBand Architecture Specification.
|
||||
|
||||
The "pkeys" and "gids" subdirectories contain one file for each
|
||||
entry in the port's P_Key or GID table respectively. For example,
|
||||
ports/1/pkeys/10 contains the value at index 10 in port 1's P_Key
|
||||
table.
|
||||
|
||||
MTHCA
|
||||
|
||||
The Mellanox HCA driver also creates the files:
|
||||
|
||||
hw_rev - Hardware revision number
|
||||
fw_ver - Firmware version
|
||||
hca_type - HCA type: "MT23108", "MT25208 (MT23108 compat mode)",
|
||||
or "MT25208"
|
99
Documentation/infiniband/user_mad.txt
Normal file
99
Documentation/infiniband/user_mad.txt
Normal file
@@ -0,0 +1,99 @@
|
||||
USERSPACE MAD ACCESS
|
||||
|
||||
Device files
|
||||
|
||||
Each port of each InfiniBand device has a "umad" device and an
|
||||
"issm" device attached. For example, a two-port HCA will have two
|
||||
umad devices and two issm devices, while a switch will have one
|
||||
device of each type (for switch port 0).
|
||||
|
||||
Creating MAD agents
|
||||
|
||||
A MAD agent can be created by filling in a struct ib_user_mad_reg_req
|
||||
and then calling the IB_USER_MAD_REGISTER_AGENT ioctl on a file
|
||||
descriptor for the appropriate device file. If the registration
|
||||
request succeeds, a 32-bit id will be returned in the structure.
|
||||
For example:
|
||||
|
||||
struct ib_user_mad_reg_req req = { /* ... */ };
|
||||
ret = ioctl(fd, IB_USER_MAD_REGISTER_AGENT, (char *) &req);
|
||||
if (!ret)
|
||||
my_agent = req.id;
|
||||
else
|
||||
perror("agent register");
|
||||
|
||||
Agents can be unregistered with the IB_USER_MAD_UNREGISTER_AGENT
|
||||
ioctl. Also, all agents registered through a file descriptor will
|
||||
be unregistered when the descriptor is closed.
|
||||
|
||||
Receiving MADs
|
||||
|
||||
MADs are received using read(). The buffer passed to read() must be
|
||||
large enough to hold at least one struct ib_user_mad. For example:
|
||||
|
||||
struct ib_user_mad mad;
|
||||
ret = read(fd, &mad, sizeof mad);
|
||||
if (ret != sizeof mad)
|
||||
perror("read");
|
||||
|
||||
In addition to the actual MAD contents, the other struct ib_user_mad
|
||||
fields will be filled in with information on the received MAD. For
|
||||
example, the remote LID will be in mad.lid.
|
||||
|
||||
If a send times out, a receive will be generated with mad.status set
|
||||
to ETIMEDOUT. Otherwise when a MAD has been successfully received,
|
||||
mad.status will be 0.
|
||||
|
||||
poll()/select() may be used to wait until a MAD can be read.
|
||||
|
||||
Sending MADs
|
||||
|
||||
MADs are sent using write(). The agent ID for sending should be
|
||||
filled into the id field of the MAD, the destination LID should be
|
||||
filled into the lid field, and so on. For example:
|
||||
|
||||
struct ib_user_mad mad;
|
||||
|
||||
/* fill in mad.data */
|
||||
|
||||
mad.id = my_agent; /* req.id from agent registration */
|
||||
mad.lid = my_dest; /* in network byte order... */
|
||||
/* etc. */
|
||||
|
||||
ret = write(fd, &mad, sizeof mad);
|
||||
if (ret != sizeof mad)
|
||||
perror("write");
|
||||
|
||||
Setting IsSM Capability Bit
|
||||
|
||||
To set the IsSM capability bit for a port, simply open the
|
||||
corresponding issm device file. If the IsSM bit is already set,
|
||||
then the open call will block until the bit is cleared (or return
|
||||
immediately with errno set to EAGAIN if the O_NONBLOCK flag is
|
||||
passed to open()). The IsSM bit will be cleared when the issm file
|
||||
is closed. No read, write or other operations can be performed on
|
||||
the issm file.
|
||||
|
||||
/dev files
|
||||
|
||||
To create the appropriate character device files automatically with
|
||||
udev, a rule like
|
||||
|
||||
KERNEL="umad*", NAME="infiniband/%k"
|
||||
KERNEL="issm*", NAME="infiniband/%k"
|
||||
|
||||
can be used. This will create device nodes named
|
||||
|
||||
/dev/infiniband/umad0
|
||||
/dev/infiniband/issm0
|
||||
|
||||
for the first port, and so on. The InfiniBand device and port
|
||||
associated with these devices can be determined from the files
|
||||
|
||||
/sys/class/infiniband_mad/umad0/ibdev
|
||||
/sys/class/infiniband_mad/umad0/port
|
||||
|
||||
and
|
||||
|
||||
/sys/class/infiniband_mad/issm0/ibdev
|
||||
/sys/class/infiniband_mad/issm0/port
|
Reference in New Issue
Block a user