Skip to content

Commit cb50429

Browse files
nrockershousenGitHub Enterprise
authored and
GitHub Enterprise
committedMar 5, 2025
TECHPUBS-4735: Updated Soft-RoCE and Lustre information (#138)
* TECHPUBS-4735: updated soft-roce and lustre instructions * TECHPUBS-4735: review feedback * TECHPUBS-4735: updated prereq wording * TECHPUBS-4735: updated title and intro text * TECHPUBS-4735: fixed errors in values * TECHPUBS-4735: updated language around lustre
1 parent 179b1cb commit cb50429

File tree

3 files changed

+26
-7
lines changed

3 files changed

+26
-7
lines changed
 

‎.spelling

+3
Original file line numberDiff line numberDiff line change
@@ -80,6 +80,7 @@ chroot
8080
CLI
8181
cli
8282
client-server
83+
ClusterStor
8384
cm-cli
8485
COCN
8586
conditionalization
@@ -387,6 +388,7 @@ NUMA
387388
Nvidia
388389
NVIDIA
389390
ntp
391+
o2ib
390392
OData
391393
ogopogod
392394
onloaded
@@ -604,6 +606,7 @@ S-9009
604606
S-9010
605607
S-9011
606608
S-9012
609+
S-9100
607610
S-9929
608611

609612

‎docs/portal/developer-portal/install/lustre_network_driver_lnd_ko2iblnd_configuration.md

+14-5
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,15 @@
1-
# Lustre Network Driver (LND) ko2iblnd configuration
1+
# Configure ko2iblnd Lustre Network Driver (LND) for Soft-RoCE performance
22

3-
The ko2iblnd.ko changes are needed for better Soft-RoCE performance on LNDs.
3+
The ko2iblnd.ko module requires modifications to optimize Soft-RoCE performance.
4+
If your setup does not involve Soft-RoCE connections, this section does not apply.
5+
6+
## Prerequisites
7+
8+
Ensure that Lustre is installed with the ko2iblnd module built for the Soft-RoCE driver (RXE) by specifying `--with-o2ib=yes` for `/.configure` or `rpmbuild`.
9+
If this option is not specified, the build process will attempt to automatically detect external OFED installations or internal o2ib support.
10+
If neither is detected, the ko2iblnd module will not be built.
11+
12+
For detailed instructions, see the _Cray ClusterStor Lustre Client Build Configuration Guide S-9100_.
413

514
## Compute Node tuning for Soft-RoCE
615

@@ -49,7 +58,7 @@ Tuning on compute node can be achieved in two ways. Follow the steps that work b
4958
/sys/module/ko2iblnd/parameters/cksum:0
5059
/sys/module/ko2iblnd/parameters/concurrent_sends:84
5160
/sys/module/ko2iblnd/parameters/conns_per_peer:4
52-
/sys/module/ko2iblnd/parameters/credits:84
61+
/sys/module/ko2iblnd/parameters/credits:1024
5362
/sys/module/ko2iblnd/parameters/dev_failover:1
5463
/sys/module/ko2iblnd/parameters/fmr_cache:1
5564
/sys/module/ko2iblnd/parameters/fmr_flush_trigger:384
@@ -112,7 +121,7 @@ NOTE: These changes will not persist on file system upgrade and should be reappl
112121
```console
113122
[root@hpelus1n01 ~]# vim /mnt/nfsdata/images/$(nodeattr -UV ver)/appliance.x86_64/etc/modprobe.d/ko2iblnd.conf
114123

115-
options ko2iblnd conns_per_peer=4 ntx=2048 peer_credits=42 peer_credits_hiw=64 concurrent_sends=256 credits=1024 map_on_demand=1
124+
options ko2iblnd conns_per_peer=4 ntx=2048 peer_credits=42 peer_credits_hiw=64 concurrent_sends=84 credits=1024 map_on_demand=1
116125
```
117126

118127
7. Recreate the SquashFS image after updating the `ko2iblnd.conf` file in the image.
@@ -163,7 +172,7 @@ NOTE: These changes will not persist on file system upgrade and should be reappl
163172
/sys/module/ko2iblnd/parameters/cksum:0
164173
/sys/module/ko2iblnd/parameters/concurrent_sends:84
165174
/sys/module/ko2iblnd/parameters/conns_per_peer:4
166-
/sys/module/ko2iblnd/parameters/credits:84
175+
/sys/module/ko2iblnd/parameters/credits:1024
167176
/sys/module/ko2iblnd/parameters/dev_failover:1
168177
/sys/module/ko2iblnd/parameters/fmr_cache:1
169178
/sys/module/ko2iblnd/parameters/fmr_flush_trigger:384

‎docs/portal/developer-portal/install/softroce_on_HPE_Slingshot_200Gbps.md

+9-2
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,10 @@
55
1. `cray-cxi-driver` RPM package must be installed.
66
2. `cray-rxe-driver` RPM package must be installed.
77
3. HPE Slingshot 200Gbps NIC Ethernet must be configured and active.
8+
4. Lustre is being used.
9+
10+
**NOTE:** This procedure contains configuration settings specific to Lustre.
11+
If you are using a different filesystem, you will need to determine and apply the appropriate configuration settings for that filesystem. Proceed with caution
812

913
## Configuration
1014

@@ -18,9 +22,8 @@ Follow the relevant procedures to achieve the needed configuration. Contact a sy
1822
cxi-eth.large_pkts_buf_count=10000
1923
```
2024

21-
2. If Lustre is being used, modify the client and server parameters.
25+
2. Modify the client and server parameters.
2226

23-
Skip this step if Lustre is not being used.
2427
See the [Lustre configuration](lustre_network_driver_lnd_ko2iblnd_configuration.md#lustre-network-driver-lnd-ko2iblnd-configuration) procedure for more details.
2528

2629
Update the parameters on the client.
@@ -40,6 +43,10 @@ Follow the relevant procedures to achieve the needed configuration. Contact a sy
4043

4144
4. Create an RXE (Soft-RoCE) device by running `rxe_init.sh [devices list]` as root.
4245

46+
The `rxe_init.sh` script is provided in the DKMS package.
47+
It is in the installed source directory's `scripts` subdirectory.
48+
If not done already, copy the `rxe_init.sh` script to the binary RPM's install location of `/usr/bin` or run the script from the `/usr/src/cray-rxe-driver-<version>/scripts` directory of the DKMS package.
49+
4350
NOTE: At this time, HPE Slingshot 200Gbps NICs do not automatically create RXE devices, so it must be done manually.
4451

4552
`[devices list]` is a list of interfaces to create RXE devices for.

0 commit comments

Comments
 (0)
Please sign in to comment.