Hardware packet-pacing could be useful for the switch test to get a deterministic transmit rate avoiding software variation, and potentially allow the software to queue a number of packets in one go that are then allow the hardware to output at the requested rate.
Mellanox documentation:
- Raw Ethernet Programming: Packet Pacing - Code Example
- HowTo Configure Packet Pacing on ConnectX-4. Says for a ConnectX-4 and ConnectX-4 Lx.
- Supported Non-Volatile Configurations from relese notes from the 14.32.1010 firmware in use. Nothing in the list appears related to packet pacing.
- Rate Limit in the MLNX_OFED v5.4-3.1.0.0 documentation contains:
Rate limit defines a maximum bandwidth allowed for a TC. Please note that 10% deviation from the requested values is considered acceptable.
For this a looking at ConnectX-4 Lx dual-port 10GbE MCX4121A-XCAT with 14.32.1010 firmware. Running AlmaLinux 8.5.
Before attempting any configuration changes the ConnectX-4 Lx wasn't reported as support packet pacing in the ibv_devinfo output:
$ ibv_devinfo -v
<snip>
packet_pacing_caps:
qp_rate_limit_min: 0kbps
qp_rate_limit_max: 0kbps
Saving ibv_devinfo output:
[mr_halfword@haswell-alma connectx_4_lx_config]$ ibv_devinfo -v > before_ibv_devinfo.log
Saving mtf configuration:
[mr_halfword@haswell-alma connectx_4_lx_config]$ sudo mst start
[sudo] password for mr_halfword:
Starting MST (Mellanox Software Tools) driver set
Loading MST PCI module - Success
Loading MST PCI configuration module - Success
Create devices
Unloading MST PCI module (unused) - Success
[mr_halfword@haswell-alma connectx_4_lx_config]$ sudo mst status
MST modules:
------------
MST PCI module is not loaded
MST PCI configuration module loaded
MST devices:
------------
/dev/mst/mt4117_pciconf0 - PCI configuration cycles access.
domain:bus:dev.fn=0000:01:00.0 addr.reg=88 data.reg=92 cr_bar.gw_offset=-1
Chip revision is: 00
mr_halfword@haswell-alma connectx_4_lx_config]$ sudo mlxconfig -d /dev/mst/mt4117_pciconf0 q > before_mlxconfig_query.log
Attempting to take a backup reports no TLV were found, and no backup file is created:
[mr_halfword@haswell-alma connectx_4_lx_config]$ sudo mlxconfig -d /dev/mst/mt4117_pciconf0 -f backup.conf backup
Collecting...
No TLVs were found.
The following generates a list of all TLVs, without requiring a device so not sure which TLVs are supported by which devices:
[mr_halfword@haswell-alma connectx_4_lx_config]$ mlxconfig gen_tlvs_file all_tlvs.txt
Saving output...
Done!
The generated file contains:
nv_packet_pacing 0
The Firmware Activation section of HowTo Configure Packet Pacing on ConnectX-4 gives commands to create a raw TLV file to activate packet pacing in the firmware, the significance of the hex number isn't defined. Create a text file as per Activate Packet Pacing in the Firmware:
[mr_halfword@haswell-alma connectx_4_lx_config]$ echo "MLNX_RAW_TLV_FILE" > mlxconfig_raw_pacing.txt
[mr_halfword@haswell-alma connectx_4_lx_config]$ echo "0x00000004 0x0000010c 0x00000000 0x00000001" >> mlxconfig_raw_pacing.txt
Convert the raw text TLV to XML:
[mr_halfword@haswell-alma connectx_4_lx_config]$ mlxconfig raw2xml mlxconfig_raw_pacing.txt mlxconfig_raw_pacing.xml
Saving output...
Done!
Which has the contents:
<?xml version="1.0" encoding="UTF-8"?>
<config xmlns="http://www.mellanox.com/config">
<nv_packet_pacing ovr_en='0' rd_en='0' writer_id='0'>
<!-- Legal Values: False/True -->
<packet_pacing>True</packet_pacing>
<!-- Legal Values: False/True -->
<lag_disable>False</lag_disable>
</nv_packet_pacing>I.e. does enable packing_pacing, and also has a <lag_disable> field.
Set the raw configuration file created above, which reports information about the one raw TLV being set:
[mr_halfword@haswell-alma connectx_4_lx_config]$ sudo mlxconfig -d /dev/mst/mt4117_pciconf0 -f mlxconfig_raw_pacing.txt set_raw
Raw TLV #1 Info:
Length: 0x4
Version: 0
OverrideEn: 0
Type: 0x0000010c
Data: 0x00000001
Operation intended for advanced users.
Are you sure you want to apply raw TLV file? (y/n) [n] : y
Applying... Done!
-I- Please reboot machine to load new configurations.
Rebooted the PC after the above set_raw operation. ibv_devinfo is now reporting packet pacing as supported for both devices (one device per port):
[mr_halfword@haswell-alma connectx_4_lx_config]$ ibv_devinfo -v > after_ibv_devinfo.log
[mr_halfword@haswell-alma connectx_4_lx_config]$ diff before_ibv_devinfo.log after_ibv_devinfo.log
99,100c99,102
< qp_rate_limit_min: 0kbps
< qp_rate_limit_max: 0kbps
---
> qp_rate_limit_min: 1kbps
> qp_rate_limit_max: 10000000kbps
> supported_qp:
> SUPPORT_RAW_PACKET
231,232c233,236
< qp_rate_limit_min: 0kbps
< qp_rate_limit_max: 0kbps
---
> qp_rate_limit_min: 1kbps
> qp_rate_limit_max: 10000000kbps
> supported_qp:
> SUPPORT_RAW_PACKET
There was no change to the configuration reported by a mlxconfig query:
[mr_halfword@haswell-alma connectx_4_lx_config]$ sudo mst start
[sudo] password for mr_halfword:
Starting MST (Mellanox Software Tools) driver set
Loading MST PCI module - Success
Loading MST PCI configuration module - Success
Create devices
Unloading MST PCI module (unused) - Success
[mr_halfword@haswell-alma connectx_4_lx_config]$ sudo mlxconfig -d /dev/mst/mt4117_pciconf0 q > after_mlxconfig_query.log
[mr_halfword@haswell-alma connectx_4_lx_config]$ diff before_mlxconfig_query.log after_mlxconfig_query.log
And a mlxconfig backup now obtains the TLVs which were set:
[mr_halfword@haswell-alma connectx_4_lx_config]$ sudo mlxconfig -d /dev/mst/mt4117_pciconf0 -f backup.conf backup
Collecting...
Saving output...
Done!
[mr_halfword@haswell-alma connectx_4_lx_config]$ cat backup.conf
MLNX_RAW_TLV_FILE
% TLV Type: 0x0000010c, Writer ID: ICMD MLXCONFIG SET RAW(0x0c), Writer Host ID: 0x00
0x000c0004 0x0000010c 0x00000000 0x00000001
The ibv_raw_packet_tx program was used to test the effect of packet pacing.
When capturing packets in Wireshark to look at the timestamps set a mirror port in the T1700G-28TQ switch, and re-directed rx only packets to the mirror port.
For this configuration:
- packet_pacing enabled in the ConnectX-4 Lx by use of a TLV type 0x0000010c.
- mlxconfig reported the default of
ACCURATE_TX_SCHEDULER False(0) - ibv_raw_packet_tx only using
IBV_QP_RATE_LIMIT.
Ran the test at a low rate to see the inter-packet gap:
[mr_halfword@haswell-alma release]$ ibv_switch_test/ibv_raw_packet_tx -i rocep1s0f0 -n 1 -p 25-26 -l 1000
Send 1536 frames over 15.692908 secs, average 97.9 Hz
The wireshark packet shows bursts of packets:
- 43 packets back-to-back in ~160 microseconds
- Gap between bursts of ~537254 microseconds
The ibv_raw_packet_tx program was modified to have an option to call ibv_modify_qp_rate_limit to try and control the burst rate, but the call failed with EINVAL. The error was from the following in mlx5_modify_qp_rate_limit:
if ((attr->max_burst_sz ||
attr->typical_pkt_sz) &&
(!attr->rate_limit ||
!(mctx->packet_pacing_caps.cap_flags &
MLX5_IB_PP_SUPPORT_BURST)))
return EINVAL;The contents of the packet_pacing_caps shown in the debugger were:
packet_pacing_caps struct mlx5_packet_pacing_caps {...}
qp_rate_limit_min __u32 1
qp_rate_limit_max __u32 10000000
supported_qpts __u32 256
cap_flags __u8 0 '\0'
I.e. the cap_flags doesn't have MLX5_IB_PP_SUPPORT_BURST (1).
It is the mlx5_ib_query_device function in the AlmaLinux 8.5 linux-4.18.0-348.20.1.el8.x86_64/drivers/infiniband/hw/mlx5/main.c kernel file which sets MLX5_IB_PP_SUPPORT_BURST:
if (MLX5_CAP_QOS(mdev, packet_pacing) &&
MLX5_CAP_GEN(mdev, qos)) {
resp.packet_pacing_caps.qp_rate_limit_max =
MLX5_CAP_QOS(mdev, packet_pacing_max_rate);
resp.packet_pacing_caps.qp_rate_limit_min =
MLX5_CAP_QOS(mdev, packet_pacing_min_rate);
resp.packet_pacing_caps.supported_qpts |=
1 << IB_QPT_RAW_PACKET;
if (MLX5_CAP_QOS(mdev, packet_pacing_burst_bound) &&
MLX5_CAP_QOS(mdev, packet_pacing_typical_size))
resp.packet_pacing_caps.cap_flags |=
MLX5_IB_PP_SUPPORT_BURST;Where MLX5_IB_PP_SUPPORT_BURST is set when both the packet_pacing_burst_bound and packet_pacing_typical_size fields read from the device are non-zero.
Generated a description of the available configuration parameters for the ConnectX-4 Lx device:
[mr_halfword@haswell-alma connectx_4_lx_config]$ sudo mlxconfig -d /dev/mst/mt4117_pciconf0 show_confs | cat > show_confs.txt
Ones which look like they could impact burst support:
ACCURATE_TX_SCHEDULER=<False|True> When TRUE, the device will optimize the transmit scheduler for high accuracy. may hurt performance When False, the device defaults will apply for the scheduler.
TX_SCHEDULER_BURST=<NUM> Log (base2) of the transmission scheduler default burst size, given in bytes, Value 0x0 indicates using device defaults.
Attempted to enable the accurate tx scheduler with:
[mr_halfword@haswell-alma connectx_4_lx_config]$ sudo mlxconfig -d /dev/mst/mt4117_pciconf0 set ACCURATE_TX_SCHEDULER=1
Device #1:
----------
Device type: ConnectX4LX
Name: MCX4121A-XCA_Ax
Description: ConnectX-4 Lx EN network interface card; 10GbE dual-port SFP28; PCIe3.0 x8; ROHS R6
Device: /dev/mst/mt4117_pciconf0
Configurations: Next Boot New
ACCURATE_TX_SCHEDULER False(0) True(1)
Apply new Configuration? (y/n) [n] : y
Applying... Done!
-I- Please reboot machine to load new configurations.
After a reset there was no change in the ibv_devinfo output:
[mr_halfword@haswell-alma connectx_4_lx_config]$ ibv_devinfo -v> after_tx_accurate_scheduler_devinfo.log
[mr_halfword@haswell-alma connectx_4_lx_config]$ diff after_ibv_devinfo.log after_tx_accurate_scheduler_devinfo.log
And a mlxconfig query reported the expected difference:
[mr_halfword@haswell-alma connectx_4_lx_config]$ sudo mlxconfig -d /dev/mst/mt4117_pciconf0 q|cat > after_tx_accurate_scheduler_mlxconfig.log
[mr_halfword@haswell-alma connectx_4_lx_config]$ diff after_mlxconfig_query.log after_tx_accurate_scheduler_mlxconfig.log
31c31
< ACCURATE_TX_SCHEDULER False(0)
---
> ACCURATE_TX_SCHEDULER True(1)
Testing the effect of the change:
- The
ibv_modify_qp_rate_limitcall still failed withEINVALasmctx->packet_pacing_caps.cap_flagswas still zero. - Initially thought that no difference in the gap between packets, but on further experiments seems senstive to the MTU set on the Ethernet device.
With the in-box driver in AlmaLinux 8.5 the following is reported for each ConnectX-4 Lx device in the dmesg output:
[ 4.029819] mlx5_core 0000:01:00.0: Rate limit: 13 rates are supported, range: 0Mbps to 9765Mbps
Where that message is from the mlx5_init_rl_table function in the drivers/net/ethernet/mellanox/mlx5/core/rl.c source file, which reads the device capabilities and divides by 1024 to change the packet pacing rates from Kbps to approximately MiBps:
/* First entry is reserved for unlimited rate */
table->max_size = MLX5_CAP_QOS(dev, packet_pacing_rate_table_size) - 1;
table->max_rate = MLX5_CAP_QOS(dev, packet_pacing_max_rate);
table->min_rate = MLX5_CAP_QOS(dev, packet_pacing_min_rate);
mlx5_core_info(dev, "Rate limit: %u rates are supported, range: %uMbps to %uMbps\n",
table->max_size,
table->min_rate >> 10,
table->max_rate >> 10);In order to load a mlx5_core module instrumented with the raw packet pacing fields from the device:
- Had performed the steps in 18. Install Kernel source which prepared the Kernel source for AlmaLinux 8.5
- Took a copy of the prepared Kernel source:
[mr_halfword@haswell-alma ~]$ cd ~
[mr_halfword@haswell-alma ~]$ cp -pr ~/rpmbuild/BUILD/kernel-4.18.0-348.20.1.el8_5 ~/mlx5_packet_pacing_diags
- Using How to recompile just a single kernel module? as a guide used the copy of the Kernel source tree to build just the mlx5_core module. Change to root of the source tree:
[mr_halfword@haswell-alma ~]$ cd ~/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/
- Import the configuration from the running Kernel:
[mr_halfword@haswell-alma linux-4.18.0-348.20.1.el8.x86_64]$ cp /boot/config-`uname -r` .config
[mr_halfword@haswell-alma linux-4.18.0-348.20.1.el8.x86_64]$ make oldconfig
scripts/kconfig/conf --oldconfig Kconfig
#
# configuration written to .config
#
- Make
scripts,prepareandmodules_prepare:
[mr_halfword@haswell-alma linux-4.18.0-348.20.1.el8.x86_64]$ make scripts
scripts/kconfig/conf --syncconfig Kconfig
HOSTCC scripts/basic/bin2c
WRAP arch/x86/include/generated/uapi/asm/bpf_perf_event.h
WRAP arch/x86/include/generated/uapi/asm/poll.h
WRAP arch/x86/include/generated/uapi/asm/socket.h
WRAP arch/x86/include/generated/asm/dma-contiguous.h
WRAP arch/x86/include/generated/asm/early_ioremap.h
WRAP arch/x86/include/generated/asm/mcs_spinlock.h
WRAP arch/x86/include/generated/asm/mm-arch-hooks.h
WRAP arch/x86/include/generated/asm/mmiowb.h
HOSTCC scripts/genksyms/genksyms.o
YACC scripts/genksyms/parse.tab.c
HOSTCC scripts/genksyms/parse.tab.o
LEX scripts/genksyms/lex.lex.c
YACC scripts/genksyms/parse.tab.h
HOSTCC scripts/genksyms/lex.lex.o
HOSTLD scripts/genksyms/genksyms
UPD include/generated/uapi/linux/version.h
CC scripts/mod/empty.o
HOSTCC scripts/mod/mk_elfconfig
MKELF scripts/mod/elfconfig.h
HOSTCC scripts/mod/modpost.o
CC scripts/mod/devicetable-offsets.s
UPD scripts/mod/devicetable-offsets.h
HOSTCC scripts/mod/file2alias.o
HOSTCC scripts/mod/sumversion.o
HOSTLD scripts/mod/modpost
HOSTCC scripts/selinux/genheaders/genheaders
HOSTCC scripts/selinux/mdp/mdp
HOSTCC scripts/kallsyms
HOSTCC scripts/pnmtologo
HOSTCC scripts/conmakehash
HOSTCC scripts/recordmcount
HOSTCC scripts/sortextable
HOSTCC scripts/asn1_compiler
HOSTCC scripts/sign-file
HOSTCC scripts/extract-cert
[mr_halfword@haswell-alma linux-4.18.0-348.20.1.el8.x86_64]$ make prepare
SYSTBL arch/x86/include/generated/asm/syscalls_32.h
SYSHDR arch/x86/include/generated/asm/unistd_32_ia32.h
SYSHDR arch/x86/include/generated/asm/unistd_64_x32.h
SYSTBL arch/x86/include/generated/asm/syscalls_64.h
HYPERCALLS arch/x86/include/generated/asm/xen-hypercalls.h
SYSHDR arch/x86/include/generated/uapi/asm/unistd_32.h
SYSHDR arch/x86/include/generated/uapi/asm/unistd_64.h
SYSHDR arch/x86/include/generated/uapi/asm/unistd_x32.h
HOSTCC arch/x86/tools/relocs_32.o
HOSTCC arch/x86/tools/relocs_64.o
HOSTCC arch/x86/tools/relocs_common.o
HOSTLD arch/x86/tools/relocs
UPD include/config/kernel.release
UPD include/generated/utsrelease.h
CC kernel/bounds.s
UPD include/generated/bounds.h
UPD include/generated/timeconst.h
CC arch/x86/kernel/asm-offsets.s
UPD include/generated/asm-offsets.h
CALL scripts/checksyscalls.sh
DESCEND objtool
HOSTCC /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/objtool/fixdep.o
HOSTLD /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/objtool/fixdep-in.o
LINK /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/objtool/fixdep
CC /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/objtool/exec-cmd.o
CC /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/objtool/help.o
CC /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/objtool/pager.o
CC /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/objtool/parse-options.o
CC /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/objtool/run-command.o
CC /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/objtool/sigchain.o
CC /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/objtool/subcmd-config.o
LD /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/objtool/libsubcmd-in.o
AR /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/objtool/libsubcmd.a
MKDIR /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/objtool/arch/x86/lib/
GEN /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/objtool/arch/x86/lib/inat-tables.c
CC /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/objtool/arch/x86/decode.o
LD /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/objtool/arch/x86/objtool-in.o
CC /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/objtool/builtin-check.o
CC /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/objtool/builtin-orc.o
CC /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/objtool/check.o
CC /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/objtool/orc_gen.o
CC /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/objtool/orc_dump.o
CC /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/objtool/elf.o
CC /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/objtool/special.o
CC /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/objtool/objtool.o
CC /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/objtool/libstring.o
CC /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/objtool/libctype.o
CC /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/objtool/str_error_r.o
LD /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/objtool/objtool-in.o
Warning: synced file at 'tools/objtool/arch/x86/include/asm/inat.h' differs from latest kernel version at 'arch/x86/include/asm/inat.h'
Warning: synced file at 'tools/objtool/arch/x86/include/asm/insn.h' differs from latest kernel version at 'arch/x86/include/asm/insn.h'
Warning: synced file at 'tools/objtool/arch/x86/lib/inat.c' differs from latest kernel version at 'arch/x86/lib/inat.c'
Warning: synced file at 'tools/objtool/arch/x86/lib/insn.c' differs from latest kernel version at 'arch/x86/lib/insn.c'
LINK /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/objtool/objtool
DESCEND bpf/resolve_btfids
Auto-detecting system features:
... libelf: [ on ]
... zlib: [ on ]
... bpf: [ on ]
GEN /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/bpf/resolve_btfids/bpf_helper_defs.h
MKDIR /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/bpf/resolve_btfids/staticobjs/
CC /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/bpf/resolve_btfids/staticobjs/libbpf.o
CC /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/bpf/resolve_btfids/staticobjs/bpf.o
CC /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/bpf/resolve_btfids/staticobjs/nlattr.o
CC /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/bpf/resolve_btfids/staticobjs/btf.o
CC /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/bpf/resolve_btfids/staticobjs/libbpf_errno.o
CC /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/bpf/resolve_btfids/staticobjs/str_error.o
CC /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/bpf/resolve_btfids/staticobjs/netlink.o
CC /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/bpf/resolve_btfids/staticobjs/bpf_prog_linfo.o
CC /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/bpf/resolve_btfids/staticobjs/libbpf_probes.o
CC /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/bpf/resolve_btfids/staticobjs/xsk.o
CC /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/bpf/resolve_btfids/staticobjs/hashmap.o
CC /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/bpf/resolve_btfids/staticobjs/btf_dump.o
CC /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/bpf/resolve_btfids/staticobjs/ringbuf.o
LD /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/bpf/resolve_btfids/staticobjs/libbpf-in.o
LINK /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/bpf/resolve_btfids/libbpf.a
HOSTCC /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/bpf/resolve_btfids/fixdep.o
HOSTLD /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/bpf/resolve_btfids/fixdep-in.o
LINK /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/bpf/resolve_btfids/fixdep
CC /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/bpf/resolve_btfids/exec-cmd.o
CC /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/bpf/resolve_btfids/help.o
CC /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/bpf/resolve_btfids/pager.o
CC /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/bpf/resolve_btfids/parse-options.o
CC /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/bpf/resolve_btfids/run-command.o
CC /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/bpf/resolve_btfids/sigchain.o
CC /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/bpf/resolve_btfids/subcmd-config.o
LD /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/bpf/resolve_btfids/libsubcmd-in.o
AR /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/bpf/resolve_btfids/libsubcmd.a
CC /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/bpf/resolve_btfids/main.o
CC /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/bpf/resolve_btfids/rbtree.o
CC /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/bpf/resolve_btfids/zalloc.o
CC /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/bpf/resolve_btfids/string.o
CC /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/bpf/resolve_btfids/ctype.o
CC /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/bpf/resolve_btfids/str_error_r.o
LD /home/mr_halfword/mlx5_packet_pacing_diags/linux-4.18.0-348.20.1.el8.x86_64/tools/bpf/resolve_btfids/resolve_btfids-in.o
LINK resolve_btfids
[mr_halfword@haswell-alma linux-4.18.0-348.20.1.el8.x86_64]$ make modules_prepare
CALL scripts/checksyscalls.sh
DESCEND objtool
DESCEND bpf/resolve_btfids
- Copy the
Module.symverswhich match the running kernel:
[mr_halfword@haswell-alma linux-4.18.0-348.20.1.el8.x86_64]$ cp /usr/src/kernels/`uname -r`/Module.symvers .
- Compile the mlx5_core module, with no changes to check that it compiles without error:
[mr_halfword@haswell-alma linux-4.18.0-348.20.1.el8.x86_64]$ make modules SUBDIRS=drivers/net/ethernet/mellanox/mlx5/core
CC [M] drivers/net/ethernet/mellanox/mlx5/core/main.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/cmd.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/debugfs.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/fw.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/eq.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/uar.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/pagealloc.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/health.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/mcg.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/cq.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/alloc.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/port.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/mr.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/pd.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/transobj.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/vport.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/sriov.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/fs_cmd.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/fs_core.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/pci_irq.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/fs_counters.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/rl.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/lag.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/dev.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/events.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/wq.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/lib/gid.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/lib/devcom.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/lib/pci_vsc.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/lib/dm.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/diag/fs_tracepoint.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/diag/crdump.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/devlink.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/diag/rsc_dump.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/fw_reset.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/qos.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/en_main.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/en_common.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/en_fs.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/en_tx.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/en_rx.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/en_dim.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/en_txrx.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/en/xdp.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/en_stats.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/en_selftest.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/en/port.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/en/monitor_stats.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/en/health.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/en/reporter_rx.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/en/params.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/en/xsk/pool.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/en/xsk/setup.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/en/xsk/tx.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/en/devlink.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/en/ptp.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/en/qos.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/en/trap.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/en_arfs.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/en_fs_ethtool.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/en_dcbnl.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/en/port_buffer.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/lag_mp.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/lib/geneve.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/lib/port_tun.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/en_rep.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/en/rep/bond.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/en/mod_hdr.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/en/mapping.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/en_tc.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/en/rep/neigh.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/lib/fs_chains.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/esw/indir_table.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_encap.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_vxlan.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_gre.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_geneve.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun_mplsoudp.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/diag/en_tc_tracepoint.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/eswitch.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads_termtbl.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/ecpf.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/rdma.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/esw/legacy.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/esw/acl/helper.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/esw/acl/egress_lgcy.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/esw/acl/egress_ofld.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/esw/acl/ingress_lgcy.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/esw/acl/ingress_ofld.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/esw/devlink_port.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/esw/vporttbl.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/esw/sample.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/lib/mpfs.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/lib/clock.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/ipoib/ethtool.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib_vlan.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/accel/ipsec_offload.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/fpga/ipsec.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/lib/crypto.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/accel/tls.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/accel/ipsec.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/fpga/cmd.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/fpga/core.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/fpga/conn.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/fpga/sdk.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_rxtx.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_stats.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_fs.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/en_accel/tls.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/en_accel/tls_rxtx.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/en_accel/tls_stats.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/en_accel/fs_tcp.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_txrx.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_rx.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/steering/dr_domain.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/steering/dr_table.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/steering/dr_matcher.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/steering/dr_rule.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/steering/dr_icm_pool.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/steering/dr_buddy.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/steering/dr_send.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v0.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste_v1.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/steering/dr_cmd.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/steering/dr_fw.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/steering/dr_action.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/steering/fs_dr.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/sf/vhca_event.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/sf/dev/dev.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/sf/dev/driver.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/sf/cmd.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/sf/hw_table.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/sf/devlink.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/en/hv_vhca_stats.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/lib/vxlan.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/lib/hv.o
CC [M] drivers/net/ethernet/mellanox/mlx5/core/lib/hv_vhca.o
LD [M] drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.o
Building modules, stage 2.
MODPOST 1 modules
CC drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.mod.o
LD [M] drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.ko
- In the
drivers/net/ethernet/mellanox/mlx5/core/rl.cfile change from:
mlx5_core_info(dev, "Rate limit: %u rates are supported, range: %uMbps to %uMbps\n",
table->max_size,
table->min_rate >> 10,
table->max_rate >> 10);To:
mlx5_core_info(dev, "Rate limit: %u rates are supported, range: %uKbps to %uKbps packet_pacing_burst_bound=%u packet_pacing_typical_size=%u\n",
table->max_size,
table->min_rate,
table->max_rate,
MLX5_CAP_QOS(dev, packet_pacing_burst_bound),
MLX5_CAP_QOS(dev, packet_pacing_typical_size));- Rebuild the module with the changes:
[mr_halfword@haswell-alma linux-4.18.0-348.20.1.el8.x86_64]$ make modules SUBDIRS=drivers/net/ethernet/mellanox/mlx5/core
CC [M] drivers/net/ethernet/mellanox/mlx5/core/rl.o
LD [M] drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.o
Building modules, stage 2.
MODPOST 1 modules
CC drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.mod.o
LD [M] drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.ko
- Load the modified module:
[mr_halfword@haswell-alma linux-4.18.0-348.20.1.el8.x86_64]$ sudo rmmod mlx5_ib mlx5_core;sudo insmod drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.ko
9. Impact of ConnectX-4 Lx firmware configuration on packet pacing capabilities seen by Kernel module
For all these tests the TLV change to enable packet-pacing was left enabled.
With the configuration as left from above, i.e. ACCURATE_TX_SCHEDULER=True and TX_SCHEDULER_BURST=0 then the mlx5_core module with the diagnostics reports:
[11098.480121] mlx5_core 0000:01:00.0: Rate limit: 13 rates are supported, range
: 1Kbps to 10000000Kbps packet_pacing_burst_bound=1 packet_pacing_typical_size=0
Changed ACCURATE_TX_SCHEDULER back to its default of False
[mr_halfword@haswell-alma linux-4.18.0-348.20.1.el8.x86_64]$ sudo mlxconfig -d /dev/mst/mt4117_pciconf0 set ACCURATE_TX_SCHEDULER=0
Device #1:
----------
Device type: ConnectX4LX
Name: MCX4121A-XCA_Ax
Description: ConnectX-4 Lx EN network interface card; 10GbE dual-port SFP28; PCIe3.0 x8; ROHS R6
Device: /dev/mst/mt4117_pciconf0
Configurations: Next Boot New
ACCURATE_TX_SCHEDULER True(1) False(0)
Apply new Configuration? (y/n) [n] : y
Applying... Done!
-I- Please reboot machine to load new configurations.
After a reboot no change to the reported packet pacing:
[ 199.073230] mlx5_core 0000:01:00.0: Rate limit: 13 rates are supported, range: 1Kbps to 10000000Kbps packet_pacing_burst_bound=1 packet_pacing_typical_size=0
Renabled ACCURATE_TX_SCHEDULER and set the base-log2 TX_SCHEDULER_BURST to 11 (2048 bytes to allow for switch test packet size):
mr_halfword@haswell-alma ~]$ sudo mlxconfig -d /dev/mst/mt4117_pciconf0 set ACCURATE_TX_SCHEDULER=1 TX_SCHEDULER_BURST=11
Device #1:
----------
Device type: ConnectX4LX
Name: MCX4121A-XCA_Ax
Description: ConnectX-4 Lx EN network interface card; 10GbE dual-port SFP28; PCIe3.0 x8; ROHS R6
Device: /dev/mst/mt4117_pciconf0
Configurations: Next Boot New
ACCURATE_TX_SCHEDULER False(0) True(1)
TX_SCHEDULER_BURST 0 11
Apply new Configuration? (y/n) [n] : y
Applying... Done!
-I- Please reboot machine to load new configurations.
After a reboot no change to the reported packet pacing:
[ 146.267158] mlx5_core 0000:01:00.0: Rate limit: 13 rates are supported, range: 1Kbps to 10000000Kbps packet_pacing_burst_bound=1 packet_pacing_typical_size=0
With these changes no noticable difference from the packet burst size compared to 7.1 Burst of packets:
- With the mtu on the Ethernet interface set to 9216 wireshark indicates bursts of 43 packets.
[mr_halfword@haswell-alma release]$ ibv_switch_test/ibv_raw_packet_tx -i rocep1s0f0 -n 1 -p 25-26 -l 1000
Send 1280 frames over 15.876766 secs, average 80.6 Hz
- With the mtu set to 1500 wireshark indicates bursts of 3 packets.
[mr_halfword@haswell-alma release]$ ibv_switch_test/ibv_raw_packet_tx -i rocep1s0f0 -n 1 -p 25-26 -l 1000
Send 1280 frames over 14.426067 secs, average 88.7 Hz
- With the mtu set to 2048 wireshark indicates bursts of 2 packets.
[mr_halfword@haswell-alma release]$ ibv_switch_test/ibv_raw_packet_tx -i rocep1s0f0 -n 1 -p 25-26 -l 1000
Send 1536 frames over 12.896631 secs, average 119.1 Hz
- With the mtu set to 2048 wireshark indicates bursts of 2 packets.
[mr_halfword@haswell-alma release]$ ibv_switch_test/ibv_raw_packet_tx -i rocep1s0f0 -n 1 -p 25-26 -l 500
Send 1024 frames over 17.178638 secs, average 59.6 Hz
As per .2. ibv_modify_qp_rate_limit failing due to lack of burst support the user space mlx5_modify_qp_rate_limit in the rdma-core mlx5 provider doesn't allow max_burst_sz or typical_pkt_sz to be modified in a Queue-Pair unless both:
rate_limitis non-zero in the request.- The
MLX5_IB_PP_SUPPORT_BURSTflag is set in thepacket_pacing_caps.cap_flagsreported by the kernel for the device. As per the above anaysis this flag is only set when the reported device QOS capabilities supports bothpacket_pacing_burst_boundandpacket_pacing_typical_size.
The __mlx5_ib_modify_qp function in the kernel drivers/infiniband/hw/mlx5/qp.c has independent tests on if modifications to max_burst_sz or typical_pkt_sz are allowed:
if (attr_mask & IB_QP_RATE_LIMIT) {
raw_qp_param.rl.rate = attr->rate_limit;
if (ucmd->burst_info.max_burst_sz) {
if (attr->rate_limit &&
MLX5_CAP_QOS(dev->mdev, packet_pacing_burst_bound)) {
raw_qp_param.rl.max_burst_sz =
ucmd->burst_info.max_burst_sz;
} else {
err = -EINVAL;
goto out;
}
}
if (ucmd->burst_info.typical_pkt_sz) {
if (attr->rate_limit &&
MLX5_CAP_QOS(dev->mdev, packet_pacing_typical_size)) {
raw_qp_param.rl.typical_pkt_sz =
ucmd->burst_info.typical_pkt_sz;
} else {
err = -EINVAL;
goto out;
}
}
raw_qp_param.set_mask |= MLX5_RAW_QP_RATE_LIMIT;
}Based upon the kernel code, changed the user space code:
- In verbs.c the test in
mlx5_modify_qp_rate_limiton theMLX5_IB_PP_SUPPORT_BURSTflag was commented out. - In ibv_raw_packet_tx.c modified the call to
ibv_modify_qp_rate_limitto only setrate_limitandmax_burst_sz.
With both the user space code changes was then able to successfully call mlx5_modify_qp_rate_limit with non-zero values for rate_limit and max_burst_sz.
Left ConnectX-4 Lx unchanged from previous tests
Output from running the tests:
[mr_halfword@haswell-alma ibv_message_passing]$ ./run_raw_packet_pacing_tests.sh ibv_message_passing_c_project/bin/packet_pacing/accurate_tx_scheduler_tx_scheduler_burst11
ibv_message_passing_c_project/bin/release/ibv_switch_test/ibv_raw_packet_tx -i rocep1s0f0 -n 1 -p 25-26 -l 200000 -c ibv_message_passing_c_project/bin/packet_pacing/accurate_tx_scheduler_tx_scheduler_burst11_m9216_l200000.csv
HCA core clock 156250 KHz
Send 181525 frames over 10.027515 secs (CLOCK_MONOTONIC), average 18102.7 Hz
Average bit rate for untagged frames = 222.7 Mbps
HCA elapsed time = 10.027382 secs
ibv_message_passing_c_project/bin/release/ibv_switch_test/ibv_raw_packet_tx -i rocep1s0f0 -n 1 -p 25-26 -l 200000 -c ibv_message_passing_c_project/bin/packet_pacing/accurate_tx_scheduler_tx_scheduler_burst11_m9216_l200000-b.csv -b
HCA core clock 156250 KHz
Send 133897 frames over 10.038814 secs (CLOCK_MONOTONIC), average 13337.9 Hz
Average bit rate for untagged frames = 164.1 Mbps
HCA elapsed time = 10.038680 secs
ibv_message_passing_c_project/bin/release/ibv_switch_test/ibv_raw_packet_tx -i rocep1s0f0 -n 1 -p 25-26 -l 1200000 -c ibv_message_passing_c_project/bin/packet_pacing/accurate_tx_scheduler_tx_scheduler_burst11_m9216_l1200000.csv
HCA core clock 156250 KHz
Send 1067564 frames over 10.004821 secs (CLOCK_MONOTONIC), average 106705.0 Hz
Average bit rate for untagged frames = 1312.9 Mbps
HCA elapsed time = 10.004688 secs
ibv_message_passing_c_project/bin/release/ibv_switch_test/ibv_raw_packet_tx -i rocep1s0f0 -n 1 -p 25-26 -l 1200000 -c ibv_message_passing_c_project/bin/packet_pacing/accurate_tx_scheduler_tx_scheduler_burst11_m9216_l1200000-b.csv -b
HCA core clock 156250 KHz
Send 1067564 frames over 10.004799 secs (CLOCK_MONOTONIC), average 106705.2 Hz
Average bit rate for untagged frames = 1312.9 Mbps
HCA elapsed time = 10.004666 secs
ibv_message_passing_c_project/bin/release/ibv_switch_test/ibv_raw_packet_tx -i rocep1s0f0 -n 1 -p 25-26 -l 2400000 -c ibv_message_passing_c_project/bin/packet_pacing/accurate_tx_scheduler_tx_scheduler_burst11_m9216_l2400000.csv
HCA core clock 156250 KHz
Send 2134630 frames over 10.002422 secs (CLOCK_MONOTONIC), average 213411.3 Hz
Average bit rate for untagged frames = 2625.8 Mbps
HCA elapsed time = 10.002288 secs
ibv_message_passing_c_project/bin/release/ibv_switch_test/ibv_raw_packet_tx -i rocep1s0f0 -n 1 -p 25-26 -l 2400000 -c ibv_message_passing_c_project/bin/packet_pacing/accurate_tx_scheduler_tx_scheduler_burst11_m9216_l2400000-b.csv -b
HCA core clock 156250 KHz
Send 2134623 frames over 10.002406 secs (CLOCK_MONOTONIC), average 213411.0 Hz
Average bit rate for untagged frames = 2625.8 Mbps
HCA elapsed time = 10.002273 secs
ibv_message_passing_c_project/bin/release/ibv_switch_test/ibv_raw_packet_tx -i rocep1s0f0 -n 1 -p 25-26 -l 4800000 -c ibv_message_passing_c_project/bin/packet_pacing/accurate_tx_scheduler_tx_scheduler_burst11_m9216_l4800000.csv
HCA core clock 156250 KHz
Send 3963866 frames over 10.001304 secs (CLOCK_MONOTONIC), average 396334.9 Hz
Average bit rate for untagged frames = 4876.5 Mbps
HCA elapsed time = 10.001172 secs
ibv_message_passing_c_project/bin/release/ibv_switch_test/ibv_raw_packet_tx -i rocep1s0f0 -n 1 -p 25-26 -l 4800000 -c ibv_message_passing_c_project/bin/packet_pacing/accurate_tx_scheduler_tx_scheduler_burst11_m9216_l4800000-b.csv -b
HCA core clock 156250 KHz
Send 4268741 frames over 10.001207 secs (CLOCK_MONOTONIC), average 426822.6 Hz
Average bit rate for untagged frames = 5251.6 Mbps
HCA elapsed time = 10.001074 secs
ibv_message_passing_c_project/bin/release/ibv_switch_test/ibv_raw_packet_tx -i rocep1s0f0 -n 1 -p 25-26 -l 200000 -c ibv_message_passing_c_project/bin/packet_pacing/accurate_tx_scheduler_tx_scheduler_burst11_m2048_l200000.csv
HCA core clock 156250 KHz
Send 191057 frames over 10.026802 secs (CLOCK_MONOTONIC), average 19054.6 Hz
Average bit rate for untagged frames = 234.4 Mbps
HCA elapsed time = 10.026668 secs
ibv_message_passing_c_project/bin/release/ibv_switch_test/ibv_raw_packet_tx -i rocep1s0f0 -n 1 -p 25-26 -l 200000 -c ibv_message_passing_c_project/bin/packet_pacing/accurate_tx_scheduler_tx_scheduler_burst11_m2048_l200000-b.csv -b
HCA core clock 156250 KHz
Send 305386 frames over 10.016788 secs (CLOCK_MONOTONIC), average 30487.4 Hz
Average bit rate for untagged frames = 375.1 Mbps
HCA elapsed time = 10.016654 secs
ibv_message_passing_c_project/bin/release/ibv_switch_test/ibv_raw_packet_tx -i rocep1s0f0 -n 1 -p 25-26 -l 1200000 -c ibv_message_passing_c_project/bin/packet_pacing/accurate_tx_scheduler_tx_scheduler_burst11_m2048_l1200000.csv
HCA core clock 156250 KHz
Send 1067571 frames over 10.004830 secs (CLOCK_MONOTONIC), average 106705.6 Hz
Average bit rate for untagged frames = 1312.9 Mbps
HCA elapsed time = 10.004697 secs
ibv_message_passing_c_project/bin/release/ibv_switch_test/ibv_raw_packet_tx -i rocep1s0f0 -n 1 -p 25-26 -l 1200000 -c ibv_message_passing_c_project/bin/packet_pacing/accurate_tx_scheduler_tx_scheduler_burst11_m2048_l1200000-b.csv -b
HCA core clock 156250 KHz
Send 1220006 frames over 10.004190 secs (CLOCK_MONOTONIC), average 121949.5 Hz
Average bit rate for untagged frames = 1500.5 Mbps
HCA elapsed time = 10.004057 secs
ibv_message_passing_c_project/bin/release/ibv_switch_test/ibv_raw_packet_tx -i rocep1s0f0 -n 1 -p 25-26 -l 2400000 -c ibv_message_passing_c_project/bin/packet_pacing/accurate_tx_scheduler_tx_scheduler_burst11_m2048_l2400000.csv
HCA core clock 156250 KHz
Send 2134623 frames over 10.002397 secs (CLOCK_MONOTONIC), average 213411.1 Hz
Average bit rate for untagged frames = 2625.8 Mbps
HCA elapsed time = 10.002264 secs
ibv_message_passing_c_project/bin/release/ibv_switch_test/ibv_raw_packet_tx -i rocep1s0f0 -n 1 -p 25-26 -l 2400000 -c ibv_message_passing_c_project/bin/packet_pacing/accurate_tx_scheduler_tx_scheduler_burst11_m2048_l2400000-b.csv -b
HCA core clock 156250 KHz
Send 1220006 frames over 10.004191 secs (CLOCK_MONOTONIC), average 121949.5 Hz
Average bit rate for untagged frames = 1500.5 Mbps
HCA elapsed time = 10.004058 secs
ibv_message_passing_c_project/bin/release/ibv_switch_test/ibv_raw_packet_tx -i rocep1s0f0 -n 1 -p 25-26 -l 4800000 -c ibv_message_passing_c_project/bin/packet_pacing/accurate_tx_scheduler_tx_scheduler_burst11_m2048_l4800000.csv
HCA core clock 156250 KHz
Send 4268734 frames over 10.001201 secs (CLOCK_MONOTONIC), average 426822.2 Hz
Average bit rate for untagged frames = 5251.6 Mbps
HCA elapsed time = 10.001067 secs
ibv_message_passing_c_project/bin/release/ibv_switch_test/ibv_raw_packet_tx -i rocep1s0f0 -n 1 -p 25-26 -l 4800000 -c ibv_message_passing_c_project/bin/packet_pacing/accurate_tx_scheduler_tx_scheduler_burst11_m2048_l4800000-b.csv -b
HCA core clock 156250 KHz
Send 1220004 frames over 10.004185 secs (CLOCK_MONOTONIC), average 121949.4 Hz
Average bit rate for untagged frames = 1500.5 Mbps
HCA elapsed time = 10.004052 secs
ibv_message_passing_c_project/bin/release/ibv_switch_test/ibv_raw_packet_tx -i rocep1s0f0 -n 1 -p 25-26 -l 200000 -c ibv_message_passing_c_project/bin/packet_pacing/accurate_tx_scheduler_tx_scheduler_burst11_m1500_l200000.csv
HCA core clock 156250 KHz
Send 152948 frames over 10.033561 secs (CLOCK_MONOTONIC), average 15243.6 Hz
Average bit rate for untagged frames = 187.6 Mbps
HCA elapsed time = 10.033427 secs
ibv_message_passing_c_project/bin/release/ibv_switch_test/ibv_raw_packet_tx -i rocep1s0f0 -n 1 -p 25-26 -l 200000 -c ibv_message_passing_c_project/bin/packet_pacing/accurate_tx_scheduler_tx_scheduler_burst11_m1500_l200000-b.csv -b
HCA core clock 156250 KHz
Send 152949 frames over 10.033574 secs (CLOCK_MONOTONIC), average 15243.7 Hz
Average bit rate for untagged frames = 187.6 Mbps
HCA elapsed time = 10.033441 secs
ibv_message_passing_c_project/bin/release/ibv_switch_test/ibv_raw_packet_tx -i rocep1s0f0 -n 1 -p 25-26 -l 1200000 -c ibv_message_passing_c_project/bin/packet_pacing/accurate_tx_scheduler_tx_scheduler_burst11_m1500_l1200000.csv
HCA core clock 156250 KHz
Send 915133 frames over 10.005597 secs (CLOCK_MONOTONIC), average 91462.1 Hz
Average bit rate for untagged frames = 1125.3 Mbps
HCA elapsed time = 10.005463 secs
ibv_message_passing_c_project/bin/release/ibv_switch_test/ibv_raw_packet_tx -i rocep1s0f0 -n 1 -p 25-26 -l 1200000 -c ibv_message_passing_c_project/bin/packet_pacing/accurate_tx_scheduler_tx_scheduler_burst11_m1500_l1200000-b.csv -b
HCA core clock 156250 KHz
Send 610259 frames over 10.008387 secs (CLOCK_MONOTONIC), average 60974.8 Hz
Average bit rate for untagged frames = 750.2 Mbps
HCA elapsed time = 10.008253 secs
ibv_message_passing_c_project/bin/release/ibv_switch_test/ibv_raw_packet_tx -i rocep1s0f0 -n 1 -p 25-26 -l 2400000 -c ibv_message_passing_c_project/bin/packet_pacing/accurate_tx_scheduler_tx_scheduler_burst11_m1500_l2400000.csv
HCA core clock 156250 KHz
Send 1829750 frames over 10.002804 secs (CLOCK_MONOTONIC), average 182923.7 Hz
Average bit rate for untagged frames = 2250.7 Mbps
HCA elapsed time = 10.002671 secs
ibv_message_passing_c_project/bin/release/ibv_switch_test/ibv_raw_packet_tx -i rocep1s0f0 -n 1 -p 25-26 -l 2400000 -c ibv_message_passing_c_project/bin/packet_pacing/accurate_tx_scheduler_tx_scheduler_burst11_m1500_l2400000-b.csv -b
HCA core clock 156250 KHz
Send 610259 frames over 10.008390 secs (CLOCK_MONOTONIC), average 60974.7 Hz
Average bit rate for untagged frames = 750.2 Mbps
HCA elapsed time = 10.008256 secs
ibv_message_passing_c_project/bin/release/ibv_switch_test/ibv_raw_packet_tx -i rocep1s0f0 -n 1 -p 25-26 -l 4800000 -c ibv_message_passing_c_project/bin/packet_pacing/accurate_tx_scheduler_tx_scheduler_burst11_m1500_l4800000.csv
HCA core clock 156250 KHz
Send 4268715 frames over 10.001197 secs (CLOCK_MONOTONIC), average 426820.4 Hz
Average bit rate for untagged frames = 5251.6 Mbps
HCA elapsed time = 10.001063 secs
ibv_message_passing_c_project/bin/release/ibv_switch_test/ibv_raw_packet_tx -i rocep1s0f0 -n 1 -p 25-26 -l 4800000 -c ibv_message_passing_c_project/bin/packet_pacing/accurate_tx_scheduler_tx_scheduler_burst11_m1500_l4800000-b.csv -b
HCA core clock 156250 KHz
Send 610256 frames over 10.008383 secs (CLOCK_MONOTONIC), average 60974.5 Hz
Average bit rate for untagged frames = 750.2 Mbps
HCA elapsed time = 10.008249 secs
Reverted to the original scheduler related firmware configuration options and then rebooted:
[mr_halfword@haswell-alma linux-4.18.0-348.20.1.el8.x86_64]$ sudo mlxconfig -d /dev/mst/mt4117_pciconf0 set ACCURATE_TX_SCHEDULER=0 TX_SCHEDULER_BURST=0
[sudo] password for mr_halfword:
Device #1:
----------
Device type: ConnectX4LX
Name: MCX4121A-XCA_Ax
Description: ConnectX-4 Lx EN network interface card; 10GbE dual-port SFP28; PCIe3.0 x8; ROHS R6
Device: /dev/mst/mt4117_pciconf0
Configurations: Next Boot New
ACCURATE_TX_SCHEDULER True(1) False(0)
TX_SCHEDULER_BURST 11 0
Apply new Configuration? (y/n) [n] : y
Applying... Done!
-I- Please reboot machine to load new configurations.
Output from running the tests:
[mr_halfword@haswell-alma ibv_message_passing]$ ./run_raw_packet_pacing_tests.sh ibv_message_passing_c_project/bin/packet_pacing/scheduler_defaults
ibv_message_passing_c_project/bin/release/ibv_switch_test/ibv_raw_packet_tx -i rocep1s0f0 -n 1 -p 25-26 -l 200000 -c ibv_message_passing_c_project/bin/packet_pacing/scheduler_defaults_m9216_l200000.csv
HCA core clock 156250 KHz
Send 181525 frames over 10.027740 secs (CLOCK_MONOTONIC), average 18102.3 Hz
Average bit rate for untagged frames = 222.7 Mbps
HCA elapsed time = 10.027610 secs
ibv_message_passing_c_project/bin/release/ibv_switch_test/ibv_raw_packet_tx -i rocep1s0f0 -n 1 -p 25-26 -l 200000 -c ibv_message_passing_c_project/bin/packet_pacing/scheduler_defaults_m9216_l200000-b.csv -b
HCA core clock 156250 KHz
Send 133890 frames over 10.038414 secs (CLOCK_MONOTONIC), average 13337.8 Hz
Average bit rate for untagged frames = 164.1 Mbps
HCA elapsed time = 10.038285 secs
ibv_message_passing_c_project/bin/release/ibv_switch_test/ibv_raw_packet_tx -i rocep1s0f0 -n 1 -p 25-26 -l 1200000 -c ibv_message_passing_c_project/bin/packet_pacing/scheduler_defaults_m9216_l1200000.csv
HCA core clock 156250 KHz
Send 1067564 frames over 10.004826 secs (CLOCK_MONOTONIC), average 106704.9 Hz
Average bit rate for untagged frames = 1312.9 Mbps
HCA elapsed time = 10.004696 secs
ibv_message_passing_c_project/bin/release/ibv_switch_test/ibv_raw_packet_tx -i rocep1s0f0 -n 1 -p 25-26 -l 1200000 -c ibv_message_passing_c_project/bin/packet_pacing/scheduler_defaults_m9216_l1200000-b.csv -b
HCA core clock 156250 KHz
Send 1067571 frames over 10.004809 secs (CLOCK_MONOTONIC), average 106705.8 Hz
Average bit rate for untagged frames = 1312.9 Mbps
HCA elapsed time = 10.004679 secs
ibv_message_passing_c_project/bin/release/ibv_switch_test/ibv_raw_packet_tx -i rocep1s0f0 -n 1 -p 25-26 -l 2400000 -c ibv_message_passing_c_project/bin/packet_pacing/scheduler_defaults_m9216_l2400000.csv
HCA core clock 156250 KHz
Send 2134616 frames over 10.002410 secs (CLOCK_MONOTONIC), average 213410.2 Hz
Average bit rate for untagged frames = 2625.8 Mbps
HCA elapsed time = 10.002290 secs
ibv_message_passing_c_project/bin/release/ibv_switch_test/ibv_raw_packet_tx -i rocep1s0f0 -n 1 -p 25-26 -l 2400000 -c ibv_message_passing_c_project/bin/packet_pacing/scheduler_defaults_m9216_l2400000-b.csv -b
HCA core clock 156250 KHz
Send 2134628 frames over 10.002393 secs (CLOCK_MONOTONIC), average 213411.7 Hz
Average bit rate for untagged frames = 2625.8 Mbps
HCA elapsed time = 10.002285 secs
ibv_message_passing_c_project/bin/release/ibv_switch_test/ibv_raw_packet_tx -i rocep1s0f0 -n 1 -p 25-26 -l 4800000 -c ibv_message_passing_c_project/bin/packet_pacing/scheduler_defaults_m9216_l4800000.csv
HCA core clock 156250 KHz
Send 3963861 frames over 10.001304 secs (CLOCK_MONOTONIC), average 396334.4 Hz
Average bit rate for untagged frames = 4876.5 Mbps
HCA elapsed time = 10.001187 secs
ibv_message_passing_c_project/bin/release/ibv_switch_test/ibv_raw_packet_tx -i rocep1s0f0 -n 1 -p 25-26 -l 4800000 -c ibv_message_passing_c_project/bin/packet_pacing/scheduler_defaults_m9216_l4800000-b.csv -b
HCA core clock 156250 KHz
Send 4268742 frames over 10.001197 secs (CLOCK_MONOTONIC), average 426823.1 Hz
Average bit rate for untagged frames = 5251.6 Mbps
HCA elapsed time = 10.001079 secs
ibv_message_passing_c_project/bin/release/ibv_switch_test/ibv_raw_packet_tx -i rocep1s0f0 -n 1 -p 25-26 -l 200000 -c ibv_message_passing_c_project/bin/packet_pacing/scheduler_defaults_m2048_l200000.csv
HCA core clock 156250 KHz
Send 191057 frames over 10.026776 secs (CLOCK_MONOTONIC), average 19054.7 Hz
Average bit rate for untagged frames = 234.4 Mbps
HCA elapsed time = 10.026658 secs
ibv_message_passing_c_project/bin/release/ibv_switch_test/ibv_raw_packet_tx -i rocep1s0f0 -n 1 -p 25-26 -l 200000 -c ibv_message_passing_c_project/bin/packet_pacing/scheduler_defaults_m2048_l200000-b.csv -b
HCA core clock 156250 KHz
Send 305384 frames over 10.016781 secs (CLOCK_MONOTONIC), average 30487.2 Hz
Average bit rate for untagged frames = 375.1 Mbps
HCA elapsed time = 10.016663 secs
ibv_message_passing_c_project/bin/release/ibv_switch_test/ibv_raw_packet_tx -i rocep1s0f0 -n 1 -p 25-26 -l 1200000 -c ibv_message_passing_c_project/bin/packet_pacing/scheduler_defaults_m2048_l1200000.csv
HCA core clock 156250 KHz
Send 1067564 frames over 10.004839 secs (CLOCK_MONOTONIC), average 106704.8 Hz
Average bit rate for untagged frames = 1312.9 Mbps
HCA elapsed time = 10.004690 secs
ibv_message_passing_c_project/bin/release/ibv_switch_test/ibv_raw_packet_tx -i rocep1s0f0 -n 1 -p 25-26 -l 1200000 -c ibv_message_passing_c_project/bin/packet_pacing/scheduler_defaults_m2048_l1200000-b.csv -b
HCA core clock 156250 KHz
Send 1219998 frames over 10.004198 secs (CLOCK_MONOTONIC), average 121948.6 Hz
Average bit rate for untagged frames = 1500.5 Mbps
HCA elapsed time = 10.004031 secs
ibv_message_passing_c_project/bin/release/ibv_switch_test/ibv_raw_packet_tx -i rocep1s0f0 -n 1 -p 25-26 -l 2400000 -c ibv_message_passing_c_project/bin/packet_pacing/scheduler_defaults_m2048_l2400000.csv
HCA core clock 156250 KHz
Send 2134609 frames over 10.002404 secs (CLOCK_MONOTONIC), average 213409.6 Hz
Average bit rate for untagged frames = 2625.8 Mbps
HCA elapsed time = 10.002236 secs
ibv_message_passing_c_project/bin/release/ibv_switch_test/ibv_raw_packet_tx -i rocep1s0f0 -n 1 -p 25-26 -l 2400000 -c ibv_message_passing_c_project/bin/packet_pacing/scheduler_defaults_m2048_l2400000-b.csv -b
HCA core clock 156250 KHz
Send 1219996 frames over 10.004190 secs (CLOCK_MONOTONIC), average 121948.5 Hz
Average bit rate for untagged frames = 1500.5 Mbps
HCA elapsed time = 10.004022 secs
ibv_message_passing_c_project/bin/release/ibv_switch_test/ibv_raw_packet_tx -i rocep1s0f0 -n 1 -p 25-26 -l 4800000 -c ibv_message_passing_c_project/bin/packet_pacing/scheduler_defaults_m2048_l4800000.csv
HCA core clock 156250 KHz
Send 4268699 frames over 10.001198 secs (CLOCK_MONOTONIC), average 426818.7 Hz
Average bit rate for untagged frames = 5251.6 Mbps
HCA elapsed time = 10.001031 secs
ibv_message_passing_c_project/bin/release/ibv_switch_test/ibv_raw_packet_tx -i rocep1s0f0 -n 1 -p 25-26 -l 4800000 -c ibv_message_passing_c_project/bin/packet_pacing/scheduler_defaults_m2048_l4800000-b.csv -b
HCA core clock 156250 KHz
Send 1220002 frames over 10.004191 secs (CLOCK_MONOTONIC), average 121949.1 Hz
Average bit rate for untagged frames = 1500.5 Mbps
HCA elapsed time = 10.004025 secs
ibv_message_passing_c_project/bin/release/ibv_switch_test/ibv_raw_packet_tx -i rocep1s0f0 -n 1 -p 25-26 -l 200000 -c ibv_message_passing_c_project/bin/packet_pacing/scheduler_defaults_m1500_l200000.csv
HCA core clock 156250 KHz
Send 152948 frames over 10.033536 secs (CLOCK_MONOTONIC), average 15243.7 Hz
Average bit rate for untagged frames = 187.6 Mbps
HCA elapsed time = 10.033409 secs
ibv_message_passing_c_project/bin/release/ibv_switch_test/ibv_raw_packet_tx -i rocep1s0f0 -n 1 -p 25-26 -l 200000 -c ibv_message_passing_c_project/bin/packet_pacing/scheduler_defaults_m1500_l200000-b.csv -b
HCA core clock 156250 KHz
Send 152949 frames over 10.033538 secs (CLOCK_MONOTONIC), average 15243.8 Hz
Average bit rate for untagged frames = 187.6 Mbps
HCA elapsed time = 10.033414 secs
ibv_message_passing_c_project/bin/release/ibv_switch_test/ibv_raw_packet_tx -i rocep1s0f0 -n 1 -p 25-26 -l 1200000 -c ibv_message_passing_c_project/bin/packet_pacing/scheduler_defaults_m1500_l1200000.csv
HCA core clock 156250 KHz
Send 915134 frames over 10.005606 secs (CLOCK_MONOTONIC), average 91462.1 Hz
Average bit rate for untagged frames = 1125.4 Mbps
HCA elapsed time = 10.005483 secs
ibv_message_passing_c_project/bin/release/ibv_switch_test/ibv_raw_packet_tx -i rocep1s0f0 -n 1 -p 25-26 -l 1200000 -c ibv_message_passing_c_project/bin/packet_pacing/scheduler_defaults_m1500_l1200000-b.csv -b
HCA core clock 156250 KHz
Send 610259 frames over 10.008389 secs (CLOCK_MONOTONIC), average 60974.7 Hz
Average bit rate for untagged frames = 750.2 Mbps
HCA elapsed time = 10.008266 secs
ibv_message_passing_c_project/bin/release/ibv_switch_test/ibv_raw_packet_tx -i rocep1s0f0 -n 1 -p 25-26 -l 2400000 -c ibv_message_passing_c_project/bin/packet_pacing/scheduler_defaults_m1500_l2400000.csv
HCA core clock 156250 KHz
Send 1829744 frames over 10.002803 secs (CLOCK_MONOTONIC), average 182923.1 Hz
Average bit rate for untagged frames = 2250.7 Mbps
HCA elapsed time = 10.002680 secs
ibv_message_passing_c_project/bin/release/ibv_switch_test/ibv_raw_packet_tx -i rocep1s0f0 -n 1 -p 25-26 -l 2400000 -c ibv_message_passing_c_project/bin/packet_pacing/scheduler_defaults_m1500_l2400000-b.csv -b
HCA core clock 156250 KHz
Send 610259 frames over 10.008387 secs (CLOCK_MONOTONIC), average 60974.8 Hz
Average bit rate for untagged frames = 750.2 Mbps
HCA elapsed time = 10.008263 secs
ibv_message_passing_c_project/bin/release/ibv_switch_test/ibv_raw_packet_tx -i rocep1s0f0 -n 1 -p 25-26 -l 4800000 -c ibv_message_passing_c_project/bin/packet_pacing/scheduler_defaults_m1500_l4800000.csv
HCA core clock 156250 KHz
Send 4268741 frames over 10.001204 secs (CLOCK_MONOTONIC), average 426822.7 Hz
Average bit rate for untagged frames = 5251.6 Mbps
HCA elapsed time = 10.001079 secs
ibv_message_passing_c_project/bin/release/ibv_switch_test/ibv_raw_packet_tx -i rocep1s0f0 -n 1 -p 25-26 -l 4800000 -c ibv_message_passing_c_project/bin/packet_pacing/scheduler_defaults_m1500_l4800000-b.csv -b
HCA core clock 156250 KHz
Send 610259 frames over 10.008387 secs (CLOCK_MONOTONIC), average 60974.8 Hz
Average bit rate for untagged frames = 750.2 Mbps
HCA elapsed time = 10.008256 secs
Changing the value of the ACCURATE_TX_SCHEDULER and TX_SCHEDULER_BURST firmware parameters didn't affect the average bit rate generated.
Setting a non-zero max_burst_sz can cause the average bit rate to fall off dramatically at the higher requested rate limits, e.g. be only 16% of that requested.
Trying some temporary changes to increase rate_limit_attr.max_burst_sz showed that need to go to around x6 the packet size to achieve the higher requested rates but:
- The transmit completion timestamps then showed packets being sent in bursts.
- There was still some interaction with the MTU set on the device, in terms of the difference between the requested and actual transmit rate in bits per second/
I.e. still not clear how to get a deterministic send interval between packets.