I'm conducting network performance testing on VMware 6.7 using an Intel X710 quad port 10G NIC and seeing some unusual behavior. It started while testing our application, but also will happen when testing using a Centos 7.3 VM running an ethernet bridge. This X710 on the ESXi host has the latest i40en 1.7.11 driver and 6.01 firmware installed.
We've set up a simple topology where the VM is connected to two of the X710 ports on the host through two standard vSwitches.
When forwarding TCP traffic across the ethernet bridge, after a few hundred HTTP transactions, we start to see more and more ARP requests and replies from a server IP/MAC address to the broadcast MAC, trying to resolve a client IP/MAC address. This type of traffic will increase while the HTTP traffic starts to decrease, until its eventually entirely ARP req/replies seen on the wire. Once the HTTP traffic test is stopped, the ARP traffic will continue for another minute or two. Without touching anything after the test completes, if we restart it again, the same pattern of HTTP traffic devolving into entirely ARP req/replies happens all over again.
We've tried to narrow down the specifics of this test and we've come up with the following observations:
- this will only happen with the Fotrville X710 cards, if we swap it for a Niantic 82599, this problem doesn't happen
- this will only happen with more than 100 or so IP addr src/dst pairs. If few than 100, the ARP req/replies are not seen.
- this problem does not seem to happen if using distributed vSwitches on the ESXi host, only the standard vSwitches exhibit this behavior.
- from the Force10 switch connecting the traffic generator to the ESXi host, the outgoing interface claims to have sent only a few hundred broadcast packets, where as the vmnic stats on the ESXi host claims to have received millions of broadcast packets. This makes it seem as though they are originating on the i40en driver or the standard vSwitch code.
At this point, it's difficult to tell where this problem is originating, either in the i40en driver or in the VMware standard vSwitch. Has anyone else observed anything similar to this behavior or might have a suggestion as to where to look further?