VMware

NIC Flapping and Instability

Hoping someone might have some insight or a similar experience that could assist.

Running a small lab/prod environment with 2 x E200-8D. These are Xeon D processors with 2 x 1GB NICs and 2 x !0GB NICs.

I am only running a small number of VMs currently, very little load overall and am testing out the Starwind vSphere Linux appliance, which i am really impressed with overall so far.

Having a bit of an issue with NIC flapping that i noted some weeks ago in two instances, which later disappeared after switching from the SW Windows Free version to the Linux appliance. I assumed it was some amiss in the Windows boxes related to SW as it appeared to happen during the initial sync, but it has shown up again.

The vSwitches are setup to run 2 x 1GB NICs (Intel Corporation I350 Gigabit Network Connection) dedicated to VM traffic through a switch, 1 of the 10GB NICs (Intel(R) Ethernet Connection X552/X557-AT 10GBASE-T) dedicated to Starwind Sync traffic, with cross over cable to the other ESXi host, and the other 10GB NIC dedicated to Starwind iSCSI and vMotion.

I don’t actually think this is specifically related to Starwind as it has been fairly stable the last few weeks and today, after running the latest batches of ESXI updates, i had an alert about a NIC drop. I figure this was expected as the host rebooted and one end of the link had temporarily gone down in the reboot cycle, but with both boxes up, the link issue remained. As they are plugged via cross over cable into the other, i’m struggling to find if one is the cause specifically. Only one of the two 10GB NICs connected between hosts is currently down (the vMotion one). Have tried two sets of cables also.

Through SSH i forced the NIC down and up again with

esxcli network nic down -n <nic>
esxcli network nic up -n <nic>

The NICs appear to come up back but are now fixed at 1GB actual speed, although i have fixed them at 10GB full duplex. Have tried setting them to auto-negotiate with no luck. MTU on the 10GB NICs are set to 9000.

&#x200B;

I’m at a loss at the moment, so any insight would be amazing.

&#x200B;

Edit: Adding NIC detail from each host:

[root@melb-esxi01:~] esxcli network nic list
Name PCI Device Driver Admin Status Link Status Speed Duplex MAC Address MTU Description
—— ———— —— ———— ———– —– —— —————– —- —————————————————
vmnic0 0000:05:00.0 igbn Up Up 1000 Full 00:25:90:bb:55:1a 1500 Intel Corporation I350 Gigabit Network Connection
vmnic1 0000:05:00.1 igbn Up Up 1000 Full 00:25:90:bb:55:1b 1500 Intel Corporation I350 Gigabit Network Connection
vmnic2 0000:03:00.0 ixgben Up Up 10000 Full 00:25:90:bb:56:a8 9000 Intel(R) Ethernet Connection X552/X557-AT 10GBASE-T
vmnic3 0000:03:00.1 ixgben Up Up 1000 Full 00:25:90:bb:56:a9 9000 Intel(R) Ethernet Connection X552/X557-AT 10GBASE-T
[root@melb-esxi01:~] esxcli network nic get -n vmnic3
Advertised Auto Negotiation: true
Advertised Link Modes: Auto, 1000BaseT/Full, 10000BaseT/Full
Auto Negotiation: true
Cable Type: Twisted Pair
Current Message Level: -1
Driver Info:
Bus Info: 0000:03:00:1
Driver: ixgben
Firmware Version: 0x800006b7
Version: 1.7.1.16
Link Detected: true
Link Status: Up by explicit linkSet
Name: vmnic3
PHYAddress: 0
Pause Autonegotiate: true
Pause RX: true
Pause TX: true
Supported Ports: TP
Supports Auto Negotiation: true
Supports Pause: true
Supports Wakeon: false
Transceiver:
Virtual Address: 00:50:56:54:ef:d0
Wakeon: None
[root@melb-esxi01:~] esxcli software vib list | grep gb
igbn 0.1.1.0-5vmw.670.3.73.14320388 VMW VMwareCertified 2019-09-20
ixgben 1.7.1.16-1vmw.670.3.73.14320388 VMW VMwareCertified 2019-09-20
net-igb 5.0.5.1.1-5vmw.670.0.0.8169922 VMW VMwareCertified 2019-08-13
net-ixgbe 3.7.13.7.14iov-20vmw.670.0.0.8169922 VMW VMwareCertified 2019-08-13

&#x200B;

[root@melb-esxi02:~] esxcli network nic list
Name PCI Device Driver Admin Status Link Status Speed Duplex MAC Address MTU Description
—— ———— —— ———— ———– —– —— —————– —- —————————————————
vmnic0 0000:05:00.0 igbn Up Up 1000 Full 00:25:90:bb:55:3e 1500 Intel Corporation I350 Gigabit Network Connection
vmnic1 0000:05:00.1 igbn Up Up 1000 Full 00:25:90:bb:55:3f 1500 Intel Corporation I350 Gigabit Network Connection
vmnic2 0000:03:00.0 ixgben Up Up 10000 Full 00:25:90:bb:56:e6 9000 Intel(R) Ethernet Connection X552/X557-AT 10GBASE-T
vmnic3 0000:03:00.1 ixgben Up Up 1000 Full 00:25:90:bb:56:e7 9000 Intel(R) Ethernet Connection X552/X557-AT 10GBASE-T
[root@melb-esxi02:~] esxcli network nic get -n vmnic3
Advertised Auto Negotiation: true
Advertised Link Modes: Auto, 1000BaseT/Full, 10000BaseT/Full
Auto Negotiation: true
Cable Type: Twisted Pair
Current Message Level: -1
Driver Info:
Bus Info: 0000:03:00:1
Driver: ixgben
Firmware Version: 0x800006b7
Version: 1.7.1.16
Link Detected: true
Link Status: Up by explicit linkSet
Name: vmnic3
PHYAddress: 0
Pause Autonegotiate: true
Pause RX: true
Pause TX: true
Supported Ports: TP
Supports Auto Negotiation: true
Supports Pause: true
Supports Wakeon: false
Transceiver:
Virtual Address: 00:50:56:5d:ae:11
Wakeon: None
[root@melb-esxi02:~] esxcli software vib list | grep gb
igbn 0.1.1.0-5vmw.670.3.73.14320388 VMW VMwareCertified 2019-09-20
ixgben 1.7.1.16-1vmw.670.3.73.14320388 VMW VMwareCertified 2019-09-20
net-igb 5.0.5.1.1-5vmw.670.0.0.8169922 VMW VMwareCertified 2019-08-13
net-ixgbe 3.7.13.7.14iov-20vmw.670.0.0.8169922 VMW VMwareCertified 2019-08-13

&#x200B;

&#x200B;

&#x200B;

Host 1 Host 2

vNIC0 ———| Server/Workstation | ————- vNIC0
Switches
vNIC1 ——– | | ————- vNIC1

vNIC2 ——————————————— vNIC2
vNIC3 ——————————————— vNIC3

vNIC0 and vNIC1 are VM workload (1GB)
vNIC2 and vNIC3 are Starwind Sync and iSCSI and vMotion (10GB)

vNIC 2 and vNIC3 are the two that are flapping intermittently. Right now only vNIC3 but have seen it with vNIC2 also.



View Reddit by frankyyy02View Source

 

To see the full content, share this page by clicking one of the buttons below

Related Articles

5 Comments

  1. I think I had a similar issue with some Windows VMs. For me it ended up being the NIC adapter type in VMWare. They were set to E1000e. I switched them to be VMXNET3 and the problem went away after a reboot.

Leave a Reply