VMware

vCenter disaster

Ok so I realize I made a few dumb moves here, but let me explain what happened. I needed to restart my ESXi 7 hosts for a brief maintenance, including the host running vCenter. vCenter is installed on that host but is stored on my NAS via iSCSI. So prior to shutting down the host, I shut down all of the VMs except for vCenter, which I suspended (probably dumb, but it takes so long to start up). I then shut down the ESXi host, performed the maintenance, and started it back up. Prior to do anything, I had to reconfigure the host’s networking settings, because an odd issue with that host is that whenever it gets restarted, one particular physical NIC always gets removed as an uplink from its assigned vSwitch, so I have to delete the vSwitch, create a new one with that NIC as its uplink, and re-add the portgroups and vmknics. This also involves re-connecting the software iSCSI adapter to its target, because that physical NIC is one of the two used to connect to the iSCSI target. So I reconfigure the networking, re-add the iSCSI target, and as I expected, the VMs re-appeared in the VM list of the host, including the suspended vCenter machine.

The disaster is that I’m not able to power on, or un-suspend, any of the VMs that were on this host. VMs on the other hosts are working fine, so I assume this has something to do with the VMs on the vCenter host being stored on the iSCSI target. The error I get when attempting to start up VMs on the problem host, or un-suspend vCenter, is “Failed to power on virtual machine LIGERSERV-PRIME. The operation is not allowed in the current state.” Additionally the “edit settings” option is greyed on on these VMs.

At the moment, I’m exporting the other critical VM that was on this host, with the hope that I’ll be able to import it as a new VM and start it up that way. I’m hesitant to try this with vCenter because exporting a 400gb+ VM will take hours and I don’t want to do that if it’s not going to work, so I guess I’ll wait and see if it works with the smaller VM being exported right now.

Is there anything I can do to recover this vCenter vm without something like exporting it and re-adding it to the ESXi host? Any idea what caused this? And I assume if I want to avoid something like this in the future, I need to not try suspending vCenter, and possibly also move it to local storage that doesn’t have to be reconfigured every reboot?


View Reddit by daishi55View Source

Related Articles

2 Comments

  1. So I have questions.
    1. If vCenter is on an iSCSI disk, why not just vmotion it to another host and maint mode the host to reboot?
    2. What the heck is wrong with this host that it ejects the nic from the vswitch? is this a USB nic or something?
    3. What is the current state of vCenter? Suspended? Was it suspended from the host or from vcenter?

  2. Can you unregister and register the VMs from the host?

    If you suspended the VM, I think you can just remove the memory state file? Please, please do not just take my word on that. I could very well be wrong.

    What version of vCenter? I have as large or larger vCenters that only take a couple minutes for all the services to start.

    Also curious why you didn’t just vMotion vCenter.

Leave a Reply

Close