VMWare Environment Design

Please excuse me for my ignorance in networking and VMWare. I have been tasked with upgrading our VMWare environment. Before I go on, I’d describe our existing VMWare environment. We have three HP Proliant DL380 Gen 8 (Dual Xeon E5-2640, 64GB memory) as hosts. Each of the server has 8 1GB line coming out of it. I have no clue how those lines are configured. We also have an EMC VNXe3150 with 4TB of storage. There are five network line coming out from the back. This system was purchased back in 2013, and it is out of warranty. Our VMWare ESXi does have a maintenance contract. The two switches are old HP switches are HP switches that I don’t have access to. The system was purchased and installed in 2013, but no one here really knows much about it. I went to the CEO and told her that I was concerned that the system had no maintenance contract, we were running low on memory and hard disk. I felt it was urgent to replace the infrastructure soon. She agreed.

For the new infrastructure, I am looking at keep two of the old HP Proliant DL380 Gen 8. I am planning on replacing one of the old server with two single-processor servers (HP Proliant DL325 Gen10, AMD EPYC 7402, 2 x 10G NIC, 256GB). I believe I can transfer the license from the old dual processor server over to the two single processor servers. I am also replacing the switch with Netgear 16-port 10G switch (model XS716T). Last, I am going to have two Synology RS3617xs (with dual 10G NIC added on) for the storage.

My question is how to best configure everything. For each of the new server, should I have one 10G ethernet going to each switch or should I have then LAG (effective 2G) on one switch? Can you even LAG across switches? Would LAG provide any redundancy if one of the connection was lost? Honestly – I am lagging it because I am more worried about the switch going out. Same question for the Synology boxes.

I am open to all suggestions.





View Reddit by mn530View Source


To see the full content, share this page by clicking one of the buttons below

Related Articles


  1. for your existing config, i assume you have vcenter, or host access, in which case you can look at your networking config on the hosts to see how they are configured.

    first, i would never put storage across netgear switches. sure 10GB.. but look at things like port queue depth, back plane, and throughput. Everything in your environment is reliant on your switching, dont cheap out with consumer grade there of all places. dont need to cisco, plenty of other viable enterprise options, but something that does stacking, and cross-switch lacp. (i’m personally fond of Ruckus ICX for a good budget option)

    yes your vmware will be licensed by physical core, so going from dual core to single will work. I suppose i might do the same on a very tight budget to give some hardware redundancy.

    what are the “Main Switches” in your diagram (green)? i dont see any reason why you couldn’t go from firewall to core switches. (again, providing they are enterprise, L3, managed, and vlaned..)

    Synology i dont have personal experience with, but assuming it runs iscsi. Why two storage arrays, or are you just trying to indicate separate controllers in the diagram? your storage data layer you will want on a seperate vlan and jumbo if possible (although not required, and maybe somewhat of a religious type opinion on that one..).

    for the network ports into your new hosts, just make them trunk ports, pass all vlans on the trunks, and configure your interfaces in esxi on whichever vlans you need (management, vmotion, iscsi, and vm traffic will generally all be on separate vlans)

  2. Firstly, log into your ESXi hosts and look at their vSwitch and/or dvSwitch configuration. Figure out where each of those connections go. Do the same with the EMC box, if you can.

    Since you plan to keep your Intel-based DL380s, you should know that you will never be able to vMotion (live migrate) VMs between those servers and the new AMD DL385s. You might want to reconsider the DL385s in favor of new DL380s.

    Dual 10G sounds okay with regard to the servers and Synology. Have you thought about whether you’ll use NFS or iSCSI as a transport protocol? If you’re thinking iSCSI, you can’t bind those two ports and each should be bound to a separate NIC (physical port) as best practice. This ensures dual paths.

    NAS may be simpler in regard to configuration.

    Either way, I would get a performance guarantee from your reseller before you pay for the Synology units. They are SMB at best. In my opinion, not enterprise-class devices. But that might be fine in your situation.

  3. Keep in mind you have to power off VMs, since you’re moving from intel to amd.

    Network wise I’d go for 1 10G Port on each switch no LAG, not really worth it imo.

  4. It looks like quite a task. A few thing to note …start from your HW. The Gen 8’s will not support beyond 6.5U3 so just know that you will not be able to upgrade to 6.7. In this post I am going to only focus on the HW & agree on the switch recommendation that [jgudnas](https://www.reddit.com/user/jgudnas/) mentioned. Networking is critical you need to get right. If you want to do LACP, make sure you select Sa switch that supports such configuration. Do not mix AMD and Intel in the same cluster. You can have two clusters in vCenter but please do not mix the different chipsets. Will post more tomorrow.

  5. Please, do not mistake a Synology Box for an enterprise grade backend storage.

    The units are fine and serve very good as a secondary target for backups, storage for test environments, so on and so forth, but not as your primary VMware storage.
    I know some of their boxes are certified and I know they have somewhat of a support level, but don’t go that path.

    Other than that I would cut the AMD server and rely solely on intel for now.
    If your goal is a higher availability, for example during maintenance, live migration is key and with a mix of AMD and Intel this isn’t going to work.
    If you were going to replace the whole compute stuff, looking at AMD is a different story, but as you only replace a single node, I wouldn’t advise it.

    Moreover, do not put production storage traffic on Netgear switches. They may have their place in the access layer or test environments, but not for high throughput, low latency production storage traffic.
    I can almost hear u/lost_signal screaming about port buffers 🙂

    In conclusion I would advise that you first outline your business goals and see afterwards how different options fit these goals.
    If you mapped your goals to (hopefully VMware) solutions, start thinking of your physical layout / hardware.
    Otherwise think about having a consultant come in to support you during the process.
    From what it sounds like you’re working at fairly small enterprise, at least from the infrastructures scale I saw in your post, which in my experience means that money is tight and mistakes made during sizing / installation won’t be fixed till the next investment comes around (prob around 5-7 years from now).

  6. >My question is how to best configure everything.

    Think about this before you spec equipment. Design first, then find equipment that fits your design model.

    1. I don’t love the VNXe, but I still think you’re taking a big step backwards in availability going from the dual controller VNXe to two single controller Synologys (unless you actually have a single controller VNXe – then shame on someone). The only way you get HA from that is to run duplicates of everything at the application layer.
    2. There’s no reason for you to use NAS or SAN, unless you plan to move from 2 to 10 hosts. Shared DAS will improve availability and reduce complexity. You’re worried about your switches going out – eliminate them.

  7. > We also have an EMC VNXe3150 with 4TB of storage

    Careful patching A 3150e firmware. Those damn things can take 2 minutes to failover between controllers and will crash everything. It’s about as useful as 2 row boats in a hurricane when it comes to “redundancy”. Also, you are not in West Virginia by any chance?

    > Can you even LAG across switches?

    Switches yes, those Netgears? No. You’d need switches that stack and support cross chassis LAG.

    > Would LAG provide any redundancy if one of the connection was lost?

    vSphere can manage path failover without LAG (It does so by default, using an active/passive system called VPID balancing).

    > model XS716T

    Rather anemic buffers. The newer models are better, but still low end marvel. I’ve seen them work for small clusters, but in 2019 It’s not what I’d chose (I’d go with MAVERICK ASIC switches instead). That is if you need switches. 2 node vSAN can direct connect. Many low end modular storage arrays support FC-AL, SAS direct connect etc.

    > Our VMWare ESXi does have a maintenance contract

    What version of ESXi you got?

  8. You are focusing on hardware instead of business requirements and a proper plan. If you don’t have a plan or path and just whack one into existenence as you go along, you will fail. Or you will succeed in little bits for the next 3 years. What are your test methods and parameters? Have you planned how to test HA or DRS failover?

    I’m migrating our installation from 6.0 to 6.7. I unplugged an HA with VCSA running to test failover. I changed the SAN LUN of VCSA while it was operating to determine recovery methods. I added test VMs to 6.7 from 6.0 and did unnatural operations to those VMs. I know my upgrade will work when I am finished. How about you?

Leave a Reply