VMware – vSAN Deploy and Manage course – Day 2

This week i attend the VMware vSAN Deploy and Manage course with Paul McSharry as our instructor. I’m still learning and preparing for my VCP6-DCV which i will catch before new years eve. And there is a helluva stuff to stuff in my brain. This course is not especially for VCP exam, but it will help to answer at least some question about vSAN, which is part of vSphere and this in turn is part of the VCP. So it’s not bad to get some insights.

Day 2

Starting off with day 2 we had a quick review about yesterday, what we did and what we discussed on day 1. We repeated what vSAN is, what you can do with it (and what not; see Pauls review question list further down). Today we worked a lot in the labs to get familiar with some functions, and probably some stuff you wouldn’t do just so in production. We enjoyed also a small outlook to vSAN 6.5 and some of its features in comparison with vSAN 6.2.

Review Question List

After Pauls questions we talked about some basic networking stuff. We discussed load balancing, features of virtual distributed switches and so on. vSAN is set up in just a few clicks. But you have to look for the networking. vSAN is a storage topology which depends on proper configured and well performing network connections. So its a good idea to make the network admins your friends.

Module 5 Lesson 1 – vSAN policies and VMs

A policy is a state config and a specification and it defines basically the SLA. It can be configured at VM or even at VMDK level. The FTT value describes how many hosts can be tolerated to be lost. FTT generates the replicas of your data (how many copies to store). When using stripes we talk about performance. Stripes define the number of physical disks across which each replica of a storage object is striped. It could increase the performance if you add some more stripes. But also the ressource usage will increase. And you will have to have probably more disks.

Another component in vSAN is the witness. It is the tiebreaker for objects. The cluster needs always a quorum to decide what to do in case of an outtage (absent or degraded state). Per default, if a host is absent (the cluster does not know what happend with that host), your data will be replicated after a wait time of 60 minutes. If the cluster is degraded (cluster knows what happend with a dowend host) then the data will be replicated instantly. You can see that the default vSAN policy with FTT=1 is always your safety net. It is recommended not to edit the default vSAN policies but to create new ones and apply those to your vSAN storage / VM / VMDK.

Module 2 Lesson 2 – vsanSpares Snapshots

Is a snapshot a backup? Most people would freak out at this question. No, it’s not a backup. If you want to make backups of your VMs (and thats a damn good idea…) you should use vSphere Data Protection (or other third party products). But VMware did some changes especially for virtual SAN snaptshots. It’s called the vsanSparse Snapshot. A traditional snapshot will be created, but with this new VMDK type. The delta file will be mounted with a virtual SCSI driver, all the read requests are served through the in-memory cache (physical memory from the host) and all writes go directly to disk. It should not create any performance impact and you can keep up to 32 snapshots as long as you want. But don’t do that. Really.

Module 6 – Management (HA & Update)

At the beginning of this module we talked about the maintenance mode and its specific differences in a vSAN cluster. The maintenance mode enables you to take a host out of rotation. This is the normal vSphere (HA / DRS) maintenance mode. The vSAN maintenance mode is slightly different.

When you put a host in a vSAN cluster in maitenance mode then you can choose between three modes:

  • Ensure accessibility => move objects to active vSAN ressources as needed to ensure access.
  • Full data migration => move all objects to active vSAN ressources, regardless of wether the move is needed.
  • Do nothing => move no objects. Some objects might become unavailable.

We discovered in a class discussion that, depending on the amount of data residing on the hosts, it could be painful to put a host in maintenance mode, even if you don’t do a full data migration but just ensure accessibility. It can take some minutes up to some hourse until the host is in maintenance mode. But you can decrease the time needed with adding more hosts, increase FTT and also stripes.

High Availability

Few words about HA (High Availability). If your cluster already has HA configured, then you cannot enable vSAN. You have to disable HA, enable vSAN, and then enable HA again. When HA is turned on, the FDM agent (HA) traffic uses the virtual SAN network. The datastore heartbeat is disabled when there are only vSAN datastores in the cluster. And HA will never use a vSAN datastore for a heartbeat, because the vSAN networking is already used for network heartbeat.

What happens with physical disk failures? In traditional server environments or with a normal SAN you create a RAID array, probably with a hot spare, to ensure immediate disk replacement if a disk fails. With vSAN the redundancy is built logically directly within vSAN (FTT, stripes, witness). Thats the reason you shouldn’t create a RAID array but configure your disk controller to pass-through mode, so vSAN is aware of each physical disk and its state.

Upgrade Process

The upgrade process for vSAN in a few words…

  • it’s non-disruptive
  • but it’s I/O intensive
  • you can’t downgrade a disk group once the upgrade is completed
  • it needs more than 3 hosts (run the allow-reduced mode => potential risk)

vSAN Upgrade Process

Before you upgrade check the hardware for vSAN 6 support (HCL…). The rest of the upgrade process is straight forward:

  1. First upgrade your vCenter
  2. then upgrade the vSphere Update Manager (VUM)
  3. Afther that upgrade your ESXi hosts to version 6
  4. Confirm that Ruby vSphere Console (RVC) is accessible
  5. Login to Ruby and execute the upgrade script at cluster level
    1. vsan.v2_ondisk_upgrade /<vcenter>/<Datacenter>/computers/<cluster>
  6. Maintenance mode is not required at host level

Now the upgrade utility runs some checks and begins with upgrading the on-disk format.

You can upgrade the disk groups as followed:

  • Evacuate all data from the disk group you want to upgrade
  • Destroy this disk group
  • Rebuild the disk group with the latest on-disk format
  • Repeat the steps above for all remaining disk groups

Module 7 – Monitoring vSAN

The monitoring part we have only touched. There are a lot of vSphare built-in and also community driven tools for monitoring vSAN.

Built-in:

  • vCenter
  • vSphere Web Client
  • DCUI
  • SSH / vCLI / PowerCLI / ESXicli
  • esxtop
  • vROPS
  • Ruby / vSAN Observer

Community driven:

One cool built-in tool we tried out today on day 2 in our course. Its the Ruby vSphere Console (RVC) with which the vSAN Observer can be enabled / started. The process starts a webserver which you then can access via https://vCenterServer_hostname_or_IP_Address:8010. The result looks like this:

vSAN Observer

The initial configuration is not that easy, but its not a big deal. Enter some commands and you’re good to go. The webserver will stop itself after a runtime of an hour or if you manually stop it with Ctrl + C in the CLI console.

Module 8 – Stretched Cluster

Everyone knows a cluster, a group of servers that act like a single system. A strechted cluster is very similar to a normal cluster, with the difference that you cover two sites with the same cluster (or probably multiple racks in one datacenter), including vMotion, storage vMotion and all other cluster-enabled features.

A stretched cluster helps you to…

  • do maintenance of a complete site with no downtime
  • lower RPO for unplanned failures

Setting up fault domains enables you to set…

  • Rack Awareness (1st is primary site, 2nd is failover, 3rd is witness)
  • Site Awerness (across sites)

A stretched cluster has some specific requirements (some are also required to setup vSAN itself):

  • L2 stretched network for vSAN (Multicast)
  • L3 routed network between witness and vSAN hosts (Unicast)
  • Less than 5ms network latency for data
  • 200ms latency for witness
  • 500ms latency for ROBO (the two-host vSAN in your remote office / branch office)
  • 10GB links are recommended
  • If you have less than 10 VMs in your ROBO then you’re fine with 1GB links
  • Consistent MTU size from end to end

You can imagine the following scenarios when there are outtages in your environment:

  • Failed site => site failover
  • Failed host same site => if ressources are good to handle the SLA then same site, otherwise DR in other site
  • Failed witness => everyone carries on workin because no tiebreaker is needed
  • Failes network between sites => Restart to preferred site
  • Failed site with vCenter => Witness comes to use to restart to FD2 site

Conclusion

Today we learned a lot about vSAN in its technical details. With an all-flash solution you get lots of IOPS and performance. With a stretched cluster you can even tolerate a complete site failure. Think about that! VMware Virtual SAN is a really cool storage topology which is easy to setup if everything is prepared correctly (networking!).

Here you can find the other blog posts about the vSAN deploy and manage course:

VMware – vSAN Deploy and Manage course – Day 1

vSAN

It is some time ago when i published my last blog article. I wasn’t really in the mood for because i am learning and preparing for my VCP6-DCV. And there is a helluva stuff to stuff in my brain. This week i’m publishing some articles, beginning with this one. Not because i don’t learn for VCP, but because i’m learning right now and because i’m attending the VMware Virtual SAN Deploy and Manage course. Lets call this a recap. My brain is still collecting data and sorting it in the right shelf. This recap helps me with that. But lets start now. You probably don’t wan’t to know whats going on in my brain…

Day 1

Module 1 – Introduction to the course

To break the ice our instructor Paul McSharry started with a short introduction round for all attendees. Paul introduced himself (and his cat too…). I didn’t know him personally. I just knew that he is an expert in his area and a instructor. I heard from other people i know that Paul does his stuff very well. So i expected a good start, and finally the whole day was great.

First of all it’s now official that it’s called vSAN (with a small v in front) and not VSAN. VMware recently did a name change on this particular product. It should show that this product (vSAN) is integrated directly into the ESXi hypervisor. vSAN is a policy-driven software-defined storage tier. There are no dependencies on a VM, it runs directly in the hypervisor. We all knew that. You don’t need a special software or plugin to use vSAN, it’s just a matter of licensing. But now the name of the product makes that clear too.

When a customer wants that software defined storage is flexible, easy to use and install, quick and scalable. He doesn’t want to make compromises in performance. And it should also run in my private cloud and in my public cloud too.

Because of my customer size i don’t work often with scalability. I am somehow feeling ashamed for that. We always calculate some reserves into the systems, because we know our customer and always clarify the needs of the customer. Now i’m 100% sure what’s this with scale-out and scale-up. You scale-up when you add some disks to your hosts to increase capacity (or caching) tier in virtual SAN. You scale-out when you add one or more hosts to your (vSAN)-Cluster to increase overall performance and capacity.

Module 2 – Storage fundamentals

In the second module we talked about some basic storage stuff like spinning disks (rotating rust) and SSD’s, about IOPS and so on. We discovered some good points about latency, and why it’s good to have at least flash cache in a hybrid vSAN, or better go all-flash. It doesn’t mean that spinning disks are old school and that you shouldn’t use them. They are great in prize and capacity comparison. Think about an archive system, or a huge backup storage. For this use cases the spinning disks still deliver a fair amount of IOPS.

We took also a short review on RAID levels. It’s always good to know that, and hear it from time to time. When you’re using vSAN you don’t have to create a RAID array on your built-in storage controller. Just make the controller passing-through the disks to ESXi / vSAN and it’s all good. A discussion worth were also some storage protocols like Fiber Channel (FC), iSCSI and NFS. And last but not least the VMware HCL, the hardware compatibility list. Always check this if you build your own vSAN ready nodes, or even if you upgrade firmware (especially disk / controller firmware) on a certified vSAN ready node.

Module 3 – What is vSAN and use cases

Creating vSAN is easy. You can tune, set limits etc. very similar to VMs. Every single I/O gos through the hypervisor. There is no HBA and no fabric in between host and storage. Thats a big plus regarding the storage latency, which means the latency will be decreased massively. With vSAN you use local ressources (CPU, memory and storage). If you have to expand capacity you can easily add more disks (scale-up) or add a host (scale-out). Lets take a look at the ESXi hypervisor. It already comes with HA, DRS, VDP, vSphere Replication. VMware Virtual SAN is compatible with these common features. It’s just a matter of licensing. You are using different storage tiers? You don’t have to with vSAN. It’s policy based. Limit IOPS for noisy neighbors or to guarantee an SLA.

You can build your own vSAN ready nodes (brownfield / specific pods). And you have always to check the HCL. Or you choose from preconfigured vSAN ready nodes from your favorite hardware vendor. HPE, DELL etc. will provide approved solutions. And last but not least there are the DELL EMC VxRail or even VxRack systems, preinstalled and preconfigured.

Use cases

VMware Virtual SAN is since version 6.0 now in production. One of the most uses is for virtual desktop infrastructure (VDI). Customers run also their Exchange servers, transactional data bases and so on with vSAN. There is no right or wrong. If you just wan’t to free up space in your racks and replace old hardware, then you’re good to go. With two height units you can replace four servers and a shared storage system which demand all together at least 10 units. Converged systems are a space saver. And don’t forget about energy savings.

Module 4 – Virtual SAN concepts, requirements, install ckecklist

A vSAN datastore is accessible to all hosts in the vSAN cluster, wether or not they have own local storage. You can have capacity hosts and compute hosts if you want. Other storage topoligies can easily coexist with vSAN, there is no limitation. A vSAN datastore is built from disk groups. Every disk group is a single capacity unit from a host and provides cache and data. You must have one flash disk per disk group and one or more capacity disks. There is a limit of five disk groups per host / node.

You need at least three vSAN hosts / nodes for production environments. Your data (for example a VM) is stored all across the hosts in the cluster. Three components are stored in total; two replicas and a whitness. If a host failure occurs the cluster needs the quorum to decide what to do with your data. Thats the reason why there are three components.

Install checklist

  • HCL checked?
  • disk controller in pass through?
  • host cache and capacity disks?
  • VMkernel marked as vSAN?
  • Multicast on network level?
  • Uplink or VLAN considerations?
  • 1GB or 10GB network connection?
  • Cluster of three nodes?
  • Standard switches or distributed virtual switches?

Conclusion

Setting up vSAN is easy as pie. Meet the requirements and turn it on:

  • setup vSAN networking
  • enable vSAN on the cluster
  • select automatic or manual disk claiming
  • create disk groups if you set automatic disk claiming

And because it’s that easy, the official movie about setting up VMware Virtual SAN is only about three minutes long. There you go.

https://www.driftar.chyoutube.com/watch?v=1EDWKE93ivw

Please read the other blog post about the vSAN deploy and manage course too:

Veeam – Automatic backup tests with SureBackup

SureBackup

A few days ago one of my co-workes asked me if, and if yes, how i test the backups of my customers. I answered him that i don’t do that personally, but let it test automatically. He asked me how i do that, and so i showed him one of the coolest features in Veeam Backup & Replication. It’s called SureBackup, and it makes your life as a sysadmin a lot easier. Believe me.

SureBackup in Veeam Backup & Replication (available in Enterprise and Enterprise Plus versions) is a great feature to test your Veeam backups automatically. So you can make sure that a VM is working if you really need to restore a whole VM and not only single files or folders. But it’s not only automatic backup tests. With Veeam SureBackup you can also easily create lab environments. Leverage your backups for patch testing, development and other stuff. You don’t have to deploy new virtual servers for patch testing. You don’t have to deploy new virtual servers only because of a new software release. Test all that stuff with your SureBackup lab to see if everything works as it should. The todays blog post is all about SureBackup. I’ll show you how you set it up and try to give you some hints about it.

The challange

Your daily backups are running fine. You receive the success mails after backup and / or replication and you know it’s all good. But how can you be sure if a restore of a whole virtual machine will work if you really need it? Most of the companies test their backups once a month. It is like an insurance. It’s good to have one, but it’s better you don’t need it. Your backup job checks the health of the backup file. And SureBackup checks if your VMs really can boot and all the necessary services come online. Veeam SureBackup is like your insurance.

The solution

With just three simple steps you can be assured that your backups really work when time comes to restore a complete VM or business critical applications.

Read more

VMware – Configure vSphere Auto Deploy

Auto Deploy

The last few days and weeks i was preparing for my VCP6-DCV exam. Well, i’m still preparing for it, there is ahelluva stuff to learn and understand. One thing is vSphere Auto Deploy.

vSphere Auto Deploy is a cool feature for large infrastructures. Imagine, you just have to mount your ESXi hardware hosts in the racks, start them, and they are getting their software, setup and configuration via network. Without the need of any USB, CD or remote mounting of ISO files (like with HPE iLO or DELL iDRAC), and without any local storage if you boot your ESXi hosts from a shared datastore. Your host is online in just a few minutes and ready for use in your cluster, or whatever scenario you need it for.

Today i did some Auto Deploy stuff. And it is not that easy as i thought. You can’t do much via the vSphere Web Client (i’m absolutely the GUI type of sysadmin). You have to do some PowerCLI stuff, but not that much as i was afraid of. Let me show you how i did it. And please drop a comment if there is anything wrong, or if there is anything to make better. I’m pleased to update this post if necessary.

Stage 1 – Preparation for Auto Deploy

What do you need for using Auto Deploy? There is not much:

  • vCenter (on Windows Server or the Appliance)
  • PowerCLI
  • a TFTP server (i used Open TFTP Server which worked very fine)
  • ESXi Offline Bundle
  • Probalby some hosts you want to setup with Auto Deploy

Let me give you some tips about the configuration for vCenter and the Open TFTP Server. With this piece of software i had to try and fail a few times until i’ve got it up and running.

vCenter configuration – Enable Auto Deploy

  1. Login to your vCenter with the Web Client.
  2. Click on “Administration“.
  3. Click on “System Configuration” and then “Services” on the next page.
  4. Click on “Auto Deploy“.
  5. In the toolbar on top, click “Actions” and then “Start“.
  6. Under “Actions” and “Edit Startup Type” you can configure Auto Deploy for a automatic or manual start.

vCenter configuration – download TFTP boot zip file

  1. Login to your vCenter with the Web Client.
  2. Click “vCenter Inventory Lists” and then click “vCenter Servers“.
  3. In this overview click your vCenter Server.
  4. Click on “Manage” then “Settings” and then “Auto Deploy“.
  5. Click on the link “Download TFTP Boot Zip” to download the file. You’ll need it later for the TFTP server.

Open TFPT Server – setup and configuration

  1. Download and install the Open TFTP Server (i’ll us this software in my configuration).
  2. Use the standard settings for installation.
  3. Navigate to the setup folder (e.g. C:\OpenTFTPServer) and open the “OpenTFTPServerMT.ini” with a text editor.
  4. You’ll need to configure the [HOME] parameters. This is the folder where you have to save the TFTP Boot Zip from above.
  5. Locate the [HOME] parameter, ignore all the text there and add just “C:\TFTP-Root” (or any other folder you’d like) after the last line of text in this part of the INI file. Add the path to the folder without quotation marks.
  6. Restart the Open TFTP Server service.
  7. Copy your “TFTP Boot Zip” file from above to the folder you added in the INI file and unpack it directly there. You should have now about 11 files, including the zip file.
  8. Restart the Open TFTP Server service again.

Configure DHCP server with options

You need to configure your DHCP server with two options so that your ESXi hosts can boot via network / PXE, get an IP address and configuration file.

  • Add option 66, which is frequently called next-server. Add the IP address of your TFTP server as value.
  • Add option 67, which is frequently called boot-file. Add undionly.kpxe.vmw-hardwired as value.

Stage 2 – Create depot, profiles and rules, and deployment

  1. Download the ESXi Offline Bundle from VMware and save it in a folder on the machine where you’re doing this stuff.
  2. Open PowerCLI and connect to your vCenter (Connect-VIServer).
  3. Add-EsxSoftwareDepot c:\tmp\update-from-esxi6.0-6.0_update02.zip.zip
  4. Add-EsxSoftwareDepot http://<vcenter server>/vSphere-HA-depot
  5. Find out which profiles are in this offline bundle with “Get-ESXImageProfile | fl * | out-file C:\tmp\profiles.txt
  6. New-EsxImageProfile -CloneProfile “ESXi-6.0.0-20160302001-standard” -name “ESXiStatelessImage”
  7. Add-EsxSoftwarePackage -ImageProfile “ESXiStatelessImage” -SoftwarePackage vmware-fdm
  8. New-DeployRule -Name “FirstBoot” -Item “ESXiStatelessImage” -AllHosts
  9. Add-DeployRule -DeployRule “FirstBoot”
  10. Now boot one of your hosts. If everything is configured until this point you should see the ESXi image booting.
  11. Login to your vCenter with Web Client. You should probably see the new auto-deployed host in your inventory. In my lab this was the case.
  12. Configure this host (like networking, storage etc.) through web client.
  13. In the web client, create a new host profile based on this newly booted host named “ESXiAutoDeploy”.
  14. New-DeployRule -name “ProductionBoot” -item “ESXiStatelessImage”, ESXiAutoDeploy, <target_cluster> -Pattern “vendor=<unique hw identifier>”
  15. Add-DeployRule -DeployRule “ProductionBoot”
  16. Remove-DeployRule -DeployRule FirstBoot -delete
  17. Boot all of your auto deploy hosts.
  18. Assign the created host profile to these hosts.
  19. Reboot these hosts => aaaand you’re done.
  20. If you want to save the newly created image profile as a software depot, to make changes to a later time if needed, just do this:
    1. Export-EsxImageProfile -ImageProfile “ESXiStatelessImage” -ExportToBundle -FilePath c:\tmp\ESXiStatelessImage.zip

Conclusion

As i wrote above it is not that easy, but it was not so hard as i was afraid of. There are some things to consider, like ESXi configuration with correct networking, storage etc. to make later the suitable host profile which should fit all of your hosts. In this first try i didn’t create a big configuration, just some basic stuff to understand Auto Deploy and for the writing of this blog post.

I have to investigate the password policy, or better, how i can set a password policy. Because my test ESXi host did not have a root password after assigning the host profile. I know i configured the password in step 12 above beside the rest of the configuration. But the password didn’t come with the host profile. But anyway, the configuration of Auto Deploy worked. Now i’ve got some more tasks, for example to find out about this password issue.

Special thanks to Duncan Epping for his cheat sheet (no, i did not read his article, just his cheat sheet, but yes, i saw the link to his article). So i had the commands needed and a thin red line for orientation.

Also thanks to Vladan Seget for his article about some new features in vSphere 6.5 including Auto Deploy (which has now a GUI! How cool is that?)

VMware – Read before upgrade to vSphere 6.5

VMware

Yesterday VMware announced the general availability of the brand new vSphere 6.5. They announced the new version at this years VMworld in Barcelona. But now you can download and install the bits. But there is a catch. Please make sure you read and understand all the important information before upgrading to vSphere 6.5 because there might be some limitation at the moment. Let me bring some light into the darkness.

Compatibility considerations

You should not upgrade to vSphere 6.5 if you are running one (or some / all) of these software components in your environment:

  • VMware NSX
  • VMware Integrated OpenStack
  • vCloud Director for Service Providers
  • vRealize Infrastructure Navigator
  • App Volumes
  • Horizon Air Hybrid-Mode
  • Integrated OpenStack
  • vCloud Networking and Security
  • vRealize Business for Cloud
  • vRealize Configuration Manager
  • vRealize Hyperic
  • vRealize Networking Insight

These components are not yet compatible with vSphere 6.5. But as we know VMware, they are already working for updates. Please check the VMware Product Interoperability Matrix for further information about updates to the products above.

  • If you have to revert a migration, please check VMware KB2146453 for reverting a vCenter Server to Appliance migration.
  • To roll back a vCenter Server Instance on Windows, please check the vSphere Upgrade Guide.

Upgrade Considerations

Before upgrading your environment, review these critical KB articles to make sure the upgrade will be successful.

vCenter Server

vCenter Server to vCenter Server Appliance

PSC High Availability

ESXi

NSX

vRealize Operations

vSphere Web Client

Known Issues

vCenter Server

vRealize Operations Manager

Security Considerations

TLS protocols

Encryption considerations

  • Running a encrypted KMS virtual machine can cause a loss of data in the event of a host failure.

More details in the VMware Knowledgebase (KB2147548):

https://kb.vmware.com/kb/2147548

*** Update ***

Backup Considerations

There is one thing i missed to mention. If you are using Veeam Availability Suite v9.5 then you can’t do backups with vSphere 6.5 at the moment, because Veeam does not support this vSphere version yet. But also the guys at Veeam are working on an update, which will be (historically) release about two months after general availability of the new vSphere version.

So stay tuned!