VMware – Clone a VM with snapshots (and consolidate it)

Recently we had a weird issue in our office. We had one virtual machine with a snapshot. That by itself isn’t an issue, a snapshot is nothing special. But someone created that snapshot before a software upgrade and forget to delete it. So this snapshot was growing and growing. We found out that there is a snapshot when the VM or service owner requested some additional disk space. We weren’t able to add disk space because of that snapshot. So we scheduled a maintenance window to delete the snapshot. Faster said than done.

The VM went offline because of disk consolidation. That could happen, depending on snapshot size and storage system. But the VM not only went offline for some time, but unexpectedly for hours. Together with VMware support we were able to stop the snapshot deletion. The VM came back online but with the known “Disk Consolidation Needed status”. We found out that this snapshot was about 400 GB in size. What a bummer! So we scheduled another maintenance window to consolidate that snapshot. Unfortunately that didn’t work well. Consolidation timed at around 96%, not sure why. “Error communicating with the host” isn’t very helpful in that moment.

Some research and again having a chat with VMware support led us towards cloning the disk files. During the cloning of a disk file the snapshot will be consolidated. And as you’re doing a disk clone locally on the ESXi host with “vmkfstools” and not withing vCenter, there shouldn’t be a timeout either. So we had out action plan. And we scheduled another maintenance window with the service owner.

Read moreVMware – Clone a VM with snapshots (and consolidate it)

VMware vSAN cache disk failed and how to recover from it

Recently i rebuilt my VMware homelab from scratch and with the most recent vSphere version available at this point. I planned to rebuild my lab a long time ago but because of my job and other things I really hadn’t the time to do that. But recently I had to rebuilt my lab because I screwed up my vCenter. Yes, I screwed it. So what did that mean? Reinstall all and everything completely from scratch. All my physical ESXi hosts, domain controller, vCenter, Jumphost and backup server. All these services are running on a standalone ESXi server with some local disks. This server is called my homebase. I’ve got some more servers which are running my VMware vSAN environment. I reinstalled these too and reconfigured everything what was needed, like networking and storage.

This week one of my vSAN cluster nodes went into degraded mode because of one of the cache disks failed. I thought, easy, just replacing the cache disk and that’s it. But no, the struggle became real…

What happened?

I’ve got three DELL servers for my vSAN cluster. All servers are equiped with one SSD as cache tier and three SSDs for the capacity tier. Now one cache disk failed because of reasons (I really don’t know why). That was causing vSAN to go into degraded mode as “failures to tolerate” was set to 1. So one failure (the failed cache disk) was compensated. Just for your information in case you didn’t know. If a cache disk of one disk group fails, the whole disk group will become unavailable. In my case that meant that one third of the whole vSAN capacity was gone.

What did i do to resolve this?

My first idea was to replace the failed cache disk as I’ve got some identical disks as spare drives available. Well, not directly as spare drives, but installed and configured as RAID 5 in my homebase ESXi host. So I did a Storage vMotion on all my homebase VMs mentioned above to another local RAID 5 datastore, deleted the SSD RAID datastore and removed the disks. The physical replacement of this disks was easy. But telling my degraded vSAN node to accept this disk was a different topic.

Checking the disks

After I installed the “new” disk into the vSAN node I did a rescan on all storage adapters. And there was nothing. Only the already existing capacity disks but no cache disk. So i tried the second and the third identical disk with the same result. Only the capacity disks were visible in vCenter on the host but not the cache disk. What’s wrong here? I knew that ESXi server only shows empty disks without any volumes, file systems or data on it. But how should I wipe this disk when not even with esxcli the disk is not visible?

As I’m using HPE Smart HBA H240 as my storage controller in the DELL server, I already installed the HPE smart storage administrator CLI tool on all the vSAN nodes. So I was able to look onto the storage controller to see what’s happening there (or probably not).

The following command showed me that all disks are here and are fine:

./ssacli ctrl slot=2 pd all show status

But I was still struggling. Why is vCenter still showing only the capacity disks?

Clearing the disk(s)

An article by Cormac Hogan showed me how to reclaim disks for other uses. So i deleted all the partitions on the existing capacity disks, hoping that then the cache disk will also come back online. I read on another blog that wiping all vSAN disks can bring back non-detected disks. But that didn’t help.

First I removed the vSAN node from the vSAN cluster:

esxcli vsan cluster leave

Next I checked with partedUtil how many and what kind of partitions are on the disks:

partedUtil get /vmfs/devices/disks/mpx.vmhba1:C2:T2:L0

Each capacity disks showed two partitions, so I wiped them all:

partedUtil delete /vmfs/devices/disks/mpx.vmhba1:C2:T2:L0 1

partedUtil delete /vmfs/devices/disks/mpx.vmhba1:C2:T2:L0 2

A look into the HPE smart storage administrator CLI tool again showed me that still all physical disks are here. A rescan on all HBA in vCenter on this particular host didn’t help, only the capacity disks were shown.

I looked a little deeper into the storage controller with the command:

./ssacli ctrl slot=2 pd all show detail

That showed something not completely unexpected:

physicaldrive 2I:0:1

Masked from HBA: The drive contains controller configuration data and has been disabled
in order to protect the configuration data. Please run the "modify clearconfigdata"
command on the drive to re-enable it.

This physical drive above was the cache disk I was missing in vCenter. OK, so let’s clear the “configdata” and let’s see what happens then:

./ssacli ctrl slot=2 pd 2I:0:1 modify clearconfigdata

I checked again with “all show detail” and this “modify clearconfigdata” was gone.

Now I was able to rescan all storage adapters in vCenter on this host and that brought back my missed cache disk:

But that was to easy…

After having my cache disk back I went into vSAN configuration in vCenter and claimed the disks. The small one for the cache tier, the bigger ones for the capacity tier. And boom! This particular disk group went into another network partition group. Well done, thank you for nothing!

When you search around the internet for vSAN network partition you will find many forum and blog posts mentioning that this happens if something with the network configuration wasn’t as good as it should be. In my case I checked everything and I changed nothing on the network. So this partitioning issue had another reason. But to be honest I didn’t try to solve that. I wasn’t in the mood for that. I only wanted to bring back my vSAN into a good and healthy state.

I removed this vSAN node from the cluster by just draggin and dropping it out of the cluster. Then I tried to remove it from the inventory. And another boom!

The resource 'eagle.lan.driftar.ch' is in use.

That was the error message in vCenter when I tried to remove the host from inventory. But why? The host is in maintenance mode! Dang it! Let me remove it!

After doing some research on the interwebs I checked also the tasks in vCenter if there is a bit more of information. And I’ve found something:

Cannot remove the host eagle.lan.driftar.ch because it's part of VDS vMotion-DSwitch vSAN-DSwitch.

Well, that’s true. And that was also the obvious reason why I can’t remove the host from inventory. So I had to reconfigure the host networking, putting back the VMKernel ports for vMotion and vSAN to their origin local virtual switches. After that I was finally able to remove the host from the inventory.

Now rebuilding vSAN…

The next steps were easy. I added the host back to the vSAN cluster and configured the VDS for vMotion and vSAN as they were before. Then I went into vSAN configuration and checked the disk group. Lucky me the disk group configured before was still there and healthy, and vSAN claimed it automatically. And no network partitioning this time! All hosts and disk groups in the same network partition group!

After retesting the health onf the vSAN cluster it showed that there is one component in need of a resync. One of my templates was partially on this disk group before failing and is now waiting until the resync completed.

But at least vSAN is working fine again!

Closing words

In the most cases, or probably in all cases, replacing a disk in vSAN should be easy. Usually you will replace a used disk against a new and empty disk other than me. But that doesn’t mean you can’t unless you know what to do. I’m glad if this blog post helped you solving the issue.

If you follow the steps described in the VMware Knowledge Base then you should be fine:

Synology now with backup for Office 365

Long time no hear, and I’m really sorry for that. It was a busy time, with a new job, huge project and also military duty in between. But now things are calming down, and so do I. And I’ve got some time for a new blog post.

Recently i stumbled across a newsletter from Synology. They now have a backup tool for Office 365 available which is free of costs for 10 users. Extra license packs can be purchased for adding and renewing additional licenses. That doesn’t sound so bad. But wait. Office 365 is in the cloud, doesn’t Microsoft back it up so that I don’t have to worry about? Well, long story short, NO. There is some retention like deleted items and stuff, and you can modify specific settings. But backing up Office 365 data is all in your own responsibility. There are various backup solutions like Veeam Backup for Office 365 which work absolutely great, and also the recently announced solution from Synology which I’m writing about today. Let’s look at it a little closer.

Unfortunately not every Synolgy NAS system is supported, so please have a look at the list here if your devices is on it or not. Lucky me, i bought a new NAS for my vSphere homelab some months ago which fits perfectly for this test setup.

Supported NAS systems

  • 18 series:FS1018, RS3618xs, RS818RP+, RS818+, RS2818RP+, RS2418RP+, RS2418+, DS3018xs, DS418play, DS918+, DS718+, DS218+, DS1618+
  • 17 series:FS3017, FS2017, RS3617xs, RS3617RPxs, RS4017xs+, RS3617xs+, RS18017xs+, DS3617xs, DS1817+, DS1517+
  • 16 series:RS2416RP+, RS2416+, RS18016xs+, DS416play, DS916+, DS716+II, DS716+, DS216+II, DS216+
  • 15 series:RS815RP+, RS815+, RC18015xs+, DS3615xs, DS415+, DS2415+, DS1815+, DS1515+
  • 14 series:RS3614xs, RS3614RPxs, RS814RP+, RS814+, RS3614xs+, RS2414RP+, RS2414+
  • 13 series:RS3413xs+, RS10613xs+, DS713+, DS2413+, DS1813+, DS1513+
  • 12 series:RS3412xs, RS3412RPxs, RS812RP+, RS812+, RS2212RP+, RS2212+, DS3612xs, DS712+, DS412+, DS1812+, DS1512+
  • 11 series:RS3411xs, RS3411RPxs, RS2211RP+, RS2211+, DS3611xs, DS411+II, DS411+, DS2411+, DS1511+

More information about Active Backup for Office 365

Synology has plenty of information about Active Backup for Office 365 on their website.

Some of the features:

  • Protection of mail, calendar, contacts, OneDrive
  • With Active Backup for Office 365 Portal enabled, both employees and admins can easily locate items for restoration and restore/export them with simple clicks
  • Mail/calendar attachments (if stored in Btrfs volumes) and OneDrive files that contain identical content will only be stored to Synology NAS once, which saves storage space
  • Files stored in Btrfs volumes on Synology NAS can be deduplicated with previous versions, minimizing the storage space required

But now let’s talk tech and let’s dive into the setup and configuration of Active Backup for Office 365.

Read moreSynology now with backup for Office 365

Recap of the latest VMware vSphere 6.7 releases

vSphere 6.7

Oh boy, what a week! Some say that winter is now finally gone, nice and warm weather, not wearing winter jackets anymore. But hey, i’m not a weatherman. When you’re sitting in the office i think it doesn’t matter if it’s raining or snowing outside. Just kidding… Let’s get back to business.

There was some rumor about the next upcoming version. Will it be version 7? Or something just above 6.5? VMware did release several new products versions! And it’s all with version number 6.7. What a list! It’s one of those email notifications that I usually like to scroll down, a little more, and more and more, to get all the news soaked up like a sponge. I’d like to dive in right now and provide you a recap of this weeks VMware releases. And as i said, it’s quite a list. I’ll pick out just some new key features. You can find the full release news on the VMware Blogs (links provided here).

New product versions

vSphere 6.7

  • several new APIs that improve the efficiency and experience to deploy vCenter, to deploy multiple vCenters based on a template, to make management of vCenter Server Appliance significantly easier, as well as for backup and restore
  • significantly simplifies the vCenter Server topology through vCenter with embedded platform services controller in enhanced linked mode
  • 2X faster performance in vCenter operations per second
  • 3X reduction in memory usage
  • 3X faster DRS-related operations (e.g. power-on virtual machine)
  • vSphere 6.7 improves efficiency when updating ESXi hosts, significantly reducing maintenance time by eliminating one of two reboots normally required for major version upgrades (Single Reboot). In addition to that, vSphere Quick Boot is a new innovation that restarts the ESXi hypervisor without rebooting the physical host, skipping time-consuming hardware initialization
  • The HTML5-based vSphere Client provides a modern user interface experience that is both responsive and easy to use, and it’s now including other key functionality like managing NSX, vSAN, VUM as well as third-party components.
  • enabling encrypted vMotion across different vCenter instances
  • enhancements to Nvidia GRID vGPU
  • vSphere 6.7 introduces vCenter Server Hybrid Linked Mode, which makes it easy and simple for customers to have unified visibility and manageability across an on-premises vSphere environment running on one version and a vSphere-based public cloud environment, such as VMware Cloud on AWS, running on a different version of vSphere.
  • vSphere 6.7 also introduces Cross-Cloud Cold and Hot Migration
  • Delivers a new capability that is key for the hybrid cloud, called Per-VM EVC

More information here: Introducing VMware vSphere 6.7 / VMware Blogs

vSAN 6.7

  • vSAN 6.7 provides intuitive operations that align with other VMware products from a UI and workflow perspective to provide a “one team, one tool” experience
  • Iintroduces a new HTML5 UI based on the “Clarity” framework as seen in other VMware products (All products in the VMware portfolio are moving toward this UI framework)
  • A new feature known as “vRealize Operations within vCenter” provides an easy way for customers to see vRealize intelligence directly in the vSphere Client
  • vSAN 6.7 now expands the flexibility of the vSAN iSCSI service to support Windows Server Failover Clusters (WSFC)
  • vSAN 6.7 introduces an all-new Adaptive Resync feature to ensure a fair-share of resources are available for VM I/Os and Resync I/Os during dynamic changes in load on the system
  • Optimizes the de-staging mechanism, resulting in data that “drains” more quickly from the write buffer to the capacity tier.  The ability to de-stage this data quickly allows the cache tier to accept new I/O, which reduces or eliminates periods of congestion
  • New health checks include:
    • Maintenance mode verification ensures proper decommission state
    • Consistent configuration verification for advanced settings
    • vSAN and vMotion network connectivity checks improved
    • Improved vSAN Health service installation check
    • Improved physical disk health check combines multiple checks (software, physical, metadata) into a single notification
    • Firmware check is independent from driver check

More information here: What’s New with VMware vSAN 6.7 / VMware Blogs and also here: Extending Hybrid Cloud Leadership with vSAN 6.7

vCenter Server 6.7

  • The vSphere Client (HTML5) is full of new workflows and closer to feature parity
  • built-in file-based vCenter Server backup now includes a scheduler

Installation

  • No load balancer required for high availability and fully supports native vCenter Server High Availability.
  • SSO Site boundary removal provides flexibility of placement.
  • Supports vSphere scale maximums.
  • Allows for 15 deployments in a vSphere Single Sign-On Domain.
  • Reduces the number of nodes to manage and maintain.

Migration

  • vSphere 6.7 is also the last release to include vCenter Server for Windows, which has been deprecated.
  • migrate to the vCenter Server Appliance with the built-in Migration Tool
  • Deploy & import all data
  • Deploy & import data in the background
  • Customers will also get an estimated time of how long each option will take when migrating

Upgrading

  • vSphere 6.7. will support upgrades and migrations only from vSphere 6.0 or 6.5
  • vSphere 5.5 does not have a direct upgrade path to vSphere 6.7
  • Upgrade path: vSphere 5.5 to vSphere 6.0 or 6.5, and then to vSphere 6.7
  • vCenter Server 6.0 or 6.5 managing ESXi 5.5 hosts cannot be upgraded or migrated until the hosts have been upgraded to at least ESXi 6.0
  • Reminder: end of general support for vSphere 5.5 is September 19, 2018.

Monitoring and Management

  • vSphere Appliance Management Interface (VAMI) on port 5480 has received an update to the Clarity UI
  • There is now a tab dedicated to monitoring. Here you can see CPU, memory, network, database and disk utilization.
  • Another new tab called Services is also within the VAMI, giving the option to start, stop, and restart vCenter Server services if needed
  • vSphere 6.7 also marks the final release of the vSphere Web Client (Flash). Some of the newer workflows in the updated vSphere HTML5 Client release include:
    • vSphere Update Manager
    • Content Library
    • vSAN
    • Storage Policies
    • Host Profiles
    • vDS Topology Diagram
    • Licensing

More information here: Introducing vCenter Server 6.7 / VMware Blogs

vSphere with Operations Management 6.7

  • new plugin for the vSphere Client. This plugin is available out-of-the-box and provides some great new functionality
  • When interacting with this plugin, you will be greeted with 6 vRealize Operations Manager (vROps) dashboards directly in the vSphere client
  • overview, cluster view, and alerts for both vCenter and vSAN views
  • The new Quick Start page is making it easier to get directly to the data you need to
  • four use cases: Optimize Performance, Optimize Capacity, Troubleshoot, and Manage Configuration
  • The Workload Optimization dashboard was updated. Workload Optimization takes predictive analytics and uses them in conjunction with vSphere Distributed Resource Scheduler (DRS) to move workloads between clusters. New with vROps 6.7, you can now fine tune the configuration for workload optimization
  • vROps 6.7 introduced a completely new capacity engine that is smarter and much faster

More information here: vSphere with Operations Management 6.7 / VMware Blogs

vSphere 6.7 Security

  • TPM 2.0 support for ESXi
  • Virtual TPM 2.0 for VMs
  • Support for Microsoft Virtualization Based Security
  • UI updates (combined all encryption functions (VM Encryption, vMotion Encryption) into one panel in VM Options)
  • Multiple SYSLOG targets
  • FIPS 140-2 validated cryptographic modules – by default!

More information here: vSphere 6.7 Security / VMware Blogs

Developer and Automation Interfaces for vSphere 6.7

  • Added functionality to existing APIs in vSphere 6.7
  • Coverage of new areas
  • Appliance API updates: from prechecks to staging to installation and validation, it’s all available by API now
  • vCenter API updates: new APIs have been added to interact with the VM’s guest operating system (OS), viewing Storage Policy Based Management (SPBM) policies, and managing vCenter server services
  • also a handful of new APIs to handle the deployment and lifecycle of the vCenter server
  • a handful of updates to the vSphere Web Services (SOAP) APIs as well

More information here: Developer and Automation Interfaces for vSphere 6.7 / VMware Blogs

Faster Lifecycle Management Operations in VMware vSphere 6.7

  • brand-new Update Manager interface which is now part of the HTML5 Client
  • Update Manager in vSphere 6.7 keeps VMware ESXi 6.0 to 6.7 hosts reliable and secure
  • the new UI provides a much more streamlined remediation process, requiring just a few clicks to begin the procedure. It’s not just a port from the old Flash client
  • Hosts that are currently on ESXi 6.5 will be upgraded to 6.7 significantly faster than ever before
  • Several optimizations have been made for that upgrade path, including eliminating one of two reboots traditionally required for a host upgrade
  • Quick Boot eliminates the time-consuming hardware initialization phase by shutting down ESXi in an orderly manner and then immediately re-starting it

More information here: Faster Lifecycle Management Operations in VMware vSphere 6.7 / VMware Blogs

vSphere 6.7 for Enterprise Applications

  • include support for Persistent Memory (PMEM) and enhanced support for Remote Directory Memory Access (RDMA)
  • PMEM is a new layer called Non-Volatile Memory (NVM) and sits between NAND flash and DRAM, providing faster performance relative to NAND flash but also providing the non-volatility not typically found in traditional memory offerings
  • new protocol support for Remote Direct memory Access (RDMA) over Converged Ethernet, or RoCE (pronounced “rocky”) v2, a new software Fiber Channel over Ethernet (FCoE) adapter, and iSCSI Extension for RDMA (iSER)

More information here: vSphere 6.7 for Enterprise Applications / VMware Blogs

driftar.ch is now serverless

wordpress serverless

Some of you, my fellow readers, have probably noticed some downtime on my website yesterday. I migrated my blog to another hosting provider. That’s the only reason. You’re asking why? Well, I like a certain level of consistency in some areas of my interests. But in technology, especially in IT, the only consistency is constant change. So i decided to move my website (again).

How it all started…

It all began on Twitter when I joined a discussion about where to place the search bar or search field on a website:

That led us to the conclusion that a website should have no clutter and no word / tag clouds. I agree with that, since it’s not up to date anymore. But hey, that’s just my personal opinion.

And the discussion finally ended at the performance of a website.

I did some testings on my “old” website / provider and i wasn’t very sad about the results, but also not very happy. There is always room for improvements. So i did some research on the topic of using WordPress as content management system but providing static websites. Delivering static content like HTML files and images is way faster then delivering dynamic content. Even if you’re working with caching plugins and all that stuff. At least some of the content is still dynamic, thus generated when you’re accessing the website. Please don’t blame me if i’m not 100% correct, i’m not a professional web developer, but at least i know some basics here.

When you look at these tweets, you will find at least two solution which generate static content out of dynamic content, gohugo.io and jekyll. I think both solutions are great, if you know basic stuff about frameworks, programming languages and some more stuff. I tried it, i really tried it. But i failed. In my eyes both solutions are complex to setup and maintain. As i said, i’m not a pro web dev. And if you’re used to certain content management systems like WordPress, then it’s hard to switch.

I moved on with my research. As my employer is using various cloud solutions, like Amazon, i thought why not going (back) to Amazon?

Read moredriftar.ch is now serverless