Platform9 Multi-node Highly Available Glance Services
With Platform9 Managed OpenStack, administrators can authorize multiple hosts within a region to act as OpenStack Glance image library servers. Adding multiple image libraries has the following benefits:
- Creates a highly available Glance image service deployment that balances load for data-heavy block storage volume (cinder) and virtual machine instance (nova) provisioning operations.
- Eliminates downtime when performing maintenance on or migrating an image library role to a new host.
This post describes Platform9’s support for multi-node glance image services and provides tips that will help simplify OpenStack Glance image management. For general information on the Platform9 Image Library infrastructure, see Handling OpenStack Cloud Images in Platform9.
What is the Platform9 Image Library Role?
In an OpenStack cloud, the Glance collection of services manages virtual machine images and makes them available for provisioning operations. In Platform9 Managed OpenStack, Glance services are installed and managed in an on-premise customer host when the administrator applies the Image Library role to the host. This role installs and configures the glance-api service as well as a supporting pf9-image library service. The pf9-imagelibrary service implements image discovery and performs maintenance on the Glance image repository.
The glance-api service is a simple http web service that stores image metadata, and manages an image file repository attached to the local filesystem. When provisioning an instance in nova, nova-compute reads image metadata from glance and downloads the virtual machine image from Glance’s image store. Conversely, when Nova is asked to create an image from a VM disk, Nova writes image metadata to Glance and uploads an image file to the Glance image store. Cinder volume provisioning and image creation follow similar workflows.
Deploying Multiple Image Libraries
All Nova VM and Cinder volume provisioning operations involve Glance so it’s important that Glance be reliable and able to scale. Deploying multiple Glance services helps with both these requirements in addition to the following benefits:
- When one of the image library hosts is down, Nova and Cinder can choose a working image library host to complete an operation. You now have a highly available Glance image catalog.
- When servicing a host or migrating an image library to a new host, having an additional image library host available eliminates provisioning downtime.
- When many provisioning operations are happening at the same time, the load on the glance-api service can be distributed among all the image library hosts, reducing the load on each.
How Does it Work?
When authorizing a new Glance/Image Library host, Platform9 adds the address of the host to the Keystone service catalog. When asked to perform a deployment operation involving images, Nova and Cinder query the service catalog for a list of glance-api service addresses. The list is randomized, and an attempt is made to perform the operation on each glance-api service. Connection errors or missing image file errors are caught in the deployment logic in each service, and the request is retried on the next service address. The deployment operation fails only when all the available services have failed to do their job.
Image File Storage
When authorizing the image library/Glance role on a host, the administrator must choose a location on disk to act as an image file repository. The default is /var/opt/pf9/imagelibrary/data, but this can be any location available on the host.
There are three ways to add an image file to the Glance image repository in Platform9 Managed OpenStack:
- Copy an image file into the repository directory. The supporting pf9-imagelibrary service will detect the new file, calculate Glance metadata for it, and add it to Glance.
- Upload an image using the Glance command line client (or through some other REST client). See Tutorial: Manage Images with the OpenStack Glance Client.
- Create an image as a snapshot of a virtual machine disk or Cinder volume.
In each case, when a new image is created, it will only be uploaded to one of the authorized image library hosts, and the image file will be stored in the chosen image repository location on that host. Requests to download the file from another image library host will fail unless the image file is made available on that host. To maximize image availability and fault tolerance, each image library host must have access to all the image files catalogued by Glance. There are two ways to ensure this:
- Connect all of the authorized image library hosts to a shared storage backend (for example an NFS share), so an image file uploaded to one of the image library hosts is immediately available for provisioning from any of them. See Configure NFS Shared Storage for OpenStack Glance Image Catalog or VM Storage.
- If shared storage is not available and each image library host uses independent storage for image files, then the image file repositories should be periodically synchronized. There are numerous tools available to do this, for example Rsync and Unison.
In either case the image repository directory must be the same on each image library host. This means that if one image library host mounts an NFS image repository filesystem at /images, all others must use the same mount point.
The Platform9 user interface’s Images view can provide clues about the availability of your images. The “Host Status description for each image shows whether or not the image is available on each image library host:
In the above scenario, bob200-10-4-253-162.platform9.sys and platform9-support-host represent two different authorized image library hosts. These hosts do not share storage, and the image repositories have not been properly synchronized. As a result, bob200-10-4-253-162.platform9.sys can only provide snap2 and cirros5.img, while platform9-support-host can only provide snap1. All three images are currently available for provisioning, but a failure of either host will result in the unavailability of some images.
The image view may show other values for the Host Status. Here is the complete list:
- ok: the image is available for provisioning from that host.
- missing: the image file doesn’t exist on the host.
- no-access: The file is there, but permissions won’t allow the glance-api services to read it. Both glance-api and the pf9-imagelibrary services run as the pf9 user, and to use an image file for provisioning, the pf9 user must be able to read it.
- cannot-delete: The image is available, but the file can’t be deleted from the file system by the image library services. If the image is deleted in Glance, it must be removed from the image repository filesystem manually. To avoid this, ensure that the pf9 user has “write permission” on the image file’s immediate parent directory, and execute permission on each directory in the image path.
- offline: The host itself is not responding, or there was a problem with one of the services associated with the image library role.
To maximize fault tolerance in your image library cluster, you would ideally have a Host Status of ‘ok’ for every image on every image library host.
With these features for image management incorporated into Platform9, it is easier than ever to manage OpenStack Glance services and images for your cloud. The ability to install multiple image library roles improves reliability and reduces downtime, and improved health monitoring provides confidence that users have access to the images they need.
- [Video] KubeVirt – Beyond Containers: Coming full circle back to VMs! - September 12, 2019
- The unforgiving cycle of cloud infrastructure costs (and the CAP theorem that drives them) - April 23, 2019
- Transitioning from managing VMs to orchestrating containers - November 28, 2018