Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalize virtualization stack to different libvirt hypervisor-drivers #259

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
130 changes: 130 additions & 0 deletions design-proposals/generalize-virt-stack/gen-virt-stack.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,130 @@
# Overview

This KubeVirt design proposal discusses how KubeVirt can be used to create `libvirt` virtual machines that are backed by diverse hypervisor drivers, such as QEMU/KVM, Xen, VirtualBox, etc. The aim of this proposal is to enumerate the design and implementation choices for enabling this multi-hypervisor support in KubeVirt.

## Motivation

Although KubeVirt currently relies on libvirt to create and manage virtual machine instances (VMIs), it relies specifically on the QEMU/KVM virtualization stack (VMM and hypervisor) to host the VMI. This limits KubeVirt from being used in settings where a different VMM or hypervisor is used.

In fact, libvirt itself is flexible enough to support a diverse set of VMMs and hypervisors. The libvirt API delegates its implementation to one or more internal drivers, dependending on the [connection URI](https://libvirt.org/uri.html) passed when initializing the library. The list of currently supported hypervisor drivers in Libvirt are:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct that libvirt is supporting many hypervisors.

However, the supported featureset accross all hypervisors (speak the subset of features) is actually much smaller.

This is why KubeVirt inteintionally had focused on KVM only in order to not consider the special cases of different hypevisors.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am in touch with the cloud-hypervisor community and they are actively working on achieving parity with qemu-kvm in terms of the VMI features offered by KubeVirt.

- [LXC - Linux Containers](https://libvirt.org/drvlxc.html)
- [OpenVZ](https://libvirt.org/drvopenvz.html)
- [QEMU/KVM/HVF](https://libvirt.org/drvqemu.html)
- [VirtualBox](https://libvirt.org/drvvbox.html)
- [VMware ESX](https://libvirt.org/drvesx.html)
- [VMware Workstation/Player](https://libvirt.org/drvvmware.html)
- [Xen](https://libvirt.org/drvxen.html)
- [Microsoft Hyper-V](https://libvirt.org/drvhyperv.html)
- [Virtuozzo](https://libvirt.org/drvvirtuozzo.html)
- [Bhyve - The BSD Hypervisor](https://libvirt.org/drvbhyve.html)
- [Cloud Hypervisor](https://libvirt.org/drvch.html)

There are several parts in the KubeVirt source-code that hard-code the use of the QEMU/KVM hypervisor driver, which prevents the creation of VMIs using another hypervisor driver. Therefore, KubeVirt needs to be updated to introduce the choice of the backend libvirt hypervisor driver to use for creating a given VMI. This would expand the set of scenarios in which KubeVirt can be used.

## Goals

KubeVirt should be able to offer a choice to its users over which libvirt hypervisor-driver they want to use to create their VMI.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why?

This section is impotrant: Please provide a justification of how this will help KubeVirt users, or KubeVirt.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Microsoft is a vendor of KubeVirt as it leverages KubeVirt in its Azure Operator Nexus product as a VM orchestrator. The hypervisor currently used in the Nexus product is qemu-kvm, however, in the future MSFT is looking at alternative hypervisors such as cloud-hypervisor.
To continue using KubeVirt for this product it would make sense to make it hypervisor-agnostic.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is also another project called Virtink which was created to add K8s-based orchestration support for cloud-hypervisor based VMs.
https://github.com/smartxworks/virtink
This shows that there is a need for K8s-based orchestration for cloud-hypervisor VMs, and KubeVirt already interfaces with libvirt - which supports cloud-hypervisor driver.


## Non Goals

Support all features available in KubeVirt in all libvirt hypervisor drivers. As libvirt makes progress in bringing feature parity among its hypervisor drivers, KubeVirt will also enable more features in the evolving hypervisor drivers.

## Definition of Users

This proposal is aimed at serving users who intend to use KubeVirt on a cluster where at least one node has a virtualization stack different from the default QEMU-KVM stack.

## User Stories

- A user trying to use KubeVirt on a cluster of machines that have a hypervisor-VMM pair that is not necessarily QEMU/KVM.
- A regular user of libvirt with any of its supported hypervisor drivers who now wants to leverage the orchestration capability provided by KubeVirt. But the user does not want to abandon their hypervisor driver of choice.
- A regular user of a hypervisor-VMM-specific orchestration capability to expand their use to a cluster with a diverse set of hypervisor-VMM pairs.

## Repos

- KubeVirt
- Libvirt

# Design

## API Changes

Addition of a `vmi.hypervisor` field. Example of how a user could request a specific hypervisor as the underlying libvirt hypervisor-driver through the VMI spec:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VMI seems to be quite fine granular.

if,t hen shouldnÄt it be a cluster level setting?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am considering a scenario in which different cluster nodes could have a different virtualization-stack. In KubeVirt virt-handlers on different cluster nodes are independent, so IMO there is no reason to not set hypervisor at this fine granularity.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason for having different hypervisors in a single cluster?

There is also a cluster level impact, i.e. the overhead calculation.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No specific reason. Based on my understanding of the KubeVirt code, the overhead calculation is for the virt-launcher pod alone, so one could (in theory) have diff virt-launcher pods with diff hypervisors running on the same cluster. However, I could be wrong, so please correct me.

I don't have a specific scenario in mind as of now that would require multiple hypervisors on the same cluster, but it is a more flexible design choice imo to have the hypervisor-specific logic limited to the components that are tied to specific nodes (i.e., virt-handler and virt-launcher).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, it makes sense to allow multiple hypervisors in the same cluster. Different hypervisor could fit to different use cases and we could have a unified management.


```yaml
spec:
hypervisor: cloud-hypervisor
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think here it is a good fit for something similar to the kubernetes runtime classes. If we need additional configuration specific to the hypervisor they could go into a new CRD

```

- Introduction of a new field `hypervisor` in `VirtualMachineInstanceSpec`
- By default if no `hypervisor` is provided, it would default to `qemu-kvm`.
- The set of acceptable values for the `hypervisor` field would be the set of all hypervisor-drivers that libvirt supports.

### Supported feature check

Maintain a list of which VMI features are supported by different hypervisor-drivers. If the KubeVirt API user requests a feature for a VMI that is not supported by the requested hypervisor-driver, then the request for creation of VMI should be rejected.


## Generalization of KubeVirt components

### VMI spec to virt-launcher pod spec by Virt-Controller

Conversion of the VMI spec to `virt-launcher` pod spec needs to take into account the `vmi.hypervisor` field. The value of this field would affect the following:

- `virt-launcher` pod image should be specific to the `vmi.hypervisor`.

- Hypervisor resource needed by the `virt-launcher` pod. For instance, for a VMI with `hypervisor=qemu-kvm`, the corresponding virt-launcher pod requires the resource `devices.kubevirt.io/kvm: 1`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, take emulation also into account

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please expand on your comment?


- The resource requirement of the virt-launcher pod should be adjusted (w.r.t. to the resource spec of the VMI), to take into account the resources consumed by the requested VMM daemon running in the `virt-launcher` pod. Currently, the field `VirtqemudOverhead` holds the memory overhead of the `virtqemud` process.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this go in the new Hypervisor Runtime CRD?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a good idea.


### Virt-Handler

- Label each node based on which hypervisor devices (e.g., `/dev/kvm` or `/dev/mshv`) are available, e.g., `devices.kubevirt.io/kvm: 1000`.

### Virt-Launcher

- Launch the libvirt modular daemon (e.g., `virtqemud`, `virtxend`) if available for the `vmi.hypervisor` choice, otherwise launch the monolithic libvirtd daemon.

- Libvirt connection URI is hypervisor-driver specific.

- Creation of auxilliary resources for the libvirt domain, such as the cloud-init disk needs to be done in a hypervisor-specific way. For instance, the `cloud-hypervisor` VMM does not support ISOs - in which case cloud-init needs to be provided as a `raw` disk.

- Conversion of VMI spec to libvirt domain XML needs to be hypervisor-driver specific. E.g., `cloud-hypervisor` does not support ISOs. Therefore, `cloud-init` needs to be provided as a `raw` disk.

### VMI flow

The flow of a VMI lifecycle would remain the same as before, with the addition of hypervisor-specific logic at virt-controller, virt-handler and virt-launcher.

![image info](./kubvirt-vmi-flow.drawio.png)

## Functional Testing Approach

For each supported value of `vmi.hypervisor`, do the following:

- Setup KubeVirt on a cluster with the given hypervisor.

- Ensure that each node is labeled with the correct hypervisor, such that each node has the resource `devices.kubevirt.io/<hypervisor>: 1000`.

- Create VMIs with all features supported the given hypervisor.


# Implementation Phases

- Extension of KubeVirt API to include `vmi.hypervisor` field.

- Creation of virt-launcher images for the supported libvirt hypervisor-drivers.
- Test these images independently of being able to spin-up pods and launch libvirt domains, similar to what would be done in a typical KubeVirt VMI creation flow.

- Refactoring of virt-controller code to create

- Refactoring virt-handler code to remove hardcoded references to `qemu/kvm` and generalize the code to handle multiple hypervisor-drivers.

- Refactoring of virt-launcher code to convert a very simple VMI spec to libvirt domain for all supported hypervisor-drivers.

- Progressively test KubeVirt VMI features against different libvirt hypervisor-drivers and resolve limitations in virt-launcher code-base.


# References

1. [Cloud Hypervisor integration - Google Groups](https://groups.google.com/g/kubevirt-dev/c/Pt9CDYJOR2A)
2. [[RFC] Cloud Hypervisor integration POC](https://github.com/kubevirt/kubevirt/pull/8056)
3. [design-proposals: Cloud Hypervisor integration](https://github.com/kubevirt/community/pull/184)
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.