-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generalize virtualization stack to different libvirt hypervisor-drivers #259
base: main
Are you sure you want to change the base?
Changes from 4 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,130 @@ | ||
# Overview | ||
|
||
This KubeVirt design proposal discusses how KubeVirt can be used to create `libvirt` virtual machines that are backed by diverse hypervisor drivers, such as QEMU/KVM, Xen, VirtualBox, etc. The aim of this proposal is to enumerate the design and implementation choices for enabling this multi-hypervisor support in KubeVirt. | ||
|
||
## Motivation | ||
|
||
Although KubeVirt currently relies on libvirt to create and manage virtual machine instances (VMIs), it relies specifically on the QEMU/KVM virtualization stack (VMM and hypervisor) to host the VMI. This limits KubeVirt from being used in settings where a different VMM or hypervisor is used. | ||
|
||
In fact, libvirt itself is flexible enough to support a diverse set of VMMs and hypervisors. The libvirt API delegates its implementation to one or more internal drivers, dependending on the [connection URI](https://libvirt.org/uri.html) passed when initializing the library. The list of currently supported hypervisor drivers in Libvirt are: | ||
- [LXC - Linux Containers](https://libvirt.org/drvlxc.html) | ||
- [OpenVZ](https://libvirt.org/drvopenvz.html) | ||
- [QEMU/KVM/HVF](https://libvirt.org/drvqemu.html) | ||
- [VirtualBox](https://libvirt.org/drvvbox.html) | ||
- [VMware ESX](https://libvirt.org/drvesx.html) | ||
- [VMware Workstation/Player](https://libvirt.org/drvvmware.html) | ||
- [Xen](https://libvirt.org/drvxen.html) | ||
- [Microsoft Hyper-V](https://libvirt.org/drvhyperv.html) | ||
- [Virtuozzo](https://libvirt.org/drvvirtuozzo.html) | ||
- [Bhyve - The BSD Hypervisor](https://libvirt.org/drvbhyve.html) | ||
- [Cloud Hypervisor](https://libvirt.org/drvch.html) | ||
|
||
There are several parts in the KubeVirt source-code that hard-code the use of the QEMU/KVM hypervisor driver, which prevents the creation of VMIs using another hypervisor driver. Therefore, KubeVirt needs to be updated to introduce the choice of the backend libvirt hypervisor driver to use for creating a given VMI. This would expand the set of scenarios in which KubeVirt can be used. | ||
|
||
## Goals | ||
|
||
KubeVirt should be able to offer a choice to its users over which libvirt hypervisor-driver they want to use to create their VMI. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why? This section is impotrant: Please provide a justification of how this will help KubeVirt users, or KubeVirt. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Microsoft is a vendor of KubeVirt as it leverages KubeVirt in its Azure Operator Nexus product as a VM orchestrator. The hypervisor currently used in the Nexus product is qemu-kvm, however, in the future MSFT is looking at alternative hypervisors such as cloud-hypervisor. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There is also another project called Virtink which was created to add K8s-based orchestration support for cloud-hypervisor based VMs. |
||
|
||
## Non Goals | ||
|
||
Support all features available in KubeVirt in all libvirt hypervisor drivers. As libvirt makes progress in bringing feature parity among its hypervisor drivers, KubeVirt will also enable more features in the evolving hypervisor drivers. | ||
|
||
## Definition of Users | ||
|
||
This proposal is aimed at serving users who intend to use KubeVirt on a cluster where at least one node has a virtualization stack different from the default QEMU-KVM stack. | ||
|
||
## User Stories | ||
|
||
- A user trying to use KubeVirt on a cluster of machines that have a hypervisor-VMM pair that is not necessarily QEMU/KVM. | ||
- A regular user of libvirt with any of its supported hypervisor drivers who now wants to leverage the orchestration capability provided by KubeVirt. But the user does not want to abandon their hypervisor driver of choice. | ||
- A regular user of a hypervisor-VMM-specific orchestration capability to expand their use to a cluster with a diverse set of hypervisor-VMM pairs. | ||
|
||
## Repos | ||
|
||
- KubeVirt | ||
- Libvirt | ||
|
||
# Design | ||
|
||
## API Changes | ||
|
||
Addition of a `vmi.hypervisor` field. Example of how a user could request a specific hypervisor as the underlying libvirt hypervisor-driver through the VMI spec: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. VMI seems to be quite fine granular. if,t hen shouldnÄt it be a cluster level setting? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am considering a scenario in which different cluster nodes could have a different virtualization-stack. In KubeVirt There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What is the reason for having different hypervisors in a single cluster? There is also a cluster level impact, i.e. the overhead calculation. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No specific reason. Based on my understanding of the KubeVirt code, the overhead calculation is for the virt-launcher pod alone, so one could (in theory) have diff virt-launcher pods with diff hypervisors running on the same cluster. However, I could be wrong, so please correct me. I don't have a specific scenario in mind as of now that would require multiple hypervisors on the same cluster, but it is a more flexible design choice imo to have the hypervisor-specific logic limited to the components that are tied to specific nodes (i.e., virt-handler and virt-launcher). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. IMO, it makes sense to allow multiple hypervisors in the same cluster. Different hypervisor could fit to different use cases and we could have a unified management. |
||
|
||
```yaml | ||
spec: | ||
hypervisor: cloud-hypervisor | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think here it is a good fit for something similar to the kubernetes runtime classes. If we need additional configuration specific to the hypervisor they could go into a new CRD |
||
``` | ||
|
||
- Introduction of a new field `hypervisor` in `VirtualMachineInstanceSpec` | ||
- By default if no `hypervisor` is provided, it would default to `qemu-kvm`. | ||
- The set of acceptable values for the `hypervisor` field would be the set of all hypervisor-drivers that libvirt supports. | ||
|
||
### Supported feature check | ||
|
||
Maintain a list of which VMI features are supported by different hypervisor-drivers. If the KubeVirt API user requests a feature for a VMI that is not supported by the requested hypervisor-driver, then the request for creation of VMI should be rejected. | ||
|
||
|
||
## Generalization of KubeVirt components | ||
|
||
### VMI spec to virt-launcher pod spec by Virt-Controller | ||
|
||
Conversion of the VMI spec to `virt-launcher` pod spec needs to take into account the `vmi.hypervisor` field. The value of this field would affect the following: | ||
|
||
- `virt-launcher` pod image should be specific to the `vmi.hypervisor`. | ||
|
||
- Hypervisor resource needed by the `virt-launcher` pod. For instance, for a VMI with `hypervisor=qemu-kvm`, the corresponding virt-launcher pod requires the resource `devices.kubevirt.io/kvm: 1`. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please, take emulation also into account There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you please expand on your comment? |
||
|
||
- The resource requirement of the virt-launcher pod should be adjusted (w.r.t. to the resource spec of the VMI), to take into account the resources consumed by the requested VMM daemon running in the `virt-launcher` pod. Currently, the field `VirtqemudOverhead` holds the memory overhead of the `virtqemud` process. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could this go in the new Hypervisor Runtime CRD? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That is a good idea. |
||
|
||
### Virt-Handler | ||
|
||
- Label each node based on which hypervisor devices (e.g., `/dev/kvm` or `/dev/mshv`) are available, e.g., `devices.kubevirt.io/kvm: 1000`. | ||
|
||
### Virt-Launcher | ||
|
||
- Launch the libvirt modular daemon (e.g., `virtqemud`, `virtxend`) if available for the `vmi.hypervisor` choice, otherwise launch the monolithic libvirtd daemon. | ||
|
||
- Libvirt connection URI is hypervisor-driver specific. | ||
|
||
- Creation of auxilliary resources for the libvirt domain, such as the cloud-init disk needs to be done in a hypervisor-specific way. For instance, the `cloud-hypervisor` VMM does not support ISOs - in which case cloud-init needs to be provided as a `raw` disk. | ||
|
||
- Conversion of VMI spec to libvirt domain XML needs to be hypervisor-driver specific. E.g., `cloud-hypervisor` does not support ISOs. Therefore, `cloud-init` needs to be provided as a `raw` disk. | ||
|
||
### VMI flow | ||
|
||
The flow of a VMI lifecycle would remain the same as before, with the addition of hypervisor-specific logic at virt-controller, virt-handler and virt-launcher. | ||
|
||
![image info](./kubvirt-vmi-flow.drawio.png) | ||
|
||
## Functional Testing Approach | ||
|
||
For each supported value of `vmi.hypervisor`, do the following: | ||
|
||
- Setup KubeVirt on a cluster with the given hypervisor. | ||
|
||
- Ensure that each node is labeled with the correct hypervisor, such that each node has the resource `devices.kubevirt.io/<hypervisor>: 1000`. | ||
|
||
- Create VMIs with all features supported the given hypervisor. | ||
|
||
|
||
# Implementation Phases | ||
|
||
- Extension of KubeVirt API to include `vmi.hypervisor` field. | ||
|
||
- Creation of virt-launcher images for the supported libvirt hypervisor-drivers. | ||
- Test these images independently of being able to spin-up pods and launch libvirt domains, similar to what would be done in a typical KubeVirt VMI creation flow. | ||
|
||
- Refactoring of virt-controller code to create | ||
|
||
- Refactoring virt-handler code to remove hardcoded references to `qemu/kvm` and generalize the code to handle multiple hypervisor-drivers. | ||
|
||
- Refactoring of virt-launcher code to convert a very simple VMI spec to libvirt domain for all supported hypervisor-drivers. | ||
|
||
- Progressively test KubeVirt VMI features against different libvirt hypervisor-drivers and resolve limitations in virt-launcher code-base. | ||
|
||
|
||
# References | ||
|
||
1. [Cloud Hypervisor integration - Google Groups](https://groups.google.com/g/kubevirt-dev/c/Pt9CDYJOR2A) | ||
2. [[RFC] Cloud Hypervisor integration POC](https://github.com/kubevirt/kubevirt/pull/8056) | ||
3. [design-proposals: Cloud Hypervisor integration](https://github.com/kubevirt/community/pull/184) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct that libvirt is supporting many hypervisors.
However, the supported featureset accross all hypervisors (speak the subset of features) is actually much smaller.
This is why KubeVirt inteintionally had focused on KVM only in order to not consider the special cases of different hypevisors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am in touch with the
cloud-hypervisor
community and they are actively working on achieving parity withqemu-kvm
in terms of the VMI features offered by KubeVirt.