Kueue: Streamlining Kubernetes Job Queueing and Resource Management
Kueue is a powerful tool designed to enhance job queueing and resource management within Kubernetes clusters. By offering a set of APIs and a controller, Kueue empowers you to efficiently manage job execution, resource allocation, and overall cluster performance.
Kueue is a cloud-native job queueing system for batch, HPC, AI/ML, and similar applications in a Kubernetes cluster.
Use Kueue to build a multi-tenant batch service with quotas and a hierarchy for sharing resources among teams in your organization. Based on the available quotas, Kueue decides when jobs should wait and when and where they should run.
Kueue works in combination with standard kube-scheduler, cluster-autoscaler, and the rest of the kubernetes ecosystem. This combination allows Kueue to run both on-prem and in the cloud, where resources can be heterogeneous, fungible, and dynamically provisioned.
Key Features for Optimized Kubernetes Workloads
- Job Management: Kueue enables prioritized job queueing with flexible strategies like
StrictFIFO
andBestEffortFIFO
. - Resource Management: Achieve fair resource sharing and implement preemption policies between different tenants.
- Dynamic Resource Reclaim: Automatically release quota as job pods complete, maximizing resource utilization.
- Resource Flavor Fungibility: Enable quota borrowing or preemption in ClusterQueues and Cohorts for enhanced flexibility.
- Wide Range of Integrations: Seamlessly integrate with popular job types, including BatchJob, Kubeflow training jobs, RayJob, RayCluster, JobSet, and plain Pods.
- Enhanced System Insights: Gain valuable insights into your system's state through built-in Prometheus metrics and Conditions.
Production Readiness and Robust Support
Kueue boasts a mature v1beta1 API version, comprehensive documentation, extensive test coverage, and scalability verification. With regular releases, robust security, and a growing list of adopters in production environments, Kueue is a reliable solution for managing your Kubernetes workloads.
Installation and Usage Made Easy
Kueue is compatible with Kubernetes 1.22 and above. Install the latest release effortlessly with a single command:
# heck version if needed!
kubectl apply --server-side -f https://github.com/kubernetes-sigs/kueue/releases/download/v0.7.0/manifests.yaml
Explore the Kueue documentation for detailed installation guides and usage examples.
Roadmap
Kueue's development is actively driven by community feedback and contributions. In 2023, the project's roadmap prioritizes:
- Cooperative preemption for workloads with checkpointing.
- Flavor assignment strategies for optimizing resource allocation.
- Integration with the cluster-autoscaler for guaranteed resource provisioning.
- Broader integration with various custom workloads like Kubeflow, Spark, Ray, and workflow tools.
Long-term goals include budget support, a user-friendly dashboard, and multi-cluster support.
Community and Support
Join the vibrant Kueue community to stay informed, participate in discussions, and contribute to the project's ongoing development. Connect with maintainers through Slack and the mailing list for support and collaboration.
Get Started with Kueue
Kueue offers a simplified and efficient approach to job queueing and resource management in Kubernetes. By leveraging its powerful features, you can optimize your cluster's performance, ensure fair resource allocation, and streamline your workload management processes. Install Kueue today and experience the benefits firsthand.
- Register with Email
- Login with LinkedIn
- Login with GitHub