时间:2021-07-15 | 标签: | 作者:Q8 | 来源:小小小小杜网络
小提示:您能找到这篇{腾讯云带你一篇读懂Kubernetes Scheduler扩展功能(一}绝对不是偶然,我们能帮您找到潜在客户,解决您的困扰。如果您对本页介绍的腾讯云带你一篇读懂Kubernetes Scheduler扩展功能(一内容感兴趣,有相关需求意向欢迎拨打我们的服务热线,或留言咨询,我们将第一时间联系您! |
< font-size: 16px;">< ">前言 < font-size: 16px;">Scheduler是Kubernetes组件中功能&逻辑相对单一&简单的模块,它主要的作用是:watch kube-apiserver,监听PodSpec.NodeName为空的pod,并利用预选和优选算法为该pod选择一个最佳的调度节点,最终将pod与该节点进行绑定,使pod调度在该节点上运行。 < font-size: 16px;">展开上述调用流程中的scheduler部分,内部细节调用(参考Kubernetes Scheduler)如图所示: < font-size: 16px;">scheduler内部预置了很多预选和优选算法(参考scheduler_algorithm),比如预选:NoDiskConflict,PodFitsResources,MatchNodeSelector,CheckNodeMemoryPressure等;优选:LeastRequestedPriority,BalancedResourceAllocation,CalculateAntiAffinityPriority,NodeAffinityPriority等。但是在实际生产环境中我们常常会需要一些特殊的调度策略,比如批量调度(aka coscheduling or gang scheduling),这是kubernetes默认调度策略所无法满足的,这个时候就需要我们对scheduler进行扩展来实现这个功能了。 scheduler扩展方案 < font-size: 16px;">目前Kubernetes支持四种方式实现客户自定义的调度算法(预选&优选),如下: < font-size: 16px;">default-scheduler recoding:直接在Kubernetes默认scheduler基础上进行添加,然后重新编译kube-scheduler < font-size: 16px;">standalone:实现一个与kube-scheduler平行的custom scheduler,单独或者和默认kube-scheduler一起运行在集群中 < font-size: 16px;">scheduler extender:实现一个"scheduler extender",kube-scheduler会调用它(http/https)作为默认调度算法(预选&优选&bind)的补充 < font-size: 16px;">scheduler framework:实现scheduler framework plugins,重新编译kube-scheduler,类似于第一种方案,但是更加标准化,插件化 < font-size: 16px;">下面分别展开介绍这几种方式的原理和开发指引。 default-scheduler recoding < font-size: 16px;">这里我们先分析一下kube-scheduler调度相关入口: < font-size: 16px;">设置默认预选&优选策略 < font-size: 16px;">见defaultPredicates以及defaultPriorities(k8s.io/kubernetes/pkg/scheduler/algorithmprovider/defaults/defaults.go): func init() { registerAlgorithmProvider(defaultPredicates(), defaultPriorities()) } func defaultPredicates() sets.String { return sets.NewString( predicates.NoVolumeZoneConflictPred, predicates.MaxEBSVolumeCountPred, predicates.MaxGCEPDVolumeCountPred, predicates.MaxAzureDiskVolumeCountPred, predicates.MaxCSIVolumeCountPred, predicates.MatchInterPodAffinityPred, predicates.NoDiskConflictPred, predicates.GeneralPred, predicates.PodToleratesNodeTaintsPred, predicates.CheckVolumeBindingPred, predicates.CheckNodeUnschedulablePred, ) } func defaultPriorities() sets.String { return sets.NewString( priorities.SelectorSpreadPriority, priorities.InterPodAffinityPriority, priorities.LeastRequestedPriority, priorities.BalancedResourceAllocation, priorities.NodePreferAvoidPodsPriority, priorities.NodeAffinityPriority, priorities.TaintTolerationPriority, priorities.ImageLocalityPriority, ) } func registerAlgorithmProvider(predSet, priSet sets.String) { // Registers algorithm providers. By default we use 'DefaultProvider', but user can specify one to be used // by specifying flag. scheduler.RegisterAlgorithmProvider(scheduler.DefaultProvider, predSet, priSet) // Cluster autoscaler friendly scheduling algorithm. scheduler.RegisterAlgorithmProvider(ClusterAutoscalerProvider, predSet, copyAndReplace(priSet, priorities.LeastRequestedPriority, priorities.MostRequestedPriority)) } const ( // DefaultProvider defines the default algorithm provider name. DefaultProvider = "DefaultProvider" ) < font-size: 16px;">注册预选和优选相关处理函数 < font-size: 16px;">注册预选函数(k8s.io/kubernetes/pkg/scheduler/algorithmprovider/defaults/register_predicates.go): func init() { ... // Fit is determined by resource availability. // This predicate is actually a default predicate, because it is invoked from // predicates.GeneralPredicates() scheduler.RegisterFitPredicate(predicates.PodFitsResourcesPred, predicates.PodFitsResources) } < font-size: 16px;">注册优选函数(k8s.io/kubernetes/pkg/scheduler/algorithmprovider/defaults/register_priorities公关有哪些危机.go): func init() { ... // Prioritizes nodes that have labels matching NodeAffinity scheduler.RegisterPriorityMapReduceFunction(priorities.NodeAffinityPriority, priorities.CalculateNodeAffinityPriorityMap, priorities.CalculateNodeAffinityPriorityReduce, 1) } < font-size: 16px;">编写预选和优选处理函数 < font-size: 16px;">PodFitsResourcesPred对应的预选函数如下(k8s.io/kubernetes/pkg/scheduler/algorithm/predicates/predicates.go): // PodFitsResources checks if a node has sufficient resources, such as cpu, memory, gpu, opaque int resources etc to run a pod. // First return value indicates whether a node has sufficient resources to run a pod while the second return value indicates the // predicate failure reasons if the node has insufficient resources to run the pod. func PodFitsResources(pod *v1.Pod, meta Metadata, nodeInfo *schedulernodeinfo.NodeInfo) (bool, []PredicateFailureReason, error) { node := nodeInfo.Node() if node == nil { return false, nil, 央视广告投放流程fmt.Errorf("node not found") } var predicateFails []PredicateFailureReason allowedPodNumber := nodeInfo.AllowedPodNumber() if len(nodeInfo.Pods())+1 > allowedPodNumber { predicateFails = append(predicateFails, NewInsufficientResourceError(v1.ResourcePods, 1, int64(len(nodeInfo.Pods())), int64(allowedPodNumber))) } // No extended resources should be ignored by default. ignoredExtendedResources := sets.NewString() var podRequest *schedulernodeinfo.Resource if predicateMeta, ok := meta.(*predicateMetadata); ok && predicateMeta.podFitsResourcesMetadata != nil { podRequest = predicateMeta.podFitsResourcesMetadata.podRequest if predicateMeta.podFitsResourcesMetadata.ignoredExtendedResources != nil { ignoredExtendedResources = predicateMeta.podFitsResourcesMetadata.ignoredExtendedResources } } else { // We couldn't parse metadata - fallback to computing it. podRequest = GetResourceRequest(pod) } if podRequest.MilliCPU == 0 && podRequest.Memory == 0 && podRequest.EphemeralStorage == 0 && len(podRequest.ScalarResources) == 0 { return len(predicateFails) == 0, predicateFails, nil } allocatable := nodeInfo.AllocatableResource() if allocatable.MilliCPU < podRequest.MilliCPU+nodeInfo.RequestedResource().MilliCPU { predicateFails = append(predicateFails, NewInsufficientResourceError(v1.ResourceCPU, podRequest.MilliCPU, nodeInfo.RequestedResource().MilliCPU, allocatable.MilliCPU)) } if allocatable.Memory < podRequest.Memory+nodeInfo.RequestedResource().Memory { predicateFails = append(predicateFails, NewInsufficientResourceError(v1.ResourceMemory, podRequest.Memory, nodeInfo.RequestedResource().Memory, allocatable.Memory)) } if allocatable.EphemeralStorage < podRequest.EphemeralStorage+nodeInfo.RequestedResource().EphemeralStorage { predicateFails = append(predicateFails, NewInsufficientResourceError(v1.ResourceEphemeralStorage, podRequest.EphemeralStorage, nodeInfo.RequestedResource().EphemeralStorage, allocatable.EphemeralStorage)) } for rName, rQuant := range podRequest.ScalarResources { if v1helper.IsExtendedResourceName(rName) { // If this resource is one of the extended resources that should be // ignored, we will skip checking it. if ignoredExtendedResources.Has(string(rName)) { continue } } if allocatable.ScalarResources[rName] < rQuant+nodeInfo.RequestedResource().ScalarResources[rName] { predicateFails = append(predicateFails, NewInsufficientResourceError(rName, podRequest.ScalarResources[rName], nodeInfo.RequestedResource().ScalarResources[rName], allocatable.ScalarResources[rName])) } } if klog.V(10) { if len(predicateFails) == 0 { // We explicitly don't do klog.V(10).Infof() to avoid computing all the parameters if this is // not logged. There is visible performance gain from it. klog.Infof("Schedule Pod %+v on Node %+v is allowed, Node is running only %v out of %v Pods.", podName(pod), node.Name, len(nodeInfo.Pods()), allowedPodNumber) } } return len(predicateFails) == 0, predicateFails, nil } 优选NodeAffinityPriority对应的Map与Reduce函数(k8s.io/kubernetes/pkg/scheduler/algorithm/priorities/node_affinity.go)如下: // CalculateNodeAffinityPriorityMap prioritizes nodes according to node affinity scheduling preferences // indicated in PreferredDuringSchedulingIgnoredDuringExecution. Each time a node matches a preferredSchedulingTerm, // it will get an add of preferredSchedulingTerm.Weight. Thus, the more preferredSchedulingTerms // the node satisfies and the more the preferredSchedulingTerm that is satisfied weights, the higher // score the node gets. func CalculateNodeAffinityPriorityMap(pod *v1.Pod, meta interface{}, nodeInfo *schedulernodeinfo.NodeInfo) (framework.NodeScore, error) { node := nodeInfo.Node() if node == nil { return framework.NodeScore{}, fmt.Errorf("node not found") } // default is the podspec. affinity := pod.Spec.Affinity if priorityMeta, ok := meta.(*priorityMetadata); ok { // We were able to parse metadata, use affinity from there. affinity = priorityMeta.affinity } var count int32 // A nil element of PreferredDuringSchedulingIgnoredDuringExecution matches no objects. // An element of PreferredDuringSchedulingIgnoredDuringExecution that refers to an // empty PreferredSchedulingTerm matches all objects. if affinity != nil && affinity.NodeAffinity != nil && affinity.NodeAffinity.PreferredDuringSchedulingIgnoredDuringExecution != nil { // Match PreferredDuringSchedulingIgnoredDuringExecution term by term. for i := range affinity.NodeAffinity.PreferredDuringSchedulingIgnoredDuringExecution { preferredSchedulingTerm := &affinity.NodeAffinity.PreferredDuringSchedulingIgnoredDuringExecution[i] if preferredSchedulingTerm.Weight == 0 { continue } // TODO: Avoid computing it for all nodes if this becomes a performance problem. nodeSelector, err := v1helper.NodeSelectorRequirementsAsSelector(preferredSchedulingTerm.Preference.MatchExpressions) if err != nil { return framework.NodeScore{}, err } if nodeSelector.Matches(labels.Set(node.Labels)) { count += preferredSchedulingTerm.Weight } } } return framework.NodeScore{ Name: node.Name, Score: int64(count), }, nil } // CalculateNodeAffinityPriorityReduce is a reduce function for node affinity priority calculation. var CalculateNodeAffinityPri舆情回复orityReduce = NormalizeReduce(framework.MaxNodeScore, false) |
上一篇:美妆品牌如何在TikTok营销:定制专属歌曲,视频
下一篇:Facebook跨境营销四大方案来了!
基于对传统行业渠道的理解,对互联网行业的渠道我们可以下这样一个定义:一切...
小米应用商店的后台操作和苹果是比较相似的,因为都能填写100字符关键词,允许...
小米的规则目前是在变更中的,但是根据经验小米的搜索排名评分的高低是个很重...
为了恰饭,有时候是要接入一些广告的,所以FB也专门有一个广告的SDK,这就是A...
在 2018 年于旧金山举行的游戏开发者大会上,Amazon Web Services (AWS) 曾宣布,目前世...
关于Facebook Audience Network如何收款的问题,其实官方已经给了详细的步骤。本文主要...
本文介绍了Audience Network对广告载体的质量检查,以及它重点广告形式需要注意的问...
随着iOS开发,作为开发者或公司需要针对iOS App开发涉及的方方面面作出对应的信息...
Facebook和谷歌对出海企业广告渠道都很熟悉,但事实上,在国外还有一些渠道也很...
卖家从做号的第1分钟开始,就一定要想好变现路径是什么?一定要以变现为目的去...
小提示:您应该对本页介绍的“腾讯云带你一篇读懂Kubernetes Scheduler扩展功能(一”相关内容感兴趣,若您有相关需求欢迎拨打我们的服务热线或留言咨询,我们尽快与您联系沟通腾讯云带你一篇读懂Kubernetes Scheduler扩展功能(一的相关事宜。
关键词:腾讯云带你一篇读懂Kube