You are here

Increasing Task Consolidation Efficiency By Using More Accurate Resource Estimation

Authors: 
Publication Type: 
Refereed Original Article
Abstract: 
Cloud providers aim to provide computing services for a wide range of applications, such as web applications, emails, web searches, and map-reduce jobs. These applications are commonly scheduled to run on multi-sites clusters that nowadays are becoming larger and more heterogeneous. A major challenge is to efficiently utilize the cluster's available resources, in particular to maximize overall machine utilization levels while minimizing the application waiting time. We propose a methodology for achieving an efficient utilization of the cluster's resources while providing the users with fast and reliable computing services. The methodology consists of three main modules: i) a prediction module that forecasts the maximum resource requirement of a task%(i.e., three prediction techniques are proposed) ; ii) a scheduling module that efficiently allocates tasks to machines; and iii) a monitoring module that tracks the levels of utilization of the machines and tasks, and can evict one or more tasks from the machines for rescheduling if required. There are multiple ways of predicting task requirements, scheduling tasks on machines and evicting task from machines. The decisions made in each module can have significant impact on not only the objective function but also on the efficiency of the decisions made in other components. We therefore study these different combinations and analyse their interaction in order to determine a configuration that meets the objective of the problem. To test our methodology we have developed a simulator and provide a detail analysis of these interactions between different modules by using a publicly available trace from a large Google cluster 12,000 machines. Our results show that the impact of more accurate resource estimations for the scheduling of tasks and evicting lower priority tasks in case of over-utilization can lead to an increase in the average utilization of the cluster, a reduction in the number of tasks being evicted, and a reduction in task waiting time.
Digital Object Identifer (DOI): 
null
Publication Status: 
In Press
Date Accepted for Publication: 
Monday, 31 August, 2015
Publication Date: 
31/08/2015
Journal: 
Future Generation Computer Systems
Institution: 
National University of Ireland, Cork (UCC)
Open access repository: 
No
Publication document: