By Walfredo Cirne, Narayan Desai, Eitan Frachtenberg, Uwe Schwiegelshohn

This publication constitutes the completely refereed lawsuits of the sixteenth overseas Workshop on activity Scheduling techniques for Parallel Processing, JSSPP 2012, which used to be held in Shanghai, China, in may well 2012. The 14 revised papers offered have been rigorously reviewed and chosen from 24 submissions. The papers conceal the subsequent themes: parallel batch scheduling; workload research and modeling; source administration approach software program reports; and net scheduling.

When a query is submitted, the front-end scheduler converts the center of the query to its corresponding one-dimensional point on the Hilbert curve, determines which sub-range includes the point, and assigns the query to the back-end server that owns the sub-range. Assigning nearby queries in one-dimensional sub-ranges takes advantage of the Hilbert curve properties. As shown in Figure 4(a), the one dimensional boundaries on the Hilbert curve cluster two dimensional queries so that they have good spatial locality.

Comprehensive evaluation with trace-driven simulation. We conducted experiments using job traces collected from four production systems, including Intrepid, currently ranked 15th on the Top500 list. Our results indicate that compared with the classical FCFS-based backfill algorithm, the checkpointbased approach is capable of producing significant improvements (by up to 40%) to key scheduling performance metrics, such as job average wait time, slowdown, and the mean queue length. In our evaluation, we also estimated the overhead incurred by checkpoint/ restart operations, based on system information from the Intrepid system.

1 range of the estimated time, across the four systems. As to be discussed in more detail in the next section, such dramatic job wall time estimate error leads to significant problems in backfilling effectiveness. To improve scheduling performance, many previous studies have targeted improving the accuracy of job execution time estimate [13–15], with only limited success. In fact, there are several factors leading to inaccurate estimates, mostly overestimates. Firstly, users may not have enough knowledge on the expected execution time of their jobs (especially with short testing jobs and jobs with new input/parameters/algorithms), and choose to err on the safe side.

