Parallel task scheduling

Parallel task scheduling (also called parallel job scheduling[1][2] or parallel processing scheduling[3]) is an optimization problem in computer science and operations research.

In a general job scheduling problem, we are given n jobs J1, J2, ..., Jn of varying processing times, which need to be scheduled on m machines while trying to minimize the makespan - the total length of the schedule (that is, when all the jobs have finished processing).

In the specific variant known as parallel-task scheduling, all machines are identical.

Each job j has a length parameter pj and a size parameter qj, and it must run for exactly pj time-steps on exactly qj machines in parallel.

Veltman et al.[4] and Drozdowski[3] denote this problem by

in the three-field notation introduced by Graham et al.[5] P means that there are several identical machines running in parallel; sizej means that each job has a size parameter; Cmax means that the goal is to minimize the maximum completion time.

The origins of this problem formulation can be traced back to 1960.

[6] For this problem, there exists no polynomial time approximation algorithm with a ratio smaller than

machines during its execution (also called the size or the width of j).

A schedule is feasible if each processor executes at most one job at any given time.

Therefore, it is possible to obtain a compact way of encoding the output with polynomial size.

This problem is NP-hard even when there are only two machines and the sizes of all jobs are

(i.e., each job needs to run only on a single machine).

, there exists a pseudo-polynomial time algorithm, which solves the problem exactly.

, the problem is also strongly NP-hard[8] (this result improved a previous result[7] showing strong NP-hardness for

This holds even for the special case in which the processing time of all jobs is

, since this special case is equivalent to the bin packing problem: each time-step corresponds to a bin, m is the bin size, each job corresponds to an item of size qj, and minimizing the makespan corresponds to minimizing the number of bins.

Contiguous jobs: In this variant, the machines have a fixed order

, the jobs have to be assigned to a contiguous interval of machines.

A usual assumption for this kind of problem is that the total workload of a job, which is defined as

The list scheduling algorithm by Garey and Graham[9] has an absolute ratio

, as pointed out by Turek et al.[10] and Ludwig and Tiwari.

[11] Feldmann, Sgall and Teng[12] observed that the length of a non-preemptive schedule produced by the list scheduling algorithm is actually at most

A polynomial-time approximation scheme (PTAS) for the case when the number

, was presented by Amoura et al.[13] and Jansen et al.[14] Later, Jansen and Thöle [2] found a PTAS for the case where the number of processors is polynomially bounded in the number of jobs.

Since, in general, the number of machines appears only in logarithmic in the size of the instance, this algorithm is a pseudo-polynomial time approximation scheme as well.

-approximation was given by Jansen,[15] which closes the gap to the lower bound of

Given an instance of the parallel task scheduling problem, the optimal makespan can differ depending on the constraint to the contiguity of the machines.

The difference between contiguous and non-contiguous schedules has been first demonstrated in 1992[16] on an instance with

Błądek et al.[17] studied these so-called c/nc-differences and proved the following points: Furthermore, they proposed the following two conjectures, which remain unproven: There are related scheduling problems in which each job consists of several operations, which must be executed in sequence (rather than in parallel).