Gang scheduling

If they were not gang-scheduled, then one could wait to send or receive a message to another while it is sleeping, and vice versa.

Gang scheduling is based on a data structure called the Ousterhout matrix.

In this matrix each row represents a time slice, and each column a processor.

The threads or processes of each job are packed into a single row of the matrix.

In this case, when a priority job arrives, the sub-gang which is currently executing on the system will be stopped and all the progress that has been made will be lost and need to be redone.

This interruption of the job will further delay the total response time of the BoG.

As per the first-come, first-served scheme, whichever job that comes first will be forwarded for execution.

[3][4] In the above execution scheme, the tasks which correspond to increasing job size are placed in a queue, with the tasks belonging to the largest gang scheduled first, but this method of execution tends to lead to the starvation of resources of smaller jobs and is therefore unfit to be executed in systems where the number of processors is comparatively low.

There are two scenarios which emerge from the above issue:[5] Gang scheduling while executing the I/O bound processes keeps the CPUs idle while awaiting responses from the other processors, whereas the idle processors can be utilized for executing tasks.

If the gang consists of a mix of CPU and I/O Processes, since these processes interfere little in each other’s operation, algorithms can be defined to keep both the CPU and the I/O busy at the same time and exploiting parallelism.

This method would present the idea of matching pairs of gangs, one I/O based and one CPU bound.

[6] Concurrent gang scheduling a highly scalable and versatile algorithm and assumes the existence of a synchronizer utilizing the internal clock of each node.

[7] We assume the existence of a synchronizer that sends the signal to all the nodes in a cluster at a constant interval.

The CGS utilizes the fact that the most common events which occur in a PC are timer interrupts and they use the same parameter to be the internal clock.

The flavor of scheduling is implemented by collecting jobs with same resource requirements in a group and executing the same for a pre-defined time-slice.

[8] The local memory of the node is utilized as the swap space for pre-empted jobs.

Synchronization: Each gang of processes utilizing the same resources are mapped to a different processor.

Unlike first fit, the used slots are sorted based on capacity, but not in sequential order.

[9] Both the capacity-based and left-right based algorithms do not accommodate the load on individual PEs.

Load-based algorithms take into account the load on the individual PE while tracking the overlap between sets of PEs assigned to different jobs.

Unlike the previous scheme, in which slots were chosen based on the minimal maximum load on

Each member of the group will be assigned a controller and when a job of size n arrives, it is assigned to a controller of size 2[lg 2] (the smallest power to 2 that is larger than or equal to n).

The controller is assigned by first sorting all the used slots, and then identifying groups of 2[lg 2] contiguous free processors.

[9] In all the above-mentioned algorithms, the initial placement policy is fixed and jobs are allocated to the PEs based on that.