[2] Many collective routines and directive-based parallel languages impose implicit barriers.
For example, a parallel do loop in Fortran with OpenMP will not be allowed to continue on any thread until the last iteration is completed.
[citation needed] This is in case the program relies on the result of the loop immediately after its completion.
In message passing, any global communication (such as reduction or scatter) may imply a barrier.
The potential problem with the Centralized Barrier is that due to all the threads repeatedly accessing the global variable for pass/stop, the communication traffic is rather high, which decreases the scalability.
After the final-level synchronization, the releasing signal is transmitted to upper levels and all threads get past the barrier.
This dedicated wire performs OR/AND operation to act as the pass/block flags and thread counter.
For small systems, such a model works and communication speed is not a major concern.
In large multiprocessor systems this hardware design can make barrier implementation have high latency.
The network connection among processors is one implementation to lower the latency, which is analogous to Combining Tree Barrier.