Advantages of focusing on hardware may include speedup, reduced power consumption,[1] lower latency, increased parallelism[2] and bandwidth, and better utilization of area and functional components available on an integrated circuit; at the cost of lower ability to update designs once etched onto silicon and higher costs of functional verification, times to market, and the need for more parts.
Hardware description languages (HDLs) such as Verilog and VHDL can model the same semantics as software and synthesize the design into a netlist that can be programmed to an FPGA or composed into the logic gates of an ASIC.
If needed calculations are specified in a register transfer level (RTL) hardware design, the time and circuit area costs that would be incurred by instruction fetch and decoding stages can be reclaimed and put to other uses.
Greater RTL customization of hardware designs allows emerging architectures such as in-memory computing, transport triggered architectures (TTA) and networks-on-chip (NoC) to further benefit from increased locality of data to execution context, thereby reducing computing and communication latency between modules and functional units.
Custom hardware is limited in parallel processing capability only by the area and logic blocks available on the integrated circuit die.
It is common to build multicore and manycore processing units out of microprocessor IP core schematics on a single FPGA or ASIC.
[11][12][13][14][15] Similarly, specialized functional units can be composed in parallel, as in digital signal processing, without being embedded in a processor IP core.
Therefore, hardware acceleration is often employed for repetitive, fixed tasks involving little conditional branching, especially on large amounts of data.
As device mobility has increased, new metrics have been developed that measure the relative performance of specific acceleration protocols, considering characteristics such as physical hardware dimensions, power consumption, and operations throughput.