Transport triggered architecture

However, it also means that a binary compiled for one TTA processor will not run on another one without recompilation if there is even a small difference in the architecture between the two.

The binary incompatibility problem, in addition to the complexity of implementing a full context switch, makes TTAs more suitable for embedded systems than for general purpose computing.

The low level programming model enables several benefits in comparison to the standard VLIW.

As the programmer is in control of the timing of the operand and result data transports, the complexity (the number of input and output ports) of the register file (RF) need not be scaled according to the worst case issue/completion scenario of the multiple parallel instructions.

The reduced register pressure, in addition to simplifying the required complexity of the RF hardware, can lead to significant CPU energy savings, an important benefit especially in mobile embedded systems.

[1] [2] TTA processors are built of independent function units and register files, which are connected with transport buses and sockets.

Data memory access and communication to outside of the processor is handled by using special function units.

Interconnect architecture consists of transport buses which are connected to function unit ports by means of sockets.

Thus, data transports taking place in a clock cycle can be programmed by defining the source and destination socket/port connection to be enabled for each bus.

The assembly language for TTA processors typically includes control flow instructions such as unconditional branches (JUMP), conditional relative branches (BNZ), subroutine call (CALL), conditional return (RETNZ), etc.

[3][4] TTA implementations that only support unconditional data transports, such as the Maxim Integrated MAXQ,[5] typically have a special function unit tightly connected to the program counter that responds to a variety of destination addresses.

Finally, a control signal selects and triggers the addition operation in ALU, of which result is transferred back to the register r3.

TTA programs do not define the operations, but only the data transports needed to write and read the operand values.

Therefore, executing an addition operation in TTA requires three data transport definitions, also called moves.

In case there are multiple buses in the target processor, each bus can be utilized in parallel in the same clock cycle.

The ports associated with the ALU may act as an accumulator, allowing creation of macro instructions that abstract away the underlying TTA: The leading philosophy of TTAs is to move complexity from hardware to software.

Due to the abundance of programmer-visible processor context which practically includes, in addition to register file contents, also function unit pipeline register contents and/or function unit input and output ports, context saves required for external interrupt support can become complex and expensive to implement in a TTA processor.

Parts of transport triggered architecture