Branch Instruction in Computer Organization
A branch is an instruction in a computer program that can cause a computer to begin executing a different instruction sequence and thus deviate from its default behaviour of executing instructions in order. Branch may also refer to the act of switching execution to a different instruction sequence as a result of executing a branch instruction. Branch instructions are used to implement control flow in program loops and conditionals (i.e., executing a particular sequence of instructions only if certain conditions are satisfied).
A branch instruction can be either an unconditional branch, which always results in branching or a conditional branch, which may or may not cause branching depending on some condition. Also, depending on how it specifies the address of the new instruction sequence (the target address).
A branch instruction is generally classified as direct, indirect or relative. It means the instruction contains the target address, specifies where the target address is to be found (e.g., a register or memory location), or specifies the difference between the current and target addresses. A branch instruction computes the target address in one of four ways:
- A target address is the sum of a constant and the address of the branch instruction itself.
- The target address is the absolute address given as an operand to the instruction.
- The target address is the address found in the Link Register.
- The target address is the address found in Count Register.
The target address can be computed sufficiently ahead of the branch to pre-fetch instructions along the target path using the first two methods.
Using the third and fourth methods, pre-fetching instructions along the branch path is also possible provided the Link Register or Count Register is loaded sufficiently ahead of the branch instruction.
Types of Brach Instructions
There are three types of branching instructions in computer organization:
1. Jump Instructions
The jump instruction transfers the program sequence to the memory address given in the operand based on the specified flag. Jump instructions are further divided into two parts, Unconditional Jump Instructions and Conditional Jump Instructions.
- Unconditional Jump Instructions:Transfers the program sequence to the described memory address.
- Conditional Jump Instructions: Transfers the program sequences to the described memory address only if the condition is satisfied.
2. Call Instructions
The call instruction transfers the program sequence to the memory address given in the operand. Before transferring, the address of the next instruction after CALL is pushed onto the stack. Call instructions are also two types: Unconditional Call Instructions and Conditional Call Instructions.
- Unconditional Call Instructions:It transfers the program sequence to the memory address given in the operand.
- Conditional Call Instructions:Only if the condition is satisfied, the instructions execute.
3. Return Instructions
The return instruction transfers the program sequence from the subroutine to the calling program. Return instructions are two types: Unconditional Jump Instructions and Conditional Jump Instructions.
- Unconditional Return Instruction:The program sequence is transferred unconditionally from the subroutine to the calling program.
- Conditional Return Instruction: The program sequence is transferred unconditionally from the subroutine to the calling program only if the condition is satisfied.
Implementation of Branch Instruction
Mechanically, a branch instruction can change the program counter of a CPU. The program counter stores the memory address of the next instruction to be executed. Therefore, a branch can cause the CPU to begin fetching its instructions from a different sequence of memory cells. Machine level branch instructions are sometimes called jump instructions.
- Machine level jump instructions typically have unconditional and conditional forms where the latter may be takenor not taken depending on some condition. Usually, there are distinct forms for one-way jump, called jump and subroutine invocations are known as a call, which automatically saves the originating address as a return address on the stack, allowing a single subroutine to be invoked from multiple locations in code.
- When a branch is taken, the CPU’s program counter is set to the argumentof the jump instruction. So, the next instruction becomes the instruction at that address in memory. Therefore, the flow of control changes.
- When a branch is not taken, the CPU’s program counter is unchanged. Therefore, the next instruction executed is the instruction after the branch instruction. Therefore, the flow of control is unchanged.
- The term branch can refer to programs in high-level languages and written in machine codeor assembly language.
- In high-level programming languages, branches usually take the form of conditional statementsof various forms that encapsulate the instruction sequence that will be executed if the conditions are satisfied.
- Unconditional branch instructions such as GOTOare used to unconditionally “jump” to (begin execution of) a different instruction sequence.
- In CPUs with flag registers, an earlier instruction sets a condition in the flag register. The earlier instruction may be arithmetic or logic It is often close to the branch, though not necessarily the instruction immediatelybefore the branch.
- The stored condition is then used in a branch such as a jump if overflow-flag set.
- This temporary information is often stored in a flag register but may also be located elsewhere.
- A flag register design is simple in slower, simple computers. A flag register can place a bottleneck on speed in fast computers because instructions that could operate in parallel need to set the flag bits in a particular sequence.
- There are also machines or particular instructions where the condition may be checked by the jump instruction itself, such as branch <label> if register X negative. In simple computer designs, comparison branches execute more arithmetic and use more power than flag register branches.
- Computer design comparison branches can run faster than flag register branches because comparison branches can access the registers with more parallelism, using the same CPU mechanisms as a calculation.
- Some early and simple CPU architectures, still found in microcontrollers, may not implement a conditional jump but rather a conditional “skip the next instruction” operation. A conditional jump or call is thus implemented as a conditional skip of an unconditional jump or call instruction.
Handling of Branch Instructions
Branch instructions can handle in several ways to reduce their negative impact on the rate of execution of instructions.
1. Branch delay slot
The processor fetches the next instructions before it determines whether the current instruction is a branch instruction. When execution of current instruction is completed, and a branch is to be made, the processor must discard remaining instructions and fetch the new branched instruction at the branch target. The location following a branch instruction is called a branch delay slot. Depending on the time to execute a branch instruction, there may be more than one branch delay slot.
A technique called delayed branching can minimize the penalty incurred as a result of conditional branch instructions. The instructions in the delay slots are always fetched. Therefore, we would like to arrange for them to be fully executed whether the branch is taken or not taken. The objective is to be able to place useful instructions in these slots. If no useful instructions can be placed in the delay slots, these slots must be filled with NOP instructions.
2. Branch prediction
Branch prediction took statistics and used the result to optimize code. A programmer would compile a test version of a program and run it with test data.
- The test code counted how the branches were taken.
- The compiler then used the statistics from the test code to optimize the branches of the released code. The optimization would arrange that the fastest branch direction (taken or not) would always be the most frequently taken control flow path.
- To permit this, CPUs must be designed with predictable branch timing. Some CPUs have instruction sets designed with “branch hints” so that a compiler can tell a CPU how each branch is to be taken.
The problem with software branch prediction is that it requires a complex software development process.
3. Branch-free code
Some logic can be written without branches or with fewer branches. It is often possible to use bitwise operations, conditional moves or other predication instead of branches. Branch-free code is a must for cryptography due to timing attacks.
4. Hardware branch predictors
To run any software, hardware branch predictors moved the statistics into the electronics. Branch predictors are parts of a processor that guess the outcome of a conditional branch. Then the processor’s logic gambles on the guess by beginning to execute the expected instruction flow.
An example of a simple hardware branch prediction scheme is to assume that all backward branches (to a smaller program counter) are taken (because they are part of a loop), and all forward branches (to a larger program counter) are not taken (because they leave a loop).
Better branch predictors are developed and validated statistically by running them in simulation on various test programs. Good predictors usually count the outcomes of previous executions of a branch.