A NOVEL HIGH SPEED ADDER ARCHITECTURE IMPLEMENTATION USING HDL

ABHILASH J.E.N*
AJAY D**
SURENDRA T***
TANUJA T****
ASHOK K*****

Abstract

This paper presents an efficient high speed parallel single-rail self-timed adder. It is based on a recursive formulation for performing multi-bit binary addition. The operation is parallel for those bits that do not need any carry chain propagation. Thus, the design attains logarithmic performance over random operand conditions without any special speedup circuitry or look-ahead schema. A practical implementation is provided along with a completion detection unit. The results are implemented and verified using standard Xilinx14.5 using ISE Simulator and results are compared with RCA. By observing the implementation the speed has increased 63.3% than existing work.

Keywords:
Parallel self timed adder (PASTA);
Xilinx;
Mentor graphics;
Carry chain propagation;

* B.Tech Program, Electronics And Communication Engineering.
** Andhra Pradesh, India
*** Associate Professor, Dept Electronics And Communication Engineering.
**** Swarnandhra College Of Engineering And Technology, Narsapur, West Godavari Dst, Andhra Pradesh
1. Introduction
An important operation of the processor is binary summation. Adder circuits have been
developed for synchronous blocks, even though there is interest in asynchronous blocks without
a clock frequency. Asynchronous designs do not gain time. Thus, they have a powerful potential
for logical design, since they are not at risk of various problems with synchronized blocks.
Logical flow in asynchronous circuits is limited by a request / acknowledgment, a handshake
protocol for establishing a pipeline in the absence of clock pulses. Observed communication
patterns for tiny items such as single bit adders, roads. Consequently, it is inheritable and
efficiently controlled using a dual-rail transfer in adders. Valid transfer output with two rails also
generates a confirmation from the block adder. Thus, asynchronous adders are based either on
full coding of all channels with two rails, or on a pipeline operation using data coding with one
bus and transfer representation with two rails for confirmation. Although these constants increase
the strength of circuit blocks, they offer speed advantages for asynchronous adders. Thus, a
healthier alternative is a good consideration that can solve these problems. This represents an
asynchronous parallel accumulator using an algorithm. The design of the parallel adder is simple
and uses half adders (HA) along with multiplexers that require minimal connections. Thus, it is
suitable for performing very large integration. This design works with independent Carry chains.
The execution in this article is moderate; it has feedback from the xor gateway to generate a
cyclic asynchronous accumulator. Cyclic circuits are more efficient than acyclic blocks. The
input data is applied before the output signal is amplified; this is called wave pipelining. It
controls the automatic pipelining of the generated transfer inputs, separated by the propagation
and inertial delays of the gates in the circuit. Monorail pipeline blocks are different from the
double rail.

2. Research Method
2.1 Background

There are many blocks of binary armor and we focus on the asynchronous adder. Timer
models are not more than industry standard models. This type of Adders runs faster for dynamic
distribution data, and early identification can avoid the delays of delay in synchronous circuits.
They are classified.
2.1.1 Pipelined adders using single-rail data encoding

A comparison of handshake request / receipt can be used to initiate assembly blocking as well as the flow of transmission generation signals. In most cases, the two-way transport convention is used to streamline the internal bitwise of the transfer results. The double -rail signals can represent more than two logic values (invalid, 0, 1), and therefore can be used to propagate bit-level acknowledgment when a single-bit operation is completed. When all acknowledge bits are high complete detection unit will sense. The carry-completion sensing adder blocks is an example of pipeline adders, which uses full adder (FA) functional designs, adapted for double-rail carry. A non-financially completion adder, It uses so-called different logic and early completion to select the number of delay lines for proper completion of response. However, the differed logic implementation is expensive due to high fan-in requirements.

2.1.2 Delay insensitive adders using dual-rail encoding

Delay Indicators are non-sequential objects that combine duplicate or duplicate actions. But, if there is a constraint but there is an unknown gate and a net delay, you can do exactly the right thing. There are many delays, such as floating adder and carry look ahead adder, which carry the operation in advance. This extension uses a double track layout and is believed to increase the area. Although double-track encryption doubles the complexity of the network, it can still be used to produce effective affinity designs for those used in single-track forms using dynamic designs or N-MOS. DIRCA uses 40 transistors whereas RCA uses only 28 transistors. Similar to CLA, the DICLA defines the bear's spread and kills comparisons in the direction of double coding. They do not tie the signals as chains, but in a hierarchical way. So they can do better if there is a long chain on a tree. A further minimization is ensured by the observation that the dual-tracking logic of logs may benefit from the creation of both 0 or 1 path. The rail logic does not wait on two roads that will be realized. Thus, the CLA should be accelerated to send dead signals at any stage of the tree. It has been developed and called DICLA, with DICLASP.

2.2 Design of PASTA

This section presents the theory and technology of parallel adders presented. The collector first accepts two entries and performs two and a half attachments. Next, it starts using previously
created carrys and sums and do half additions recursively until all these carry bits become zero adjusted.

2.3. Architecture of PASTA

The general architecture of the adder is shown in Fig. 1. The sel input of muxs are used as initially it selects the operands and when sel=1 used for carry paths. Half adders feedback path allows the whole bear signals to continue completing multiple repetitions when receiving zero values.

Fig 1: General block diagram of PASTA

2.4 State Diagrams

Fig. 2 state diagrams are given for the initial phase and the iterative phase of the proposed design. Each state is represented by \((C_i+1, S_i)\) pair where \(C_i+1, S_i\) is carryout and sum values, respectively, from the \(i\)th bit block. At the initial stage, the circle works as a part of a normal mode computing unit. Instead of full additions, the state can not appear due to the use of half-additions.
Fig 2: State diagram of PASTA

During the repetition phase (SEL = 1), the reaction path is activated through the mux block. Transition Transitions (Ci) are allowed to complete recursion. From the definition of normal mode maps, the current design can not be considered a normal way, as entry results pass through multiple transitions before the final output is produced. Some changes will be made, as shown in the state diagram. It is analogous to cyclical circles where gate delays are used to separate individual states.

2.5. Implementation

PASTA architecture can be implemented by using xilinx 14.5 Mentor Graph program is used for synthesis and simulation. The resulting design is shown in Figure. Using the C-mos design, the p MOS transistor connected to this design Vdd ratio acts as a load register, resulting in static leakage when some of the transistors in the MOS are simultaneously present. In addition to Ci s, the SEL signal is also included for the TERM signal to ensure that completion can not be accidentally ignited during the initial selection phase of the current input. It also prevents p MOS pulling transistor from always on. Thus, the static current will only flow for the actual calculation duration.
3. Results and Analysis

Fig 3: Internal RTL schematic of PASTA

Fig 4: Synthesis report of PASTA
4. Conclusion

<table>
<thead>
<tr>
<th>PARAMETER</th>
<th>PASTA</th>
<th>RCA</th>
</tr>
</thead>
<tbody>
<tr>
<td>Number of slice LUTS</td>
<td>72</td>
<td>24</td>
</tr>
<tr>
<td>Number of fully used LUT-FF pairs</td>
<td>69</td>
<td>0</td>
</tr>
<tr>
<td>Number of bounded IOBS</td>
<td>56</td>
<td>50</td>
</tr>
<tr>
<td>Number of slice register</td>
<td>70</td>
<td>Null</td>
</tr>
<tr>
<td>Number of BUFG/BUFGCTRL/BUFHCE</td>
<td>1</td>
<td>Null</td>
</tr>
<tr>
<td>Time delay</td>
<td>1.639ns</td>
<td>4.467ns</td>
</tr>
</tbody>
</table>

Table 1: comparison of parallel self timed adder and ripple carry adder

These days speed of the multiplier has become an asset or constraint due to the importance of multiplier circuit in a wide variety of microelectronic systems. In this paper we analyzed different multiplier techniques taking speed as the main criteria. parallel self timed adder is proved to be more efficient in terms of speed compared to conventional multiplication techniques generated the output.
References
12. C. Cornelius, S. Koppe, and D. Timmermann, “Dy-namic circuit techniquesin deep submicron

13. M. Anis, S. Member, M. Allam, and M. Elmasry, “Impact of technologyscaling on CMOS logic styles,”