- Parent Directory - MIF File - Postscript File - PDF File -

CHAPTER 3

Circuit Techniques for Low Power, High Speed Pipelined A/D

3.1 Overview

Having determined the optimal pipelined A/D architecture, the implementation of this architecture is described in this chapter. The low power techniques used in this design include the dynamic comparators and capacitor scaling which are made possible with this architectural selection. To be compatible with the digital integrated circuit now running at 3.3V power supply, some techniques for low supply voltage are introduced, which include a 3.3V OpAmp and low voltage SC circuits. One of the modifications made to the previous design [3] was to include the digital correction circuit on chip, which provides real time digital correction. Lastly, bias circuits and clock generator are briefly described and followed by a summary.

Figure 3.1 shows the pipelined A/D architecture chosen for this design. It is consist of 9 stages in which the last stage is a flash A/D only. The analog signal is first sampled by S/H and quantized by the flash A/D in the first stage. The D/A, subtraction and amplification are accomplished by the SC circuit. The amplified output is then sampled by the second stage and identical operations are performed. The stages are interleaving and hence, the data is processed concurrently. Therefore, one can achieve high throughput with this pipelined architecture. The output digital bits from each stage are then collected and digitally corrected to achieve the 10 bit resolution.

The per-stage resolution is chosen at 1.5 bit, mainly for two reasons. The tolerance on the comparator offset can be as much as (described in the previous chapter) and the low closed loop gain for the SC circuit is essential for high speed operation. The capacitors in each stage is scaled appropriately for the noise requirement. The interstage gain amplifier has a gain of 2 and is implemented with a SC circuit which shares one of the sampling capacitor in the feedback path to achieve the gain of 2. (Note: for simplicity, only single ended circuit is shown in the block diagram. For practical implementations, differential signal path is employed throughout the converter.) Lastly, the subtraction is accomplished by connecting an appropriate to one of the sampling capacitor.

3.2 Low Power Techniques

Having decided on the power-optimized pipelined A/D architecture from the previous chapter, the circuit is also implemented with low power in mind, pushing the limit of this architecture.

3.2.1 Dynamic Comparators

Employing digital correction in the pipelined architecture (described in Chapter 2), the comparator offset can be relaxed to as much as . With the chosen , , dynamic comparators can be used to eliminate the static power in conventional comparators which are normally used in A/D converters. The thresholds of the comparator are designed at , determined by the digital correction and modified coding. (See Figure 2.13) Two dynamic comparators are used to implement two threshold voltages.

The circuit schematic for the dynamic comparator is shown in

Figure 3.2. It is consist of two cross-coupled inverters controlled by a latch signal. The threshold of the comparator is set by two variable resistors formed with four NMOS transistors. The two parallel connected triode region NMOS's, whose gates are connected to and , determine the resistance. Shown on the right is a conceptual diagram of the dynamic comparator with two variable resistors. The conductances of R and R are determined by

Assuming and , and ,the threshold of the dynamic comparator, given under the condition that , is

Therefore, the ratio between W and W determines the comparator threshold. In order to achieve the threshold requirement for this architecture, the ratio of is chosen for the NMOS device sizes. By interchanging the position for V and V, the comparator with threshold is achieved.

3.2.2 Capacitor Scaling

In an A/D converter having resolution higher than 10 bits, the fundamental noise limitation is the noise due to sampling. [Appendix] The noise is inversely proportional to the sampling capacitor size. Therefore, for a specific thermal noise requirement, the minimum capacitor size can be determined. Assuming the A/D converter is ideal (power dissipated in charging and discharging capacitors only) except the noise, the minimum power dissipation is set by the minimum size of sampling capacitor which satisfies the noise requirement to achieve the desired signal-to-noise ratio (SNR) without considering quantization error.

In practice, the SC power is dominated by the power dissipation in OpAmp's which drive the sampling and charging capacitors. Therefore, in order to minimize the power dissipation, one needs to minimize the power dissipation in the active circuits which, in turn, need to drive the capacitor whose limits the SNR. In order to minimize the active circuit power, the sampling capacitors used must be at their minimum value which satisfy the noise requirement at any given point of the pipeline. The sampling capacitors present themselves as loading capacitor to the previous stage, hence determine the size of the previous stage for a given sampling rate. Considering the source, load and feedback capacitors, feedback factor, speed and noise requirements, a suitable OpAmp can be designed for a particular location in the pipeline to minimize power.

As described in Chapter 2, the noise requirement relaxes greatly later in the pipeline. The effect of parasitic capacitor is becoming more and more apparent. Therefore, for later stages in the pipeline, the size of the OpAmp is mainly determined by the sampling rate rather than the noise constraint. Shown in Figure 3.3[3] is the power dissipation for each pipelined stages normalized to the first stage. It is shown that the front end is limited by the noise and the later stages are limited by the parasitic capacitance.

3.3 Low Voltage Techniques

To achieve high integration in today's integrated circuit design, A/D converter is becoming a circuit block in a large digital signal processing chip which shares the same supply voltage with the digital circuit. With the drive to low power and low voltage in digital circuits, the supply voltage for A/D has dropped to 3.3V. Inherently for analog circuits, dropping supply voltage means the loss of headroom and dynamic range which are critical to any active circuit design. In addition, for A/D converter implemented with SC circuits, low supply voltage also affects the on-resistance of MOS switches. Special techniques, described in this section, need to be employed to compensate for the loss of supply Volga.

3.3.1 2-stage 3.3V OpAmp

In the high speed pipelined architecture, the two most severe requirements of the OpAmp are the DC-gain and settling time for the first stage. To achieve 10 bit resolution, the DC gain over 60dB or 0.1% within the settling time (~11ns) is required. With minimum power dissipation in mind, it's difficult to design an OpAmp which meets all the requirements above with just a few mW's and 3.3V supply voltage.

The easiest way to achieve high gain and high speed is with a telescopic OpAmp. The signal path is only consist of NMOS transistors and the power dissipation is estimated to be small to the first order because of only one current branch. However, having multiple devices stacked on top of each other, the output swing may be small. Although the folded cascode OpAmp can achieve the necessary output swing at 3.3V supply voltage, the slow PMOS in the signal path degrades the speed of the OpAmp. And it may be difficult to achieve the DC-gain because of lower output resistance at the folding node.

To achieve high DC gain, multi-stage OpAmp with only NMOS's in signal path can be used along with pole-split compensation. However, in SC circuit configuration, having the non-dominant pole contributed from the load capacitance at the output and to drive the compensation capacitor, it's difficult to achieve the bandwidth and settling time required with minimum amount of power dissipation. Therefore, an OpAmp with the output resistance of a cascode stage and no compensation capacitor is desirable to meet the DC gain and settling time requirements.

One possible design is to use a cascode stage with a wide-bandwidth pre-amplifier shown in Figure 3.4

. The core of the amplifier is consist of 2 NMOS's and 2 PMOS's cascodes and the output resistance is, to the first order, determined by

Assuming is much smaller than , the output resistance is dominated by . Hence, the gain of the cascode stage is

The wide bandwidth pre-amplifier is designed to have a gain of about 2. This helps to increase the safe margin for the OpAmp DC gain. This pre-amplifier is implemented with a differential pair input stage loaded with a low impedance diode-connected NMOS's to retain the wide bandwidth.

This OpAmp is consist of only NMOS's in the signal path and a desirable cascode output stage to achieve high output impedance. Furthermore, the pre-amplifier increases the effective transconductance as well as achieves the required level shift into the input of cascode stage. The dominant pole is at the output, therefore, the output load capacitance also serves as the compensation capacitor. However, the non-dominant pole is introduced at the input of the core amplifier (transconductance stage). Therefore, in order to push for the widest bandwidth possible and optimum settling time, the selection of pre-amplifier gain is critical. If the gain is too small, the effect of transconductance boost is negligible. However, if the gain is too large, the non-dominant pole will be brought down and limit the bandwidth.

Common mode feedback of the amplifier uses two capacitor connected between the output nodes and the gate of the NMOS load in the pre-amplifier. This provides a negative feedback around the transconductance stage to stabilize the output when the OpAmp is in action. (Also shown in Figure 3.4)

In the prototype, the pre-amplifier gain is chosen to be around 1.75. The gain boost amplifier is eliminated from the previous design [3], mainly due to the fact that the TSMC 0.6m CMOS process used in this design has a reasonably high PMOS output resistance. The output resistance was verified by DC measurement of actual devices in the lab. By eliminating the gain boost amplifier, the power dissipation is further reduced for a given stage.

Running with 3.3V supply voltage, the first stage is implemented as described above. The SC circuit, simulated with the OpAmp, achieves <11ns settling time to 0.1%. The sampling capacitor for the first stage is about 350fF which is determined by the noise requirement. The load capacitance contributed from the second stage sampling capacitor is about 150fF each, and from flash A/D and wiring parasitic is on the order of 100fF. The power dissipation is estimated to be about 2.9mW for the first stage and the DC gain greater than 60dB is achieved.

3.3.2 Low Voltage SC Circuits

In switched capacitor circuits, the use of transmission gate to determine the sampling and charge transferring edge timing is essential. In conventional CMOS process, the threshold voltage is about 0.7 - 0.8V and does not scale with the supply voltage. When the supply voltage is dropped to 3.3V, the threshold voltage has become a larger portion of the supply voltage (the effective gate drive reduces), hence the on-resistance of a CMOS switch is increased. This is demonstrated in

Figure 3.6.

One way to reduce the on-resistance is to use multiple NMOS's (effectively increase the width of transmission gate), however, this will also increase the parasitic capacitance associated with the switch and the RC product stays about constant. Same result with increased parasitic capacitance is obtained with the use of both NMOS and PMOS transistors. Another possible solution is to reduce the threshold voltage for the switches. However, this requires a few extra mask steps in the process which are not available in typical CMOS process. The only option left is to increase the gate drive to the traditional 5V by either a DC-DC converter to create a global 5V supply or clock booster to generate a 5V clock.

The approach chosen here is to create an individual clock booster circuit locally for each transmission gate rather than a global 5V supply for the clocks. This approach will eliminate the possibility of cross-talks between different clock lines which might be coupled through the supply in the global 5V supply case and minimize the number of circuits running at 5V supply to reduce power dissipation. The approach is implemented with a dynamic charge pump circuit shown in Figure 3.7. By applying a 3.3V clock input, the capacitors C and C are self-charged to 3.3V through the cross-coupled NMOS transistors

. And an inverted clock (when the input clock is low) will pump the voltage on C to be greater than 3.3V and the PMOS M will be ON to transfer the voltage to the NMOS switch. This ratio is determined by

Transistor M2 is responsible to discharge the high voltage node to ground when the input clock is high. Hence, the charge pump circuit is inherently an inverting stage and the 3.3V clock signal is converted to be a 5V clock signal. And because this high voltage generated is close to 2Vdd, and the analog common mode voltage is around 1-2 volts, only NMOS switches are necessary for the SC circuits in the pipeline.

To avoid latch up in the circuit, the well potential of the PMOS M needs to be at least . A high voltage generator is shown in Figure 3.8 for this purpose. By removing the discharging transistor M from Figure 3.7

and increase charging capacitor (C and C) sizes, the output will sustain a relatively high voltage as long as the input clock is applied. Lastly, because this is a 5 volt CMOS process, the reliability of charge pumping to ~5 volt will not be an issue. However, when a smaller feature size is used, charge pumping should be used with caution.

3.4 Digital Correction Circuit

In the previous design[3], the raw digital bits from each stage are collected and the digital correction is performed after the output samples are taken in the PC. In order to see the correct 10 bit output from the A/D converter, digital correction circuit needs to be integrated on chip. This also allows a real-time feedback for the gain control in the RF receiver application.

Since the stages in the pipeline are interleaving (meaning when odd stages are sampling, even stages are evaluating), the outputs for a given input sample are present at a clock cycle interval and progresses down the pipeline. When the output from the first stage is ready, the output from the second stage will be ready clock cycle after, etc. Therefore, the sampled input signal can not be corrected until the last stage has finished the conversion.

Because of modified coding, only addition is required in the digital correction. Figure 3.9 shows a conceptual diagram of the digital correction. The output from stage N is delayed by clock cycles before it's being corrected. The correction is done by taking (N+1)th stage output and added to Nth stage output with one bit overlap from the LSB.

The carry will propagate in the direction of MSB. The maximum code each stage can output is 10 after modified coding; assuming a full scale input is applied, the full scale resolution of A/D conversion, after digital correction, will give 1023 codes (1 code short of ideal 10bit 1024 codes). This generally does not create a problem for normal operation.

The delay block is implemented with a P/N transmission gate and an static inverter; the adder is done with a full adder from standard digital library to minimize the design time. The silicon area for the digital correction is 340m x 170m and the power consumption is less than 3mW at 40MS/s and 3.3V power supply.

3.5 Bias Circuits and Clock Generation

The final section of the A/D implementation is the bias circuits and clock generation. Because of OpAmp scaling in the pipeline, bias circuits need to be designed carefully to ensure the OpAmp from each stage is operating within the proper range. Clock generation is also critical due to the fact that rise/fall edges and delay need to be carefully controlled to guaranteed precise SC circuit action. And with the increase in sampling rate (shorter clock period), rise/fall time and jitter are becoming larger portion of the clock period.

Figure 3.10 shows the circuit used to bias up the OpAmp. From the OpAmp schematic shown in Figure 3.4, two PMOS and one NMOS bias voltages are required for the cascode stage. A NMOS current source for the differential pair tail current is needed for the pre-amplifier; as well as the gate voltage for the pre-amplifier load transistor. Because each OpAmp has a different size, an unit bias condition is chosen in the bias circuit which is scaled appropriately in each OpAmp by scaling current mirrors. The DC characteristic of an ideal square law device is examined

If minimum channel length is used (L fixed), is only a function of which is the current density, , of a device. In another word, if the current density is fixed in the OpAmp, the bias condition for the OpAmp will be the fixed. Therefore, 1A/m is chosen to be the unit bias condition in the bias circuit and the current and width is scaled appropriately to keep the same current density for all stages.

The circuit is biased up with a 20A current source driving a chain of triode devices to generate the necessary for the bottom current source. The diode connected NMOS provides the bias voltage for the gate of NMOS cascode device and tracks with the process to the first order. The similar idea is replicated for the PMOS bias voltages. The input common mode voltage, to reset the OpAmp, is generated in the similar fashion; where as the output common mode voltage is controlled externally. Two identical bias circuits are used, each provides bias voltages to four stages in the pipeline. This is to reduce the noise coupling through the bias lines. Each bias line is heavily bypassed with large on-chip capacitors to reject any AC signal that might be resident on the line.

Clock inputs are necessary to operate the SC circuit. Because multiple tasks take place during one clock cycle, the triggering edge of the clock signal need to be carefully controlled. Because the pipeline A/D performs conversion in a interleaving fashion, two sets of clock (clock or clock) is required for even and odd stages. Figure 3.11 shows the block diagram of a clock generator.

When a reference clock is provided (CLK), the output and are 180 degrees out of phase. And and ' are separated by six inverter delays, where ' is used to trigger the digital logic and is the sampling clock. The small delay between these two phases is to ensure a quiet sampling edge when the input sample is taken. Four stages of buffer are added at each output to drive the wiring capacitance, as well as the input capacitance to the charge pump circuits.

The rise and fall time were simulated to be less than 1ns for 40MHz clock input. The delay between and ' is about 1ns, hence, the sampling is finished before the clock edge triggers the logic circuit. The power consumption of the clock generator including the buffer is estimated to be less than 3mW.

3.6 Summary

This chapter describes the implementation of the power optimized pipelined A/D architecture presented in Chapter 2. Taking advantage of the large offset tolerance on the comparator, a dynamic comparator, hence no DC power, is used. To avoid overdesigning by duplicating pipeline stages, scaling on OpAmp and capacitors, which pushes the noise limitation, is employed. Since the goal of high integration causes the supply voltage for A/D to decrease to 3.3V, a high gain, low voltage OpAmp is described here. The OpAmp achieves the required settling time and gain for 10 bit 40MS/s requirement. Lastly, charge pump circuit is shown to boost the 3.3V clock pulses to 5V, to reduce the on-resistance of MOS switches.

The circuit is designed, layout and fabricated with 0.6m CMOS DPTM process and the experimental result will be shown in the next section.



- Parent Directory - MIF File - Postscript File -
This FrameMaker Document was converted to HTML by maker2html v1.0.
(This file was created: Thu May 30 17:22:38 PDT 1996 )