Pipelined Analog-to-Digital Architecture
In this chapter, after a brief introduction of the evolution of pipelined A/D converter architecture, the power optimized pipelined A/D converter architecture will be described. The focus of this chapter will be the power optimization techniques on the architectural level, such as the choice of stage resolution, capacitor scaling and digital correction. The design of high speed, lower power A/D converter architectures have been investigated in detail and can be found in several publications [3][12][14] and will not be included in this thesis.
2.2 Evolution of Pipelined A/D Architecture
Since the existence of digital signal processing, A/D converters have been playing a very important role which interfaces between the analog and digital worlds. They perform the digitalization of analog signals at a fixed time period (frequency), the speed of A/D converters. The fixed time period is generally specified by the application. One typical example is the Nyquist Sampling Theorem which states:
A bandlimited signal having no spectral components above f
Hz can be determined uniquely by values sampled at uniform intervals of Ts seconds, where
This condition needs to hold in order to reconstruct the original analog signal completely. Since algorithms can be implemented very inexpensively in digital domain and if the samples acquired satisfy the Nyquist Sampling Theorem, signals can be reconstructed perfectly after the digital signal processing. Hence, the A/D converter acts as a bridge between two domains and its accuracy is very critical to the performance of the system.
The most straightforward way to perform the A/D conversion is to compare the sampled analog signal with different reference levels. Figure 2.1
The advantage of this architecture is its fast conversion rate. For low resolution application, one can achieve > 100MS/s conversion rate with the flash architecture. And the latency through the converter is only one clock cycle; for application which requires data immediately, i.e. in a feedback/feedforward loop, a flash converter is generally the choice of architecture. On the other hand, low tolerance on the process offset, hence low resolution, and high power dissipation due to large number of precision circuits drive the designers to look for an alternative for flash converters.
2.2.2 2-step Flash Architecture
The required large number of low offset comparators was the major problem in flash converters. Figure 2.2
. Similar to flash converters, the analog input is first sampled by S/H circuit; during the hold period, the first flash ADC performs a coarse quantization on the held signal. The held signal is then subtracted from the output of DAC; the residue of the subtraction is then passed down for fine quantization to full resolution of the converter. Although this architecture still requires the low offset comparator with the full resolution of the converter, the number of low offset comparator required is reduced significantly. With
coarse comparators required in the first half of the converter, the total number of comparators is also reduced. By using concurrent processing, the throughput of this architecture can sustained the same rate as flash A/D. However, the converted outputs have a latency of two clock cycles due to the extra stage to reduce the number of precision comparators.
The advantage of this architecture is its low count on precision comparators, hence lower power. The throughput is the same as flash converters because of concurrent processing of signals, however, an extra clock cycle is required because it requires two steps to complete the conversion. If the system can tolerate latency of converted signal, 2-step flash is a lower power, smaller area alternative. The disadvantage is that both the subtraction and precision comparators still need to be the full resolution of the A/D. As mentioned earlier, it is very difficult to achieve resolution above 8 bit in CMOS without special techniques to compensate for the offset. Subtraction accuracy can be relaxed by using a wider range of precision comparators in the second stage, i.e. digital correction (described later in this chapter). Interstage gain can be used here to tolerate larger comparator offset for the second stage precision comparators.
2.2.3 Conventional Pipelined A/D Architecture
In the 2-step flash converter, an interstage gain amplifier can be used to relax the comparator offset in the second stage. In the same way, if we amplify the subtracted residue signal from the first stage to the full scale, the offset requirement of the 2nd-stage comparators can be relaxed. Figure 2.3 shows a 2-step flash converter with an interstage gain, A. However, the interstage gain needs to be carefully designed according to the first stage resolution and the overall A/D resolution. For example, a 10 bit 2-step flash A/D converter utilizes an switched capacitor interstage amplifier. The amplifier is required to settle to 10 bit resolution in roughly half the clock period with the gain of
. To meet this requirement in switched capacitor (SC) circuit at high speed is very difficult and may take a lot of power, mainly due to the small feedback factor.
In order to reduce the power even more, one can reduce the per-stage resolution and cascade more stages to get the full resolution. This particular architecture is called the Pipelined architecture, mainly because the analog input signal is passed through a pipeline of flash A/D (sub-ADC) and interstage gain blocks.
The advantage of this architecture is its reduced complexity. With a given per-stage resolution, an A/D converter of a given resolution can be achieved by cascading an appropriate number of identical pipelined stages. Therefore, the hardware cost is a linear function of resolution, given that all the requirements are met. Some capacitor trimming techniques may be required to correct for the SC circuit gain and non-ideal subtraction (capacitor mismatch). With concurrent processing (interleaving between stages), the throughput achieved is the same as the flash case, a set of output bits per clock cycle. The major disadvantage of this architecture is the latency in the converter. Generally, if concurrent/interleaving processing is used, the delay through the converter is roughly
clock cycles.
Up to now, the conventional pipelined architecture has the most flexibility and the least number of precision components required to be accurate to the full resolution of the converter. In next section, an attempt to optimize the power and eliminate precision circuit components is introduced.
2.3 Power-Optimized Pipelined A/D Architecture
To reduce the power, the trade-off between per-stage resolution and number of stages is investigated. And capacitor scaling method is described to reduce the power which was overdesigned in the later stages of a conventional pipelined architecture case. Lastly, the use of digital correction is introduced to eliminate precision comparator with inexpensive low power digital circuits.
2.3.1 Power-Optimized Per-Stage Resolution
In the conventional pipelined A/D architecture, the trade-off between the per-stage resolution and power is not clear. For a given sampling rate, when increasing the per-stage resolution, the required number of stages is reduced; however each stage will require more power because of multiple bits. When decreasing the per-stage resolution, the required number of stages is increased; however each stage will require less power. Below is an attempt to estimate the power for the conventional pipelined A/D architecture with different per-stage resolution.
Using the conventional pipelined A/D converter, each stage is identical and performs the same functionality. The power comparison can be found by comparing the power per stage and multiply by the number of stages. Majority of power in a stage is dissipated in the SC S/H/Gain circuit and the estimation of SC power can be done as follows.
where
is the input capacitance of the flash ADC of the next stage. The total number of stages is roughly the full resolution of the converter divided by B, per-stage resolution. And the number of comparator required in each stage is about
.
Knowing the load capacitance and the interstage gain of each stage, the power dissipation per stage can be estimated as a function of B, per-stage resolution. The result of this power estimation is shown previously in [3]. It has concluded that for 8 and 10-bit applications, as the sampling frequency increases, the interstage amplifier bandwidth must be increased to meet the faster settling time requirement. The power difference between 8 and 10-bit curves is much smaller than the
prediction for a given sampling frequency. This is due to the fact that noise is not limiting the performance at these resolutions. Where as in the higher resolution cases (12 or 14-bit), the power increase follows the
prediction of 16x increase in power for 2 bit increase resolution.
For higher resolution (12 or 14-bit), the curves for B=1 and B=2 coincide with each other, meaning the power dissipation is roughly equal for both cases. Inherently, the B=1 case is more suitable for high conversion rate because of its large feedback factor; the B=2 case is preferred for low speed since its interstage gain attenuates the noise from later stages referred to the input. Therefore the power savings from the large feedback factor in B=1 case roughly cancels the larger capacitance required for noise performance; where as B=2 case, the greater OpAmp power is required to compensate for the smaller closed loop gain.
From the above discussion, a simple conclusion can be drawn: at low sampling frequency, the minimum size OpAmp is sufficient to meet both the noise and settling requirement. Therefore, larger per-stage resolution, hence less stages, is preferred for low power. On the other hand, at high sampling frequency near the limit of technology, lower per-stage resolution (hence low closed loop gain) and smaller load capacitance is more suitable for low power.
2.3.2 Power-Optimized Capacitor Scaling
If we re-examine the conventional pipelined A/D architecture more closely, the performance of first stage is found to be the most critical one. Not only the comparator, interstage gain and subtraction need to be accurate to the full resolution of the converter, the
noise from sampling (Appendix C), also deserves some attention for high resolution application. Since the equivalent input-referred noise contribution from subsequent stages is attenuated by the interstage gain of all previous stages, the noise contribution from the first stage (or first S/H circuit) for noise limited pipelined architecture case determines the A/D performance. And because the input-referred noise is attenuated by the interstage gain for later stages, the effective resolution requirement for each stage decreases as the sampled analog signal travels down the pipeline.
Since the noise contribution is mostly coming from the capacitor sampling in the first stage, the capacitors can be scaled down for later stages in the pipeline. For example, assuming a 10-bit pipelined A/D converter with 5 stages (2 bits/stage), the noise performance of the first stage needs to be 10-bits. However, the second stage only needs to meet the noise performance of roughly 8-bit. The reduction of noise requirement can be translated into reduced capacitor sizes, hence smaller Op Amp for power optimized solution.
In conclusion, a pipelined architecture employing capacitor scaling consumes much less power than the conventional pipelined architecture. And it is demonstrated again for high sampling rate, a low per-stage resolution (B=1) is more desirable than a high per-stage resolution (B=2). However, the disadvantage of this architecture is the increase in design time. In order to achieve the minimum power dissipation, each stages needs to be optimized carefully to achieve the lowest possible power for a given noise limitation; where as for the conventional case, the stages can be duplicated to reduce the design cycle.
In the last two sections, methods to reduce power are shown for the proper choice of per-stage resolution and capacitor scaling. However, with a closer examination of the first stage, a precision flash ADC, subtraction and interstage amplifier to the full resolution of the converter are still required to achieve the full resolution of the A/D. As demonstrated earlier, in CMOS, it's extremely difficult to design an A/D without calibration to 10-bit resolution; therefore, some correction algorithm is required for a robust design. In this section, the use of digital correction is introduced with an attempt to relax the comparator offset requirement.
In order to illustrate the algorithm behind digital correction, a 2-bit pipeline stage is presented here as an example.
; shown in Figure 2.10. This will saturate the second stage and cause missing information. To eliminate this problem, one can increase the range of the second stage sub-ADC or equivalently reduce the interstage gain of the first stage to tolerate sub-ADC error.
When the interstage gain is reduced to 2, the transfer function becomes Figure 2.11
and the output is still in the input range of the following stage. However, when a sub-ADC error is present without digital correction, the error will appear in the final digital output. In another words, if digital correction is not used, the first-stage sub-ADC must still be as linear as the entire converter. Whereas later stages, because of interstage gain, the requirements can be relaxed. Now, assume the first stage is ideal, with a full scale input to the first stage, the output is only between
and
, leaving an extra bit on top and bottom of the per-stage resolution. Digital correction simply utilizes the extra bit to correct the overranging section from the previous stage.
For example, when one of the sub-ADC thresholds has an offset, the output of the first stage will exceeds
. The second stage, sensing the overranging, will increase the output by one LSB. This bit will cause the first stage output to increase by one LSB during the digital correction cycle. In the same way, when the output of the first stage drops below
, the second stage will sense the overranging and subtract one LSB during digital correction cycle. With this method, the sub-ADC error, as large as
, in the stage can be corrected by the following stage with digital correction.
With the above digital correction algorithm, both addition and subtraction need to be present in the digital correction circuit which complicates the code assignment for the pipeline stage. Subtraction can be eliminated by intentionally adding an
offset to the sub-ADC and the output of sub-DAC. A conceptual block diagram and transfer function are shown in Figure 2.12
, can be tolerated and digital correction circuit is modified to contain adders only.
Since overranging in the transfer function can be detected by the next stage, one can simplify the design even more by eliminating a comparator at
. The final block diagram and transfer function is shown in Figure 2.13
and
; the sub-DAC levels are at
, 0 and
. The codes are shown on top of the transfer function and the overranging part on the transfer function will be digitally corrected by the next stage except the last stage of the pipeline. The 1.5-bit ADC and DAC here represent the effective bits per stage after digital correction [5].
In this section, the evolution of Pipelined A/D converter is presented. An power-optimized pipelined A/D converter architecture is described in detail. The low-power techniques include the choice of per-stage resolution, capacitor scaling and digital correction. It has been found for high conversion rate, a low per-stage resolution (hence low closed loop gain) is more desirable. With capacitor scaling and digital correction, each stage in the pipeline can be designed according to the noise limitation, hence reduce power dissipation.
With the above techniques, a CMOS implementation of such a power-optimized pipelined A/D converter will be presented. Some practical issues on circuit design will also be discussed.