# Bio-inspired 0.35µm CMOS Time-to-Digital Converter with 29.3ps LSB

András Mozsáry Faculty of Information Technology Péter Pázmány Catholic University Budapest, Hungary mozsary@itk.ppke.hu

Jen-Feng Chung Electrical and Control Engineering Department National Chiao Tung University Hsinchu, Taiwan

Abstract—Time-to-Digital Converter (TDC) integrated circuit is introduced in this paper. It is based on chain of delay elements composing a regular scalable structure. The scheme is analogous to the sound direction sensitivity nerve system found in Barn Owl. The circuit occupies small silicon area, and its direct mapping from time to position-code makes conversion rates up to 500Msps possible. Specialty of the circuit is the structural and functional symmetry. Therefore the role of START and STOP signals are interchangeable. In other words negative delay is acceptable: The circuit has no dead time problems. These are benefits of the biology model of the auditory scene representation in the bird's brain. The prototype chip is implemented in  $0.35\mu$ m CMOS having less than 30ps single-shot resolution in the measurements.

## I. INTRODUCTION

Timing measurements on picosecond level are required in many nuclear physics experiments. Time-to-Digital Converters (TDCs) are employed for elementary particle tracking and lifetime measurements. (drift chambers, etc.) Another popular application for TDCs is 3D ranging and laser distance measurement, where 1 cm equals 66ps.

Nowadays TDCs use various configurations of delay chains, usually incorporated into Phase-Locked Loops (PLL) or Delay-Locked Loops (DLL)[1][2]. These are surrounded by sophisticated control circuitry, and chip input RESET, CLOCK, etc. signals are needed. On the contrary, the herein proposed alternative employs delay elements in a flash-like conversion architecture that operates in fully autonomous way, having no other pins than the timing inputs. The idea was taken from biological models of auditory processing in the Barn Owl [3]. There are direction-sensitive neurons in the brain, which code the time difference of the sound Angel Rodríguez-Vázquez Anafocus Sevilla, Spain

Tamás Roska Faculty of Information Technology Péter Pázmány Catholic University Budapest, Hungary

reaching the two ears. The chip layout is simple and regular like the anatomy of the Barn Owl. The layouter's task was to copy a 5-micron module repeatedly 64 times, composing the layout for 6-bit TDC. This property makes the architecture easy to migrate and scale. Also, the symmetry of the neuron pathway makes this design to have functional symmetry as well: Like the two identical ears, there are two identical chip input pins 'IN1' and 'IN2'. There are no labels 'START' and 'STOP' distinguishing them. The circuit measures delay between them and therefore it has a symmetric input range from –950ps to +950ps. The problem of dead time does not exist here.

Prototypes were manufactured using a 0.35µm CMOS technology. Time resolution of 29.3ps was measured besides 100Msps conversion rate. However this speed limit is posed just by output pad driver weakness. According to simulations, digital output code is generated inside the chip at higher speed. It allows us to nominate the highest rate that is meaningful for the 1.9ns input full-scale range: that is unprecedented 500Msps.

Together with its scalability, symmetry and speed advantages, comparison versus other TDC chips shows a more efficient usage of area and power to achieve accuracy and speed.

#### II. ARCHITECTURE

Neurobiology research addresses direction sensitivity as a submodality of auditory scene analysis. Direction (azimuth angle) of a sound is detected partly by calculating the difference of path lengths from the single sound source to the two ears. There is a place in the brain stem of the Barn Owl, the Nucleus Laminaris (NL) [4], where signals coming from

This work has been funded by OTKA (Hungarian National Research Foundation) Doctoral School Framework. Grant no.: TS40858



Figure 1. Jeffress model for encoding Interaural Time Differences. Nucleus Laminaris (NL) in owls maps time differences into place coding.

the two ears are compared. (see figure 1.) Magnocellular axons from both sides deliver the action potential to NL neurons, what act as coincidence detector. NL neurons have a regular positioning in a single row. As the axon fiber has a certain propagation speed, they introduce increasing delay to the NL neurons accordingly. Therefore each NL cell represents a unit time step, and the activity of a certain neuron corresponds to a given Interaural Time Difference (ITD). This way the brain formulates a topographic code of the direction of the sound.

In the integrated circuit, a delay chain implements the Magnocellular axon, and the NL neurons are replaced by logical AND gates. D-latches record the timing and store the result of the time-to-digital conversion. The system block diagram is given on figure 2. There are 64 identical channels, on the diagram there are only three drawn for demonstration purposes. The circuit implementation of the elements follows Source-Coupled Logic (SCL) style aiming for larger accuracy (because of the differential structure) and speed (because of the bias current) as well as for larger noise immunity. The operation principle is detailed on figure 3. The chip receives two rising-edge pulses on inputs IN1 and



Figure 2. Block diagram of the implemented Time-to-Digital Converter architecture. Delay chains replace Magnocellular axons, AND gates act like coincidence detector NL neurons, D-latches formulate and store the output thermometer code.



Figure 3. Mapping from time to space via spatial convolution of input pulses. Transition in thermometer code corresponds to the peak of timing waveform.

IN2. Those propagate through their delay path, and after coincidence, the rising edge appears on the outputs of the AND gates as well. Earliest will be the middle point, after those other gates aside will receive the signal. The AND gate outputs form a triangular wavefront speaking in terms of timing. This is the outcome of the spatio-temporal convolution operation, which is programmed by this architecture. The peak of the so-called triangle shifts to the left or to the right, as the IN1 input becomes delayed with respect to IN2 of vice versa. At the bottom, a thermometer code is formulated, where the logical high-low transition marks the position of the peak. This 64-bit level- or thermometer-code is then usually transformed into a 6-bit digital output. However for full testability, the prototype chip includes readout multiplexer instead. Microphotograph is on figure 4 and Table 1 summarizes chip data.



Figure 4. Microphotograph of the Hypthree prototype chip.

| Hypthree Chip       |                          |
|---------------------|--------------------------|
| Function:           | TDC (time measurement)   |
| Sensitivity:        | LSB=29.3ps (single shot) |
| Resolution:         | 6 bit                    |
| Full Scale:         | 1.9ns                    |
| Accuracy:           | INL=77.9ps=2.7LSB (RMS)  |
| Conversion speed:   | < 500Msps                |
| Conversion Latency: | < 9ns                    |
| Technology          | 0.35um CMOS              |
| Manufactured:       | TSMC, year 2005          |
| Yield:              | 32.6%                    |
| Missing codes:      | 67.4%                    |
| Chip dimensions:    | 1139 um x 1230 um        |
| Active core size:   | 0.09 mm <sup>2</sup>     |
| Power Consumption:  | 675mW @ 3Vdd             |
| Transistor count:   | 7678                     |

TABLE I. CHIP DATA

## III. MEASUREMENTS

Tektronix PacketBERT PB200 generator's DATA and CLOCK output signal was fed into the prototype chip.



Figure 5. Time-to-Digital Conversion characteristics measurement



Figure 6. Conversion Error, deviation from the ideal Best-fit straight line.

The transition points of each of the 64 channels are shown on figure 5. Results from eight prototype chips are collected on that single diagram. The slope of the best-fit regression line is 29.3 picosecond, which is the conversion unit. The impact of fabrication process variation on this 29.3ps LSB was less than  $\pm 5\%$ . It was 2.5% on average. The 6-bit word length covers a full scale of 1.9 ns.

Integral Nonlinearity (INL) characterizes the spread of the measured points around the ideal straight line. INL is demonstrated on figure 6. The root-mean-square (RMS) average of the INL is 77.9ps, which is 2.7 times higher than the LSB. Repeated measurements were 97% correlated. This justifies 18ps RMS jitter for the overall measurement setup. Therefore the measurement results are well above the noise floor. Conclusively the obtained 2.7LSB error is solely the outcome of matching problem. The design fault that resulted this severe mismatch degradation should have originated from AND gate or D-latch stages, as the spread seems to be uniformly distributed along the regression line. Also twothird of channel outputs are stuck to logical high or low, which probably belongs to the same problem.

The TDC core keeps functioning up to at least 100Msps. Beyond this speed the output pad buffer circuitry fails to drive the digital signal off chip. 9ns latency time has been measured from IC package input pin to IC package output pin. Majority of this latency results most likely from pad buffer delay as well. Therefore this 100Msps is a worst-case estimate, and typical value should be much better. The circuit is manufactured using  $0.35\mu m$  CMOS process and operates at 3.0V supply. Power consumption is 675mW that is due to the static bias of the SCL circuitry. There was no noticeable change in drawn supply current up to the maximum 200Msps rate of the pulse generator.

### IV. CONCLUSIONS

Table 2 summarizes key features of test chips representing different TDC architectures and techniques. Power consumption is higher than competitors, but this is price we pay for the high speed and high accuracy: Figureof-merit-1 (FOM1) concludes that presented chip uses area and power efficiently to achieve accuracy and speed. The comparison is made on a fair basis. Thus  $\sigma$  is used, instead of LSB, when it was reported larger than LSB parameter. Custom method for shrinking LSB is to repeat measurements and average the result. The FOM expressions cancel this effect and treat single shot data as well as repeated measurements-technique on an equal footing. Scalability is incorporated in FOM2, where only architectures based on cyclic pulse-shrinking outperform the one presented herein. However, as Table 2 shows, these architectures are thousand times slower. Other high speed competitor chips have much larger area.

Speed is a major advantage of present bio-inspired architecture. However not all applications might require such high conversion rate. Authors see challenge in reducing supply power when moderate speed is the design target.

#### ACKNOWLEDGMENT

The first author wants to thank prof. Chin-Teng Lin at National Chiao Tung University for hosting him for one academic year, and providing the chip implementation framework. The chip was manufactured via Chip Implementation Center (CIC), Taiwan. Measurement laboratory facilities provided at Integration Hungary Ltd have been indispensable for this project.

#### REFERENCES

- J. Christiansen, "An integrated CMOS 0.15 ns digital timing generator for TDCs and clock distribution systems," IEEE Transactions on Nuclear Science, vol. 42, issue 4, pp. 753-757, Aug. 1995
- [2] M. S. Gorbics, J. Kelly, K. M. Roberts, R. L. Sumner, "A high resolution multihit time to digital converter integrated circuit," IEEE Nuclear Science Symposium, 1996. Conference Record. vol. 1, pp. 421-425. 02-09 Nov. 1996
- [3] C. E. Carr and M. Konishi, "A circuit for detection of interaural time differences in the brain stem of the barn owl," J. Neuroscience vol. 10, pp. 3227-3246, 1990
- [4] K. Lotz, L. Bölöni, T. Roska and J. Hámori, "Hyperacuity in Time: A CNN Model of a Time-Coding Pathway of Sound Localization," IEEE Transactions on Circuits and Systems - I: Fundamental Theory and Applications. vol. 46. no. 8. pp. 994-1002. august 1999.
- [5] A. Mantyniemmi, T. Rahkonen, J. Kostamovaara, "A High Resolution Digital CMOS Time-to-Digital Converter Based on Nested Delay Locked Loops," IEEE International Symposium on Circuits and Systems, ISCAS'99 proceedings vol. 2, pp. 537-540, 1999
- [6] E. Raisanen-Ruotsalainen, T. Rahkonen, J. Kostamovaara, "A Low Power CMOS Time-to-Digital Converter," IEEE Journal of Solid-State Circuits, vol. 30. no 9, pp. 984-990, 1995
- [7] P. Chen, S. Liu, J. Wu, "A CMOS Pulse-Shrinking Delay element For Time Interval Measurement," IEEE Trans. on Circuits and Systems-II vol. 47, no 9, 2000.
- [8] S. Tisa, A. Lotito, A. Giudice and F. Zappa, "Monolithic Time-to-Digital Converter with 20ps resolution," European Solid-State Circuits Conference ESSCIRC'03 pp. 465-468, 2003
- [9] I. Nissinen, A. Mantyniemi, J. Kostamovaara, "A CMOS Time-to-Digital Converter based on a Ring Oscillator for a Laser Radar," European Solid-State Circuits Conference ESSCIRC'03 pp. 469-470, 2003
- [10] P. Dudek, S. Szczepanski, J.V. Hatfield, "A high-resolution CMOS time-to-digital converter utilizing a Vernier delay line," IEEE Journal of Solid-State Circuits, vol. 35, pp. 240-247, feb. 2000.

TABLE II. COMPARISON TO OTHER REPORTED TIME-TO-DIGITAL CONVERTER CHIPS

| Ref   | Architecture                      | Accuracy<br>LSB  | Conversion<br>speed | Interpolator<br>length,<br>resolution | Supply<br>Power     | CMOS Technology,<br>silicon area                      | Figure of Merit-1<br>(P Area LSB <sup>2</sup> )/ F <sub>clk</sub> | Figure of Merit-2<br>FOM1 / 2 <sup>Nbit</sup> |
|-------|-----------------------------------|------------------|---------------------|---------------------------------------|---------------------|-------------------------------------------------------|-------------------------------------------------------------------|-----------------------------------------------|
| [5]   | Nested DLLs                       | 92ps             | 85Msps              | 128                                   | 100mW,<br>5Vdd      | 0.8um,<br>3.1 x 2.2 mm                                | 67 911                                                            | 530                                           |
| [6]   | Linear pulse<br>shrinking         | 780ps            | 20Msps              | 64                                    | 15mW,<br>5Vdd       | 1.2um,<br>2.9 x 2.5 mm                                | 3 308 175                                                         | 51 690                                        |
| [7]   | Cyclic pulse<br>shrinking         | 68ps             | 100ksps             | 200                                   | 1.2mW,<br>3.3Vdd    | 0.35um (L=1um),<br>core: 0.35 x 0.09 mm               | 1 747                                                             | 8.7                                           |
| [8]   | Cyclic pulse<br>shrinking         | 20ps<br>(σ=76ps) | 50ksps              | 10 bit                                | estimated!<br>1mW   | 0.8um, estimated<br>core size 0.08mm <sup>2</sup>     | 9 241                                                             | 10.4                                          |
| [9]   | Cyclic delay line                 | 156ps            | 400Msps             | 16                                    | 72mW,<br>3Vdd       | 0.35, 1.81x1.81 mm,<br>TDC core: 0.238mm <sup>2</sup> | 1 042                                                             | 65                                            |
| [10]  | differential delay<br>lines + DLL | 30ps             | 260Msps             | 128                                   | estimated!<br>100mW | 0.7um<br>3.2 x 3.1 mm                                 | 3 433                                                             | 27                                            |
| pres. | Nucleus Laminaris                 | 30ps<br>(σ=78ps) | <500Msps            | 6 bit                                 | 675mW<br>3Vdd       | 0.35um<br>core: 0.09mm <sup>2</sup>                   | 739                                                               | 11.6                                          |

The less the Figure-of-Merit, the better the design