# A 28-GHz 7.3-mW/Element Beamforming Receiver with On-Chip LO Synthesis

Sean Wang and Behzad Razavi

Electrical and Computer Engineering Department, University of California, Los Angeles, CA 90095, USA seanwang22@ucla.edu

Abstract—An 8-element receiver employs per-element LO synthesis along with new stacking and phase shift techniques. Realized in 28-nm CMOS technology, the 8-element receiver draws 58 mW, including LO synthesis and distribution. Each receiver element achieves 35-dB conversion gain, 4.8-dB minimum noise figure, and an average phase resolution of 11.5 to 15.6 degrees.

Keywords—5G RX, Adler's equation, beamforming, injection pulling

#### I. INTRODUCTION

5G millimeter-wave radios promise high data rates [1]–[3] while facing competition from Wi-Fi - at least for short distances. The use of beamforming gives these radios a significant advantage for a longer range, but such a scenario becomes attractive only if the *total* power consumption and cost fall below those of Wi-Fi. This paper describes an 8-element beamforming receiver that benefits from the lowest power and smallest area reported to date. This performance derives from several new architecture and circuit techniques.

## II. ARCHITECTURE

Shown in Fig. 1 is the high-level view of the 8 elements. Our first approach to reducing power is based on "per-element" LO synthesis, i.e., one dedicated PLL in each element. Such a method obviates the need for distributing high-frequency waveforms, thus saving power-hungry buffers, splitters, etc. This proposition proves viable because even in a modest 28-nm process, we can realize a 28-GHz PLL drawing only 0.6 mW, excluding the VCO. As explained below, the VCO power is reused

As depicted in Fig. 1, each element incorporates a direct-conversion RX with a 28-GHz PLL. Our second approach is to avoid a 56-GHz VCO and ÷2 circuit for I/Q generation; rather, a DLL-based I/Q splitter performs the task while consuming 3 mW. Noted in the architecture of Fig. 1 is also a new phase control method for beamforming that can achieve a wide range and a high resolution with negligible power (Section IV). The red connection between the LNA and the PLL signifies another new scheme for power reduction and is described below.

# III. NEW STACKING TECHNIQUE

The LNA and the 28-GHz VCO are among the most powerhungry blocks in the receiver. We naturally ask whether they can share the same bias current. As proposed in Fig. 2(a), we stack the two circuits, with  $C_0$  establishing an ac ground.



Fig. 1. High-level and per-element receiver architecture.

This topology requires a minimum supply equal to two gatesource voltages, lending itself to any environment that supports CMOS inverters. The dc level at X is about  $V_{DD}/2$ , a value that maximizes the tuning range of the varactors. Despite the ac ground at X, the LO does leak to the LNA input: as illustrated by the floorplan of Fig. 2(b), the symmetric placement of  $L_{VCO}$  with respect to  $L_S$  and  $L_G$  introduces a leakage of about -30 dBm to the antenna, yielding a dc offset of 185 mV at the output of the mixers. We reduce the leakage by nearly 20 dB through the use of  $R_1$ ,  $R_2$ , and  $C_1$ (Fig. 2(b)), which attenuate and shift the phase of the LO to cancel the original LO coupling through the inductors and the substrate. The cost is 0.1 dB degradation in the RX noise figure. The absence of a cascode device in the LNA raises instability concerns. We propose a new technique: by allowing mutual coupling between  $L_G$  and  $L_D$ , with proper polarity, we guarantee a stability factor of 1.8. The VCO depicted in Fig. 2(a) provides two control inputs; their roles are described next.

## IV. BEAMFORMING

We propose a beamforming method that avoids phase-shift networks in the RF or LO paths, and provides a high resolution at virtually no cost. Consider the PLL depicted in Fig. 3, where control voltages  $V_{cont}$  and  $V_{cont}'$  experience gains equal to



Fig. 2. (a) Proposed stacked LNA and VCO topology, and (b) LO leakage cancellation.

 $K_{VCO}$  and  $K'_{VCO}$ , respectively. If  $V'_{cont}$  increases by  $\Delta V$ , the loop remains locked by reducing  $V_{cont}$  by an amount equal to  $\Delta V \cdot K'_{VCO}/K_{VCO}$ . This change is divided by the phase detector (PD) gain, causing the phase difference between  $V_{in}$ and  $V_F$  to change by  $\Delta t = \Delta V \cdot K'_{VCO}/K_{VCO}/K_{PD} =$  $\Delta V \cdot K'_{VCO} \cdot t_1/(K_{VCO} \cdot V_{DD})$ , where  $t_1$  denotes the transition time of  $V_{in}$ . Expressed in seconds, this departure also appears at the VCO output. We wish  $\Delta t$  to reach  $1/(28 \text{ GHz}) = 36 \text{ ps in steps of } 1.5 \text{ ps } (\equiv 15^{\circ}). \text{ With}$  $K'_{VCO}/K_{VCO} = 0.5$ ,  $t_1 = 180$  ps, and  $V_{DD} = 1$  V,  $V'_{cont}$ must vary by 400 mV in steps of 17 mV, a task readily accomplished by a resistor-ladder DAC (RDAC). The DAC draws less than 50 µW and its thermal noise is suppressed by a capacitor tied between  $V_{cont}^{\prime}$  and ground. Further increasing the phase resolution simply requires more resistor segments in the RDAC.



Fig. 3. PLL for beamforming.

## V. I/Q GENERATION

To generate the LO quadrature phases from the 28-GHz PLL output, we rely on a DLL rather than on passive phase splitters (Fig. 4). The proposed approach naturally yields large LO swings and consumes less power than amplifiers that necessarily would follow splitters. For a final phase difference,  $\Delta \phi$ , of 90°, the DLL must employ an XOR gate as its phase detector. The topology in Fig. 4 consists of two paths generating  $LO_I$  and  $LO_Q$ , and XOR and XNOR gates providing outputs that form a differential voltage that goes to zero if  $\Delta \phi = 90^{\circ}$ . The  $G_m$  stage amplifies this voltage and adjusts the delay of the lower path with a high loop gain. The residual phase error thus arises only from the device mismatches within the PD. The cascade of the 28-GHz VCO and the DLL proves superior to conventional techniques. For example, a 56-GHz VCO would suffer from a low Q due to its varactors and programmable capacitors, and a robust 56-GHz ÷2 circuit would draw high power.



Fig. 4. Proposed I/Q generator architecture.

### VI. PULLING BETWEEN PLLS

An interesting consequence of our per-element LO-synthesis architecture is the occurrence of injection pulling between the VCOs contained in adjacent elements. Since the PLLs guarantee operation at exactly equal frequencies for the VCOs, we surmise that no corruption appears but each element's phase shift characteristic is slightly offset. To quantify this effect, consider the case where two mutually-coupled oscillators are locked to the same frequency,  $\omega_0$  (Fig. 5).



Fig. 5. Two mutually-coupled VCOs operating at exactly equal frequencies.

Extending the analysis in [4], we assume the PLLs are nominally identical, and approximate the (sampling) PDs with a single pole,  $\omega_{PD}$ , to obtain the following second-order

differential equations for the PLL control voltages,  $V_{c1}$  and  $V_{c2}$ :

$$\begin{split} \frac{d^2V_{c1}}{dt^2} + \frac{1}{\tau_{\text{PD}}}\frac{dV_{c1}}{dt} - \frac{K_{\text{VCO}}K_{\text{PD}}}{\tau_{\text{PD}}}V_{c1} &= \frac{\alpha\omega_0K_{\text{PD}}}{2Q\tau_{\text{PD}}}\sin(\Delta\phi) \\ \frac{d^2V_{c2}}{dt^2} + \frac{1}{\tau_{\text{PD}}}\frac{dV_{c2}}{dt} - \frac{K_{\text{VCO}}K_{\text{PD}}}{\tau_{\text{PD}}}V_{c2} &= -\frac{\alpha\omega_0K_{\text{PD}}}{2Q\tau_{\text{PD}}}\sin(\Delta\phi), \end{split}$$

where  $\Delta\phi$  denotes the phase difference between the two PLLs' feedback signals, and  $\tau_{PD}=1/\omega_{PD}$ . We observe the homogeneous solution describes a type-I PLL without coupling, while the particular solution encapsulates the effect of mutual pulling. Rewriting in terms of the VCOs' output phases we have:

$$\begin{split} \phi_{\text{out1}} &= \phi_0 - \frac{1}{K_{\text{VCO}} K_{\text{PD}}} \cdot \frac{\alpha \omega_0}{2Q} \sin(\Delta \phi) \\ \phi_{\text{out2}} &= \phi_0 + \frac{1}{K_{\text{VCO}} K_{\text{PD}}} \cdot \frac{\alpha \omega_0}{2Q} \sin(\Delta \phi), \end{split}$$

where  $\phi_0$  is the static phase error of a type-I PLL. Thus, in the presence of mutual pulling, the PLLs' phase shift characteristics are offset in equal and opposite directions by an amount proportional to  $\sin(\Delta\phi)$ . Depicted in Fig. 6, this theory agrees well with Spectre-RF simulations over the range of expected coupling factors, revealing that the maximum offset occurs if the two elements bear a nominal phase difference of  $90^\circ$ .



Fig. 6. Simulated phase offset for mutual coupling factors  $k=10\text{m},\ 15\text{m},\ 20\text{m}.$ 

## VII. EXPERIMENTAL RESULTS

The 8-element RX has been fabricated in TSMC's 28-nm CMOS technology and tested with a 1-V supply. Fig. 7 shows the die photograph. Fig. 8 plots the NF of one element with  $f_{LO}=28~{\rm GHz}$ . Shown in Fig. 9 is the measured RX gain with  $f_{LO}=28~{\rm GHz}$  while the input frequency varies from 27.5 GHz to 28.5 GHz. Fig. 10 depicts the measured compression characteristic, and Fig. 11 shows the  $S_{11}$ . The quality of the I/Q LO generation is assessed by measuring 24 chips, revealing a mean of  $95^{\circ}$  and a standard deviation of  $11.4^{\circ}$  (Fig. 12).



Fig. 7. Die photograph.



Fig. 8. Measured single-element NF ( $f_{LO}=28~\mathrm{GHz}$ ).



Fig. 9. Measured single-element RX gain ( $f_{LO}=28~\mathrm{GHz}$ ).



Fig. 10. Measured compression  $(P_{1dB})$ .



Fig. 11. Measured S<sub>11</sub>.



Fig. 12. Histogram of measured I/Q mismatch.

The gray trace in Fig. 13 shows the phase-shift characteristics of one element, indicating a maximum range exceeding 600°. This is obtained by examining the baseband outputs of two receivers driven by the same RF input. As discussed in Section VI, the phase-shift characteristics change when the adjacent elements' PLLs are enabled. Shown in Fig. 13, the total range decreases from  $600^{\circ}$  to  $450^{\circ}$  and the average phase resolution changes from 15.6° to 11.5°. To measure the PLL phase noise, we apply a 28-GHz tone to the RX input and monitor the baseband output. Fig. 14 shows the measured phase noise of both the RF generator and our prototype. Our integrated jitter from 10 kHz to 100 MHz amounts to 640 fs. Table I compares the performance to that of the prior art.

#### ACKNOWLEDGMENT

We thank the TSMC University Shuttle program for chip fabrication. This research was supported by Realtek Semiconductor.

#### REFERENCES

- [1] P. K. Khanna, Y. Zhao, M. Forghani and B. Razavi, "A Low-Power 28-GHz Beamforming Receiver with On-Chip LO Synthesis," in ESSCIRC 2023- IEEE 49th European Solid State Circuits Conference (ESSCIRC). Sep. 2023, pp. 501-504.
- [2] H. -T. Kim et al., "A 28-GHz CMOS Direct Conversion Transceiver With Packaged 2×4 Antenna Array for 5G Cellular System," IEEE Journal of Solid-State Circuits, vol. 53, no. 5, pp. 1245-1259, May 2018.
- [3] S. Mondal, R. Singh, A. I. Hussein and J. Paramesh, "A 25-30 GHz Fully-Connected Hybrid Beamforming Receiver for MIMO Communication," IEEE Journal of Solid-State Circuits, vol. 53, no. 5, pp. 1275-1287, May 2018.
- [4] B. Razavi, "Mutual Injection Pulling Between Oscillators," in IEEE Custom Integrated Circuits Conference 2006, Sep. 2006, pp 675-678.



Fig. 13. Measured phase shift characteristics.



-13.00

Fig. 14. Synthesizer phase noise, measured at baseband.

#### TABLE I PERFORMANCE SUMMARY

|                                  | [1]                | [2]                                     | [3]                     | This work          |
|----------------------------------|--------------------|-----------------------------------------|-------------------------|--------------------|
| Input Frequency<br>(GHz)         | 27.3 - 29          | 25.8 - 28                               | 25 - 30                 | 27.8 - 28.2        |
| No. of Elements                  | 8                  | 8                                       | 8                       | 8                  |
| NF <sub>min</sub> (dB)           | 4.2                | 6.7                                     | 7.3                     | 4.8                |
|                                  | (1 element)        | (1 element)                             | (1 element)             | (1 element)        |
| Gain (dB)                        | 31.3               | 69                                      | 34                      | 35                 |
|                                  | (1 element)        | (8 elements)                            | (1 element)             | (1 element)        |
| IP <sub>1dB</sub> (dBm)          | -39<br>(1 element) | -68.9 <sup>a</sup> / -34.8 <sup>b</sup> | -29ª / -21 <sup>b</sup> | -30<br>(1 element) |
| Total Power<br>(per element)(mW) | 19.5*              | 50                                      | 27.9                    | 7.3*               |
| Phase Resolution                 | 11.7°              | 45°                                     | N.A.                    | 11.5 - 15.6°       |
| Technology                       | 28-nm CMOS         | 28-nm CMOS                              | 65-nm CMOS              | 28-nm CMOS         |
| Area (mm²)                       | 2.36               | 7.28**                                  | 6.16                    | 2.01               |
|                                  | (8 elements)       | (8 elements)                            | (8 elements)            | (8 elements)       |

<sup>\*</sup> Including LO Synth. \*\* Includes TX Area a High-gain mode b Low-gain mode