# imall

Chipsmall Limited consists of a professional team with an average of over 10 year of expertise in the distribution of electronic components. Based in Hongkong, we have already established firm and mutual-benefit business relationships with customers from, Europe, America and south Asia, supplying obsolete and hard-to-find components to meet their specific needs.

With the principle of "Quality Parts, Customers Priority, Honest Operation, and Considerate Service", our business mainly focus on the distribution of electronic components. Line cards we deal with include Microchip, ALPS, ROHM, Xilinx, Pulse, ON, Everlight and Freescale. Main products comprise IC, Modules, Potentiometer, IC Socket, Relay, Connector. Our parts cover such applications as commercial, industrial, and automotives areas.

We are looking forward to setting up business relationship with you and hope to provide you with the best service and solution. Let us make a better world for our industry!



## Contact us

Tel: +86-755-8981 8866 Fax: +86-755-8427 6832 Email & Skype: info@chipsmall.com Web: www.chipsmall.com Address: A1208, Overseas Decoration Building, #122 Zhenhua RD., Futian, Shenzhen, China





## LatticeECP2/M Family Data Sheet

DS1006 Version 04.1, September 2013



## LatticeECP2/M Family Data Sheet Introduction

#### July 2012

#### Features

- High Logic Density for System Integration
   6K to 95K LUTs
  - 90 to 583 I/Os
- Embedded SERDES (LatticeECP2M Only)
  - Data Rates 250 Mbps to 3.125 Gbps
    Up to 16 channels per device PCI Express, Ethernet (1GbE, SGMII), OBSAI, CPRI and Serial RapidIO.

#### ■ sysDSP<sup>™</sup> Block

- 3 to 42 blocks for high performance multiply and accumulate
- Each block supports
  - One 36x36, four 18x18 or eight 9x9 multipliers

#### ■ Flexible Memory Resources

- 55Kbits to 5308Kbits sysMEM<sup>™</sup> Embedded Block RAM (EBR)
  - 18Kbit block
  - Single, pseudo dual and true dual port
  - Byte Enable Mode support
- 12K to 202Kbits distributed RAM
  - Single port and pseudo dual port

#### ■ sysCLOCK Analog PLLs and DLLs

- Two GPLLs and up to six SPLLs per device
  - Clock multiply, divide, phase & delay adjustDynamic PLL adjustment
- Two general purpose DLLs per device

- Pre-Engineered Source Synchronous I/O
  - DDR registers in I/O cells
  - Dedicated gearing logic
  - Source synchronous standards support – SPI4.2, SFI4 (DDR Mode), XGMII
    - High Speed ADC/DAC devices
  - Dedicated DDR and DDR2 memory support
     DDR1: 400 (200MHz) / DDR2: 533 (266MHz)
  - Dedicated DQS support
- Programmable sysl/O<sup>™</sup> Buffer Supports Wide Range Of Interfaces
  - LVTTL and LVCMOS 33/25/18/15/12
  - SSTL 3/2/18 I, II
  - HSTL15 I and HSTL18 I, II
  - PCI and Differential HSTL, SSTL
  - LVDS, RSDS, Bus-LVDS, MLVDS, LVPECL
- Flexible Device Configuration
  - 1149.1 Boundary Scan compliant
  - Dedicated bank for configuration I/Os
  - SPI boot flash interface
  - Dual boot images supported
  - TransFR™ I/O for simple field updates
  - Soft Error Detect macro embedded
- Optional Bitstream Encryption (LatticeECP2/M "S" Versions Only)

#### System Level Support

- ispTRACY<sup>™</sup> internal logic analyzer capability
- On-chip oscillator for initialization & general use
- 1.2V power supply

#### Table 1-1. LatticeECP2 (Including "S-Series") Family Selection

| Device                        | ECP2-6 | ECP2-12 | ECP2-20 | ECP2-35 | ECP2-50 | ECP2-70 |
|-------------------------------|--------|---------|---------|---------|---------|---------|
| LUTs (K)                      | 6      | 12      | 21      | 32      | 48      | 68      |
| Distributed RAM (Kbits)       | 12     | 24      | 42      | 64      | 96      | 136     |
| EBR SRAM (Kbits)              | 55     | 221     | 276     | 332     | 387     | 1032    |
| EBR SRAM Blocks               | 3      | 12      | 15      | 18      | 21      | 60      |
| sysDSP Blocks                 | 3      | 6       | 7       | 8       | 18      | 22      |
| 18x18 Multipliers             | 12     | 24      | 28      | 32      | 72      | 88      |
| GPLL + SPLL + DLL             | 2+0+2  | 2+0+2   | 2+0+2   | 2+0+2   | 2+2+2   | 2+4+2   |
| Maximum Available I/O         | 190    | 297     | 402     | 450     | 500     | 583     |
| Packages and I/O Combinations | ·      |         | •       |         | •       |         |
| 144-pin TQFP (20 x 20 mm)     | 90     | 93      |         |         |         |         |
| 208-pin PQFP (28 x 28 mm)     |        | 131     | 131     |         |         |         |
| 256-ball fpBGA (17 x 17 mm)   | 190    | 193     | 193     |         |         |         |
| 484-ball fpBGA (23 x 23 mm)   |        | 297     | 331     | 331     | 339     |         |
| 672-ball fpBGA (27 x 27 mm)   |        |         | 402     | 450     | 500     | 500     |
| 900-ball fpBGA (31 x 31 mm)   |        |         |         |         |         | 583     |

© 2012 Lattice Semiconductor Corp. All Lattice trademarks, registered trademarks, patents, and disclaimers are as listed at www.latticesemi.com/legal. All other brand or product names are trademarks or registered trademarks of their respective holders. The specifications and information herein are subject to change without notice.

Data Sheet DS1006



| Device                       | ECP2M20     | ECP2M35 | ECP2M50 | ECP2M70  | ECP2M100 |
|------------------------------|-------------|---------|---------|----------|----------|
| LUTs (K)                     | 19          | 34      | 48      | 67       | 95       |
| sysMEM Blocks (18kb)         | 66          | 114     | 225     | 246      | 288      |
| Embedded Memory (Kbits)      | 1217        | 2101    | 4147    | 4534     | 5308     |
| Distributed Memory (Kbits)   | 41          | 71      | 101     | 145      | 202      |
| sysDSP Blocks                | 6           | 8       | 22      | 24       | 42       |
| 18x18 Multipliers            | 24          | 32      | 88      | 96       | 168      |
| GPLL+SPLL+DLL                | 2+6+2       | 2+6+2   | 2+6+2   | 2+6+2    | 2+6+2    |
| Maximum Available I/O        | 304         | 410     | 410     | 436      | 520      |
| Packages and SERDES / I/O C  | ombinations |         |         | •        |          |
| 256-ball fpBGA (17 x 17 mm)  | 4 / 140     | 4 / 140 |         |          |          |
| 484-ball fpBGA (23 x 23 mm)  | 4 / 304     | 4 / 303 | 4 / 270 |          |          |
| 672-ball fpBGA (27 x 27 mm)  |             | 4 / 410 | 8 / 372 |          |          |
| 900-ball fpBGA (31 x 31 mm)  |             |         | 8 / 410 | 16 / 416 | 16 / 416 |
| 1152-ball fpBGA (35 x 35 mm) |             |         |         | 16 / 436 | 16 / 520 |

#### Table 1-2. LatticeECP2M (Including "S-Series") Family Selection

## Introduction

The LatticeECP2/M family of FPGA devices is optimized to deliver high performance features such as advanced DSP blocks, high speed SERDES (LatticeECP2M family only) and high speed source synchronous interfaces in an economical FPGA fabric. This combination was achieved through advances in device architecture and the use of 90nm technology.

The LatticeECP2/M FPGA fabric is optimized with high performance and low cost in mind. The LatticeECP2/M devices include LUT-based logic, distributed and embedded memory, Phase Locked Loops (PLLs), Delay Locked Loops (DLLs), pre-engineered source synchronous I/O support, enhanced sysDSP blocks and advanced configuration support, including encryption ("S" versions only) and dual boot capabilities.

The LatticeECP2M device family features high speed SERDES with PCS. These high jitter tolerance and low transmission jitter SERDES with PCS blocks can be configured to support an array of popular data protocols including PCI Express, Ethernet (1GbE and SGMII), OBSAI and CPRI. Transmit Pre-emphasis and Receive Equalization settings make SERDES suitable for chip to chip and small form factor backplane applications.

Lattice Diamond<sup>®</sup> design software allows large complex designs to be efficiently implemented using the LatticeECP2/M FPGA family. Synthesis library support for LatticeECP2/M is available for popular logic synthesis tools. The Diamond software uses the synthesis tool output along with the constraints from its floor planning tools to place and route the design in the LatticeECP2/M device. The Diamond design tool extracts the timing from the routing and back-annotates it into the design for timing verification.

Lattice provides many pre-engineered IP (Intellectual Property) modules for the LatticeECP2/M family. By using these IP cores as standardized blocks, designers are free to concentrate on the unique aspects of their design, increasing their productivity.



## LatticeECP2/M Family Data Sheet Architecture

#### September 2013

Data Sheet DS1006

#### **Architecture Overview**

Each LatticeECP2/M device contains an array of logic blocks surrounded by Programmable I/O Cells (PIC). Interspersed between the rows of logic blocks are rows of sysMEM<sup>™</sup> Embedded Block RAM (EBR) and rows of sys-DSP<sup>™</sup> Digital Signal Processing blocks, as shown in Figure 2-1. In addition, the LatticeECP2M family contains SERDES Quads in one or more of the corners. Figure 2-2 shows the block diagram of ECP2M20 with one quad.

There are two kinds of logic blocks, the Programmable Functional Unit (PFU) and Programmable Functional Unit without RAM (PFF). The PFU contains the building blocks for logic, arithmetic, RAM and ROM functions. The PFF block contains building blocks for logic, arithmetic and ROM functions. Both PFU and PFF blocks are optimized for flexibility, allowing complex designs to be implemented quickly and efficiently. Logic Blocks are arranged in a two-dimensional array. Only one type of block is used per row.

The LatticeECP2/M devices contain one or more rows of sysMEM EBR blocks. sysMEM EBRs are large dedicated 18K fast memory blocks. Each sysMEM block can be configured in a variety of depths and widths of RAM or ROM. In addition, LatticeECP2/M devices contain up to two rows of DSP Blocks. Each DSP block has multipliers and adder/accumulators, which are the building blocks for complex signal processing capabilities.

The LatticeECP2M devices feature up to 16 embedded 3.125Gbps SERDES (Serializer / Deserializer) channels. Each SERDES channel contains independent 8b/10b encoding / decoding, polarity adjust and elastic buffer logic. Each group of four SERDES channels along with its Physical Coding Sub-layer (PCS) block, creates a quad. The functionality of the SERDES/PCS Quads can be controlled by memory cells set during device configuration or by registers that are addressable during device operation. The registers in every quad can be programmed by a soft IP interface, referred to as the SERDES Client Interface (SCI). These quads (up to four) are located at the corners of the devices.

Each PIC block encompasses two PIOs (PIO pairs) with their respective sysl/O buffers. The sysl/O buffers of the LatticeECP2/M devices are arranged in eight banks, allowing the implementation of a wide variety of I/O standards. In addition, a separate I/O bank is provided for the programming interfaces. PIO pairs on the left and right edges of the device can be configured as LVDS transmit/receive pairs. The PIC logic also includes pre-engineered support to aid in the implementation of high speed source synchronous standards such as SPI4.2, along with memory interfaces including DDR2.

The LatticeECP2/M registers in PFU and sysl/O can be configured to be SET or RESET. After power up and the device is configured, it enters into user mode with these registers SET/RESET according to the configuration setting, allowing the device entering to a known state for predictable system function.

Other blocks provided include PLLs, DLLs and configuration functions. The LatticeECP2/M architecture provides two General PLLs (GPLL) and up to six Standard PLLs (SPLL) per device. In addition, each LatticeECP2/M family member provides two DLLs per device. The GPLLs and DLLs blocks are located in pairs at the end of the bottommost EBR row; the DLL block is located towards the edge of the device. The SPLL blocks are located at the end of the other EBR/DSP rows.

The configuration block that supports features such as configuration bit-stream decryption, transparent updates and dual boot support is located toward the center of this EBR row. The Ball Grid Array (BGA) package devices in the LatticeECP2/M family supports a sysCONFIG<sup>™</sup> port located in the corner between banks four and five, which allows for serial or parallel device configuration.

In addition, every device in the family has a JTAG port. This family also provides an on-chip oscillator. The LatticeECP2/M devices use 1.2V as their core voltage.

<sup>© 2013</sup> Lattice Semiconductor Corp. All Lattice trademarks, registered trademarks, patents, and disclaimers are as listed at www.latticesemi.com/legal. All other brand or product names are trademarks or registered trademarks of their respective holders. The specifications and information herein are subject to change without notice.













## **PFU Blocks**

The core of the LatticeECP2/M device consists of PFU blocks, which are provided in two forms, the PFU and PFF. The PFUs can be programmed to perform Logic, Arithmetic, Distributed RAM and Distributed ROM functions. PFF blocks can be programmed to perform Logic, Arithmetic and ROM functions. Except where necessary, the remainder of this data sheet will use the term PFU to refer to both PFU and PFF blocks.

Each PFU block consists of four interconnected slices, numbered 0-3 as shown in Figure 2-3. All the interconnections to and from PFU blocks are from routing. There are 50 inputs and 23 outputs associated with each PFU block.

#### Figure 2-3. PFU Diagram



#### Slice

Slice 0 through Slice 2 contain two LUT4s feeding two registers, whereas Slice 3 contains two LUT4s only. For PFUs, Slice 0 and Slice 2 can also be configured as distributed memory, a capability not available in the PFF. Table 2-1 shows the capability of the slices in both PFF and PFU blocks along with the operation modes they enable. In addition, each PFU contains some logic that allows the LUTs to be combined to perform functions such as LUT5, LUT6, LUT7 and LUT8. There is control logic to perform set/reset functions (programmable as synchronous/asynchronous), clock select, chip-select and wider RAM/ROM functions. Figure 2-4 shows an overview of the internal logic of the slice. The registers in the slice can be configured for positive/negative and edge triggered or level sensitive clocks.

Table 2-1. Resources and Modes Available per Slice

|         | PFU E                   | BLock                   | PFF Block               |                    |  |
|---------|-------------------------|-------------------------|-------------------------|--------------------|--|
| Slice   | Resources               | Modes                   | Resources               | Modes              |  |
| Slice 0 | 2 LUT4s and 2 Registers | Logic, Ripple, RAM, ROM | 2 LUT4s and 2 Registers | Logic, Ripple, ROM |  |
| Slice 1 | 2 LUT4s and 2 Registers | Logic, Ripple, ROM      | 2 LUT4s and 2 Registers | Logic, Ripple, ROM |  |
| Slice 2 | 2 LUT4s and 2 Registers | Logic, Ripple, RAM, ROM | 2 LUT4s and 2 Registers | Logic, Ripple, ROM |  |
| Slice 3 | 2 LUT4s                 | Logic, ROM              | 2 LUT4s                 | Logic, ROM         |  |

Slices 0, 1 and 2 have 14 input signals: 13 signals from routing and one from the carry-chain (from the adjacent slice or PFU). There are seven outputs: six to routing and one to carry-chain (to the adjacent PFU). Slice 3 has 13 input signals from routing and four signals to routing. Table 2-2 lists the signals associated with Slice 0 to Slice 2.



#### Figure 2-4. Slice Diagram



WCK is CLK

DI[3:2] for Slice 2 and DI[1:0] for Slice 0 data

WAD [A:D] is a 4bit address from slice 1 LUT input

Table 2-2. Slice Signal Descriptions

| Function | Туре               | Signal Names   | Description                                                          |
|----------|--------------------|----------------|----------------------------------------------------------------------|
| Input    | Data signal        | A0, B0, C0, D0 | Inputs to LUT4                                                       |
| Input    | Data signal        | A1, B1, C1, D1 | Inputs to LUT4                                                       |
| Input    | Multi-purpose      | MO             | Multipurpose Input                                                   |
| Input    | Multi-purpose      | M1             | Multipurpose Input                                                   |
| Input    | Control signal     | CE             | Clock Enable                                                         |
| Input    | Control signal     | LSR            | Local Set/Reset                                                      |
| Input    | Control signal     | CLK            | System Clock                                                         |
| Input    | Inter-PFU signal   | FC             | Fast Carry-in <sup>1</sup>                                           |
| Input    | Inter-slice signal | FXA            | Intermediate signal to generate LUT6 and LUT7                        |
| Input    | Inter-slice signal | FXB            | Intermediate signal to generate LUT6 and LUT7                        |
| Output   | Data signals       | F0, F1         | LUT4 output register bypass signals                                  |
| Output   | Data signals       | Q0, Q1         | Register outputs                                                     |
| Output   | Data signals       | OFX0           | Output of a LUT5 MUX                                                 |
| Output   | Data signals       | OFX1           | Output of a LUT6, LUT7, LUT8 <sup>2</sup> MUX depending on the slice |
| Output   | Inter-PFU signal   | FCO            | Slice 2 of each PFU is the fast carry chain output <sup>1</sup>      |

1. See Figure 2-4 for connection details.

2. Requires two PFUs.

WRE is from LSR



#### Modes of Operation

Each slice has up to four potential modes of operation: Logic, Ripple, RAM and ROM.

#### Logic Mode

In this mode, the LUTs in each slice are configured as 4-input combinatorial lookup tables. A LUT4 can have 16 possible input combinations. Any four input logic functions can be generated by programming this lookup table. Since there are two LUT4s per slice, a LUT5 can be constructed within one slice. Larger look-up tables such as LUT6, LUT7 and LUT8 can be constructed by concatenating other slices. Note LUT8 requires more than four slices.

#### **Ripple Mode**

Ripple mode supports the efficient implementation of small arithmetic functions. In ripple mode, the following functions can be implemented by each slice:

- Addition 2-bit
- Subtraction 2-bit
- Add/Subtract 2-bit using dynamic control
- Up counter 2-bit
- Down counter 2-bit
- Up/Down counter with Async clear
- Up/Down counter with preload (sync)
- Ripple mode multiplier building block
- Multiplier support
- Comparator functions of A and B inputs
  - A greater-than-or-equal-to B
  - A not-equal-to B
  - A less-than-or-equal-to B

Ripple Mode includes an optional configuration that performs arithmetic using fast carry chain methods. In this configuration (also referred to as CCU2 mode) two additional signals, Carry Generate and Carry Propagate, are generated on a per slice basis to allow fast arithmetic functions to be constructed by concatenating Slices.

#### **RAM Mode**

In this mode, a 16x4-bit distributed single port RAM (SPR) can be constructed using each LUT block in Slice 0 and Slice 2 as a 16x1-bit memory. Slice 1 is used to provide memory address and control signals. A 16x2-bit pseudo dual port RAM (PDPR) memory is created by using one Slice as the read-write port and the other companion slice as the read-only port.

The Lattice design tools support the creation of a variety of different size memories. Where appropriate, the software will construct these using distributed memory primitives that represent the capabilities of the PFU. Table 2-3 shows the number of slices required to implement different distributed RAM primitives. For more information about using RAM in LatticeECP2/M devices, please see the list of additional technical documentation at the end of this data sheet.

#### Table 2-3. Number of Slices Required to Implement Distributed RAM

|                  | SPR 16X4 | PDPR 16X4 |
|------------------|----------|-----------|
| Number of slices | 3        | 3         |

Note: SPR = Single Port RAM, PDPR = Pseudo Dual Port RAM



#### **ROM Mode**

ROM mode uses the LUT logic; hence, Slices 0 through 3 can be used in ROM mode. Preloading is accomplished through the programming interface during PFU configuration.

## Routing

There are many resources provided in the LatticeECP2/M devices to route signals individually or as buses with related control signals. The routing resources consist of switching circuitry, buffers and metal interconnect (routing) segments.

The inter-PFU connections are made with x1 (spans two PFU), x2 (spans three PFU) and x6 (spans seven PFU). The x1 and x2 connections provide fast and efficient connections in horizontal and vertical directions. The x2 and x6 resources are buffered, allowing the routing of both short and long connections between PFUs.

The LatticeECP2/M family has an enhanced routing architecture that produces a compact design. The Diamond design software takes the output of the synthesis tool and places and routes the design. Generally, the place and route tool is completely automatic, although an interactive routing editor is available to optimize the design.

## sysCLOCK Phase Locked Loops (GPLL/SPLL)

The sysCLOCK PLLs provide the ability to synthesize clock frequencies. All the devices in the LatticeECP2/M family support two General Purpose PLLs (GPLLs) which are full-featured PLLs. In addition, some of the larger devices have two to six Standard PLLs (SPLLs) that have a subset of GPLL functionality.

#### General Purpose PLL (GPLL)

The architecture of the GPLL is shown in Figure 2-5. A description of the GPLL functionality follows.

CLKI is the reference frequency (generated either from the pin or from routing) for the PLL. CLKI feeds into the Input Clock Divider block. The CLKFB is the feedback signal (generated from CLKOP or from a user clock PIN/ logic). This signal feeds into the Feedback Divider. The Feedback Divider is used to multiply the reference frequency.

The Delay Adjust Block adjusts either the delays of the reference or feedback signals. The Delay Adjust Block can either be programmed during configuration or can be adjusted dynamically. The setup, hold or clock-to-out times of the device can be improved by programming a delay in the feedback or input path of the PLL, which will advance or delay the output clock with reference to the input clock.

Following the Delay Adjust Block, both the input path and feedback signals enter the Voltage Controlled Oscillator (VCO) block. In this block the difference between the input path and feedback signals is used to control the frequency and phase of the oscillator. A LOCK signal is generated by the VCO to indicate that the VCO has locked onto the input clock signal. In dynamic mode, the PLL may lose lock after a dynamic delay adjustment and not relock until the t<sub>LOCK</sub> parameter has been satisfied. LatticeECP2/M devices have two dedicated pins on the left and right edges of the device for connecting optional external capacitors to the VCO. This allows the PLLs to operate at a lower frequency. This is a shared resource that can only be used by one PLL (GPLL or SPLL) per side.

The output of the VCO then enters the post-scalar divider. The post-scalar divider allows the VCO to operate at higher frequencies than the clock output (CLKOP), thereby increasing the frequency range. A secondary divider takes the CLKOP signal and uses it to derive lower frequency outputs (CLKOK). The Phase/Duty Select block adjusts the phase and duty cycle of the CLKOP signal and generates the CLKOS signal. The phase/duty cycle setting can be pre-programmed or dynamically adjusted.

The primary output from the post scalar divider CLKOP along with the outputs from the secondary divider (CLKOK) and Phase/Duty select (CLKOS) are fed to the clock distribution network.



#### Figure 2-5. General Purpose PLL (GPLL) Diagram



#### Standard PLL (SPLL)

Some of the larger devices have two to six Standard PLLs (SPLLs). SPLLs have the same features as GPLLs but without delay adjustment capability. SPLLs also provide different parametric specifications. For more information, please see the list of additional technical documentation at the end of this data sheet.

Table 2-4 provides a description of the signals in the GPLL and SPLL blocks.

Table 2-4. GPLL and SPLL Blocks Signal Descriptions

| Signal               | I/O | Description                                                                                              |
|----------------------|-----|----------------------------------------------------------------------------------------------------------|
| CLKI                 | I   | Clock input from external pin or routing                                                                 |
| CLKFB                | I   | PLL feedback input from CLKOP (PLL internal), from clock net (CLKOP) or from a user clock (PIN or logic) |
| RST                  | I   | "1" to reset PLL counters, VCO, charge pumps and M-dividers                                              |
| RSTK                 | I   | "1" to reset K-divider                                                                                   |
| CLKOS                | 0   | PLL output clock to clock tree (phase shifted/duty cycle changed)                                        |
| CLKOP                | 0   | PLL output clock to clock tree (no phase shift)                                                          |
| CLKOK                | 0   | PLL output to clock tree through secondary clock divider                                                 |
| LOCK                 | 0   | "1" indicates PLL LOCK to CLKI                                                                           |
| DDAMODE <sup>1</sup> | I   | Dynamic Delay Enable. "1": Pin control (dynamic), "0": Fuse Control (static)                             |
| DDAIZR <sup>1</sup>  | I   | Dynamic Delay Zero. "1": delay = 0, "0": delay = on                                                      |
| DDAILAG <sup>1</sup> | I   | Dynamic Delay Lag/Lead. "1": Lead, "0": Lag                                                              |
| DDAIDEL[2:0]1        | I   | Dynamic Delay Input                                                                                      |
| DPA MODES            | I   | DPA (Dynamic Phase Adjust/Duty Cycle Select) mode                                                        |
| DPHASE [3:0]         | I   | DPA Phase Adjust inputs                                                                                  |
| DDDUTY [3:0]         | —   | DPA Duty Cycle Select inputs                                                                             |

1. These signals are not available in SPLL.



## Delay Locked Loops (DLL)

In addition to PLLs, the LatticeECP2/M family of devices has two DLLs per device.

CLKI is the input frequency (generated either from the pin or routing) for the DLL. CLKI feeds into the output muxes block to bypass the DLL, directly to the DELAY CHAIN block and (directly or through divider circuit) to the reference input of the Phase Frequency Detector (PFD) input mux. The reference signal for the PFD can also be generated from the Delay Chain and CLKFB signals. The feedback input to the PFD is generated from the CLKFB pin, CLKI or from tapped signal from the Delay chain.

The PFD produces a binary number proportional to the phase and frequency difference between the reference and feedback signals. This binary output of the PFD is fed into a Arithmetic Logic Unit (ALU). Based on these inputs, the ALU determines the correct digital control codes to send to the delay chain in order to better match the reference and feedback signals. This digital code from the ALU is also transmitted via the Digital Control bus (DCNTL) bus to its associated DLLDELA delay block. The ALUHOLD input allows the user to suspend the ALU output at its current value. The UDDCNTL signal allows the user to latch the current value on the DCNTL bus.

The DLL has two independent clock outputs, CLKOP and CLKOS. These outputs can individually select one of the outputs from the tapped delay line. The CLKOS has optional fine phase shift and divider blocks to allow this output to be further modified, if required. The fine phase shift block allows the CLKOS output to phase shifted a further 45, 22.5 or 11.25 degrees relative to its normal position. Both the CLKOS and CLKOP outputs are available with optional duty cycle correction. Divide by two and divide by four frequencies are available at CLKOS. The LOCK output signal is asserted when the DLL is locked. Figure 2-6 shows the DLL block diagram and Table 2-5 provides a description of the DLL inputs and outputs.

The user can configure the DLL for many common functions such as time reference delay mode and clock injection removal mode. Lattice provides primitives in its design tools for these functions. For more information about the DLL, please see the list of additional technical documentation at the end of this data sheet.



Figure 2-6. Delay Locked Loop Diagram (DLL)



#### Table 2-5. DLL Signals

| Signal     | I/O | Description                                                                   |  |
|------------|-----|-------------------------------------------------------------------------------|--|
| CLKI       | I   | Clock input from external pin or routing                                      |  |
| CLKFB      | I   | DLL feed input from DLL output, clock net, routing or external pin            |  |
| RSTN       | I   | Active low synchronous reset                                                  |  |
| ALUHOLD    | I   | Active high freezes the ALU                                                   |  |
| UDDCNTL    | I   | ynchronous enable signal (hold high for two cycles) from routing              |  |
| DCNTL[8:0] | 0   | Encoded digital control signals for PIC INDEL and slave delay calibration     |  |
| CLKOP      | 0   | The primary clock output                                                      |  |
| CLKOS      | 0   | The secondary clock output with fine phase shift and/or division by 2 or by 4 |  |
| LOCK       | 0   | Active high phase lock indicator                                              |  |

#### **DLLDELA Delay Block**

Closely associated with each DLL is a DLLDELA block. This is a delay block consisting of a delay line with taps and a selection scheme that selects one of the taps. The DCNTL[8:0] bus controls the delay of the CLKO signal. Typically this is the delay setting that the DLL uses to achieve phase alignment. This results in the delay providing a calibrated 90° phase shift that is useful in centering a clock in the middle of a data cycle for source synchronous data. The CLKO signal feeds the edge clock network. Figure 2-7 shows the connections between the DLL block and the DLLDELA delay block. For more information, please see the list of additional technical documentation at the end of this data sheet.

#### Figure 2-7. DLLDELA Delay Block



#### PLL/DLL Cascading

LatticeECP2/M devices have been designed to allow certain combinations of PLL (GPLL and SPLL) and DLL cascading. The allowable combinations are:

- PLL to PLL supported
- PLL to DLL supported



The DLLs in the LatticeECP2/M are used to shift the clock in relation to the data for source synchronous inputs. PLLs are used for frequency synthesis and clock generation for source synchronous interfaces. Cascading PLL and DLL blocks allows applications to utilize the unique benefits of both DLLs and PLLs.

For further information about the DLL, please see the list of additional technical documentation at the end of this data sheet.

## GPLL/SPLL/GDLL PIO Input Pin Connections (LatticeECP2M Family Only)

All LatticeECP2M devices contain two GDLLs, two GPLLs and six SPLLs, arranged in quadrants as shown in Figure 2-8. In the LatticeECP2M devices GPLLs, SPLLs and GDLLs share their input pins. Figure 2-8 shows the sharing of SPLLs input pin connections in the upper two quadrants and the sharing of GDLL, GPLL and SPLL input pin connections in the lower two quadrants.





## **Clock Dividers**

LatticeECP2/M devices have two clock dividers, one on the left side and one on the right side of the device. These are intended to generate a slower-speed system clock from a high-speed edge clock. The block operates in a ÷2, ÷4 or ÷8 mode and maintains a known phase relationship between the divided down clock and the high-speed clock based on the release of its reset signal. The clock dividers can be fed from selected PLL/DLL outputs, DLL-DELA delay blocks, routing or from an external clock input. The clock divider outputs serve as primary clock sources and feed into the clock distribution network. The Reset (RST) control signal resets input and synchronously forces all outputs to low. The RELEASE signal releases outputs synchronously to the input clock. For further information about clock dividers, please see the list of additional technical documentation at the end of this data sheet. Figure 2-9 shows the clock divider connections.



#### Figure 2-9. Clock Divider Connections



## **Clock Distribution Network**

LatticeECP2/M devices have eight quadrant-based primary clocks and eight flexible region-based secondary clocks/control signals. Two high performance edge clocks are available on each edge of the device to support high speed interfaces. These clock inputs are selected from external I/Os, the sysCLOCK PLLs, DLLs or routing. These clock inputs are fed throughout the chip via a clock distribution system.

#### **Primary Clock Sources**

LatticeECP2/M devices derive clocks from five primary sources: PLL (GPLL and SPLL) outputs, DLL outputs, CLK-DIV outputs, dedicated clock inputs and routing. LatticeECP2/M devices have two to eight sysCLOCK PLLs and two DLLs, located on the left and right sides of the device. There are eight dedicated clock inputs, two on each side of the device, with the exception of the LatticeECP2M 256-fpBGA package devices which have six dedicated clock inputs on the device. Figure 2-10 shows the primary clock sources.





Note: This diagram shows sources for the ECP2-50 device. Smaller LatticeECP2 devices have fewer SPLLs. All LatticeECP2M devices have six SPLLs.



#### Secondary Clock/Control Sources

LatticeECP2/M devices derive secondary clocks (SC0 through SC7) from eight dedicated clock input pads and the rest from routing. Figure 2-11 shows the secondary clock sources.







#### Edge Clock Sources

Edge clock resources can be driven from a variety of sources at the same edge. Edge clock resources can be driven from adjacent edge clock PIOs, primary clock PIOs, PLLs/DLLs and clock dividers as shown in Figure 2-12.

#### Figure 2-12. Edge Clock Sources





#### Primary Clock Routing

The clock routing structure in LatticeECP2/M devices consists of a network of eight primary clock lines (CLK0 through CLK7) per quadrant. The primary clocks of each quadrant are generated from muxes located in the center of the device. All the clock sources are connected to these muxes. Figure 2-13 shows the clock routing for one quadrant. Each quadrant mux is identical. If desired, any clock can be routed globally





#### **Dynamic Clock Select (DCS)**

The DCS is a smart multiplexer function available in the primary clock routing. It switches between two independent input clock sources without any glitches or runt pulses. This is achieved regardless of when the select signal is toggled. There are two DCS blocks per quadrant; in total, there are eight DCS blocks per device. The inputs to the DCS block come from the center muxes. The output of the DCS is connected to primary clocks CLK6 and CLK7 (see Figure 2-13).

Figure 2-14 shows the timing waveforms of the default DCS operating mode. The DCS block can be programmed to other modes. For more information about the DCS, please see the list of additional technical documentation at the end of this data sheet.

#### Figure 2-14. DCS Waveforms



#### Secondary Clock/Control Routing

Secondary clocks in the LatticeECP2 devices are region-based resources. The benefit of region-based resources is the relatively low injection delay and skew within the region, as compared to primary clocks. EBR/DSP rows and a special vertical routing channel bound the secondary clock regions. This special vertical routing channel aligns with either the left edge of the center DSP block in the DSP row or the center of the DSP row. Figure 2-15 shows



this special vertical routing channel and the eight secondary clock regions for the ECP2-50. LatticeECP2 devices have four secondary clocks (SC0 to SC3) which are distrubed to every region.

The secondary clock muxes are located in the center of the device. Figure 2-16 shows the mux structure of the secondary clock routing. Secondary clocks SC0 to SC3 are used for clock and control and SC4 to SC7 are used for high fan-out signals.







#### Figure 2-16. Secondary Clock Selection



#### Slice Clock Selection

Figure 2-17 shows the clock selections and Figure 2-18 shows the control selections for Slice0 through Slice2. All the primary clocks and the four secondary clocks are routed to this clock selection mux. Other signals can be used as a clock input to the slices via routing. Slice controls are generated from the secondary clocks or other signals connected via routing.

If none of the signals are selected for both clock and control then the default value of the mux output is 1. Slice 3 does not have any registers; therefore it does not have the clock or control muxes.

#### Figure 2-17. Slice0 through Slice2 Clock Selection





#### Figure 2-18. Slice0 through Slice2 Control Selection



#### **Edge Clock Routing**

LatticeECP2/M devices have a number of high-speed edge clocks that are intended for use with the PIOs in the implementation of high-speed interfaces. There are eight edge clocks per device: two edge clocks per edge. Different PLL and DLL outputs are routed to the two muxes on the left and right sides of the device. In addition, the CLKO signal (generated from the DLLDELA block) is routed to all the edge clock muxes on the left and right sides of the device. Figure 2-19 shows the selection muxes for these clocks.

#### Figure 2-19. Edge Clock Mux Connections





#### sysMEM Memory

LatticeECP2/M devices contains a number of sysMEM Embedded Block RAM (EBR). The EBR consists of an 18-Kbit RAM with dedicated input and output registers.

#### sysMEM Memory Block

The sysMEM block can implement single port, dual port or pseudo dual port memories. Each block can be used in a variety of depths and widths as shown in Table 2-6. FIFOs can be implemented in sysMEM EBR blocks by implementing support logic with PFUs. The EBR block facilitates parity checking by supporting an optional parity bit for each data byte. EBR blocks provide byte-enable support for configurations with18-bit and 36-bit data widths.

#### Table 2-6. sysMEM Block Configurations

| Memory Mode      | Configurations                                                              |
|------------------|-----------------------------------------------------------------------------|
| Single Port      | 16,384 x 1<br>8,192 x 2<br>4,096 x 4<br>2,048 x 9<br>1,024 x 18<br>512 x 36 |
| True Dual Port   | 16,384 x 1<br>8,192 x 2<br>4,096 x 4<br>2,048 x 9<br>1,024 x 18             |
| Pseudo Dual Port | 16,384 x 1<br>8,192 x 2<br>4,096 x 4<br>2,048 x 9<br>1,024 x 18<br>512 x 36 |

#### Bus Size Matching

All of the multi-port memory modes support different widths on each of the ports. The RAM bits are mapped LSB word 0 to MSB word 0, LSB word 1 to MSB word 1, and so on. Although the word size and number of words for each port varies, this mapping scheme applies to each port.

#### RAM Initialization and ROM Operation

If desired, the contents of the RAM can be pre-loaded during device configuration. By preloading the RAM block during the chip configuration cycle and disabling the write controls, the sysMEM block can also be utilized as a ROM.

#### Memory Cascading

Larger and deeper blocks of RAM can be created using EBR sysMEM Blocks. Typically, the Lattice design tools cascade memory transparently, based on specific design inputs.

#### Single, Dual and Pseudo-Dual Port Modes

In all the sysMEM RAM modes the input data and address for the ports are registered at the input of the memory array. The output data of the memory is optionally registered at the output.

EBR memory supports two forms of write behavior for single port or dual port operation:

1. Normal – Data on the output appears only during a read cycle. During a write cycle, the data (at the current address) does not appear on the output. This mode is supported for all data widths.



2. Write Through – A copy of the input data appears at the output of the same port during a write cycle. This mode is supported for all data widths.

#### **Memory Core Reset**

The memory array in the EBR utilizes latches at the A and B output ports. These latches can be reset asynchronously or synchronously. RSTA and RSTB are local signals, which reset the output latches associated with Port A and Port B, respectively. The Global Reset (GSRN) signal resets both ports. The output data latches and associated resets for both ports are as shown in Figure 2-20.

#### Figure 2-20. Memory Core Reset



For further information about the sysMEM EBR block, please see the the list of additional technical documentation at the end of this data sheet.

#### EBR Asynchronous Reset

EBR asynchronous reset or GSR (if used) can only be applied if all clock enables are low for a clock cycle before the reset is applied and released a clock cycle after the reset is released, as shown in Figure 2-21. The GSR input to the EBR is always asynchronous.

#### Figure 2-21. EBR Asynchronous Reset (Including GSR) Timing Diagram

| Reset |  |
|-------|--|
| Clock |  |
| Clock |  |

If all clock enables remain enabled, the EBR asynchronous reset or GSR may only be applied and released after the EBR read and write clock inputs are in a steady state condition for a minimum of  $1/f_{MAX}$  (EBR clock). The reset release must adhere to the EBR synchronous reset setup time before the next active read or write clock edge.



If an EBR is pre-loaded during configuration, the GSR input must be disabled or the release of the GSR during device Wake Up must occur before the release of the device I/Os becomes active.

These instructions apply to all EBR RAM and ROM implementations.

Note that there are no reset restrictions if the EBR synchronous reset is used and the EBR GSR input is disabled.

### sysDSP™ Block

The LatticeECP2/M family provides a sysDSP block, making it ideally suited for low cost, high performance Digital Signal Processing (DSP) applications. Typical functions used in these applications are Finite Impulse Response (FIR) filters, Fast Fourier Transforms (FFT) functions, Correlators, Reed-Solomon/Turbo/Convolution encoders and decoders. These complex signal processing functions use similar building blocks such as multiply-adders and multiply-accumulators.

#### sysDSP Block Approach Compared to General DSP

Conventional general-purpose DSP chips typically contain one to four (Multiply and Accumulate) MAC units with fixed data-width multipliers; this leads to limited parallelism and limited throughput. Their throughput is increased by higher clock speeds. The LatticeECP2/M, on the other hand, has many DSP blocks that support different data-widths. This allows the designer to use highly parallel implementations of DSP functions. The designer can optimize the DSP performance vs. area by choosing an appropriate level of parallelism. Figure 2-22 compares the fully serial and the mixed parallel and serial implementations.

#### Figure 2-22. Comparison of General DSP and LatticeECP2/M Approaches



#### sysDSP Block Capabilities

The sysDSP block in the LatticeECP2/M family supports four functional elements in three 9, 18 and 36 data path widths. The user selects a function element for a DSP block and then selects the width and type (signed/unsigned) of its operands. The operands in the LatticeECP2/M family sysDSP Blocks can be either signed or unsigned but not mixed within a function element. Similarly, the operand widths cannot be mixed within a block. In the LatticeECP2/M family the DSP elements can be concatenated.

The resources in each sysDSP block can be configured to support the following elements:



- MULT (Multiply)
- MAC (Multiply, Accumulate)
- MULTADDSUB (Multiply, Addition/Subtraction)
- MULTADDSUBSUM (Multiply, Addition/Subtraction, Accumulate)

The number of elements available on each block depends in the width selected from the three available options x9, x18, and x36. A number of these elements are concatenated for highly parallel implementations of DSP functions. Table 2-7 shows the capabilities of the block.

#### Table 2-7. Maximum Number of Elements in a Block

| Width of Multiply | x9 | x18 | x36 |
|-------------------|----|-----|-----|
| MULT              | 8  | 4   | 1   |
| MAC               | 2  | 2   | —   |
| MULTADDSUB        | 4  | 2   | —   |
| MULTADDSUBSUM     | 2  | 1   | —   |

Some options are available in four elements. The input register in all the elements can be directly loaded or can be loaded as a shift register from previous operand registers. By selecting "dynamic operation" the following operations are possible:

- In the 'Signed/Unsigned' options the operands can be switched between signed and unsigned on every cycle.
- In the 'Add/Sub' option the Accumulator can be switched between addition and subtraction on every cycle.
- The loading of operands can switch between parallel and serial operations.