The Short Circuit

Mike Morgan's technical journal

 

The circuit in figure 1 is useful for passing pulses across ascynchronous clock domains. Its purpose is to convert a pulse in one clock domain to another clock domain, while mitigating metastability inherant in clock domains crossing (CDC).(1) VHDL source code can be found below

For our purposes, a pulse is defined as a synchronous signal one clock period in duration. Pulses may be used to load registers, or to indicate status such as "output valid", as two examples. When passing busses from one clock domain to another, resynchronized pulses may be utilized to ensure that all bits in a data bus are stable in the destination domain before being used as valid data. 

Figure 1: Circuit for passing pulse across clock domains

Figure 1: Circuit for passing pulse across clock domains

Theory of circuit

A pulse is converted to state change with a T flip flop, TFF0. The state change is synchronized to the destination clock domain via two cascaded D flip flops (DFFs), DFF0 and DFF1. The output of DFF1 is cascaded to DFF2, where the input and output of DFF2 are exclusived ORed to create a pulse in the destination clock domain. 

Detailed explanation of circuit 

The flip flop TFF0 in figure 1 is a T flip flop, whose output toggles when the input is '1'. When the input is '0', the output retains its previous value. 

T flip flops (TFFs) can be easily inferred with synthesizers based on RTL descriptions similar to figure 2. If a TFF is not part of the target library, synthesizers will often infer the logic in figure 3, using a D flip flop (DFF) and a two-input "exclusive or" (XOR) gate to construct a TFF. 

-- VHDL snippet to infer a T flip flop (TFF)
TFF_PROCESS : process (iClk,iRSTn)
begin
  if(iRSTn = '0') then
    t <= '0';
  elsif(rising_edge(iClk)) then
    t <= t XOR iPulseA;
  end if;
end process TFF_PROCESS;

Figure 2: VHDL snippet to infer T flip flip. 

T Flip Flop using D Flip Flop and XOR gate

Figure 3: T Flip Flop using D Flip Flop and XOR gate

 

 

Referring back to figure 1, the purpose of the T flip flop is to turn a pulse into a change of output, i.e. '0' to '1' or '1' to '0'. With each '1' input clocked in, the TFF output changes state. 

The purpose of DFF0 and DFF1, excerpted to figure 4, is to bound metastability. It functions even if iClkA is at a much higher frequency than iClkB, so long as the iPulse to iPulse timing is no less than 2 destination clock periods.(2) 

Figure 4: Metastability mitigating cascaded DFFs

Figure 4: Metastability mitigating cascaded DFFs

 

The last DFF, DFF2, is used to convert the change in state back into a pulse.  The input and the output of DFF2 are XORed, resulting in a pulse whenever they differ. 

The VHDL snippet to infer the three cascaded DFFs and the exclusive or gate follows: 

D_PROCESS : process (iClkB,iRSTBn)
begin
  if(iRSTBn = '0') then
    d <= (others => '0');
  elsif(rising_edge(iClkB)) then
    d <= d(d'high -1 downto 0) & t;
  end if;
end process D_PROCESS;

-- create pulse for every state change
-- XORing last D flip flop input/output.
oPulseB <= d(d'high) XOR d(d'high -1);

Figure 5: VHDL snippet to infer three cascaded DFFS and pulse generating XOR gate 

Circuit operation is further illustrated via waveform in figure 6

Waveform of circuit

Figure 6: Waveform of circuit shown in figure 1

 

Example Block Diagram

Figure 7 shows an example partial block diagram of a DSP system that acquires data in one clock domain but does the majority of its processing in another clock domain. The diagram contains the pulse across clock domain module (PACD), whose source code can be found below

PACD used in signal processing application

Figure 7: PACD used in signal processing application

 

The PACD module converts the data valid pulse from the interpolating CIC filter into a pulse synchronized to the ClkB domain. The resynchronized pulse is then fed to the IIR bandpass filter, indicating that the input data is ready to be loaded. This ensures that by the time the IIR bandpass filter latches the data, it has been stable for at least three ClkB clocks (creating a multi-cycle path), as illustrated in figure 8

Waveform of DSP block shown figure 7

Figure 8: Waveform of DSP block shown figure 7

 

Footnotes 

  • 1 Very Brief background on metastability: Metastability is a topic worthy of its own post. It is impossible to do the topic justice here. For for now, suffice it to say that if a signal changes state, '0' to '1' or vice versa, too close to the clock edge used to clock the signal into a register, the physical output of the register can go into an indeterminate state. This indeterminate state can take any of a variety of possible physical manifestations, which can lead to sundry problems--from glitches, invalid data, spurious state machine transitions, as well as increased power consumption. However, when a design utilizes only synchrnous signals and a single clock domain, you can ensure that none of your registers will go metastable using a static timing analyzer (STA). Static timing analysis is yet another topic worthy of its own post. Briefly, STAs are automated engineering software tools that examine a netlist and associated timing data to ensure combinatorial paths between registers are sufficiently long, and sufficiently short, to ensure set up and hold times are not violated. STAs are very simple in theory and practice. Their power derives from their ability to perform a multitude of simple calculations quickly, without error. STAs and metastability will be covered in future posts. For now, it is enough to know that STAs do not validate signals crossing asynchronous clock domains. If you cannot wait, here is an old but good whitepaper on metastability by TI: Metastable Response in 5-V Logic Circuits.  The 5V logic circuits is a giveaway that it's dated, but the fundementals of metastability haven't changed with lower operating logic voltages.
  • 2 Minimum pulse spacing: To be precise, the input pulses though synchronized to iClkA must be spread apart at least one iClkB period plus one setup time (tsu) and one hold time (thd). However, it is often easier to just make sure the pulses in the iClkA domain are spread at least two iClkB periods apart, which is a safe assumption so long as register timing specifications are such that tsu + thd ≤ PeriodiClkB, the period othe iClkB clock. When the pulses are used to synchronize data busses, additional pulse spacing may be required. For example, the block diagram in figure 7 requires pulses be spread at least roughly four ClkB periods.

Pulse Across Clock Domain VHDL source 

library IEEE;
use IEEE.STD_LOGIC_1164.all;
-- Author:  Mike Morgan
--
-- Theory of operation:  This is a simple circuit to pass a
-- pulse synchronized to a source clock to a destination clock
-- domain using 1 T flip flop (TFF) and 3 D flip flops (DFFs).
--
-- The source pulse must be one source clock in duration and
-- the output pulse will be one destination clock in duration.   

-- This circuit is useful for passing 1 clock duration data
-- enables across clock domains.
--
-- Disclaimer:  This file is released to the public domain for
-- illustration purposes only.  Any use is at risk of the person
-- or company reusing the code.  Validation via simulation and
-- timing analysis based on target library and design parameters
-- is responsibility of company or person using this code.
--
-- Please see footnote 2 at  http://sc.morganisms.net/?p=226
-- regarding constraints on iPulseA timing
--
entity pacd is -- Pulse Across clock domain
port(
  iPulseA  : IN   std_logic; -- 1 clock width pulse from source domain
  iClkA    : IN   std_logic; -- source clock domain
  iRSTAn   : IN   std_logic; -- active low reset
  iClkB    : IN   std_logic; -- destination clock domain
  iRSTBn   : IN   std_logic; -- active low reset
  oPulseB  : OUT  std_logic  -- 1 clock width pulse in destination domain
);
end pacd;

architecture rtl of pacd is
  signal t : std_logic;
  signal d : std_logic_vector(2 downto 0);
begin -- architecture

-- infer a T flip flop in source domain
T_PROCESS : process (iClkA,iRSTAn)
begin
  if(iRSTAn = '0') then
    t <= '0';
  elsif(rising_edge(iClkA)) then
   t <= t XOR iPulseA;
  end if;
end process T_PROCESS;

-- Feed T output to three D flip flops in destination clock
-- domain.  First two flip flops are to filter metastability
-- inherant in clock domain crossing of T, and the final D
-- flip flop is used to turn the output of the T flip flop
-- (now synchronized to the destination domain) into a pulse
-- in the destination clock domain.
D_PROCESS : process (iClkB,iRSTBn)
begin
  if(iRSTBn = '0') then
    d <= (others => '0');
  elsif(rising_edge(iClkB)) then
    d <= d(d'high -1 downto 0) & t;
  end if;
end process D_PROCESS;

-- create pulse every toggle using input and output
-- of last D flip flop.
oPulseB <= d(d'high) XOR d(d'high -1);

end rtl;

Credits

  • I learned this circuit from Joe Meyer, though I think it exists in the toolbelts of many engineers.

10 Comments »

  1. Thanks for this Mike, I have used this in a design of mine (and referenced of course) and appreciate you sharing it.

  2. Hello,

    there is unfortunately a bug in this synchronizer. It may fail to pass a pulse from the clkA domain to the clkB domain if the clkA clock is slower than clkB. (And I wouldn't trust it unless clkB was significantly faster than clkA.)

  3. Hi Andreas,

    Make sure you read and understand footnote 2; make sure your test bench runs long enough for the pulse to make its way through the circuit; and make sure your ClkA synchronized input pulses are spaced at least 2 ClkB clock periods apart.

    Good luck.

  4. Ah, sorry, I missed footnote 2 as I only looked at the circuit diagram and the source code. Sorry about that.

  5. No problem.

  6. Definitely gonna recommend this post to a few colleagues

  7. Hello Mike,
    Thanks a lot for your post. It's a great circuit. Also it's really nice of you to share with the community.

  8. Hi Mike,

    Could you please explain footnote 2, how you arrived at the Minimum pulse spacing ?

  9. Hi Mike,
    Used this to remote trigger a custom digital camera using standard 3.3V LVCMOS over 8 feet of wire.
    Was flawless over millions of runs at 400hz. Thanks for posting this solution.

  10. Thanks for the feedback! It is always good to read nice comments like that.

    Keep up the good work--sounds like you have an interesting project.

RSS feed for comments on this post. TrackBack URL

Leave a comment



© 2017 The Short Circuit | Entries (RSS) and Comments (RSS) |