Wishbone templates for ZPUino


Jack Gassett

Recommended Posts

A while back I asked Alvie if he had any easy to understand templates for a wishbone peripheral that could be plugged into the ZPUino. He quickly threw together four examples and sent them to me but I never got a chance to do anything with them. I was just looking them over today when I realized I should probably post these to the forums so everyone can enjoy them! So here goes:

 

Example 1

-- This example uses asynchronous outputs.library ieee;use ieee.std_logic_1164.all;use ieee.std_logic_unsigned.all;use ieee.numeric_std.all;entity example1 is  port (    wb_clk_i:   in  std_logic;                     -- Wishbone clock    wb_rst_i:   in  std_logic;                     -- Wishbone reset (synchronous)    wb_dat_o:   out std_logic_vector(31 downto 0); -- Wishbone data output (32 bits)    wb_dat_i:   in  std_logic_vector(31 downto 0); -- Wishbone data input  (32 bits)    wb_adr_i:   in  std_logic_vector(31 downto 2); -- Wishbone address input  (30 bits)    wb_we_i:    in  std_logic;                     -- Wishbone write enable signal    wb_cyc_i:   in  std_logic;                     -- Wishbone cycle signal    wb_stb_i:   in  std_logic;                     -- Wishbone strobe signal    wb_ack_o:   out std_logic                      -- Wishbone acknowledge out signal  );end entity example1;architecture rtl of example1 is  signal register1: std_logic_vector(31 downto 0); -- Register 1 (32 bits)  signal register2: std_logic_vector(31 downto 0); -- Register 2 (32 bits)  signal register3: std_logic_vector(7 downto 0);  -- Register 3 (8 bits)begin  -- Asynchronous acknowledge  wb_ack_o <= '1' when wb_cyc_i='1' and wb_stb_i='1' else '0';  -- Multiplex the data output (asynchronous)  process(register1,register2,register3, wb_adr_i)  begin    -- Multiplex the read depending on the address. Use only the 2 lowest bits of addr    case wb_adr_i(3 downto 2) is      when "00" =>        wb_dat_o <= register1;  -- Output register1      when "01" =>        wb_dat_o <= register2;  -- Output register2      when "10" =>        wb_dat_o(31 downto 0) <= (others => '0'); -- We put all upper 24 bits to zero        wb_dat_o(7 downto 0) <= register3;        -- since register3 only has 8 bits      when others =>        wb_dat_o <= (others => 'X'); -- Return undefined for all other addresses    end case;  end process;  process(wb_clk_i)  begin    if rising_edge(wb_clk_i) then  -- Synchronous to the rising edge of the clock      if wb_rst_i='1' then        -- Reset request, put register1 and register2 with zeroes,        -- put register 3 with binary 10101010b        register1 <= (others => '0');        register2 <= (others => '0');        register3 <= "10101010";      else -- Not reset        -- Check if someone is writing        if wb_cyc_i='1' and wb_stb_i='1' and wb_we_i='1' then          -- Yes, its a write. See for which register based on address          case wb_adr_i(3 downto 2) is            when "00" =>              register1 <= wb_dat_i;  -- Set register1            when "01" =>              register2 <= wb_dat_i;  -- Set register2            when "10" =>              register3 <= wb_dat_i(7 downto 0); -- Only lower 8 bits for register3            when others =>              null; -- Nothing to do for other addresses          end case;        end if;      end if;    end if;  end process;end rtl;

Example 2

library ieee;use ieee.std_logic_1164.all;use ieee.std_logic_unsigned.all;use ieee.numeric_std.all;entity example2 is  port (    wb_clk_i:   in  std_logic;                     -- Wishbone clock    wb_rst_i:   in  std_logic;                     -- Wishbone reset (synchronous)    wb_dat_o:   out std_logic_vector(31 downto 0); -- Wishbone data output (32 bits)    wb_dat_i:   in  std_logic_vector(31 downto 0); -- Wishbone data input  (32 bits)    wb_adr_i:   in  std_logic_vector(31 downto 2); -- Wishbone address input  (30 bits)    wb_we_i:    in  std_logic;                     -- Wishbone write enable signal    wb_cyc_i:   in  std_logic;                     -- Wishbone cycle signal    wb_stb_i:   in  std_logic;                     -- Wishbone strobe signal    wb_ack_o:   out std_logic                      -- Wishbone acknowledge out signal  );end entity example2;architecture rtl of example2 is  signal register1: std_logic_vector(31 downto 0); -- Register 1 (32 bits)  signal register2: std_logic_vector(31 downto 0); -- Register 2 (32 bits)  signal register3: std_logic_vector(7 downto 0);  -- Register 3 (8 bits)  signal ack_i:     std_logic;  -- Internal ACK signal (flip flop)begin  -- This example uses fully synchronous outputs.  wb_ack_o <= ack_i; -- Tie ACK output to our flip flop  process(wb_clk_i)  begin    if rising_edge(wb_clk_i) then  -- Synchronous to the rising edge of the clock      -- Always set output data on rising edge, even if reset is set.      case wb_adr_i(3 downto 2) is        when "00" =>          wb_dat_o <= register1;  -- Output register1        when "01" =>          wb_dat_o <= register2;  -- Output register2        when "10" =>          wb_dat_o(31 downto 0) <= (others => '0'); -- We put all upper 24 bits to zero          wb_dat_o(7 downto 0) <= register3;        -- since register3 only has 8 bits        when others =>          wb_dat_o <= (others => 'X'); -- Return undefined for all other addresses      end case;      ack_i <= '0'; -- Reset ACK value by default      if wb_rst_i='1' then        -- Reset request, put register1 and register2 with zeroes,        -- put register 3 with binary 10101010b        register1 <= (others => '0');        register2 <= (others => '0');        register3 <= "10101010";      else -- Not reset        -- See if we did not acknowledged a cycle, otherwise we need to ignore        -- the apparent request, because wishbone signals are still set        if ack_i='0' then          -- Check if someone is accessing          if wb_cyc_i='1' and wb_stb_i='1' then            ack_i<='1'; -- Acknowledge the read/write. Actual read data was set above.            if wb_we_i='1' then              -- Its a write. See for which register based on address              case wb_adr_i(3 downto 2) is                when "00" =>                  register1 <= wb_dat_i;  -- Set register1                when "01" =>                  register2 <= wb_dat_i;  -- Set register2                when "10" =>                  register3 <= wb_dat_i(7 downto 0); -- Only lower 8 bits for register3                when others =>                  null; -- Nothing to do for other addresses              end case;            end if;  -- if wb_we_i='1'          end if; -- if wb_cyc_i='1' and wb_stb_i='1'        end if; -- if ack_i='0'      end if; -- if wb_rst_i='1'    end if; -- if rising_edge(wb_clk_i)  end process;end rtl;

Example 3

-- This example uses fully synchronous outputs and record-based registerslibrary ieee;use ieee.std_logic_1164.all;use ieee.std_logic_unsigned.all;use ieee.numeric_std.all;entity example3 is  port (    wb_clk_i:   in  std_logic;                     -- Wishbone clock	 	wb_rst_i:   in  std_logic;                     -- Wishbone reset (synchronous)    wb_dat_o:   out std_logic_vector(31 downto 0); -- Wishbone data output (32 bits)    wb_dat_i:   in  std_logic_vector(31 downto 0); -- Wishbone data input  (32 bits)    wb_adr_i:   in  std_logic_vector(31 downto 2); -- Wishbone address input  (30 bits)    wb_we_i:    in  std_logic;                     -- Wishbone write enable signal    wb_cyc_i:   in  std_logic;                     -- Wishbone cycle signal    wb_stb_i:   in  std_logic;                     -- Wishbone strobe signal    wb_ack_o:   out std_logic                      -- Wishbone acknowledge out signal  );end entity example3;architecture rtl of example3 is  type regstype is record    register1: std_logic_vector(31 downto 0); -- Register 1 (32 bits)    register2: std_logic_vector(31 downto 0); -- Register 2 (32 bits)    register3: std_logic_vector(7 downto 0);  -- Register 3 (8 bits)    ack:       std_logic; -- Ack signal to output (register/ff)    dat:       std_logic_vector(31 downto 0); -- Data out register/ff  end record;  signal r: regstype; -- Main registersbegin  wb_ack_o <= r.ack; -- Tie ACK output to our flip flop  wb_dat_o <= r.dat; -- And data out also  -- This is a single process, with mixed asynchronous and synchronous parts  process(wb_adr_i,wb_dat_i,wb_clk_i,wb_cyc_i,wb_stb_i,wb_we_i,wb_rst_i,r)    variable v: regstype; -- Local variable with register values  begin    v := r; -- Set v with our saved regs. We use 'v' to write, and 'r' to read    -- Always set output data on rising edge, even if reset is set.    case wb_adr_i(3 downto 2) is      when "00" =>        v.dat := r.register1;  -- Output register1      when "01" =>        v.dat := r.register2;  -- Output register2      when "10" =>        v.dat(31 downto 0) := (others => '0'); -- We put all upper 24 bits to zero        v.dat(7 downto 0) := r.register3;        -- since register3 only has 8 bits      when others =>        v.dat := (others => 'X'); -- Return undefined for all other addresses    end case;    if wb_rst_i='1' then      -- Reset request, put register1 and register2 with zeroes,      -- put register 3 with binary 10101010b      v.register1 := (others => '0');      v.register2 := (others => '0');      v.register3 := "10101010";    else -- Not reset      -- See if we did not acknowledged a cycle, otherwise we need to ignore      -- the apparent request, because wishbone signals are still set      if r.ack='0' then        -- Check if someone is accessing        if wb_cyc_i='1' and wb_stb_i='1' then          v.ack := '1'; -- Acknowledge the read/write. Actual read data was set above.          if wb_we_i='1' then            -- Its a write. See for which register based on address            case wb_adr_i(3 downto 2) is              when "00" =>                v.register1 := wb_dat_i;  -- Set register1              when "01" =>                v.register2 := wb_dat_i;  -- Set register2              when "10" =>                v.register3 := wb_dat_i(7 downto 0); -- Only lower 8 bits for register3              when others =>                null; -- Nothing to do for other addresses            end case;          end if;  -- if wb_we_i='1'        end if; -- if wb_cyc_i='1' and wb_stb_i='1'      end if; -- if ack_i='0'    end if; -- if wb_rst_i='1'    if rising_edge(wb_clk_i) then  -- Synchronous to the rising edge of the clock      r <= v; -- Update registers on clock change    end if;   end process;end rtl;

Example 4

-- This example uses asynchronous outputs and record-based registerslibrary ieee;use ieee.std_logic_1164.all;use ieee.std_logic_unsigned.all;use ieee.numeric_std.all;entity example4 is  port (    wb_clk_i:   in  std_logic;                     -- Wishbone clock    wb_rst_i:   in  std_logic;                     -- Wishbone reset (synchronous)    wb_dat_o:   out std_logic_vector(31 downto 0); -- Wishbone data output (32 bits)    wb_dat_i:   in  std_logic_vector(31 downto 0); -- Wishbone data input  (32 bits)    wb_adr_i:   in  std_logic_vector(31 downto 2); -- Wishbone address input  (30 bits)    wb_we_i:    in  std_logic;                     -- Wishbone write enable signal    wb_cyc_i:   in  std_logic;                     -- Wishbone cycle signal    wb_stb_i:   in  std_logic;                     -- Wishbone strobe signal    wb_ack_o:   out std_logic                      -- Wishbone acknowledge out signal  );end entity example4;architecture rtl of example4 is  type regstype is record    register1: std_logic_vector(31 downto 0); -- Register 1 (32 bits)    register2: std_logic_vector(31 downto 0); -- Register 2 (32 bits)    register3: std_logic_vector(7 downto 0);  -- Register 3 (8 bits)  end record;  signal r: regstype; -- Main registersbegin  -- This is a single process, with mixed asynchronous and synchronous parts  process(wb_adr_i,wb_dat_i,wb_clk_i,wb_cyc_i,wb_stb_i,wb_we_i,wb_rst_i,r)    variable v: regstype; -- Local variable with register values  begin    v := r; -- Set v with our saved regs. We use 'v' to write, and 'r' to read    -- Always set output asynchronously    case wb_adr_i(3 downto 2) is      when "00" =>        wb_dat_o <= r.register1;  -- Output register1      when "01" =>        wb_dat_o <= r.register2;  -- Output register2      when "10" =>        wb_dat_o(31 downto 0) <= (others => '0'); -- We put all upper 24 bits to zero        wb_dat_o(7 downto 0) <= r.register3;        -- since register3 only has 8 bits      when others =>        wb_dat_o <= (others => 'X'); -- Return undefined for all other addresses    end case;    if wb_rst_i='1' then      -- Reset request, put register1 and register2 with zeroes,      -- put register 3 with binary 10101010b      v.register1 := (others => '0');      v.register2 := (others => '0');      v.register3 := "10101010";    else -- Not reset      if wb_cyc_i='1' and wb_stb_i='1' then        wb_ack_o <= '1'; -- Acknowledge the read/write asynchronously. Actual read data was set above.        if wb_we_i='1' then          -- Its a write. See for which register based on address          case wb_adr_i(3 downto 2) is            when "00" =>              v.register1 := wb_dat_i;  -- Set register1            when "01" =>              v.register2 := wb_dat_i;  -- Set register2            when "10" =>              v.register3 := wb_dat_i(7 downto 0); -- Only lower 8 bits for register3            when others =>              null; -- Nothing to do for other addresses          end case;        end if;  -- if wb_we_i='1'      end if; -- if wb_cyc_i='1' and wb_stb_i='1'    end if; -- if wb_rst_i='1'    if rising_edge(wb_clk_i) then  -- Synchronous to the rising edge of the clock      r <= v; -- Update registers on clock change    end if;   end process;end rtl;

Alvies_Wishbone_Examples.zip

Link to comment
Share on other sites

Without digging in and studying the code yet I'm seeing that the differences for each version seem to be between using Synchronous vs. Asynchronous outputs and using Records vs No Records. My next task is to better understand the implications of using the Synchronous vs Asynchronous versions...

 

Jack.

Link to comment
Share on other sites

Ok, so I just looked at the Wishbone Datasheet for help on when to use Async vs Sync. It looks like Chapter 4 goes into the information that we need and if I understand correctly... Using Async is simpler, can have greater bandwidth, and can complete in one cycle. But, it can impact the timing of your overall design pretty drastically it looks like. So it is probably best to go with the Sync examples...

 

Anyone care to comment to further clarify/explore?

 

Thanks,

Jack.

Link to comment
Share on other sites

  • 3 weeks later...

Hi

 

synchronous is as far as I know the use of a clocked flip-flop for output signals in contrast to latches. That in turn means a output signal only can change at predefined points in time, etc rising clock.

Asynchronous signals can change intermittently before reaching a steady state. This usually happens due to different delays of input signals to a gate. As an example a xor gate with two inputs A and B.

Let's assume the XOR output is 0 and the inputs A and B change from A=0 and B=0 to A=1 and B=1 but the B's one arrives the xor gate a little bit later than A's 1 so for a short moment the xor gate will show a 1 on it's output until B's 1 arrived and propagated through the gate. I think this is a hazard or race condition.

 

This asynchronous design can lead you in deep trouble with regard to state machines. 

Some time ago we had to reconstruct an old ASIC asynch design from mid 80th into a current FPGA. It was hell on earth :)

 

Maybe that all is understood already, but I thought it could help some readers. Next upcoming question is metastability

 

Cheers

Peter

Link to comment
Share on other sites

"Next upcoming question is metastability"

 

After taking ages to make a design for trapping a metastability event in the wild I think that 90% of the faults attributed to metastability are something completely different... If you don't have a unique common flip-flop between you and the outside world and all logic that uses the signal, then your issue is not metastability. It is a difference in routing delays from the input pin to the flip-flops that are acting on that signal. 

Link to comment
Share on other sites

Notice metastability is the effect where a flip flop or latch gets stuck half on in the metastable state between 1 and 0, it will then decay to either 1 or 0 based on whatever perturbation that occurs next, since this is impossible to determine you cannot determine which state it will decay to.  This is bad because it causes the design to be unreliable and also because CMOS logic draws a very high current if both transistors are stuck half on since you then have a path from VCC to Ground.

 

For an SR latch metastability occurs when the signals into the S and R inputs of the latch form very short pulses such that there is not enough energy in them to push the latch into the opposite state and it sticks at the half way point.

 

For flip flops this can happen when the setup time and hold time are not observed and the signals change between the setup time and the clock transition or between the clock transition and the hold time,  I think this is rather unlikely in an FPGA unless you are directly using an asynchronous outside input, or have two completely independent clocks driving different parts of the design.

Link to comment
Share on other sites

Hi again,

 

thank you both for the clarification. I myself build sometimes ago a circuit out of "discrete" CMOS ICs to observe a metastable situation on a scope. Indeed it was a very rare event.

I mentioned it in my post simply because even if you use Flip Flops you can ran into funny behaviour of a circuit. 

Current chip technology inherently has enormous high amplification and therefore only a small window in time changes of clock and inputs may lead to metastability. As I read typical delays are in the range of 10 to 100 ps. The probability of a metastable state is less than 10e-10. 

 

Maybe in extreme highspeed applications you have to cascade the Flip Fliops (JK, Master/Save) to ensure proper functioning.

 

I have no clue about the metastability issue when you connect to synchronous systems with different clock domains? Metastability possible?

 

Now, after digging in the datasheet Jack linked to, I see my first post is totally useless regarding the original question but anyways, I've got some interesting answers  :) I like to expand the image in my brain

 

Many regards

Peter

 

P.S. sorry for my german accented english :(

Link to comment
Share on other sites

When you connect two different synchronous systems that are running from entirely independent clocks (if they are derived from the same oscillator they should have the same phase with only varying delay) then an output from one to an input of the other is similar to an asynchronous input to a system, it could change at any time with respect to the clock, instead of only at times defined by the clock, this results in the possibility of such signals violating the setup or hold margin's of the other system and causing metastability.

 

In these examples I believe the main difference between the synchronous and asynchronous outputs is what latency the output has from the core, if its asynchronous it's possible that it may have 0 cycle latency and the result can be read in the next cycle, otherwise the flip-flops on the outputs will add latency and the result will be available later.

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.