flag26838 Posted November 24, 2015 Report Share Posted November 24, 2015 As a learning exercise, i'm building my own 8bit cpu (while following another "popular" online tutorial truth be told...) - i wrote the decode block, a simple alu, and a register file, but when plumbing all the pieces together, at first, i noticed that the timing of the alu doesn't behave as i expected. Here is a trimmed down example showing my problem:entity decode is Port ( clk : in STD_LOGIC; en : in STD_LOGIC; input : in STD_LOGIC_VECTOR (7 downto 0); output : out STD_LOGIC_VECTOR (3 downto 0));end decode;architecture Behavioral of decode isbeginprocess(clk)begin if rising_edge(clk) then if (en = '1') then output <= input(3 downto 0); end if; end if;end process;end Behavioral;The decode block is super simple: it takes a byte from memory and chops it down in pieces, the 3-0 bits are the opcode and are passed down to the alu.entity alu is Port ( clk : in STD_LOGIC; en : in STD_LOGIC; opcode : in STD_LOGIC_VECTOR (3 downto 0); output : out STD_LOGIC_VECTOR (3 downto 0));end alu;architecture Behavioral of alu isbeginprocess(clk)begin if rising_edge(clk) then if (en = '1') then case opcode(3 downto 0) is when "0001" => output <= "1111"; -- f when others => output <= "1010"; -- a end case; end if; end if;end process;end Behavioral;Again anothe self explanatory module. And here's the testbench:ENTITY testbench ISEND testbench; ARCHITECTURE behavior OF testbench IS -- Component Declaration for the Unit Under Test (UUT) COMPONENT decode PORT( clk : IN std_logic; en : IN std_logic; input : IN std_logic_vector(7 downto 0); output : OUT std_logic_vector(3 downto 0) ); END COMPONENT; COMPONENT alu PORT( clk : IN std_logic; en : IN std_logic; opcode : IN std_logic_vector(3 downto 0); output : OUT std_logic_vector(3 downto 0) ); END COMPONENT; --Inputs signal clk : std_logic := '0'; signal en : std_logic := '0'; signal input : std_logic_vector(7 downto 0) := (others => '0'); signal internal_bus : std_logic_vector(3 downto 0); signal output : std_logic_vector(3 downto 0); -- Clock period definitions constant clk_period : time := 10 ns; BEGIN -- Instantiate the Unit Under Test (UUT) uut: decode PORT MAP ( clk => clk, en => en, input => input, output => internal_bus ); Inst_alu: alu PORT MAP( clk => clk, en => en, opcode => internal_bus, output => output ); -- Clock process definitions clk_process :process begin clk <= '0'; wait for clk_period/2; clk <= '1'; wait for clk_period/2; end process; -- Stimulus process stim_proc: process begin -- hold reset state for 100 ns. wait for 100 ns; en <= '1'; input <= "00000001"; wait; end process;END;And attached is the timing simulation. -after 100ns the simulation starts, and the decode's block input goes to "00000001"-at the first clk rising edge (105ns) the decode block reads the input, chops it down, and passes the opcode "0001" to the alu via the internal_bus signal-at the same rising_edge (105ns), the alu output the default value "1010", instead of "1111" And here's my reasoning and my doubts: Since all the blocks work in parallel, at the first rising edge, if the alu gets the "0001" (as it does looking at the timing simulation), it should output "1111" instead of the catch-all "1010" value. After a second thought, yes all the blocks work in parallel, but there might be a slight delay, and as such, the alu doesn't get the correct value exactly at 105ns, it might get it at 105ns+x, and by then it has processed it's input already. After a third thought, i realized that i was fouled by the timing diagrams in the simulation since at 105ns the input value for the alu isn't stable (indeed there's that crossing of lines to indicate the passage from an undefined to a fixed value), and it can't rely on its input, and as such it has latches put in place to avoid working on an unstable input signal. My guess is that the third is probably the correct hypothesis but i would like to hear from you. In the end, what really puzzles me, is that this behaviour introduce a buffer-like mechanism, and there's no way to avoid it (and probably it's good that you can't avoid it, i don't know...), so the more blocks i add to my cpu, the longer it takes to break down-execute-write-back an instruction. I know, that's how a pipeline is supposed to behave, but at first, coming from a software world, it's not immediately obvious... Quote Link to comment Share on other sites More sharing options...
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.