alvieboy

Members
  • Content count

    865
  • Joined

  • Last visited

  • Days Won

    18

Everything posted by alvieboy

  1. alvieboy

    ZPUuino Down?

    AtomSoft: you seem to be rejecting personal messages, so I am not able to reply to you.
  2. alvieboy

    Multi-Boot loader app for new Spartan 6 board.

    The BSCAN_SPARTAN6 is a bit different, although more powerful than the one present in S3E. But looks like it is also easy to put it to work.
  3. alvieboy

    ZPUuino Down?

    AtomSoft: my hosting plan seems to have expired, and they did not notify me. Should be up again in a couple of hours (I hope). Álvaro
  4. Hi, I was kinda bored this morning, so I decided to think and implement a indirect JTAG programmer for Papilio One. The objective is to present regular JTAG signals (TDI/TDO/TCK/TMS) at some preconfigured wing, and allow "butterfly loader" or any other FTDI-based JTAG programmer to address this device. Primary reason for this is to resume my work on C/WING and other JTAG devices without having to use my old S3E eval board (which allows, using the hirose connector, to connect other devices to JTAG chain). The principle here is quite simple, and operates at a very high speed. We use the BSCAN component on S3E, which exports 2 user chains. The first chain is used to select one of these three modes: CONFIG, TMS and TXRX. To access these modes one must shift the appropriate mode number into USER1 chain. After selecting the mode, USER2 chain behaves as follows: if mode is CONFIG, a 33-bit register/sub-chain can be accessed. This register uses 32-bits for TXRX size, plus one bit to signal end of shift (TMS up) if mode is TXRX, the TDI is mapped into device TDO, while holding TMS low - however, after we shift N bits, and N matches the 32-bit TXRX size we set on CONFIG, TMS will be set according to the 1 bit end-of-shift signal. This mimics the operation of the JTAG/IO interface in butterfly loader code. if mode is TMS, the TDI is mapped into device TMS, which allows us to change TAP state. In TXRX/TMS modes the TCK input clock is mapped also as output clock. Right now I'm updating the software to use this "indirect jtag" approach, but looking at waveforms in simulation it's quite clear that this must work (and up to high frequencies - the design can withstand 200MHz according to synthesis). You think this might be useful ? Want me to publish this ? Best, Álvaro
  5. alvieboy

    Indirect JTAG programming using Papilio

    Thanks for your tips, you were right. The timings were not correct, however they are per XSVF spec. I had to add additional delays. XC9500 programming datasheet says RUNTEST "n" should wait "n" milissecconds (no actual need to pulse TCK). However if I do follow these timings I have errors. Something like 3 times that value seems to work (I wonder if this is due to FTDI latencies also). So it's working right now, but it's not very fast either. Erasing + programming + verifying takes 1m15s, mostly due to those delays on erase, plus having to check all transactions (and additional delays): $ time ./xc9500prog file.xsvf JTAG chainpos: 0 Device IDCODE = 0x11c1a093 Desc: XC3S250E JTAG chainpos: 0 Device IDCODE = 0x29504093 Desc: XC9572 real 1m15.331s Can you program it faster ?
  6. alvieboy

    Indirect JTAG programming using Papilio

    ben: sorry to be such a PITA, but I wonder if you could try yours C/RAM wing with your player, but change it to print the IR shift out at least on the beginning. I suspect something is wrong (read/write protection on my device). If I generate a plain SVF file, I can see this there: //Check for Read/Write Protect. SIR 8 TDI (ff) TDO (01) MASK (e3) ; This is not in the XSVF file (not sure why, but appears in both SVF and STAPL). Anyway, I am getting "0x81" and not "0x01" when I shift any IR. The "write protect override" only comes after this check, which is also weird. I have no Xilinx cable, so it's not easy for me to use this part with Impact. Álvaro
  7. alvieboy

    Indirect JTAG programming using Papilio

    Everything kinda looks ok, but (note tihs is my own debug code, not yours) if fails on erase (if I don't erase it, programming seems to actually be done, but fails somewhere at the end). Note that binary values in TDI/TDO debug are depicted L->R (first bits on left, either for TDO and for TDI): JTAG chainpos: 0 Device IDCODE = 0x29504093 Desc: XC9572 XREPEAT TX TMS: 11111 TX TMS: 0 XRUNTEST XSIR 0xfe TX TMS: 1100 TX TDI: 01111111 (+TMS1) XSDRSIZE 32 XTDOMASK ff ff ff ff XSDRTDO 00 00 00 00 expect 29 50 40 93 TX TMS: 10100 TX TDI: 00000000000000000000000000000000 (+TMS1) RX TDO: 11001001000000100000101010010100 HEX: 93 40 50 29 0 : 29 29 [actual 29 expected 29 mask ff] 1 : 50 50 [actual 50 expected 50 mask ff] 2 : 40 40 [actual 40 expected 40 mask ff] 3 : 93 93 [actual 93 expected 93 mask ff] Read: ff XSIR 0xff TX TMS: 101100 TX TDI: 11111111 (+TMS1) XSIR 0xe8 TX TMS: 101100 TX TDI: 00010111 (+TMS1) XSIR 0xec TX TMS: 101100 TX TDI: 00110111 (+TMS1) XRUNTEST XSDRSIZE 27 XTDOMASK 00 00 00 00 XSDRTDO 06 a9 57 fe expect 00 00 00 00 TX TMS: 10100 TX TDI: 011111111110101010010101011 (+TMS1) RX TDO: 001111111111111111111111111 HEX: fc ff ff 07 0 : 00 00 [actual 07 expected 00 mask 00] 1 : 00 00 [actual ff expected 00 mask 00] 2 : 00 00 [actual ff expected 00 mask 00] 3 : 00 00 [actual fc expected 00 mask 00] XRUNTEST XSIR 0xed TX TMS: 101100 TX TDI: 10110111 (+TMS1) XRUNTEST XSDRTDO 00 3f ff fe expect 00 00 00 00 TX TMS: 10100 TX TDI: 011111111111111111111100000 (+TMS1) RX TDO: 101111111110101010010101011 HEX: fd 57 a9 06 0 : 00 00 [actual 06 expected 00 mask 00] 1 : 00 00 [actual a9 expected 00 mask 00] 2 : 00 00 [actual 57 expected 00 mask 00] 3 : 00 00 [actual fd expected 00 mask 00] XTDOMASK 00 00 00 03 XSDRTDO 00 3f ff fe expect 00 00 00 03 TX TMS: 10100 TX TDI: 011111111111111111111100000 (+TMS1) RX TDO: 101111111111111111111100000 HEX: fd ff 3f 00 0 : 00 00 [actual 00 expected 00 mask 00] 1 : 00 00 [actual 3f expected 00 mask 00] 2 : 00 00 [actual ff expected 00 mask 00] 3 : 03 01 [actual fd expected 03 mask 03] SDR failed -- trying again (31 left) And it stays here, fails this last SDR always with same value (0x1 instead of 0x3) That's why I'd like so see a debug dump from yours. This is the ISC_ERASE phase.
  8. alvieboy

    Indirect JTAG programming using Papilio

    ben: I'm still trying to do that. After a few byte order struggling I think I managed most of it, but I'm still having some troubles to program the thing. Do you happen to have a successful debug log for XC9527 (aka C/RAM CPLD) ? I'm wondering if my part is somehow broken... or I might miss something. Alvaro
  9. alvieboy

    Peripheral in RAM Space

    [tt] Are you connecting this port "dbus_out" to external pins ? You cannot have high-Z signals inside the FPGA, I wonder if you know that. In case you just meant "I don't care about dbus_out value when matrix_sel /= 1", use "(others => 'X') instead of 'Z'. [/tt]
  10. alvieboy

    Indirect JTAG programming using Papilio

    Hi, Yes, the JTAG interface seems to be working, but I don't have a means to replay XSVF into the CPLD. Perhaps ben can help me on this This was actually trickier than I first thought. There's a lot of resynchronization going inside the BSCAN component which prevented proper TDI/TDO interaction. Right now it's working OK it seems, I can shift in/out in proper order. So expect a JTAG debugger for ZPUino soon. Ben: if you read this - I saw a lot of "progalgxcf" implementations for the current papilio loader (xs3cprog), but none seems to work. All I have is a mean to change TAP state (shift out TMS) and shift out+in the TDI/TDO thing in bursts. Your XSVF replayer looks very nice, but I cannot use the serial port because I want reprogramming with live systems - otherwise your tool (which is a very useful one) would suffice for me. Think debugging an internal processor with an already implemented TAP+CHAINS. I am not sure who maintains xs2cprog. Jack: do you have any idea ? Álvaro
  11. See the section about "Metastability/Synchronizers" on the following document: http://www.ul.ie/~rinne/ee6471/ee6471%20wk12.pdf Álvaro
  12. alvieboy

    Why is this getting trimmed

    fullt <= fullt(15 downto 0) & dout; You cannot do this unless in a synchronous process. Ouside synchronous processes you only have combinatorial logic. So basically you are saying here that: fullt(16) = fullt(15), fullt(15) = fullt(14), ... fullt(0) = dout; This means either all fullt bits are set to dout, or you're generating something weird. Note that VHDL is just a description language - everything will boil down to flip flops, combinatorial logic and dedicated entities after synthesis. So what you want is: process(clk) begin if rising_edge(clk) then fullt <= fullt(15 downto 0) & dout; end if; end process; This will create an 16-bit shift register.
  13. alvieboy

    Why is this getting trimmed

    Care to share the whole process ?
  14. alvieboy

    COE Editor

    Yes, it is. ISE/XST will place output FF's on IOB by default, but you can also instantiate OBUF/OBUFT if needed. Same for IBUF. Álvaro
  15. alvieboy

    COE Editor

    Now, this depends if you require a perfect sync between clk/data on output/input pins, and whether if you need to sync between other clocks internally and also if you require a 50% duty cycle and 100% accuracy. If I were you, I'd change 32MHz to something higher, but a multiple of 9MHz. Like 36Mhz ( 32 * (9/8) ), or 72Mhz ( 32* (9/4) ). Then I'd use regular flip flops to clock out the signal/data. If they are placed on IOB, then everything will be clocked out at same time. Using an enable signal will allow you to get the desired 9MHz, with 50% duty cycle.
  16. alvieboy

    COE Editor

    AtomSoft: care to explain in more detail ? What you mean by "identical" ? Álvaro
  17. alvieboy

    TV output wing

    Ben: you're not attending ISEL, are you ? Just asking because it's rare to see links to portuguese universities. I expect to try your core soon, let's see if I find some free time. Álvaro
  18. alvieboy

    COE Editor

    Not sure which of the boards (more than one exist) you're referring to. I tried the C/RAM wing recently (the one with a CPLD + SRAM) and found it very hard to work with at high speeds, due to Z->non-Z->Z latencies on CPLD. Still it might be useable at low speeds (ZPUino uses 96MHz as IO clock, that was too high for the CPLD, although data itself moved along very well). However new boards are coming with integrated SRAM and I think I'll manage to boost a full VGA interface, full color, while still allowing RAM to be shared among VGA and CPU, and eventually other peripherals. Stay tuned
  19. alvieboy

    COE Editor

    Yes, you do need 500 because of amount of RAM required by VGA core. I'll take a look at your COE editor soon. Álvaro
  20. alvieboy

    First Projects

    Hi, See the AVR bit manipulation instructions for this (SBI/CBI). For IO you can also toggle individual bits. From ATMEGA48/168/328 datasheet: 13.2.2 Toggling the Pin Writing a logic one to PINxn toggles the value of PORTxn, independent on the value of DDRxn. Note that the SBI instruction can be used to toggle one single bit in a port. Álvaro
  21. alvieboy

    TV output wing

    Can't you use another approach for that table ? I have not got a chance to look to your code in detail. What's the formula for the color ? Alvie
  22. alvieboy

    ZPUino Beta 2 is out

    ZPUino Beta 2 is out. Mostly bug fixing, but most interesting is native support for upcoming Arcade Wing, with VGA (and audio ) Downloads here: http://www.alvie.com/zpuino/download.html Alvie
  23. alvieboy

    Papillio with framebuffer SRAM

    Hi bkraz, I've a few VGA cores than can be used with any wishbone-compatible CPU/sequencer. as I understand you only need a monochrome output. For 640x480 that means 38Kbytes RAM, if you actually need to use all those bits. Are you building something like an oscilloscope ? Álvaro
  24. Hi Let me see if I can shed some light on this clock thing. Four scenarios can exist when we're speaking about clocks: a) You have one and only one clock which drives all synchronous elements, You have only one clock but use enable signals/dividers to obtain a clock which is an integer division of the original clock, or use an enable signal instead of a dedicated clock (latter is more usual). c) You have two clocks where the frequency and phase difference are known - this way you can always figure out the worst case scenario and impose constraints on your design based on that, d) You have two completely unrelated clocks, and here you need to resynchronize data from one clock domain to another. You seem to have a d) scenario, so I'd suggest the following: Use the external clock (note that it will be delayed) to synchronize your inputs/outputs, then use a FIFO with different clocks to transfer data from one clock domain to another. Block RAM are good for this. If your internal clock is higher compared to the SPI clock you're expecting you can resynchronize everything to the internal clock. Just place a latch+ff on each SPI input, and use your clock as output clock. You'll have to use the SPI clock as "level" triggered, so you won't attach this clock to your synchronous elements, but rather use a scheme like this: signal spi_clk_in_samples: std_logic_vector(1 downto 0); process(clk) begin if rising_edge(clk) then spi_clk_in_samples(0) <= spi_clock_in; spi_clk_in_samples(1) <= spi_clk_in_samples(0); end if; end process; Then: signal spi_clock_rising: std_logic; spi_clock_rising <= '1' when spi_clk_in_samples(0)='1' and spi_clk_in_samples(1)='0' else '0'; And then: process(clk) begin if rising_edge(clk) then if spi_clock_rising then --- Process your data here end if; end if; end process; Other thing is FPGA usually do not allow that many clocks per design, because they have what they call "clock regions". These clock regions have very fast clock paths, but only one clock can be used. So I'd go for this approach if speeds are 4x higher.
  25. Hi Van, Thanks for your interest in Papilio. Unfortunately I have to say that you cannot easily port gameduino for P500, because it does not have enough internal RAM to handle the VGA output (400x300x9 - That's roughly 35Kbyte, a S3E500 has 46Kbyte RAM so not much more spare RAM for CPU and acceleration). However Jack is working on a new external RAM module - That yes, might allow that and even more. I've some guys porting old chip devices like pokey, ym2149 and sn76489 to ZPUino, if you're interested please let me know. Alvie