After releasing the ZPUino 1.0 Alvaro Lopes has managed to add a VGA adaptor to his processor and then he wanted to make something special to test the adaptor and try to display some nice graphics. Then he got the idea to remake one of his favourite games that he enjoyed so much back in the mid 80s.
“Then I remembered JSW, and googled for implementations. I found a commented disassembly of the original game and I thought I can remake this in “C”, and use the original game art to recreate the original game.”
The game principle is simple, it consists of several rooms, filled with enemies and objects. The objective of the game is to successfully collect all items in the game so Maria will allow you to go to bed.
The Game is a remake for the ZPUino running on the Papilio One with an Arcade MegaWingand can be played with the buttons on the board, Alvaro will make it playable with a joystick soon and he will try to improve it and make it even better.
“The game implementation is not complete – some stuff is missing, like ropes, arrows, entry/exit screens, lifes, Maria and the Toilet. Also no sound implementation was done.”
Here is a demonstration video of the game.

The full source code is available for download on Github.
Click here to check the game wiki page.
The main reasons why I settled on BJ were that no one had implemeted it before in an FPGA, the schematics were of great quality and it used standard hardware of that era, such as discrete TTL logic, Z80 CPU and AY-3-8192 sound generators. I knew these cores were already available and I could stand on the shoulders of giants and not have to re-implement the CPU and audio chips, instead focus on translating the schematic to VHDL.

Project Source Code Direct Download
Google Code SVN Repository
Original Project Page

In my eagerness to begin, I threw caution to the wind and didn't go through a resource planning stage, I simply went page by page in the schematic and translated all the chips to VHDL code, connecting everything together. This was a fairly tedious process for the most part, but also challenging, in trying to figure out the proper VHDL constructs and solve various implementation issues. This can be seen in the project source code, where each page of the schematic is implemeted in it's own corresponding VHDL file. I've also kept the chip notation inside the VHDL code as per schematic, namely a number letter pair identifying the board row/column where the chip is located. In the case where a specific gate within a chip needs to be identified, an optional number follows specifying the pin where the output connects. For example, 1T8 refers to chip located at coordinates 1,T on the board. It is a LS08 quad AND gate as seen in the schematic. Furthermore 1T8 refers to one of the four gates in the chip, the gate with output exiting at pin 8 of the chip.
Once the translation from schematic to VHDL was complete, I came across what seemed like an insurmountable problem. The game uses a total of 16 ROM chips adding up to 112Kb of memory but my FPGA, a S3E500 only has internal space for 40Kb. This issue would have come up earlier had I bothered to go through a planning stage. In hindsight, it was probably serendipitous, as if I had figured this out early, I might not have started the project at all. Now that I had spent all this time and effort, I was invested. Even so, I had to put this project on ice for the rest of 2011 until through sheer luck, Jack Gasset donated a beta Papilio Plus board. The P+ uses a LX9 FPGA but the board also has a 512Kb of static RAM chip. This was perfect for this project and early this year I picked up the project again and attempted to progress it.
More Roadblocks
The next problem I encountered was the fact that the video circuit uses a total of 10 ROM chips. Even if you consider that some ROM chips share a data bus, that still leaves one ROM chip with a 8 bit data width and three ROMs with a 24 bit width. That's without even counting the main CPU program ROMs and the audio CPU program ROM. The problem is that all these ROMs are constantly accessed simultaneously and in different clock domains, for example the CPU ROMs are accessed synchronous to the CPU clock which is 4Mhz but the video ROMs are synchronous with the 6Mhz video clock and the audio CPU ROMs runs at 3Mhz, which at least is in sync with the 6MHz clock.
But I only have one single SRAM chip to store them all in. How can I fake having a bunch of separate ROM chips by only using a single SRAM chip? After a fair amount to simulation and examining the circuit diagram, it turns out the answer is time division multiplexing. By running the SRAM on a 48Mhz clock, it turns out is possible to fake the appearance of multiple ROM chips by storing them in different areas of the SRAM and quickly reading each ROM address and presenting the data to the target just in time.
By also taking advantage of the FPGA's built in BRAM blocks, it was possible to store the main and audio CPU ROMs inside the FPGA (total of 48Kb = 24 BRAMs) and only have to retrieve the video ROMs from external SRAM, which are inside the same clock domain.
Final stumble
So now we have a bunch of ROMs that need to be available inside the SRAM chip at power on, but the SRAM is a volatile storage medium. This is where the SRAM bootstrap comes in. The SRAM bootstrap project which I published on the Papilio Code Playground back in February is a direct result of this BJ project. You can read the details at the link above but briefly, on power on or reset, the boostrap takes over the SRAM chip buses and copies the contents of the serial FLASH chip in to the SRAM, then releases the SRAM buses to the user and signals it is done. The user portion, in this case the actual BJ circuit, uses that signal as its reset, so when the boostrap is done, the BJ circuit comes out of reset state and is free to run and access the SRAM.
By far the lengthiest part of this project was debugging the game so it runs correctly (or at all). After writing some test jigs and testing the easy schematic pages, such as the input switches on page 2 and the video and timing signal generator on page 3, I started the debug process from the video output and moved backwards.
This meant starting with page 8, the color palette circuit. This was fairly simple to debug but it needed something to initialise the palette RAM and drive it. I decided to build a simple state machine test jig that would replace the CPU and simply initialize the palette RAM by writing the values to address space based at 0x9c00 then drive one of the priority encoder inputs such as for example BC/BV. The priority encoders 5F,H,J,K on page 8 are arranged such that when supplied with simultaneous video signals on inputs BC/BV, SC/SV and OC/OV they prioritise these signals such that BC/BV has the lowest priority (this is the background picture), SC/SV has the next highest priority (this is the character generator) so it always appears "on top" of the background and OC/OV has the highest priority (these are the sprites) so they would be displayed "on top" of both characters and background.
The next step was to drive the palette circuitry and by examining the schematic I decided the easiest and best part to get going next was the background generator on page 7. This generates the in game backgrounds and it doesn't even depend much on the CPU driving it, the CPU simply writes a value to address 0x9e00 to select a background and then leaves it alone while the circuit continuously generates the background and shifts it out to the video output. The test jig I'd built earlier fit the bill perfectly adding this new schematic page to it and before long I could see my first real game pictures. I could cycle through all the game background pictures, though the colors were off as it seems each background uses a different palette. Nevertheless this was a great step forward as it was the first real video from this whole project that looked like part of the original game.
The next part to get going would logically be the character generator on page 6, as it is not much different from the background generator circuit on page 7 except for the index ROM 4P is now replaced with a SRAM chip 6LM. This means that it would now be more difficult to drive this with my test jig, so I'd actually have to implement the main CPU. I quickly decided to move all the main CPU ROMs to external SRAM since that didn't require any multiplexing and shift as many video ROMs to internal BRAMs which are truly independent and can easily emulate multiple independent ROMs. After testing in the simulator that the main CPU executes instructions from its program ROMs I tried running the whole game on the FPGA. This didn't initially just work but after some more simulator action and tweaking of the timing of some signals and fixing up some minor bugs, the video screen showed the game booting up through its power on self test routine, with all ROMs passing the test and most of the RAMs too, then the initial high score table would be displayed. As I let the game run through it's demo mode, I couldn't of course see any sprites as they hadn't been tested and were also comented out, but I could see the background and the in game graphics that consisted of characters only, such as the platforms and bombs and their colors matched what I expected to see from MAME.
One thing that puzzled me for a while was that the Bomb Jack logo on the startup screen would be missing its top part completely. It was not until much later when I implemented the sprites, that the mystery was solved, that missing top part of the logo is built with sprites not characters!
The final part of the circuitry was pages 4 and 5 and proved to be the most complex to debug. This is the sprites generator (page 4) and sprite positioning (page 5). The problem I had was that simulating this was hard, in that simply running the game in the simulator was not an option because the sprites don't appear on screen until about a minute after power on. It takes several hours to simulate one single second of circuit action. This is where I had to break out IDA and dig through the disassembly of the game ROMs, then find suitable patch locations to force the game to skip portions I wasn't interested and just jump to where I needed it to. At this point there wasn't enough memory space inside the FPGA to keep all the video ROMs, so since the background generator had already been tested I commented out its ROMs and brought in the character generator and sprite generator ROMs. I'm not exagerating when I say this portion of the debug took a couple of months on an off working into the evening simulating and examining the traces. Initially I got page 4 working which finally displayed some sprites to the screen but they only moved left-right as page 5 had not been implemented. There was also something that bothered me immensely, while sprites appeared to work, the death sequence animation of Bomb Jack showed corrupt graphics across the entire row that Bomb Jack occupied. I put this aside for now and continued with page 5.
The sprite positioning on page 5 was a bit of a head scratcher, as soon as I brought it in, the sprites would disappear completely. Much time spent in the simulator showed this to be a timing issue. The RAMs on page 5 have the 6Mhz clock connected to their R/W line, this means the RAMs are read when the clock is high and written to, when the clock is low. Essentially they are accessed twice inside a single 6Mhz clock cycle. Once that was apparent, I brought in a 12Mhz clock line to the chip in order to get it working double time however that didn't prove to be as simple as I thought. More simulator action showed very subtle timing issue with the clocks, in that the 6 and 12 Mhz clocks cannot have coincident edges, so I inverted the 12Mhz clock in order to shift its edge to the middle of each 6Mhz clock half cycle. Presto! The sprites finally appeared in all their glory. Yet the death animation corruption issue remained...
Back to the simulator. I must as as aside, say that the simulator is absolutely invaluable. Without simulation, it would have been very very hard, if not impossible to track down some of the bugs or other subtle issues encountered here. At this point I patched some test code into the VGA scan doubler so that it writes its input signals to a .ppm file. What is special about this is that a ppm file (portable pixmap format) is actually an uncompressed image file (similar to a bitmap) but entirely in text format. As I run the simulator, a sequence or ppm files would be output each corresponding to a video frame, which I could then view with a graphics program. This allowed me to see at exactly which frame the faulty sprites appeared. The cause of all this grief turned out to be very simple, I had incorrectly inverted the signal coming out of gate 7C6 on page 4. This was a very simple mistake that proved very costly in terms of time to track down.
As troubleshooting the video section took such a long time, in order to not lose my mind working on the same problem over and over without making much progress, I decide to take a "break" and work on the audio section. This is a fairly simple setup, a CPU with ROM and RAM driving three identical programmable sound generators (PSGs). These are AY-3-8192 types but there is a proven YM2149 core written by MikeJ of fpgaarcade.com, YM2149 being identical to the AY-3-8192 chip apart from one pin that lets you run the YM at double clock (by causing the chip to halve the clock internally).
Again since the audio circuitry on page 9 and 10 is quite self sufficient, it just needs a value written to its input latch to select which sound to play, I decided to write a test jig for it, but one that could be run on the FPGA, not just the simulator, and by using buttons select what value to write to the audio board while displaying relevant info on the VGA screen using another one of my projects, the VGA 7-segment display
The main problem initialy encountered here was that some of the signals are not labeled on the schematic, such as the clock to the PSGs, the mystery signal feeding the flip-flop which goes to the CPU NMI input, some signals were mismatched in their labeling, such as the /SIORQ from the CPU that really does go to IORQ at chip 5D. However when all those were sorted by referencing the source code for MAME for clues, the audio board finally played sound effects and music. One small issue that arose during testing was that the background music seemed to be missing one of the audio channels. Simulation showed that the audio CPU was explicitly writing the register of the PSG to actively mute that channel. This was very strange until after more debugging and talking to MikeJ the solution presented itself. I was missing the chip select to the RAM. By having the RAM permanently enabled, some writes from the CPU that should not have gone to the RAM at all, were in fact writing memory corrupting it, causing the CPU to the write incorrect data to the PSGs.
Up to this point, the CPU ROMs were running out of external SRAM and the video ROMs from internal FPGA BRAMs, and there wasn't even enough room for those inside the FPGA, so I'd had to comment out some ROMs such as the background generator ROMs. It was finally the time to shift everything around. I moved the audio and main CPU ROMs to internal BRAMs and all the video ROMs to external SRAM, but I was still one BRAM short fitting everyting in the FPGA. I eventually changed the color palete design on page 8 so instead of using BRAMs, it uses vector arrays, which at syntesis are mapped to lookup tables and not BRAMs.
There was yet more simulator required to figure out the exact times to read the external SRAM and present the data to the appropriate places inside the FPGA in order to mimic having a bunch of separate ROMs and fixed some more minor bugs, such as having some of the sprites colors wrong when running from external SRAM because I'd incorreclty swapped around some of the ROM chip address mapping, but it eventually all finally fell into place and I had the whole game running as it should.
Building the project
The project is organised into a number of folders relative to the main project folder, all the source code lives in /source, the BJ original ROMs are expected to be in /roms/bombjack and some handy scripts live in /scripts while relevant documentation can be found in /doc. Finally the Xilinx build occurs in the /build directory which can become cluttered with temp files after each build. Feel free to delete all files in there but make sure you keep the .xise project files.
The basic steps to build this project are:
1) copy the binary ROM files to /roms/bombjack
2) run /scripts/build_roms_bombjack.bat to translate the binary ROMs to VHDL code
3) run /build/bombjack.xise to start the Xilinx ISE environment and generate the fpga bit file.
4) run /scripts/build_fpga_image.bat to concatenate the ROMs to the FPGA bit file
5) finally burn the resulting /scripts/fpga.bin file to FLASH using the command "papilio-prog.exe -b bscan_spi_lx9.bit -f fpga.bit"
It is important to burn the fpga.bit to FLASH rather than just soft upload it to the FPGA because the game ROMs must be present inside the FLASH so that the bootstrapper can then copy them to SRAM at power on.
This project as it stands now, uses all 32 BRAM blocks of the LX9 FPGA, so it would not be easy to port it to another FPGA with fewer BRAMs unless either or both the main and audio CPU ROMS could be run from external SRAM. The largest ROMs, are the main CPU ROMs, totalling 40Kb or 20 BRAMs, and they are the most difficult to run from SRAM due to the main CPU running at 4Mhz which does not sync up with the video ROMs being accessed on a 6Mhz clock.
As such, this project runs specifically on the Papilio Plus platform with the MegaWing add on. The button labeled RESET on the MegaWing is not used as reset but as a shift to expand the functionality of the remaining four buttons.
The control buttons are as follows:
Buttons Function
RESET+LEFT Player 1 coin insert
RESET+RIGHT Player 2 coin insert
RESET+UP Start one player game
RESET+DOWN Start two player game
UP+DOWN+LEFT+RIGHT Hardware reset
UP In game up + jump button (these would be separate on the original game)
DOWN In game down button
LEFT In game left button
RIGHT In game right button
If you want to try this project on your Papilio Plus prototype you must have these wings:
Joystick wing
VGA wing
MicroSD wing
Audio wing

Since the first version uses a SD card we can’t use the
Arcade MegaWing. Instead of all these wings listed above to run this project, but Ben has been working on a new version that should work fine with the Arcade MegaWing.
Ben wrote a nice bootloader to select ROMs from a SD Card.
“Load the bit file : the bootloader shows the content of your SD card. Pick a rom and it gets loaded into the SRAM, and the system boots it : you’re ready to go !”
The full source code is available for download on Github.
Check out some of the exciting new features:

Famous audio chips such as the Commodore 64 SID, Atari ST YM2149, and Atari 800 Pokey are built in and ready to use in your sketches. No GPIO’s are wasted or soldering required!
A new small VGA mode that is modeled after the Sinclair ZX Spectrum. This versatile but compact VGA core fits, along with the ZPUino to control it, in the Papilio 250K and 500K boards!
SmallFs filesystem allows resources such as images our audio files to be placed in a directory with your sketch and automatically becomes available for use by your sketch.
Upload to RAM allows sketches and smallfs resources to be loaded directly to internal BRAM or the external SRAM of the upcoming Papilio Plus board.

If you still looking For more ZPUino information and documentation check out this page.
The missing instructions can be resolved at compile time using a modified build of GCC or by generating traps at runtime on encountering the unsupported instructions.
You can check the source code, FPGA project and FPGA bit files at OpenCores.
And here is an example of an MP3/WAV decoder that was made with this MIPS-like processor, the project is an FPGA based MP3/WAV Player using just a FPGA, some RAM & a stereo DAC.
“The project consists of a custom 32-bit soft core processor running at just under 60MHz which decodes the MP3 algorithm in software with no hardware acceleration apart from a single cycle Xilinx multiplier unit.”
The original article is complete with a full hardware and software specifications, source code, schematics and much more…
At this point i have designed a small pipeline that grab the pixels, takes the Y (luminance) component performs a downscale of the image (640x480 -> 80x60) and send the picture on serial at 3Mbaud. Frame grabbing and sending is done at 30hz. There is no soft core involved, everything is performed using homemade modules (i2c, pixel grabbing, downscaling ...) and i'am only using some BRAM for configuration storing, one 80 pixel line storing for dwonscaling, and a 128 byte FIFO for the serial communication.
I made a little java app to display the picture and test evrything. The picture are fine but it seems that i sometime get transmission errors.
The next step is to add an edge detector to the pipeline and try to dectect a line an build a line following robot in the future !
som pictures:

the interface

and the board, the reset button is directly pugged into the female header ...
The sensor connected straight into the papilio headers, is just had to add a little stripboard to get access to the vcc and gnd of the papilio. There is still room to connect a second camera ... stereovision anyone ?
Device Utilization :
Selected Device : 3s250evq100-4
Number of Slices: 252 out of 2448 10%
Number of Slice Flip Flops: 215 out of 4896 4%
Number of 4 input LUTs: 483 out of 4896 9%
Number used as logic: 474
Number used as Shift registers: 9
Number of IOs: 22
Number of bonded IOBs: 22 out of 66 33%
Number of BRAMs: 3 out of 12 25%
Number of GCLKs: 4 out of 24 16%
Number of DCMs: 2 out of 4 50%
EDIT: The project can be checkout on
The VHDL and SystemC code are not very clean but i plan to do some refactoring.
The project also contains a small SystemC to VHDL application i designed to help with this project but not a complete work. It only translate RTL like SystemC.
if you want to read the whole thing please check the original project thread here

Small AVR project

By Dhia, in Papilio One,

Provided sw_app directory includes base C project + makefile which includes a serial I/O driver, printf/sprintf/vprintf library & timer driver.
Typing 'go' in sw_app will build the source code and update prog_mem_content.vhd with new program memory contents (then just rebuild in ISE).
Source: - AVR MCU (cut down) - http://opencores.org...ect,cpu_lecture - GPL - Fixed speed UART - Opencores - GPL
Tools Used - WinAVR 20100110 or newer - Xilinx 13.1 webpack
Source Code: small_avr.zip

FPGA Games: Amidar

By Dhia, in Papilio Arcade,

The game and the name have their roots in the Japanese lot drawing game Amidakuji. The bonus level in Amidar is a nearly exact replication of an Amidakuji game and the way the enemies move conform to the Amidakuji rules. A clone of this game was released for the Atari 2600, entitled Spiderdroid.
Source Code On Github
Amidar ROM Information
Amidar Technical Guide
Amidar Wiki Page

Game Play
As in Pac-Man, the player is opposed by enemies who kill on contact. The enemies increase in number as the player advances from one level to the next, but do not increase in speed. Their speed is always matched exactly to that of the player.
On odd-numbered levels, the player controls an ape (in some versions labeled "Copier"), and must collect coconuts while avoiding headhunters (labeled "Police" and "Thief"). On even-numbered levels, the player controls a paint roller (labeled "Rustler"), and must paint over each spot of the board while avoiding pigs (labeled "Cattle" and "Thief"). Each level is followed by a short bonus stage.
Whenever a rectangular portion of the board is cleared (either by collecting all surrounding coconuts, or painting all surrounding edges), the rectangle is colored in, and in the even levels, bonus points are awarded. This leads to some comparisons with the popular and influential Qix, although the similarities between these games are superficial at best. When the player clears all four corners of the board, he is briefly empowered to kill the enemies by touching them (just as when Pac-Man uses a "power pill").
The game controls consist of a joystick and a single button labeled "Jump," which can be used up to three times, resetting after a level is cleared or the player loses a life. Pressing the jump button does not cause the player to jump, but causes all the enemies to jump, enabling the player to walk under them.
· Detailed Game Play and Caracters Guide
Hardware Specifications

Z80 at 3.072 MHz
Z80 at 1.78975 MHz
[*]Sound – AY-3-8910A at 1.789750 MHz

Resolution – 768 x 224
Orientation – Vertical
Refresh Rate – 60.61 Hz


FPGA Games: Frogger

By Dhia, in Papilio Arcade,

Quick Links:
Frogger ROM Information
Frogger Technical Guide
Fogger Wiki Page

Game Play
Joystick: Use to direct the frog throughout the screen. Unlike many games, the joystick must be returned to the neutral position before being moved again to make the frog continue in the direction you wish for him to go.
1 or 2 Players: Push these buttons to begin a one or two player game.
Detailed Game Play and Characters Guide

Hardware Specifications

Z80 at 3.072 MHz
Z80 at 1.78975 MHz
[*]Sound – AY-3-8910A at 1.789750 MHz

Resolution – 768 x 224
Orientation – Vertical
Refresh Rate – 60.61 Hz

Source Code: Github
In this maze-themed game, the player controls a paintbrush, used to paint the corridors of an aquarium, all while being pursued by two predatory fish (one blue, the other yellow). The player has to avoid these fish at all costs, as if they catch the paintbrush, it is destroyed. However, the player can turn the tables on these foes by using one of two rollers located at two points on the board and crush them for bonus points.
Crush Roller ROM Information
Cruch Roller Wiki Page

Game Play

Joysticks: The game features a 4-way joystick, the only control needed for gameplay. This used to guide the paintbrush around the maze.
1-2 Players: Used to begin a one- or two-player (alternating turn) game.

Characters, enemies and other items
The paintbrush: The hero of the game, his objective in life was simple: to paint the corridors of the maze surrounding the aquarium, where two predatory fish live.
The player begins the game with three lives (depending on the setting); an extra brush is awarded at 10,000 points. The game ends when all the lives are lost.
The fish: The primary adversaries of the paintbrush, these serve to pursue the paintbrush and stop it from painting the entire maze. Resembling goldfish, one of these are blue, the other yellow. The fish are considered very smart, and often team together in an attempt to corner the paintbrush. If either one (or both) catch the paintbrush, it is destroyed.
Other adversaries: Once per round, one of six "mischief makers" (animals, people or other objects) appear, with the sole purpose of messing up the paintbrush's work. Depending on the stage.
Hardware Specifications
Platform — NAMCO 8-bit PCB
CPU — Z80A at 3.072 MHz
ROM — 16K in four, 4K chips
RAM — ~2K
Display — Raster
Orientation — Vertical
Resolution — 224x288
Colors — 16
Attributes — Eight 16x16 hardware sprites
Refresh rate — 60.61 Hz
[*]Sound — Monophonic 3-voice waveform sound generator chip
[*]Controls — One 4-way leaf joystick, 1P/2P buttons
[*]Models — Upright, and Cocktail

Source Code: Github

FPGA Games: Pac-Man

By Dhia, in Papilio Arcade,

Pac-Man's game play was a stark and refreshing contrast to the space aged shoot'em ups that were popular at the time. Pac-mania became a phenomenon and video games' first mascot was met with an insatiable demand.
How to Play Pac-Man
Pac-Man Strategy Guide
Pac-Man ROM Information
Pac-man Wiki Page

Game Play
You control Pac-Man through the maze with the joystick.
You must eat every dot and power pellet to advance to the next stage.
You must avoid contact with the ghosts while they are their normal color.
If you eat an power pellet, the ghosts will turn blue, and you may eat them for bonus points until they turn back to their normal color.
Ghosts travel at half speed through the side escape tunnels. Use them to get away.
A bonus item will appear below the ghost pen twice per stage for extra points.

Detailed Game Play Guide

Hardware Specifications
Platform — NAMCO 8-bit PCB
CPU — Z80A at 3.072 MHz
ROM — 16K in four, 4K chips
RAM — ~2K
Display — Raster
Orientation — Vertical
Resolution — 224x288
Colors — 16
Attributes — Eight 16x16 hardware sprites
Refresh rate — 60.61 Hz
Sound — Monophonic 3-voice waveform sound generator chip
Controls — One 4-way leaf joystick, 1P/2P buttons
Models — Upright, and Cocktail
Source Code URL: Github
The Papilio Plus is based on a Xilinx Spartan 6 family FPGA, specifically the LX9, which has a total of 589824 bits of BRAM internally, which translates to 72K bytes of internal BRAM, however the Papilio Plus has an external 256Kbx16 fast SRAM attached to the FPGA in the form of an IS61WV25616BLL chip which significantly increases the amout of RAM the FPGA has access to. While it is true that the Papilio also has 4M bits of FLASH attached in the form of the SST25VF040B, that memory is significantly slower to access because its contents must be accessed serially and clocked out one bit at a time.
It would be nice to have a fast, parallel access ROM attached, but how does one turn a SRAM which loses its contents at power off into a permanent non volatile ROM? This project addresses that issue!
The basic priciple is to store the ROM contents into the comparatively slow serial flash then copy them at power on into the SRAM before handing over control to the user. We need to bootstrap the SRAM, so there are two issues that need to be addressed, how to upload arbitrary user data into the flash and how to copy that data from serial flash to parallel SRAM.
Storing arbitrary data in flash
There are tools like the Papilio Loader that uploads FPGA bitstream (.bit files) into the flash and other tools like Xilinx's data2mem which can insert user provided data into the BRAM memory locations inside a .bit file, but none of these actually help us. Our user ROM data is not stored inside the FPGA BRAM and as such, it is not part of the .bit file. If the FPGA had enough BRAM to hold our ROM, we would not need the external storage after all.
This is where bitmerge.py comes in. This is a Python script which will take a valid .bit file and a user provided binary file and merge them together into another .bit file which can be uploaded to the flash using the Papilio Loader. This will all make sense once we understand what a .bit file really is.
FPGA bitstream files
Once a user design is error free and synthesized, the FPGA compiler can produce a .bit file, which is really a binary file that contains a small header followed by the actual FPGA binary bitstream. The Papilio Loader does not in fact write the entire .bit file into the flash, it only writes the FPGA bitstream contained within the .bit file. The format of the .bit file is explaned here but basically it consists of a magic number and 5 sections labeled a, b, c, d, e. The first four sections contain strings such as the design name, the device type, date and time the .bit file was compiled. Section e contains the actual bitstream. What the bitmerge.py script does is append the user binary ROM data to the FPGA bitstream data and rewrite the section e length to be the total combined length of both the FPGA bitstream and user ROM data. The Papilio Loader will happily parse this new .bit file and write section e to the flash which includes both the FPGA bitstream and the appended user ROM data.
But won't the FPGA get confused now because new data is appended to its bitstream? In actual fact, no. On power on, the LX9 FPGA will serially shift in from the flash only the FPGA bitstream and then stop. Provided these bits are a valid FPGA bitstream, which they are, because we haven't modified the FPGA bitstream in the .bit file, the FPGA will happily run the design.
As a form or sanity checking, the bitmerge.py script will parse the original .bit file and print out the contents of sections a through d, to ensure the input file is a valid .bit file. It will also do some basic checking to ensure the files do not exceed the flash capacity, then it will print out the address in flash where the user ROM data begins.
The example below shows a merge operation and expected output:
>bitmerge.py bootstrap_top.bit rom.bin output.bit
Section 'a' (size 36) 'bootstrap_top.ncd;UserID=0xFFFFFFFF '
Section 'b' (size 12) '6slx9tqg144 '
Section 'c' (size 11) '2012/02/17 '
Section 'd' (size 9) '00:40:54 '
Section 'e' (size 340604) 'FPGA bitstream'
Merged user data begins at FLASH address 0x05327C
Copying data from flash to SRAM
Now that we've stored the ROM contents in non volatile flash, we need to copy them to SRAM on every power on so that they are available for fast access. An example project is attached which shows how this can be accomplished. At power on or external reset, the signal "bootstrap_busy" is asserted and a state machine initializes the flash by sending it the address where the user data begins. This is the address we obtained from bitmerge.py and is hardcoded in the VHDL top level code as constant "user_address". Because we maintain the chip select to the flash, we do not need to send further addresses to the flash, we simply keep clocking it and every 8 clocks we've shifted a new byte out of the flash. The flash auto-increments the address for us, and if the maximum address is reached, it rolls over to zero though that won't happen here as we stop reading the flash when we've reached the end. As each byte is read from flash, the state machine stores in the low byte of the SRAM's 16 bit data bus, whereas the high byte is set to zero since it's not used. When the flash has finished copying to SRAM, the signal "bootstrap_busy" is de-asserted. This signal can be directly used as an active high reset by the user design. Finally, the signal "bootstrap_busy" drives a multiplexer which "steals" the SRAM data, address and control lines during bootstrap and connectes them to "bs_*" signals then releases them once it's done. The user can access the SRAM via the signals "user_*"
During bootstrap, the flash data range from address "user_address" to the top of flash (0x3FFFF) is copied to the SRAM starting at address zero, even if the user ROM data does not in fact occupy that entire range. The performance penalty for copying the extra data is negligible.
In the example project, the user portion of the design reads the SRAM sequentially and sends the bytes out as rs232 data at 115200 baud 8N1, formatted as 16 hex bytes per line. This is just an example and the user can replace this part of the design with whatever they wish.
The FPGA bit stream for an LX9 is about 333K bytes and the serial flash is 4M bits, or 512K bytes so the amount of space available for user ROMs is about 179K bytes. The SRAM is significantly bigger than 179K bytes so there will not be enough space in the flash to fill the entire SRAM. Also the SRAM data bus is 16 bits or 2 bytes wide. It's conceivable that a user could use the low byte lane of the SRAM, for example, to store the ROM image while using the high byte lane as actual RAM. The limitation here is that one cannot access different "ROM" and RAM addresses simultaneously, so a staggered access must be implemented.
Also, because the SRAM LB and UB selectors are tied together, byte access to the SRAM is disabled, so the SRAM data can only be accessed 16 bits wide at a time. For that reason, if one wants to write to the SRAM, care must be taken to not overwrite the "ROM" contents. This is accomplished with a read-modify-write cycle. The SRAM contents at a given address are read to obtain both high (RAM) and low (ROM) bytes, the high byte is updated while preserving the low byte, then the word (2 bytes) is written to SRAM again.
Project Files Direct Download
Google Code Repository
project wiki page

What is S/PDIF?
It is the digital audio output from CD's, PCs and other consumer devices.
In brief, it consists of a stream of subframes, each containing a header (equivalent in length to 4 bits), a 24 bit signed audio sample and 4 bits of subcode data.
The encoding is such that there each frame is encoded into 64 clock cycles (2 per bit). The signal always 'flips' between each data bit, and it also flips in the middle of a '0' bit. The binary value 11001010 will get encoded as either 11-00-10-10-11-01-00-10 or 00-11-01-01-00-10-11-01. So a 44,200Hz stream will actually consist of 32bit per subframe * 2 clocks per bit * 2 channels * 44,200 samples per second gives a S/PDIF signaling rate of 5,657,600Hz.
To provide synchronization of subframes, three header patterns are used - 00010111, 00011011, and 00011101 (and their inversions 11101000, 11100100, 11100010). Because these patterns break the usual rules of a signal change every other cycle it can be used to synchronize to the start of a subframe. The three different headers indicate which channel the subframe sample is for, and if the subframe is the start of a frames.

Electrical interface
Over coax, the signal is sent as a 0.5v peak-to-peak signal that needs conversion into LVTTL before it can be processed by an FPGA. I found this schematic at http://sound.westhos.../project85.htm:

Converting S/PDIF from coax to LVTTL
Implemented on a breadboard it looks like:

Converting S/PDIF from coax to LVTTL
I have tried implementing it using the FPGA's I/O pins, but it wasn't reliable - it needed a occasional poke of a finger to get it to successfully convert to TTL. I attribute this to the short circuit protection resistors on my FPGA development board, or maybe the Schottky characteristics on the FPGA's outputs.
How to capture the signal
First thing is to convert the signal into the FPGA's clock domain. I also use this to detect the flips in the input bitstream:

entity resync is
Port ( clk : in STD_LOGIC;
bitstream : in STD_LOGIC;
flipped : out STD_LOGIC;
synced : out STD_LOGIC);
end resync;

architecture Behavioral of resync is
signal ff1,ff2 : std_logic;
flipped <= ff1 xor ff2;
synced <= ff2;

process (clk, pulse, ff1, ff2)
if clk'event and clk = '1' then
ff2 <= ff1;
ff1 <= bitstream;
end if;
end process;
end Behavioral;
Failure to reclock caused me much grief.
One way to recover the S/PDIF data is to count the length of the pulses, giving pulses that are either one S/PDIF clock, two clock or three clocks in length. This works well, but needs a finite state machine to work out where the headers are and then to recover the data bits.
I chose to recover something close to the the sender's original clock, and use this to sample the signal into a 64 bit shift register the size of the frame. The highest 8 bits can be checked for a frame header, and the bits can be recovered by comparing even and odd positions in the shift register. Here's how the frame is assembled:

entity frameCapture is
Port ( clk : in STD_LOGIC;
bitstream : in STD_LOGIC;
takeSample : in STD_LOGIC;
data : out STD_LOGIC_VECTOR (23 downto 0);
channelA : out STD_LOGIC;
dataValid : out std_logic);
end frameCapture;

architecture Behavioral of frameCapture is
signal frame : STD_LOGIC_VECTOR (63 downto 0) := x"0000000000000000";
if clk'event and clk='1' and takeSample = '1' then
frame <= frame(62 downto 0) & bitstream;
end if;
end process;

-- checking for a subframe header
dataValid <= '0';
channelA <= '0';
if frame(63 downto 56) = "00010111" or
frame(63 downto 56) = "11101000" then
dataValid <= '1';
channelA <= '1';
end if;

if frame(63 downto 56) = "00011101" or
frame(63 downto 56) = "11100010" then
dataValid <= '1';
channelA <= '1';
end if;

if frame(63 downto 56) = "00011011" or
frame(63 downto 56) = "11100100" then
dataValid <= '1';
channelA <= '0';
end if;
end process;

-- Recovery of data bits
data( 0) <= not frame(55) xor frame(54);
data( 1) <= not frame(53) xor frame(52);
data(21) <= not frame(13) xor frame(12);
data(22) <= not frame(11) xor frame(10);
data(23) <= not frame( 9) xor frame( 8);
end Behavioral;
So, how to regenerate something approaching the sender's clock? I chose to find the length of the shortest pulse, and then sample at 0.5x, 1.5x and 2.5x the minimum pulse length from a flip of the input signal. If the signal does not flip within four times the minimum sample time it indicates that minimum pulse length is incorrect, or the signal is no longer present.

architecture Behavioral of reclock is
type reclock_reg is record
count : STD_LOGIC_VECTOR(9 downto 0);
takeSample : STD_LOGIC;
resetInputCounter : STD_LOGIC;
end record;
signal r : reclock_reg := ("0000000000",'0','0');
signal n : reclock_reg;
process(flipped, r, oneAndAHalfPulse, twoAndAHalfPulse, fourPulse)
n.count <= r.count+1;
n.takeSample <= '0';
n.resetInputCounter <= '0';

if n.count >= fourPulse then
n.resetInputCounter <= '1';
end if;

if n.count = halfPulse then
n.takeSample <= '1';
elsif n.count = twoAndAHalfPulse then
n.takeSample <= '1';
elsif n.count = oneAndAHalfPulse then
n.takeSample <= '1';
end if;

if flipped = '1' then
n.count <= "0000000001";
end if;
end process;

-- Assign next State
process (clk, n)
if clk'event and clk = '1' then
r <= n;
end if;
end process;
end Behavioral;
Here is the original bitstream, and a second trace of the trigger used for sampling:

A S/PDIF frame, showing the Sampling trigger pulses
This is sub-optimal - if the minimum pulse is just under 5 FPGA cycles 2.5 x 4 cycles = 10 cycles - close enough that a sampling error can occur. Maybe sampling at (minimum pulse len-1), (2*minimum pulse len-1), (3*minimum pulse len-1) would be better when the FPGA clock rate is not many times that of the SPDIF signaling rate.
And that is pretty much it
Converting samples back to audio
Once you have the data, it's pretty simple to send it into a two generic 1bit DACs and listen to the sound. Just remember to convert the signed integer sample into an unsigned value for the DAC by inverting bit 15:

entity dac16 is
Port ( clk : in STD_LOGIC;
data : in STD_LOGIC_VECTOR (15 downto 0);
dac_out : out STD_LOGIC);
end dac16;

architecture Behavioral of dac16 is
signal sum : STD_LOGIC_VECTOR (16 downto 0) := "01000000000000000";
dac_out <= sum(16);
process (Clk, sum)
if Clk'Event and Clk = '1' then
-- Don't forget to flip data(15) to convert it to an unsinged int value
sum <= ("0" & sum(15 downto 0)) + ("0" & (not data(15)) & data(14 downto 0));
end if;
end process;
end Behavioral;
Output was just through headphones connected between the DAC output and ground - not ideal, but as my development board has 220 Ohm resisters on all lines it couldn't harm it. A better way would be for a low pass filter, and a capacitor to block DC. All the same, the volume was loud enough that I had to use the inline volume control.
Seven segment displays are useful for showing data, debug info such as addresses, etc. Instead of using a physical expansion board (wing) which costs money and has a limited number of digits, this simply uses an existing VGA output on a wing to implement any number of seven segment displays limited only by the available screen real estate.
In principle, if your project already uses the VGA output such as say PacMan, if should be quite easy to superimpose the seven segment display on top of the existing video output by logical ORing the seven segment display output with the existing VGA signal.
The position, size and color of each digit is configurable. A simplistic demo top module is included to generate some static and dynamic digits of various sizes and colors, see picture below.
Source Code : Source code on Google Code
6K sample memory at 32 channels, 24K sample memory at 8 channels
32 channels sampling at 100MHz
16 channels sampling at 200MHz
Four stage serial and parallel triggering
External clock input
Noise filter
RLE built into the hardware to make the most of available memory.
SPI protocol analysis (SPI debugger)
I2C protocol analysis (I2C debugger)
UART protocol analysis (UART debugger)
State Analysis

The FPGA can only sample 1.2V, 2.5V, and 3.3V. Any higher voltages can damage the input pins of the FPGA. Given time a plugin board will be developed to address this issue.

NOTE: The current 2.12 release is based on the Openbench Logic Sniffer 2.12 source code. There are issues with RLE, test mode, and there are some failing timing constraints in the project. This release has not been well tested, it is being released while the new Verilog branch of the project is completed. The Verilog branch should fix all of the above issues and will be available soon.
Sources and Attribution
Michael Poppitz was the original author of this great Logic Analyzer design. He wrote the original VHDL and Java client and released it GPL at http://www.sump.org/projects/analyzer/. Please visit his website for more information.
Jonas Diemer took the original design and ported it to the Spartan 3E by utilizing BRAM instead of SRAM he also integrated a RLE into the design. His source can be downloaded here.
The very latest development for the Java client is hosted on SourceForge here.
OakMicros has created a very nice tutorial for the Java client here . They also offer a nice buffer card to allow any Voltage level to be sampled. It is not currently compatible with the Butterfly Platform but watch for an adapter in the future.


Main window (light theme) with scope.

Main window (light theme) with measure tooltip.

Main window (dark theme).

Measurement tool.

OLS general settings.

OLS trigger settings.

General preferences.

For More informations about this project please visit the Sump Logic Analyzer wiki page
Papilio Barcode Genie Kit
Barcodes, Barcodes Everywhere. \Everywhere we turn we are surrounded by barcodes, but without the right tools to capture them they are just wasted information. There are countless times in our day to day life where a portable and hackable barcode scanner can save loads of time.
The first time I wished I had a portable barcode scanner was several years ago when I was working as a system administrator. We were tasked with doing a physical inventory of every server in our data center. This daunting task was a perfect example of what drives me crazy about computers; they are supposed to save us from brain dead and repetitive tasks! But too often they actually end up forcing us into brain dead and repetitive tasks… In this case we were surrounded by enough processing power to send a man to the moon but we had to resort to writing down serial numbers on pads of paper. Every rack and server had barcodes that were just crying to be quickly scanned, but instead we had hours upon hours of practicing our penmanship. I mean come on, this is 2010 we are supposed to have HAL and flying cars! The least we can do is make a hackable barcode scanner to help keep simple tasks easy, right?
The Papilio Barcode Genie kit is a flexible, expandable, and portable barcode scanning kit. It is based on an FPGA for maximum hardware flexibility and written in the Arduino IDE for maximum ease of use. It is a platform that encourages hacking and remixing to adapt it to exactly what you want it to do. Anything you’ve ever wanted to do with barcodes is possible; save them to an excel spreadsheet on a SD card, send them out wirelessly over Zigbee, or add a TFT LCD with a touchscreen. Open Source code and modular building blocks help bring your ideas to life!
The Papilio Barcode Genie kit provides the base system needed to capture barcodes using an off the shelf PS/2 barcode scanner. Captured barcodes are saved to a microSD card in a spreadsheet format (csv) that can be opened directly by Microsoft Excel. A high speed USB channel allows barcodes to be saved directly to a computer if desired. But most importantly, three 8-bit Wing slots and FPGA fabric provide the unconstrained potential to make it your own.

The Papilio Barcode Genie kit is comprised of the Papilio One 250K, which provides the core, and Wings that provide the peripheral functionality. The Papilio One 250K provides the flexible FPGA core that ties everything together. A PS/2 Wing provides two PS/2 connectors that any PS/2 compatible barcode scanner can plug into. The microSD Wing provides the socket that a large capacity microSD card connects to. Finally, a Button/LED Wing provides visual feedback through the LED’s and input to control the application using buttons.
Papilio One FPGA Board
The Papilio One 250K FPGA board provides the empty canvas that the Barcode Genie is built on. The Xilinx Spartan 3E FPGA chip that is used on the Papilio One is like a blank, rewritable CD. The Barcode Genie ships with an AVR8 soft processor already ‘burned’ onto it. It is compatible with the ATMega103 microcontroller and provides a high degree of compatibility with the Arduino IDE and existing sketches. The beauty of a soft processor is that new features can be hacked in. If you want to add a SPI Ethernet Wing but are worried about the SPI pins already being used by the microSD card Wing then worry no more. The soft processor can move the SPI pins on command or another SPI core can be added and ‘re-burned’ to the Papilio One. And that is the key element that the FPGA buys you, no longer do you have to think about whether the SPI pins are connected or if the chip has such and such ability. You just know you can add whatever you need.
With the Papilio Platform the name of the game is flexibility, so once you are done scanning barcodes you can use the Papilio One as a Logic Analyzer, another Papilio kit, or even a full blown FPGA development kit. The very popular “Sump” Logic Analyzer is fully ported and supported by all Papilio One boards. Just download the project and enjoy using a 32 channel 100Mhz Logic Analyzer. Or, just swap out the Wings to play with any future Papilio kits we develop such as a Magnetic Strip Genie. Finally, and maybe most importantly, the Papilio One is a full FPGA development kit that you can start learning VHDL/Verilog with by simply downloading the free Xilinx ISE software.
PS/2 Wing

The PS/2 Wing provides two PS/2 sockets that enable the use of commodity barcode scanners. PS/2 barcode scanners are inexpensive, plentiful, and reliable. Instead of re-inventing the wheel with a homebrew barcode scanner, PS/2 barcode scanners were targeted instead. While the Barcode Genie does not include a barcode scanner it is easy and affordable to get one. At the time of writing there were barcode scanners on Ebay for as little as $10. In fact, we won an Ebay auction that included two used barcode scanners for $12.
The PS/2 Wing design is Open Source under the Creative Commons license. It is designed in EAGLE and can be downloaded from the development page at GadgetForge. The design provides a 5V source to pin 4 of the PS/2 connector and ground to pin 3. Data (PS/2 pin 1) and CLK (PS/2 pin 5) are connected to the I/O pins of the Wing header. Each data pin is protected by a 270 ohm current limiting series resistor that protects the pins from too much current that can be generated by the 5V PS/2 pins.
microSD Wing

The microSD Wing provides a convenient means of storing large amounts of data on removable media. The Barcode Genie kit does not include a SD card in order to allow the most flexibility in choosing the appropriate sized microSD card. All development and testing was done with a 2GB Kingston microSD card.
The microSD Wing design is also Open Source under the Creative Commons license. The EAGLE design files can be found at the GadgetForge development page. At the core of the design is a spring loaded microSD socket. The Barcode Genie uses SPI mode which only uses four pins but all six of the microSD card signals are routed to the Wing header. This will allow for faster communications if a full speed SD card core is added to the soft processor in the future. All I/O pins, except SCK, are routed to the Wing header through a 47K pull up resistor array.
Button/LED Wing

The Button/LED Wing provides all of the resources needed to communicate with the program running on the Barcode Genie kit. Four LED’s provide visual feedback for when the Barcode Genie is ready to scan the next barcode, when it has scanned the last barcode, and the state of the SD card.
Four buttons allow user input for things such as starting a new row of barcodes in the Excel csv file. It can also form the basis for a navigation system with the addition of a character LCD Wing.


The software for the Barcode Genie is written as a sketch in a specially modified version of the Arduino IDE. The sketch is compiled to run on the AVR8 soft processor and is loaded to the Papilio One by the Arduino IDE. The Barcode Genie kit is shipped with the latest version of the sketch loaded into the SPI Flash so that the Papilio One will automatically run the sketch when powered on. Making changes to the sketch is as easy as making changes to the Arduino. Just hack/remix the code in the easy to use IDE and press the Upload icon. The Arduino IDE takes care of everything else and hides the nitty gritty technical details of the C++ libraries and FPGA hardware so you can focus on hacking!
The Barcode Genie sketch is kept simple in order to make it easy to make it your own. Out of the box it can scan barcodes and save them to an Excel compatible csv file on an SD card. But where the kit really excels is making it easy to adapt it to anything you want to do with barcodes. Need to make it wireless? Just add a Zigbee Wing, include the Zigbee libraries, and start using the high level methods from the library. That is the secret sauce of the Arduino IDE, it encourages “client” object oriented programming. Experienced programmers create C++ libraries that provide objects that are used by less experienced “client” programmers. The experienced programmers boil down the complexity of the library to just a few methods that client programmers need to learn and use. The end result is that relatively new programmers can accomplish amazing things by just including a library, studying some examples, and remixing a sketch to suit their needs. It’s very exciting, powerful, and inviting to new programmers. The Papilio Platform continues this tradition by providing libraries that are coupled with hardware Wings to simplify things even further!
The Barcode Genie finally makes it easy to capture barcodes without the huge overhead of carrying a laptop around. This simple device can be mounted next to a refrigerator to make grocery lists or carried around in a pocket for taking inventory. Being built from the ground up with the express purpose of easy hacking makes it simple to bring new ideas and uses to life.

Mount the Barcode Genie next to a refrigerator and scan empty containers as food is consumed. Pull the SD card before shopping for an instant shopping list that can be printed out or accessed on a smart phone. Access the list on a computer and do a Google search to find the best price on shippable goods like canned foods.
Plug a battery into the Barcode Genie and make short work of inventory tasks. Do a physical inventory of every server in a data center. Scan the barcode of every book at a thrift store to find books with value. Keep an inventory of electronics parts as they arrive.

Future Direction
Here are some proposed ideas for how the Barcode Genie can be hacked/remixed for new applications:

Add a character LCD module for the display of more meaningful information such as the barcode scanned or scan count.
Add a Zigbee Wireless Wing to allow the Barcode scanner to communicate back to the Internet or a central database.
Add a touchscreen display (a Wing is in the works) to display more complicated data such as price comparisons from the Internet.
Add a scale over a RS232 port that allows instant inventory of a bag or bin of parts. The barcode genie can store a database of what each part weighs and subtract the weight of the container to come up with an accurate count of parts. Couple this information with an advanced display or a wireless connection for an advanced inventory system.
Develop software that takes a shopping list generated by scanning barcodes and then uses the Internet to find coupons and the best price for the automatically generated shopping list.

William Greiman – Wrote the SdFat library that plays such a central role in the Barcode Genie kit.
Benjamin Maus – Wrote the barcode scanning code that is equally critical to the operation of the Barcode Genie kit.
Arduino Team – Made a great IDE that makes it easy to modify the functionality of the Barcode Genie with sketches.
Wiring Team – Came up with the idea of porting Processing to microcontrollers.
Processing Team – And finally, the Processing team who got everything started and provided the framework for Wiring.

Please visit the Papilio Barcode Genie wiki page for more informations about this projects
Demo Video
Alex shows the Papilio One playing sid files:

Readme.txt (Written by Alex)
Alex - Author of this project for the Papilio.
Markus Gritsch - HybridSID
PACEDev.net - The source for the SID VHDL project. We have attempted to determine the original author of the SID core but have been unable to yet. If this SID VHDL project is yours please contact us so we can provide proper attribution and clarify the license.

Source Code: NetSID.zip

TV Output Wing

By ben, in Papilio Wings,


Quick Links
Forum thread
Source code
Schematic in PDF format
EAGLE PCB Design on Github
Purchase PCB at BatchPCB
Parts List at Mouser


7/13/2011 Revision 1.0 of Wing was designed and submitted to BatchPCB for manufacturing.
7/15/2011 The design is available for $8 at BatchPCB, please be aware that it is not verified yet and may not work correctly.
7/26/2011 Design is built and verified. There is an issue with the example bit file having the pins defined in opposite of the Wing.

BenL is working on a Sega Master implementation on the Papilio FPGA board. During that process he created a TV Output Wing:
"This TV output wing could be an interesting replacement for the VGA+jack outputs of the arcade kit. It's a very simple R-2R ladder 7 bit DAC for the video line, and a plain low pass filter for the audio (the DAC must be implemented through logic).
You can very easily output B/W video, and colors with some efforts :"

The schematics :

You should use R=115 ohms and 2R=230 ohms to get 0.3V from "010000" (black) on the input and 1.0V from "110000" (white), the remaining "001111" being left for color modulation at high luminance.
A picture of the homemade prototype :

Color Test Bar
A small demo : compile this, plug your tv_output wing in A8-A15 and enjoy 8 color bars in the middle of the screen: yellow, cyan, green,magenta,red, blue, black and white. And overscan is grey.

You can switch in main.vhd between pal and ntsc (*_video to produce the correct sync, *_encoder to transform rgb to quadrature modulated yuv)
The image is good, but far from perfect: low sat on yellow and cyan, interferences between luminance and color carrier, some dot crawl (although you might not notice it because it scrolls vertically really fast)

How it works
You need to generate sync, luminance (monochrome image) and color information. Seehttp://www.deetc.ise...ds/doc/ch08.pdf for details on the theory.
The provided encoder turns 2 bit RGB (64 colors) into YUV by a table lookup, and modulates the U/V signals by multiplying them by a square wave at the color carrier frequency (3.45MHz for NTSC, 4.62MHz for PAL) in phase (for U) and quadrature (for V). The color clok is generated from a 64MHz clock (thanks to a 21 bits accumulator), itself derived from the main 32MHz clock of the Papilio.
sync, Y, U and V are summed into the FPGA, and converted by the R-2R DAC into the corresponding voltage: the resistor values are chosen to turn the 3.3V of the Papilio output into the 0, 0.3 and 1.0V under 75 ohms that the tv requires.
Useful Links
Information on converting Video Formats