awallin

Free FPGA resources with AVR8 or ZPUino ?

Recommended Posts

Hi all,

 

I'm wondering what fraction of the FPGA resources on a Papilio Pro are free to use for custom logic when an AVR8 or ZPUino is loaded onto the FPGA?

 

Are there common things (software defined radio? or other big/nontrivial things) people want to do with the PPro that won't fit together with a Soft Processor?

 

Can someone give an example of the complexity/size of VHDL code that will fit (or not fit) along with an AVR8/ZPUino?

 

thanks,

AW

Share this post


Link to post
Share on other sites

Hi,

 

attached a place-and-route report from a "vanilla" zpu, no Wishbone etc.

All over all, it's pretty clear that most of the floorspace is left for my project.

 

Note that this ZPU variant emulates opcodes, needs a bit more ram and is slower than it could be. For example, it uses zero DSP48 slices (multipliers), because multiplications are done in software (probably - never tried). There are other processors, like Pacoblaze (again, never tried myself).

 

For radio experiments, why not prototype the control algorithms on a PC and use USB in-between as register level interface?

 

 

Even if you are planning a complex design, get some cheap'n'cheerful low-end board first. Consider it disposable, probably it'll become your "swiss army knife".

With the Papilio pro it takes about half a second to upload the bitstream. With high-end boards it can be much longer. Possibly faster devices are also more sensitive to damage. I've seen one 3 k€ board go up in smoke and no, it wasn't me.

 

Assuming you haven't worked with FPGAs before, expect that every step on the way is a small uphill battle. You can (and possibly should) spend days learning the tools, at a level where your design only makes the LED blink. For SDR, you'll have to re-think your algorithms with an FPGA mindset (i.e. pipelining). Start with some small problems.

 

The vendors try to make you believe that you can simply stack their DSP blocks like in a textbook, but that's largely a marketing ploy. Well, you can, but then the hardware cost goes through the roof.

 

 

Device Utilization Summary:

Slice Logic Utilization:
  Number of Slice Registers:                   164 out of  11,440    1%
    Number used as Flip Flops:                 164
    Number used as Latches:                      0
    Number used as Latch-thrus:                  0
    Number used as AND/OR logics:                0
  Number of Slice LUTs:                        663 out of   5,720   11%
    Number used as logic:                      656 out of   5,720   11%
      Number using O6 output only:             539
      Number using O5 output only:              61
      Number using O5 and O6:                   56
      Number used as ROM:                        0
    Number used as Memory:                       0 out of   1,440    0%
    Number used exclusively as route-thrus:      7
      Number with same-slice register load:      4
      Number with same-slice carry load:         3
      Number with other load:                    0

Slice Logic Distribution:
  Number of occupied Slices:                   229 out of   1,430   16%
  Number of MUXCYs used:                       140 out of   2,860    4%
  Number of LUT Flip Flop pairs used:          671
    Number with an unused Flip Flop:           511 out of     671   76%
    Number with an unused LUT:                   8 out of     671    1%
    Number of fully used LUT-FF pairs:         152 out of     671   22%
    Number of slice register sites lost
      to control set restrictions:               0 out of  11,440    0%

  A LUT Flip Flop pair for this architecture represents one LUT paired with
  one Flip Flop within a slice.  A control set is a unique combination of
  clock, reset, set, and enable signals for a registered element.
  The Slice Logic Distribution report is not meaningful if the design is
  over-mapped for a non-slice resource or if Placement fails.

IO Utilization:
  Number of bonded IOBs:                         4 out of     102    3%
    Number of LOCed IOBs:                        4 out of       4  100%

Specific Feature Utilization:
  Number of RAMB16BWERs:                         2 out of      32    6%
  Number of RAMB8BWERs:                          0 out of      64    0%
  Number of BUFIO2/BUFIO2_2CLKs:                 0 out of      32    0%
  Number of BUFIO2FB/BUFIO2FB_2CLKs:             0 out of      32    0%
  Number of BUFG/BUFGMUXs:                       1 out of      16    6%
    Number used as BUFGs:                        1
    Number used as BUFGMUX:                      0
  Number of DCM/DCM_CLKGENs:                     0 out of       4    0%
  Number of ILOGIC2/ISERDES2s:                   0 out of     200    0%
  Number of IODELAY2/IODRP2/IODRP2_MCBs:         0 out of     200    0%
  Number of OLOGIC2/OSERDES2s:                   0 out of     200    0%
  Number of BSCANs:                              0 out of       4    0%
  Number of BUFHs:                               0 out of     128    0%
  Number of BUFPLLs:                             0 out of       8    0%
  Number of BUFPLL_MCBs:                         0 out of       4    0%
  Number of DSP48A1s:                            0 out of      16    0%
  Number of ICAPs:                               0 out of       1    0%
  Number of MCBs:                                0 out of       2    0%
  Number of PCILOGICSEs:                         0 out of       2    0%
  Number of PLL_ADVs:                            0 out of       2    0%
  Number of PMVs:                                0 out of       1    0%
  Number of STARTUPs:                            0 out of       1    0%
  Number of SUSPEND_SYNCs:                       0 out of       1    0%
 

Share this post


Link to post
Share on other sites

Ok, so I just recently gathered these numbers for the Papilio Schematic Library. One of the main benefits of the ZPUino is that it is small so there is quite a bit of FPGA space left over for your designs.

 

Here is a snapshot of the ZPUino symbol for the Papilio Pro:

post-29509-0-73088100-1390799006_thumb.p

 

And here is a snapshot of the ZPUino symbol for the Papilio One, both 250K and 500K:

post-29509-0-21860800-1390799037_thumb.p

 

As you can see all of the important stats of the basic "Vanilla" design is written down in the schematic symbol.

 

For the Papilio Pro we have around 66% of FPGA slices available, 21 2KB Block Rams available, 1 PLL and 2 DCM clock generators available, and 8MB of code space available.

 

For the Papilio One 500K there is 63% of FPGA slices available 2 2KB Block Rams available, 2 DCM clock generators available, and 27KB of code space available.

 

For the Papilio One 250K there is 30% of FPGA slices available, 2 2KB Block Rams available, 2 DCM clock generators available, and 12KB of code space available.

 

I would not recommend the Papilio One 250K since there is not a lot of space free but the 500K and Papilio Pro have plenty of resources available.

 

We are working hard on getting the Papilio Schematic Library in place with lots of good tutorials and documentation so you will have everything you need to make your system as a Wishbone peripheral that can plug right into one of the Wishbone Slots. And even better, we have a Logic Analyzer that you can use to debug your design...

 

Jack.

Share this post


Link to post
Share on other sites

Note that overall, the number of free resources (slices) can be a bit misleading. The P&R will use all of the available resources in order to meet timing and to be faster, so the number of used LUT is probably the best way to tell how big a design is (Flip Flops are free, since their used number is lower than LUT).

 

You will get into trouble if you use >90% of the LUT, cause P&R will probably not be able to meet timing, because it cannot "freely" locate the LUT in the available space.

 

Another problem is that sometimes you cannot pack FF into the same slice due to restrictions in the control sets.

 

On this design I have over here (ZPUino 2, PPRO) you can see that about 60% of LUT are being used:

 

  Number of Slice LUTs:                      3,479 out of   5,720   60%

 

And the number of slices is bigger:

 

  Number of occupied Slices:                 1,275 out of   1,430   89%

 

We can also see that for 6% of the registers we were not able to pack them into the same slice:

 

    Number of slice register sites lost
      to control set restrictions:             763 out of  11,440    6%

 

So, there is no "direct" way to tell how much free space you got on the FPGA. It will depend on a variety of factors.

 

Alvie

Share this post


Link to post
Share on other sites

Yes, please regard the free slices as just a guesstimate of how much space is available, the true picture is too complicated to easily convey. There are just too many factors and it depends on what the VHDL code you implement does, there is no easy answer to how much space is available so I usually just settle on slices to use as a general estimate.

 

Jack.

Share this post


Link to post
Share on other sites

One way to get reliable numbers is to synthesize, draw a PBLOCK in the floorplanning window and assign all primitives to it. P&R will take a little (/up to infinitely much) longer. If it succeeds, you can see that everything fits into the box, and can be fairly certain that the remaining floorspace including the MEM and DSP blocks is really yours.

 

BTW, there is no need to buy any hardware to get started. Simply download ISE 14.7 / "Plan Ahead" any time you like, it's free. But in reality, I'd expect few people will find the motivation to go through the mandatory epic struggle with the tools without having some shiny new toy in front of them.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now