• Content count

  • Joined

  • Last visited

  • Days Won


Community Reputation

6 Neutral

About EtchedPixels

  • Rank
    Advanced Member

Profile Information

  • Gender Not Telling
  1. If you look at the fork of it I did that already has CP/M 3.0 banked CP/M support. Mostly I did that because I wanted 512 byte/sector support as it also supports the use of an SD card for disk.
  2. With suitable level shifters you can simply wire the 6502 directly to the FPGA. There are similar projects that use a PIC or similar device to do this. Another option, which has its own big set of advantages is to use one of the 6502 FPGA cores and put the entire thing on the FPGA. That way you can have your 32MHz 6502 with 512Kb of banked memory and all the other bits you fancy. Alan
  3. Most of a multi-core CPU is the memory interface. At least you've got dual ported RAM (with the odd fun 'feature' - definitely read the chip documentation!) so some of the horrible bits are done for you.
  4. I did look at it a bit, and figured it was easier to wait for the Duo board with SRAM. For now I'll stick the SocZ80. It ain't simple but a 128MHz Z80 makes a wicked CP/M dev box
  5. You don't need call return providing you can get access to PC somewhere. Ultimately however all 8bit microprocessor designs that pursue elegance evolve into a 6809 (and I say that as a Z80 fan) You don't actually need a lot of instructions to get a pretty effective processor, but how easy it is to program for and how short the code is are rather different questions. The 6502 for example is pretty minimal and quite effective (if a pita to program), while the 8008 is miniscule but does lack a proper stack and arbitrary depth call/return In your instructions set I'd say you can drop NAND (you have NOT and AND), you can drop NOT (XOR). You can in theory even drop SUB as you have ADD and XOR (thus NOT). Some of the bigger machine word systems also didn't have a jump instruction as such, you merely need store-conditional and you can treat program counter as a register. That also makes stacks or register link calls trivial SL/SR can both be replaced with the more useful rotate operation which is as cheap to implement but can do shift left/shift right/ rotate left/rotate right if combined with AND I suspect you can implement call/return and the stack ok as you've got register relative ops so you can use a register of your choice as stack. The only ugly would be that you basically end up doing "load register with constant computed at link time", stick it in (Rstack), Rstack += 2, JMP xx, and your 'RET' is slightly ugly too
  6. You could also just watch for patterns on the address bus and trigger an interrupt on those. Not only is it a good debug tool but of course you can interrupt on a device writing to a memory location - and without polling the bus - kind of like PCI MSI
  7. I can't think of a generic case but then you would wasting main memory bus bandwidth polling the address rather than being event triggered. The Z80 DMAC can do it (interrupt on match) but it was almost never a win to do so because of the bus contention.
  8. If you are doing an rt system then you can hide interrupts from users by making an interrupt an event. You never see an "interrupt" just your thread gets woken up again. Making that work often needs support for priority inversion handling and priorities but I suspect you need them anyway and deadlock detection if you want it to be reasonably userproof. Controllers monitoring bits of I/O space seems sensible - it's not new, floppy controllers generally polled the disk change lines of the disks and turned it into an IRQ. Some ethernet controllers support similar PHY polling schemes so the hardware polls the phy regularly and checks for certain changes.
  9. Thats confusing given that VHDL is a sort of bastardised ADA which is (allegedly) a programming language.
  10. I can only speak for the Linux case, but I think Linux will be quite happy with physical mappings in supervisor mode. Do you need to force a page size or can you match on a base/mask pair as the 68010/68451 pair did ? Physical without proper caching would be bad though. I guess with 16MB tlbs for the kernel it wouldn't be too bad. I guess the other alternative is segment based addressing 8) There are reasons a lot of the earlier microprocessors with memory protection used segments even if it made programming them less fun in some cases (x86 due to the 16bit size). Does make full virtual memory harder but it makes the MMU architecture much simpler because you cache the entry with the segment register. Would limit you to ucLinux but with protection (although in theory with a bit of core kernel hacking you could also get fork() etc working) or perhaps a retrobsd/2BSD. Would going to 8 or 16K pages help - seems like it would also help for performance, especially if your code isn't very compact. There's definitely going to be a trade-off on how much time you spend reloading TLB entries and efficiency of memory use. 16K pages isn't that unreasonable and x86 is really only 4K nowdays because of compatibility. 16K ought to mean less misses and two less match bits to worry about in the cam Other trick is to ignore some bits of the virtual address space for now (and support it later as needed). Some 64bit cpus do this today. Not sure I'd bother with a context/ASID. If you only have 8 entries then it'll be cheap to save/reload them on a task switch and if that lets you have more TLBs that I imagine would be a bigger win ?
  11. There is plenty wrong with VHDL and Verilog but I have to say the biggest problem I (and I think many people from a programming background) have is the business of thinking in parallel. Not just the idea that things like assignments take time and aren't instant but things like the fact that (except for power in some cases) it's actually not worth doing conditional evaluation of something, you can evaluate it every clock at no extra cost, in fact you can evaluate hundreds of un-needed things for free just in case they are relevant to a given cycle. Not sure a language can help much with that. There is simply a gap between the conceptual model of programming and the reality of FPGA.
  12. It's nastier than that if you are not very careful. Consider the sequence TLB miss fetch TLB miss handler instruction, oh bugger it's not there -> BOOM and TLB miss fetch instruction save old stack pointer, oh bugger -> BOOM (and thats a general trap handling issue with TLB misses - where do you put the trap vector and restart data that won't itself cause a TLB miss) I would vote for running the TLB miss handler physically mapped. In fact if you don't have many TLB entries, or your TLB entries don't have a size field I'd vote for running "supervisor mode" code physically mapped always. If you've only got fixed size say 4K TLBs then you have to take hits executing kernel code, which is stupid, and you have some other horrible cases (the infamous one is drawing a vertical line on a frame buffer) On x86 we try and do things like map the Linux kernel and its view of physical RAM using large pages, because even with a hardware TLB fetcher and a big TLB the TLB misses hurt. Another approach used by some processors is to in effect sacrifice a couple of bits of virtual address space to "direct mapped" and "uncached" and things like that. Then it becomes address[31] = physical mapped "0"&address[30 down to 0] else TLB If you do that then I think you are probably ok providing the user makes sure their TLB trap vector, code and the like is all in physical space.
  13. Most of the processors that do this dump enough internal state onto the stack for the trap and then throw the lot at the OS and say "you clean it up". For some processors this was *evil* but usually consisted of a chunk of nasty to understand assembler the manufacturer provided. I would think if your instructions are restartable then you probably only need to know PC of trapping instruction and delay slot flag. At that point you can reconstruct and resume execution (you might need to know if the jump was taken). So I think I'd push condition codes flags [delayslot, etc] onto the trap stack or even push both a trap pc and a resume pc (the same except for delay slots when resume pc is the jump) It becomes something like restartaddr = stack[trap_pc]; if (stack[flags]&DELAY_SLOT) { restartaddr -= JUMP_SIZE stack[trappc] = restartaddr; ret (restores condition code, continues in usermode. The branch will be re-executed and go the same way as before) Probably even hideable in hardware. One thing that's nice about hiding it is that you keep compatibility if your behaviour has to change in future processors (eg x86 hides all sorts of parallelism in the real processor when it comes to throwing exceptions)
  14. On the store trying to buy one goes to a broken link page ?
  15. The SD only works in UZI and is a bit iffy as its driving it in SPI mode (1,1) not (0,0) as it should. I've added support for other SPI modes to the spimaster VHDL but only so far tested the 0,0 with ethernet. Once its a bit more tidied up I'll put up a new version of the 'classic' build with CP/M 3.x including SD support, ethernet SPI port etc