-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stand-alone PPU adaptation #337
Comments
The PPU as it stands in the NES project is not suitable for a pin-accurate replacement to a real PPU. The FPGA implementation largely abstracts away the need for the ALE pin (address latch enable) and the multiplexing of the associated address pins, as well as composite generation entirely in favor of RGB output. There is a branch in my repo that makes the PPU more closely match the output of the real PPU in terms of timing and clock behavior, but the ALE pin is untested in that as it's still unused, and it still does not have composite generation. I think with enough work you could make it function the way you want, but it's going to require writing some code and a lot of testing, it won't just be a drop-in task. |
Thank @Kitrinx for the reply and analysis, much appreciated! The (secondary) requirement of composite video output can be deferred, at least for the time being. The priority would be RGB output on pins 14, 15, and 16 (like an RP2C03) along with a composite sync signal output on pin 21. So yes, basically a drop-in replacement for an RP2C03 or similar chip. I didn't expect this to be a no-brainer task, which is also one of the reasons that I think it would be an enormous amount of work to do myself, being a beginner to FPGAs and Verilog. That's also the main reason why I'm prepared to pay someone for the work involved in getting the functionality to that point. I should mention that I would also cover the cost of any parts or supplies in such an arrangement. @Kitrinx, is this something that interests you, and if so, do you think you have the skill level and availability to accomplish it? From my perspective, NES_MiSTer looks like the open-source project that's already closest to this goal, which is why I thought it made sense to approach the folks here about it. |
I can't really commit to any kind of time frame but I'm willing to assist in your efforts. There's probably some hardware comparisons to a real PPU that have to happen to get the timing of some things just right, particularly with the ALE stuff, and i'm not set up to do that here, so that would be up to you. |
@Kitrinx, awesome, that sounds great! Here are some of the basic things on my mind at the moment regarding this:
Looking forward to hearing your thoughts! |
I've used visual 2c02 extensively in working on the PPU and it's 2A03 counterpart for the APU. They are helpful for a lot of things but excruciatingly slow, and can't be assumed to be accurate on an analog level, ie the time of a rising edge, etc. I can't comment on how much space the PPU would take up as ALMs and LE's on other FPGA's aren't really 1:1, and there's features in the PPU that don't need to be there on an external implementation, like extra sprites and save states. It does need enough spare block ram for the OAM ram * 2 and the palette ram. I think almost all FPGA's will have that available. As for the palette, the differences between PAL and NTSC palettes aren't really striking, per se. PAL will generally have less skew to it's chroma angles at higher luma values, but this is also the same for early model famicoms as well. If there's one thing i've learned, it's that nobody will ever be happy with only a single palette. You can use mister's built-in selections as a good guide to what some of the most popular ones are though. The PPU has a few options that need consideration:
|
@Kitrinx, thanks for the reply. I do see the usefulness and limitations of the visual simulations that you described. How does one go about determining FPGA sizing requirements? Likewise, is there a different class or family of FPGAs (other than ICE40LP) that you would be inclined to recommend? It would make sense to remove any irrelevant code such as save state support, etc. It's true that it's impossible to make everyone happy with palettes! We could limit it to only original RGB PPU palettes since I believe simpler is better in this case, at least for now.
I believe all RGB PPUs such as the 2C03, 2C04, and 2C05 use NTSC timings. So this would be NTSC.
How would this impact the functioning of the PPU itself?
I'm inclined to omit this option. Without an overclocked CPU, is there a compelling reason for this?
Can you elaborate on this? Is this something that would be done in the chip's code?
There are three main use cases I can think of:
RGB PPUs are expensive and difficult to purchase, often costing $150-$200 or more, if you can even find them in the first place. As such, having an alternative drop-in replacement is appealing. While the same drop-in replacement functionality could potentially be convenient for the 2A03, 2C02, 2A07, and 2C07, it is specifically the scarcity and cost of RGB PPUs that make this idea attractive. As a final thought regarding palettes and arcade board compatibility in particular, It may be nice to have configuration for the following. This could either be in code only, requiring them to be set before programming the FPGA, or it could be something external such as a bank of tiny DIP switches on the board.
These goals are secondary for now, as I think the primary goal should be functional 2C03 behavior with zero necessary configuration. |
Generally to figure out the size you have to compile it. You could compile it in quartus and check there, but as I mentioned it's not going to be a representation of what the ICE board needs. TinyNes, why bother. I don't even think that uses a real CPU does it? So it has the bad pulse channels and all that going on, probably not worth spending money on from my PoV. Using mister would get you more accuracy than that thing. Upgrade kits to existing NES might be nice. Desoldering the PPU is quite a drag. Regarding the various options: Extra sprites don't have anything to do with an overclocked CPU. The PPU queries the cart hardware twice instead of once per fetch cycle, able to get data for up to 8 additional sprites. Sometimes this won't work if the mapper is sensitive to this, but sometimes it will if it's a stateless mapper or it doesn't worry about counting address lines in that way. Dejittering is the practice of taking the uneven frame length (nes will draw one fewer pixel per frame every other frame when rendering) and pausing the clocks for one cycle to make the frames even in length. Modern televisions HATE uneven frame lengths and often wont work at all with this. There are hardware mods for original hardware to address this on NES. The PPU can't do this exclusively, as it requires the cpu to pause as well, but it can cooperate or make it easier. |
@Kitrinx, I appreciate the explanations. Thanks!
I'll set up an ICE40 toolchain in order to synthesize the current ppu.sv so I can see how many logic cells it would need. This will tell us if the ICE40 is a feasible chip family or not. Is it safe to assume that a standalone modification would require roughly the same amount of FPGA logic as the current one?
The TinyNES has two 40-pin DIP sockets for the CPU and PPU. Most units ship with genuine 2A03 and 2C02 chips in the sockets. Clone chips are a cheaper option for folks too. Any chips that want either a 21.477270 MHz (NTSC) or 26.601712 MHz (PAL) master clock are supported (which is most of them) since both clock sources are available. All the RGB PPUs are supported too, including 2C05s, and all the hardware is open-source. I'm the creator of the TinyNES btw ;)
I agree that inexpensive RGB upgrades could be awesome. Desoldering these chips is actually very easy and non-destructive if the right tools are used. It takes about two minutes to remove a PPU with a Hakko FR-301.
Ah, okay, now I know what you mean. If a PPU would only be more compatible without the extra 1 frame on reset, then my vote would be to omit that particular behavior. Is there any compelling reason not to?
I understand now. Extra sprite querying should certainly not be the default behavior. I could see it being an option, but I don't think it's an important one especially since no original systems do it. I'd be inclined to omit it for that reason.
What changes to the PPU would facilitate this, and would they introduce compatibility issues or be especially complicated? If not, then I see no reason why it shouldn't cooperate in this respect. I've begun sketching out the hardware, and I'll continue with fine layout and routing once we've determined if the ICE40 family is a valid FPGA target or not. |
If TinyNES uses a real 2A03 it's probably okay then and a worthy goal to add some modern usability perks to it. To judge the side you'd want to compile it with the toolchain, as you mentioned. I've never used the ICE one but usually at the end they give you a report on the size of things and the fit in some way. You'd want to use this code here: https://github.com/Kitrinx/NES_MiSTer/blob/ppu3/rtl/ppu.sv It's my branch where I refactored the PPU using a lot of things from Visual 2C02 to make it work much more closely to the timing and asynchronous behavior of the real chip, and it shouldn't be too far from working in a real-hardware scenario, and also much more efficient size-wise. The extra stuff is something one can hammer out later with dip switches or something. |
Hey @Kitrinx, I'm having some trouble synthesizing ppu.sv as I'm getting a number of errors. I've omitted the full file paths for brevity, and I've also pruned similar/repeated error messages. Note that the last message is only a warning.
This is with Lattice iCEcube2 (2020.12.27943) & Synplify Pro (L-2016.09L+ice40). Any idea what I'm doing wrong here? |
I cannot tell from those comments as the line numbers don't seem to line up with the code in my ppu3 branch, but if I had to guess I'd say it was some aspect of systemverilog that the toolchain didn't like. |
As a sanity check, I was able to synthesize https://github.com/strigeus/fpganes/blob/master/src/ppu.v without issue. So that's good! 😀 Here's the resource report:
So, it looks like that particular code wouldn't quite fit on an ICE40LP1K, but it would fit easily on an ICE40LP8K. |
I hope I didn't scare you off @Kitrinx ! Happy to provide anything I can to help facilitate this :) |
Sorry, have not been scared off just occupied elsewhere. I will take a look for you. |
No problem @Kitrinx, I understand. Thanks for the note. I spent some more time working on getting your files to synthesize, and I had some success. My goal for the moment was to change absolutely as little as possible in order to get things to work. Here's what I did:
Now the error that I get is: I'm not sure how to proceed with that. Any ideas? As an aside, I also separately spent some time going through Hope all is well! |
Just a bit of further clarification. After some more fiddling, I was able to get the Anyway, that's where I'm at for the moment. |
I can help you more with it soon. I'm finishing some work to fix some input devices for atari 7800 that have been an issue for a while, then I will put my full attention on the NES PPU. |
If you'd like to work more directly i'm also available on the mister fpga discord. |
Hi, just reading through this. Whats stopping the idea of nes to hdmi with fpga? I know one existed but its long out of stock. library IEEE; entity NES_to_HDMI is architecture Behavioral of NES_to_HDMI is |
Hi @loglow, sorry to bump an old issue. There is a new standalone PPU implementation for FPGA that can be found here: https://github.com/andkorzh/RP2C02-7- Check the YouTube video in the readme, it shows it working in place of a real PPU on a Famicom. I wonder if this would be of any use the the NES MiSTer core? |
While outwardly it might seem like using a direct netlist would be a good idea, in practice it believe it would not. It doesn't allow for clock pausing and the code is almost entirely unworkable by a human. What this means in practice is that things like extra sprites, savestates, fame size evening to keep modern tv's from freaking out, some of the analog ram decay emulation, etc would all be nearly impossible using this implementation, and at this moment, I don't see any direct improvements that would come from it. |
That makes sense, thanks for explaining |
Hello!
I'm looking to adapt just the PPU code (ppu.sv and anything that it depends on) from this project to run on its own. It would run on a suitable FPGA chip with enough physical pins (>40) to replicate the behavior of an original PPU closely. The behavior of an RGB-output PPU (eg. RP2C03) would be the primary target. The behavior of a composite-output PPU (eg. RP2C02) could be a nice secondary target if it would be reasonably straightforward to do so.
I don't have much experience with FPGA development nor Verilog / SystemVerilog. I do have hardware development experience and would be able to handle all the physical design aspects of this project. The end result would be an assembled purpose-built PCB to be used as a stand-in PPU replacement.
I do have some funding available for someone who can assist with this project. The results of this effort will be entirely open-source. Any new, modified, or derivative code would of course be released under the GPL as required, and all board design files will be released CC BY-SA as part of the existing TinyNES project.
If this sounds interesting and/or you're willing to assist, feel free to discuss here, or you can also contact me directly at dan@tall-dog.com and we can talk further about how the funding for such work might proceed.
Thanks, and take care!
Dan
The text was updated successfully, but these errors were encountered: