Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some guidance on better timing #89

Open
tristanitschner opened this issue Apr 6, 2024 · 3 comments
Open

Some guidance on better timing #89

tristanitschner opened this issue Apr 6, 2024 · 3 comments

Comments

@tristanitschner
Copy link

Hi there,

I'm using your CPU in a quad core constellation on a nexys video. The SoC was generated with:

python3 -m litex_boards.targets.digilent_nexys_video --cpu-type=naxriscv --bus-standard axi-lite \
--with-video-framebuffer --with-coherent-dma --with-sdcard --with-ethernet --xlen=64 \
--scala-args='rvc=true,rvf=true,rvd=true,alu-count=2,decode-count=2' --with-jtag-tap \
--sys-clk-freq 100000000 --cpu-count 4 --soc-json build/digilent_nexys_video/csr.json \
--soc-csv build/digilent_nexys_video/csr.csv \
--update-repo no --build

I used the most recent version Litex, except for LiteEth, which is currently broken, so I used the most recent release tag.

Utilization is quite high, although there is definitely room for a larger L2 cache:
quad_core_naxriscv

However, regarding timing it is quite unsatisfactory:
quad_core_timing

While the debian system seems to be stable, sometimes I'm losing a core or get weird IO errors.

So my question is: What measures (i.e. NaxRiscv configuration) could be taken, to improve the timing of the SoC? Of course, by trial and error I could figure this out myself, I just found it to be appropriate to ask, since you as developer probably know best :) Also I would rather sacrifice some performance for frequency if that is possible by any means.

@Dolu1990
Copy link
Member

Dolu1990 commented Apr 9, 2024

Hi,

Ahhhh so, i was facing the same issue when going above 2 cores.
The issues seems that timings goes down because the floorplanning/placer are too constrained. Critical path were appearing in place which are quite isolated from other cores.
So maybe tunning the vivado settings could help, asking it to make more effort.
Also floor planning quite bad in the picture you sent, i mean especialy the bright pink core on the top get completly messed up by the biiiiiig black area, and for me that was even happening sometimes in dual core config, while the FPGA was half empty.

So, overall probably would need to do some manual floor planning. But note, i never tried it, also i keept myself in the 2 core space.

For NaxRiscv option which could help timing, maybe reducing the LSU LQ SQ entries, but that would be realy sad.
Overall, i would say that even if those option are reduced, the bad floor plannig will create critical path out of very OK stuff.

Note, i'm now working on https://github.com/SpinalHDL/VexiiRiscv/tree/dev/src/main/scala/vexiiriscv
Performance aren't far off NaxRiscv :)
If you are interrested in some ways let me know XD
I have a litex port but not merged upstream yet. The only missing feature now is FPU, then it could run debian aswell.

While the debian system seems to be stable, sometimes I'm losing a core or get weird IO errors.

Ahhh when the slave violation is as much than 22%, likely it is timing related.

@tristanitschner
Copy link
Author

So I played a little with floor planning, but I don't think it can be done.

In fact, I have seen your VexiiRiscv effort, but concluded to wait until it is done. Basically, I need a CPU for a project, and thought it would be best to use rv64gc, as that is supported by apt package by default. I think I will rather settle on rv32gc and just use compile the compiler myself. The main problem is that there is assembly involved, and the optional architecture of RISCV makes it very hard to port that code, in contrast to the hierarchical extension nature of other ISAs. I'm definitely looking forward to the first stable release of VexiiRiscv :) Also, if I may ask, I have once seen a recommendation by yours on some books on out of order CPU, but can't find them, could you state them here again? While by now I feel rather confident with in order pipelined CPUs (I've written a 3 smaller ones, but nothing worth publishing, rather good examples of what you call "wire mess"), I still struggle with understanding OOC cores and would very much like to devote some time on them. That would be great!

@Dolu1990
Copy link
Member

So I played a little with floor planning, but I don't think it can be done.

Yeah, not enough space to play around.

I'm definitely looking forward to the first stable release of VexiiRiscv

So far, in simulation, linux run well with RV32/64 IMACSU.
I tested RV32IMA multi core on hardware, it ran multiple doom for 24 h straight on linux :)

I'm now working on getting the FPU in, then it could be GC and run debian aswell.

I have once seen a recommendation by yours on some books on out of order CPU

I'm not a book person, i realy have hard time reading text.
I would say, the BOOM documentation is good to get ideas. There is also the RSD core paper.

I still struggle with understanding OOC cores

Maybe a NaxiiRiscv at the horizon ^^
There is a few design decision that prevent NaxRiscv scaling up enough i think.
But first, with VexiiRiscv i'm trying to solve the in order design space as well as i can, to get it "done" XD

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants