Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Probable issue fix when sending more than 512 bytes #10

Open
lk-davidegironi opened this issue Oct 21, 2022 · 8 comments
Open

Probable issue fix when sending more than 512 bytes #10

lk-davidegironi opened this issue Oct 21, 2022 · 8 comments

Comments

@lk-davidegironi
Copy link

Hello @WangXuan95 and thanks for this IP, it saves me a lot of time.

I've ported your code to Verilog.
I'm working on a ECP5 Lattice, running at synchronous mode a FT2232H.
I've built a simple project to loopback and test data. My sample program is in C and C#. I'm using the FTDI driver.

I've found a issue when sending/receiving more than 512 bytes.

It keeps me a couple of days but finally I've been able to solve the issue.
It happens only when TXE and WR goes up at the same cycle, find image as reference, this cause the txfifoo_wtrans_i to get full.

My solution was to disable the input buffer of this module and wait for the buffer to became empty when TXT get high.

Hope this helps!

Find below the changed code, in Verilog, but you can easily port to SystemVerilog

From this:

    stream_wtrans #
        (
            .I_DEXP(TXFIFO_DEXP),
            .O_DEXP(C_DEXP)
        ) txfifoo_wtrans_i (
            .rstn(rstn_usb_clk),
            .clk(usb_clk),
            .itvalid(txfifoo_valid),
            .itready(txfifoo_ready),
            .itdata(txfifoo_data),
            .otvalid(c_tx_valid),
            .otready(c_tx_ready),
            .otdata(c_tx_data)
        );

To this:

        wire txfifoo_readyt;
	assign txfifoo_ready = txfifoo_readyt & ~usb_txe & usb_txe_en;
	reg usb_txe_wait = 1'b0;
	reg usb_txe_en = 1'b1;
	always @(posedge usb_clk or negedge rstn_usb_clk)
	begin
		if(stat == TXD & usb_txe)
		begin
			usb_txe_wait <= 1'b1;
			usb_txe_en <= 1'b0;
		end
		else if(usb_txe_wait && ~c_tx_valid)
		begin
			usb_txe_wait <= 1'b0;
			usb_txe_en <= 1'b1;
		end
	end
	
    stream_wtrans #
        (
            .I_DEXP(TXFIFO_DEXP),
            .O_DEXP(C_DEXP)
        ) txfifoo_wtrans_i (
            .rstn(rstn_usb_clk),
            .clk(usb_clk),
            .itvalid(txfifoo_valid & ~usb_txe & usb_txe_en),
            .itready(txfifoo_readyt),
            .itdata(txfifoo_data),
            .otvalid(c_tx_valid),
            .otready(c_tx_ready),
            .otdata(c_tx_data)
        );

sample_img

@WangXuan95
Copy link
Owner

WangXuan95 commented Oct 21, 2022

Thank you for finding and fixing this bug, which are very helpful for the improvement of this repo! I will review and update it when I have time (a few days later).

Besides, if this repo is useful for you, welcome to star it.

@lk-davidegironi
Copy link
Author

Thanks! Starred.

@WangXuan95
Copy link
Owner

Hello, I carefully look at the waveform you showed and find a question:
In synchronous FIFO mode, usb_wr and usb_txe (and all other signals directly connected to the FT2232H) should be synchronized with the rising edge of usb_clk . However, in the waveform you showed, usb_clk seems to be slow and not synchronized with usb_wr and usb_txe . I guess maybe there's something uncorrect in your test, such as the pin constraint, or the FT2232H did not enter the synchronous FIFO mode.

The following figure in FT2232H datasheet page 31 shows the correct waveform of synchronous FIFO mode. Similar waveforms can be seen from the simulation examples I provided in the SIM folder.

To recur the situation in your test, I constructed a case where the usb_wr and usb_txe signals become higher simultaneously in the simulation. Then I found no functional bugs (no one byte of data is lost).

To verify this, I also carried out a test, and found that when the Host PC received the data sent from the FPGA, there was no data loss (including on the boundary of each transmission) whenever send length >=512B or <512B.

Perhaps you can tell me more details about your test in which you found this bug.

Note: usb_clk is the input signal of FPGA and the output signal of FT2232H. It should be connected to the CLKOUT pin of FT2232H (pin 32, ACBUS5, see FT2232H datasheet page 9 for detail).

@lk-davidegironi
Copy link
Author

lk-davidegironi commented Oct 28, 2022

Hello,
I thought usb_clk was slow due to a problem on the analyzer. I mean, my main clock (the one it's running the analyzer) it's 50Mhz, while CLKOUT, connected from FT2232H it's 60Mhz, so my supposition is that the analyzer just take some clock of CLKOUT, indeed if I use usb_clk (ft_clk signal that cames from the FT245) as main clock of the analyzer, usb_clk is always low in analyzer.

Capture_ftclk_usbclk

About C/C# code:
I'm pretty sure FT2332 is entering FIFO mode cause CLKOUT emit 60Mhz at oscilloscope.
Using FT Prog software by FTDI is Port A Hardware is set to 245 FIFO, Port A Driver to D2XX Driver.
Also in C software is set the mode using FT_SetBitMode(handle, 0xFF, 0x40).
How can i check if it's really in FIFO Syncronous mode? is not enough CLKOUT at 60Mhz?

About Verilog code and constraints:
The CLKOUT pin is connected to a Clock pin input of my ECP5 (function GR_PCLK3_1).
Also constraints are set looking at table "Table 4.1 FT245 Synchronous FIFO Interface Signal Timings" of datasheet.

What's strange is that i can not reach more than 4Mbps, that sounds me asynchronous mode.
Another thinkg is that if I disable the analyzer, thinkgs does not works, I've timing problem. Find below timing report with and without the analyzer (look at negative hold on ft_txen/ft_clk when analyzer is disabled)

Analyzer Enabled:

Port       Clock   Edge  Setup Performance_Grade  Hold Performance_Grade
ft_clk     sys_clk R     3.388      6      -0.656     M
ft_data[0] sys_clk R     2.647      6      -0.742     M
ft_data[0] ft_clk  R     1.780      6       0.274     6
ft_data[1] sys_clk R     2.646      6      -0.742     M
ft_data[1] ft_clk  R     1.348      6       0.342     6
ft_data[2] sys_clk R     2.426      6      -0.660     M
ft_data[2] ft_clk  R     1.550      6       0.247     6
ft_data[3] sys_clk R     2.867      6      -0.826     M
ft_data[3] ft_clk  R     1.790      6       0.377     6
ft_data[4] sys_clk R     2.474      6      -0.688     M
ft_data[4] ft_clk  R     1.348      6       0.388     6
ft_data[5] sys_clk R     2.413      6      -0.659     M
ft_data[5] ft_clk  R     1.228      6       0.380     6
ft_data[6] sys_clk R     2.470      6      -0.677     M
ft_data[6] ft_clk  R     1.159      6       0.377     6
ft_data[7] sys_clk R     2.776      6      -0.805     M
ft_data[7] ft_clk  R     1.699      6       0.401     6
ft_rxfn    sys_clk R     5.949      6      -1.147     M
ft_rxfn    ft_clk  R     4.775      6       0.156     6
ft_txen    sys_clk R     7.661      6      -1.010     M
ft_txen    ft_clk  R     5.833      6       0.544     6

// Clock to Output Delay
Port       Clock   Edge  Max_Delay Performance_Grade  Min_Delay Performance_Grade
ft_data[0] ft_clk  R    11.286         6        3.980          M
ft_data[1] ft_clk  R    11.286         6        3.847          M
ft_data[2] ft_clk  R    11.274         6        4.015          M
ft_data[3] ft_clk  R    10.989         6        3.953          M
ft_data[4] ft_clk  R    11.528         6        3.829          M
ft_data[5] ft_clk  R    11.274         6        3.920          M
ft_data[6] ft_clk  R    10.989         6        4.037          M
ft_data[7] ft_clk  R    11.308         6        4.214          M
ft_oen     ft_clk  R     9.511         6        3.852          M
ft_rdn     ft_clk  R    10.516         6        4.231          M
ft_wrn     ft_clk  R     8.580         6        3.480          M
led_o      sys_clk R     6.085         6        2.829          M

Analyzer Disabled:

// Input Setup and Hold Times
Port       Clock  Edge  Setup Performance_Grade  Hold Performance_Grade
ft_data[0] ft_clk R     1.255      6       0.404     6
ft_data[1] ft_clk R     1.254      6       0.407     6
ft_data[2] ft_clk R     1.255      6       0.404     6
ft_data[3] ft_clk R     1.384      6       0.407     6
ft_data[4] ft_clk R     1.210      6       0.470     6
ft_data[5] ft_clk R     1.780      6       0.378     6
ft_data[6] ft_clk R     1.341      6       0.412     6
ft_data[7] ft_clk R     1.309      6       0.471     6
ft_rxfn    ft_clk R     5.419      6       0.464     6
ft_txen    ft_clk R     6.847      6      -0.116     6
// Clock to Output Delay
Port       Clock   Edge  Max_Delay Performance_Grade  Min_Delay Performance_Grade
ft_data[0] ft_clk  R     9.276         6        3.478          M
ft_data[1] ft_clk  R     9.276         6        3.487          M
ft_data[2] ft_clk  R     9.262         6        3.553          M
ft_data[3] ft_clk  R     9.865         6        3.632          M
ft_data[4] ft_clk  R     9.611         6        3.538          M
ft_data[5] ft_clk  R     9.239         6        3.477          M
ft_data[6] ft_clk  R     9.495         6        3.560          M
ft_data[7] ft_clk  R     9.865         6        3.685          M
ft_oen     ft_clk  R    10.777         6        4.328          M
ft_rdn     ft_clk  R     9.761         6        3.945          M
ft_wrn     ft_clk  R    10.564         6        4.258          M
led_o      sys_clk R     5.648         6        2.657          M

Also I've some other timing problem. I'm using this library wrapped on a protocol I've write. If I enable the analyzer it works, if not it does not. There are for sure timing problem, not caught at Place & Route compile time.
How can i check if I'm really in FIFO synch mode? Am i missing something?
Although I'm a senior dev in other languages, I'm a junior FPGA dev, and this design is driving me mad :)

Thanks for your help!

Attached my test projects.
FT245.zip

@WangXuan95
Copy link
Owner

WangXuan95 commented Oct 28, 2022

Hi,

CLKOUT is 60Mhz, which is enough to show that FT2232H is indeed in sync-245 fifo mode.

I roughly read your Verilog (top.v). It seems you use 0xFF to control the transfer, when reaching 0xFF, let transmit=1 and emit data from axis_fifo to my ftdi_245fifo. I suggest to modify the code to this:

localparam STARTCHAR = 8'hFF;   // Emit when STARTCHAR reached
reg transmit = 0;
always @(posedge clk)
    if(m_axis_tvalid && m_axis_tdata == STARTCHAR && !transmit)
        transmit <= 1;
    else if (!s_axis_tvalid)
        transmit <= 0;

because m_axis_tdata is valid to use only when m_axis_tvalid=1.

By the way, the method of implementing loopback can be simpler, just connect to TX interface and RX interface:

ftdi_245fifo #(
	.TX_DEXP( 0), // TX data stream width = (2^TX_DEXP)*8
	.TX_AEXP(12), // TX FIFO depth = 2^TX_AEXP
	.RX_DEXP( 0), // RX data stream width = (2^TX_DEXP)*8
	.RX_AEXP(12), // RX FIFO depth = 2^RX_AEXP
	.C_DEXP ( 0)  // FTDI USB chip data width = (2^C_DEXP)*8 - FT232H is 0, for FT600 is 1, for FT601 is 2
) ftdi_245fifo_i (
	.rstn_async(~rst),
	.rx_clk(clk),
	.tx_clk(clk),
	.tx_valid(m_axis_tvalid),
	.tx_ready(m_axis_tready),
	.tx_data(m_axis_tdata),
	.rx_valid(m_axis_tvalid), 
	.rx_ready(m_axis_tready),
	.rx_data(m_axis_tdata),
	.usb_clk(ft_clk),
	.usb_rxf(ft_rxfn),
	.usb_txe(ft_txen),
	.usb_oe(ft_oen),
	.usb_rd(ft_rdn),
	.usb_wr(ft_wrn),
	.usb_data(ft_data),
	.usb_be()
  );

Then, back to the main problem, you found that there is a probable issue when sending more than 512 bytes.
A few days ago, I tested it (by the way, I updated this repo to make the code more convenient to use). I let the FPGA to send continuously incremental bytes (0x00 ->0xFF ->0x00 ->0xFF....) to the Host PC. Then I use Python program to receive data and check. I found that the received data is increasing, which means no bytes are lost. At the same time, in the other direction, I let Python to send incremental bytes to the FPGA, and the FPGA checks whether the bytes are incremental. If not, it turns off the LED. I found that the LED did not go off, indicating that there was no data loss.

It doesn't matter that txfifoo_wtrans_i gets full. because it has a valid&ready handshake signal to its prior, when txfifoo_wtrans_i gets full, it will stop its input data stream and no data will be discard.

I'd recommend you to completely use my test code (fpga_top_example_ft232h.sv and supporting python test program in my repo). If there is no any issue, you can modify it to your application as you wish.

I have not used Lattice FPGA, and I do not understand its timing constraint process, so I cannot directly advise you on timing.

But when I looked at the ECP5 board User Guide, I found that there was no FT2232H chip on it that could be used for sync-245 fifo mode communication (there was a FT2232H used as a JTAG, but it could not be used for sync-245 fifo mode communication). It seems that you have an external FT2232H module, so there is a very critical hint: the wiring from FPGA to FT2232H must be short enough (including the length of the wiring on the PCB), preferably within 5cm, because the clock frequency of the interface between them is up to 60MHz. If you connect FPGA and FT2232H module with DuPont cable or flying wire, it is almost impossible to work normally. At least, they should be connected in the way of pin insertion. Even so, if the distance between them is not close enough, they may still not work properly. You can send me a photo of the boards.

Don't feel frustrated for hardware debugging failure, because once you succeed, your sense of achievement will be stronger. Welcome to contact me again and wish you all the best :)

@lk-davidegironi
Copy link
Author

Thank you.

I've an half duplex loopback cause I'm using it as half duplex in my main logic. So I replicated this using a STARTCHAR for transmit. I've added the m_axis_tvalid signal, my mistake.

I have to found some board with FT245 embedded.
I'll try your software, but I think the problem lies in cabling, is something i thought but i was not sure about cause it works sometimes.
Indeed take a look at this mess :-/ :
processed-bc84b114-a037-4ae4-ae30-d66587e2d21a_1wez2Yhe

I'll let keep you updated.
Thanks again.

@WangXuan95
Copy link
Owner

WangXuan95 commented Oct 28, 2022

I've look at your board's photo. Obviously, it is not reliable to use DuPont wire to connect. Btw, It seems hard to find an FPGA development board with FT232H/FT2232H chips (I use a PCB I drew myself). However, you can try to plug the FPGA and the FT2232H board directly by the 2.54mm pin header (if there is no connection error, be careful to burn the board).

In fact, you don't need to worry too much about half duplex or full duplex, and you don't need to use Verilog to deliberately control half duplex. My ftdi_245fifo has a send and receive schedule, as well as send and receive buffers. For example, Host-PC can first send a chunk of data to FPGA. The FPGA obtains it from the RX interface of ftdi_245fifo, and then directly loopback to the TX interface of ftdi_245fifo. Now, If you do not call usb.receive() in the Host-PC's software, these data will always be buffered in ftdi_245fifo and will not be discarded, until usb.receive() is called. In other words, my ftdi_ 245fifo will not cause data loss evenyou use the TX and RX interfaces at the same time.

@lk-davidegironi
Copy link
Author

Hello,
I've the same error with your host code. Also, if i disable the Lattice Analyzer still nothing works, and that's for sure a timing issue. I'm working on a supplementary PCB, there I will add a ftdi interface, that way I hope to remove the cabling problem, and test this again. I'll let you know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants