Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ping times of servers which are further down the server list are too high #49

Closed
corrados opened this issue Apr 10, 2020 · 27 comments
Closed
Labels
bug Something isn't working

Comments

@corrados
Copy link
Contributor

It seems that servers which are further down in the list get higher ping times reported in comparison to the first entries.

@corrados corrados added the bug Something isn't working label Apr 10, 2020
@corrados corrados self-assigned this Apr 10, 2020
@pljones
Copy link
Collaborator

pljones commented Apr 14, 2020

Do you mean that, if Server A is the first server to register and Server Z is the latest server to register, even if they are both "close" to Client M, then Server Z will have an unexpectedly higher ping time, somehow related to the number of servers in the list? (Just to clarify what this is about.)

@corrados
Copy link
Contributor Author

Yes. One example: A guy created 5 servers on one PC. He created his own central server and the other 5 servers on the same PC registered at that central server so that if you open the server list you could see his 6 servers on his PC. So all servers should have the same ping time. But this was not the case. The one on the top of the list had, e.g., a ping time of about 20 ms, the next in the list 22 ms, then 24 and so on. So the further down the list, the higher the ping time. In this example you can see that there is obviously an issue with the ping measurement in the server list.

@pljones
Copy link
Collaborator

pljones commented Apr 14, 2020

There's a degree of serialization of the UDP messages both going out and coming back that will affect the computed time, of course. If all the messages get created "at the same time" with the same start timestamp, rather than the start time being the time the socket layer passes the message to the network UDP layer, then that will be amplified. Similarly, if the time of receipt is the time that the protocol message gets handled by the server list, rather than the time the UDP message is handled by the socket layer, there's more time added. Just some thoughts - not looked at the code.

@pljones
Copy link
Collaborator

pljones commented Apr 25, 2020

Checked the creation of the timestamp for the outbound message - that looks fine: it's definitely at the time of the creation of each ping. The server receiving the ping definitely just sends the received value back. The client's calculation of the difference seem fine, too. So I definitely don't think the numbers are wrong in themselves, now having read the code.

I'll try out actually starting up a bunch of servers locally, too.

@vocobox
Copy link

vocobox commented Apr 25, 2020

I have two public jamulus servers running on the same physical machine. One is up for 8 days and I juste rebooted the second for evaluating the divergence of ping time due to position in list : I observe varying ping difference, from 5ms to 20ms

@pljones
Copy link
Collaborator

pljones commented Apr 25, 2020

image
Central and three Slave servers - client on the same machine, zero ms ping to all slaves.

@corrados
Copy link
Contributor Author

I don't think you will see the issue if all servers run on the same PC. When I tried it, the server were on a remote server somewhere in the internet.

@pljones
Copy link
Collaborator

pljones commented Apr 25, 2020

Oh, that does change matters - but it pushes the problem off the client, the slaves and the central server onto the network infrastructure, which makes isolation of it ... trickier...

@corrados
Copy link
Contributor Author

If you give me the IP and port of your Central Server, I can try it out from my PC and see if I get the same ping numbers.

@pljones
Copy link
Collaborator

pljones commented Apr 25, 2020

I'll open up the firewall, too :)

OK, try jamulus.drealm.info:55851 for the central server. The slaves should be 2-5 (added another - I can go up to 9). (Had to punch the same external port through the firewall as no NAT gets triggered just by listening on the server box, so the central advertises the internal port.)

@corrados
Copy link
Contributor Author

I tried it out just now but cannot see any servers in the list.

@pljones
Copy link
Collaborator

pljones commented Apr 25, 2020

Hm, that implies no reply from the central server, right? (It would list itself.)
Try with --showallservers? The firewall could be wrong...

@corrados
Copy link
Contributor Author

Even with --showallserver I get an empty list.

@pljones
Copy link
Collaborator

pljones commented Apr 25, 2020

D'oh - didn't hit save on the router's firewall config... Now it's logged me off! ...

OK... try again?

@corrados
Copy link
Contributor Author

Now I see your central server. There is one thing you must change so that I can see your slave servers: You must register them as "permanent" servers by calling the Central Server with -o "myname;mycity;82;jamulus.drealm.info:[port_of_slave_server];slaveservername;slaveservercity;82". If you use -e on the slave servers, they will register with 127.0.0.1 which I obviously cannot ping.

@corrados
Copy link
Contributor Author

BTW: This test has to be done with --showallservers since I changed the normal server list that only the minimum ping times are listed (to avoid that the numbers change all the time). For the --showallservers mode, the current ping times are shown as it was before with the normal server list.

@pljones
Copy link
Collaborator

pljones commented Apr 25, 2020

Ah, ok... hm, I only see one now...

@corrados
Copy link
Contributor Author

Have you registered all your slave servers as "permanent" servers? Here is what I read in the code:
// parse the predefined server infos (if any) according to definition:
// [server1 address];[server1 name];[server1 city]; ...
// [server1 country as QLocale ID]; ...
// [server2 address];[server2 name];[server2 city]; ...
// [server2 country as QLocale ID]; ...
// ...
So you have three entries for the Central Server and for each slave server you have four entries (the IP address is the additional entry compare to the Central Server server info.

@pljones
Copy link
Collaborator

pljones commented Apr 25, 2020

Try now...

The two Slave1 entries is odd...

@corrados
Copy link
Contributor Author

Still only one:
grafik

@pljones
Copy link
Collaborator

pljones commented Apr 25, 2020

Can you try connecting to jamulus.drealm.info:55852 through 5? And 1 -- which is the central server, which now isn't listing itself, though it was without the fixed servers.

@pljones
Copy link
Collaborator

pljones commented Apr 25, 2020

Hm. Now?

@corrados
Copy link
Contributor Author

grafik

@corrados
Copy link
Contributor Author

I now use the normal list and let the list open for a while so that we can see the "true" minimum ping times of your servers:
grafik
As you can see, we have about 3 ms difference between the very first and very last server. But they should be exactly the same.

@pljones
Copy link
Collaborator

pljones commented Apr 25, 2020

Odd - but clearly external to both the machine and the central server and to the client, as when they're running locally, the effect isn't seen.

@corrados corrados removed their assignment May 20, 2020
@pljones
Copy link
Collaborator

pljones commented Jul 1, 2020

OK, here's another odd one (or two)...

My jam server, jamulus.drealm.info, is showing a ping time of 2ms to 3ms from this machine, which is okay (long cable, switch, router in1, router out2, switch, short cable). (It shows 0-1ms from the machine sitting next to it on the same switch.)

Genre Rock Central Server List is showing 11-14ms
Genre Classical/Folk/Choir Server List is showing 11-14ms

... strange ... If I restart them, those values will drop.

Stranger still...

If I look from http://jamulus.softins.co.uk/, jamulus.drealm.info is ~9ms (which is what I see for all the London Docklands servers from here, so that's right). Hm... And so are the two Central Server List servers.

@corrados
Copy link
Contributor Author

I am running the "create ping time messages" in a separate thread and apply some delay to it. It showed that then the variance of the ping time measurement is now much smaller and also Jamulus servers on the same hardware do now have the same ping value (+- 1 ms, tested with the Worldjam server list). So I'll close this issue now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants