Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interest in byte-level tailing? #13

Closed
gcla opened this issue Dec 23, 2019 · 9 comments
Closed

Interest in byte-level tailing? #13

gcla opened this issue Dec 23, 2019 · 9 comments
Labels
v2 Ideas for v2

Comments

@gcla
Copy link

gcla commented Dec 23, 2019

Hi - I made use of a fork of the original hpcloud/tail repo in termshark to tail a pcap (packet capture) file. The output from the tail process is fed into some tshark (CLI tool) processes which provide data for termshark's UI. Because pcaps are a binary format, I couldn't rely on a line-based tail to provide timely output for tshark's input, so I made some cheesy modifications to hpcloud/tail to tail in "chunks" of bytes. These chunks are not necessarily line-terminated, and possibly even 1-byte long if the pcap file being tailed grows very slowly. Would you be interested in these sorts of changes in nxadm/tail? If so, I could make a PR - and if you find it acceptable, I'd then switch termshark over to use nxadm/tail. Thanks!

@pocc
Copy link

pocc commented Dec 23, 2019

I have a project that would benefit from this.

@nxadm
Copy link
Owner

nxadm commented Jan 2, 2020

@gcla, I think it sounds very interesting. We need to think hard about the API to make sure we don't end with a Frankenstein-do-it-all API. So we have two options:

  • make sure it integrates transparently with today line-based API, e.g. by adding an option that switches it on. Line is used all over the place, so a PR will touch a lot of places. We need to make sure it makes sense.
  • leave the API as it is and integrate it with a new major version of the API where it's cleaned up. The API has a lot of historical baggage, like exposed internal variables and structs. I started the work here: [new major release] Test and release the noTomb branch #11. Because I removed a orphaned upstream dependency, this new branch needs a lot of testing though.

@pocc
Copy link

pocc commented Jan 12, 2020

The former would be cleaner but the latter will take less time, given that gcla/tail works right now. @gcla what's your take on this?

@gcla
Copy link
Author

gcla commented Jan 18, 2020

Hi @nxadm - my view is that I should follow the path you've set. :-) It looks to me as though the future of tail is in your new branch, and it would make sense to try to cleanly extend that to support byte-level tailing. What do you think? It would be a shame to retro-fit the old API, then have it be left behind.

If you agree, I can try to make time to build on your branch with a byte-level approach - or maybe I should describe it as a non-CRLF approach. My very simple-minded view of the design, without even having got down into the weeds recently, is that ReadLine is replaced by Read, up to some maximum number of bytes per time-window, and possibly subjected to some sort of leaky-bucket rate limiter as was in the original implementation. You can see my original change here - I can picture you wincing as you look at it :-)

gcla/tail@fb58b0d

It reads 1 byte at a time - it doesn't even try to buffer them up. Yuck!

@nxadm
Copy link
Owner

nxadm commented Jan 25, 2020

@gcla, I'll respond begin next week (aka a few days). I was AFK. Looking forward to the feature.

@nxadm
Copy link
Owner

nxadm commented Feb 2, 2020

@gcla, whatabout both ReadLine and ReadBytes methods? Reading one of the other is a very explicit show, so I don't see a problem with having two methods. It would make sense, though, to have one data structure.

What I really would like is to test the new branch to see if the changes I made don't break it. It passes all tests, but maybe it's just an indication that we need more tests :).

@nxadm
Copy link
Owner

nxadm commented Dec 24, 2020

Issue #20 is related to this.

@dolmen
Copy link
Contributor

dolmen commented Apr 27, 2021

I think it would be better to keep this package centered around text streams and leave the handling of binary streams to another package.
The case of a program having the need for tailing both text streams and binary streams is a niche, so I don't much benefit in having the two cases under the same package. A "see also" section that mention the other project should be enough.

@nxadm
Copy link
Owner

nxadm commented Nov 2, 2021

I agree with @dolmen on this. It would clutter the interface to the programmer. We can spin-off a second package when/if v2 is released that build on a common package. I moved the discussion for v2 here: #36

@nxadm nxadm closed this as completed Nov 2, 2021
@nxadm nxadm added v2 Ideas for v2 and removed investigate-for-v2 labels Nov 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
v2 Ideas for v2
Projects
None yet
Development

No branches or pull requests

4 participants