Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

coalesce (bytes2hex, num2hex, hex) and (bin, bits, etc.) #14418

Closed
StefanKarpinski opened this issue Dec 15, 2015 · 5 comments
Closed

coalesce (bytes2hex, num2hex, hex) and (bin, bits, etc.) #14418

StefanKarpinski opened this issue Dec 15, 2015 · 5 comments

Comments

@StefanKarpinski
Copy link
Sponsor Member

It was noted in the conversation on #14341 that bytes2hex is very similar in functionality to hex and num2hex. There's also overlap between bin and bits. Of course hex2bytes is also a form of number parsing. This is a design issue to discuss how to reduce and systematize the number of functions for printing and parsing numbers in different bases.

@StefanKarpinski
Copy link
Sponsor Member Author

cc: @mason-bially, @nalimilan, @tkelman

@mason-bially
Copy link
Contributor

I'd like to reiterate that it would be amazing if there was a way to stick this functionality into the print (show? honestly I'm still a bit confused to the difference, It took me a while to find a type (symbols) that print differently between them for testing @sprintf) and parse functions. I feel like some work has already been moving in this direction (#14052 and #13825).

Additionally all of the existing x2y functions should probably make an appearance in a C compatibility library in the future.

With all of that said, it seems we have:

  • bin
  • oct
  • dec
  • hex

Already. All of these methods provide padding.

  • bits is bin with padding set to width of the type.
  • num2hex seems to be hex with padding set to the width of the type.

I think we could provide better standard behavior by having a way to set the padding to "max width of type". Maybe by setting padding to zero, or passing a type to dispatch off of like MaxPadding (I realize this is of course an abuse of the type system just discussed a few hours ago in another issue). Then we could remove those previous two methods.

bytes2hex returns a string hex representation of an array of bytes. These seems to match hex, return an object (ASCIIString) representing the given type in hex. The bytes argument can be valid for bin and oct (the last one is tricky), but dec would require a separator. However I think all of them could do to have separators, or other formatting arguments. For example this one liner dec(Array{UInt8}([127,0,0,1]), '.')

While we're on the topic, it would be nice if bytestring, or something, took an Array{Int64} and turned it into bytes, with an exception for out of bounds. Making the one liner: dec(bytestring([127,0,0,1]), '.') This would also help with the whole hex2bytes functions which seem to imply the creation of a bytes function (returning an Array{UInt8}). Perhaps with a base argument to help it parse various different bases of strings? And perhaps also for turning things into Array{UInt8}.

hex2num is problematic (it takes hex and turns it into a float... who is using this!?) this should be in the parse function. Basically something like parse(Float64, "4056200000000000", 16, binary=true) should just work (much longer I know... this can't be a common function call can it?) .

Now the last major hurdle in my opinion is useful printing of a large dump of hex, basically hexdump in a function. Aiming for something like (from wikipedia):

00105e0 e6b0 343b 9c74 0804 e7bc 0804 e7d5 0804
00105f0 e7e4 0804 e6b0 0804 e7f0 0804 e7ff 0804
0010600 e80b 0804 e81a 0804 e6b0 0804 e6b0 0804

or

0000: 57 69 6B 69 70 65 64 69 61 2C 20 74 68 65 20 66  Wikipedia, the f
0010: 72 65 65 20 65 6E 63 79 63 6C 6F 70 65 64 69 61  ree encyclopedia
0020: 20 74 68 61 74 20 61 6E 79 6F 6E 65 20 63 61 6E   that anyone can
0030: 20 65 64 69 74 00 00 00 00 00 00 00 00 00 00 00   edit...........

But I'm not sure how to encapsulate that in a function yet. Maybe have a DumpFormatter type which would describe spacing, starting location of the addresses, whether to include the ASCII view on the left, etc. So maybe:

hex(bytes, DumpFormat())

Where DumpFormat can be constructed with various different keywords to enable different behavior. But that's my wishlist. I'd be happy if we happen to prune all the methods we currently have down to 5.

@mason-bially
Copy link
Contributor

I did a search. This is the only file on GitHub that uses hex2num in a non-testing, non-defining, non-documentation way. And I don't think a more explicit function call to parse would be problematic here. The line would go from:

hex2num(bytes2hex(bytes[1:8])), 8

to

parse(Float64, hex(bytes[1:8]), 16, binary=true), 8

There is probably a faster way to pull a float out of memory and into Julia anyway...

@StefanKarpinski
Copy link
Sponsor Member Author

Yes, I suspect that hex2num can go away.

@oscardssmith
Copy link
Member

We appear to have deleted most of these by now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants