Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deprecate String(::IOBuffer) #21438

Closed
JeffBezanson opened this issue Apr 18, 2017 · 21 comments
Closed

deprecate String(::IOBuffer) #21438

JeffBezanson opened this issue Apr 18, 2017 · 21 comments
Assignees
Labels
domain:io Involving the I/O subsystem: libuv, read, write, etc. domain:strings "Strings!" kind:deprecation This change introduces or involves a deprecation
Milestone

Comments

@JeffBezanson
Copy link
Sponsor Member

JeffBezanson commented Apr 18, 2017

This constructor method does not seem to be necessary (left over from the bytestring function). It could be seekstart(b); readstring(b) or String(take!(b)).

Actually, String(::IOBuffer) is a bit of a strange function since it gives you a copy of the data in the buffer without affecting the stream state in any way. Does anybody use this?

@JeffBezanson JeffBezanson added kind:deprecation This change introduces or involves a deprecation domain:io Involving the I/O subsystem: libuv, read, write, etc. domain:strings "Strings!" labels Apr 18, 2017
@JeffBezanson JeffBezanson added this to the 1.0 milestone Apr 18, 2017
@lobingera
Copy link

lobingera commented Apr 19, 2017

I'm using String(take!(b)) a few times in Cairo.jl runtests, so i'd guess there are other who test if the output to an io stream contains something.

@KristofferC
Copy link
Sponsor Member

I use String(take!(b)) because the previous version (takebuf_string) was deprecated.

@lobingera
Copy link

same reason.

@nalimilan
Copy link
Member

BTW, the current readstring is so simple that it could even be deprecated in favor of String(read(b)). We could deprecate String(b::IOBuffer) in favor of String(take!(b)) rather than adding a new readstring method.

@JeffBezanson
Copy link
Sponsor Member Author

Sorry I wasn't clear. There's nothing wrong with String(take!(b)); I was wondering if anybody uses the String(::IOBuffer) method I want to deprecate.

readstring could now avoid allocating an intermediate vector, so that's a reason to keep it. String(read(f)) could end up copying all of the data. In fact it seems to do so in some cases now, which should be fixed!

@TotalVerb
Copy link
Contributor

TotalVerb commented Apr 19, 2017

Furthermore, although unfortunate, I think it's inaccurate to assume that every IO will contain UTF-8 data. There are still places today where ISO 8859-1 or UTF-16 are used, and having a separate encoding-aware readstring is useful in those cases.

(This is not relevant to IOBuffer, of course, which only contains UTF-8 data by definition.)

@nalimilan
Copy link
Member

Furthermore, although unfortunate, I think it's inaccurate to assume that every IO will contain UTF-8 data. There are still places today where ISO 8859-1 or UTF-16 are used, and having a separate encoding-aware readstring is useful in those cases.

Yes, but I figured String(read(b), enc"UTF-16") would be enough in that case.

But indeed if readstring can avoid a copy it's a good reason to keep it.

@JeffBezanson
Copy link
Sponsor Member Author

IOBuffer can contain arbitrary binary data, but String is by definition UTF-8, so it's a good point that the String constructor would not be a good way to read a non-UTF-8 string.

@nalimilan
Copy link
Member

IOBuffer can contain arbitrary binary data, but String is by definition UTF-8, so it's a good point that the String constructor would not be a good way to read a non-UTF-8 string.

As long as the constructor converts the buffer to UTF-8, I don't see the problem. It could also be interesting to be able to create an encoded string in a custom type like EncodedString{:UTF16} (cf. discussions at JuliaStrings/StringEncodings.jl#9), but that would use a separate constructor.

@JeffBezanson
Copy link
Sponsor Member Author

Yes, I guess that's true. Actually readstring could also be replaced with read(io, String), which would be more consistent with our existing API. An interface like StringType(read(io), ...) should be avoided since it will require copying data in general.

@JeffBezanson JeffBezanson changed the title deprecate String(::IOBuffer) to readstring deprecate String(::IOBuffer) to readstring or read(io, String) Jun 21, 2017
@JeffBezanson JeffBezanson self-assigned this Jul 13, 2017
@JeffBezanson JeffBezanson changed the title deprecate String(::IOBuffer) to readstring or read(io, String) deprecate String(::IOBuffer) Jul 13, 2017
@JeffBezanson
Copy link
Sponsor Member Author

Resolved: will deprecate this to String(take!(copy(iobuf))).

JeffBezanson added a commit that referenced this issue Jul 14, 2017
deprecate `String(io::IOBuffer)`. fixes #21438
jeffwong pushed a commit to jeffwong/julia that referenced this issue Jul 24, 2017
@rfourquet
Copy link
Member

Would it make sense to have take(iobuf) = take!(copy(iobuf))? That would allow the less verbose String(take(iobuf)).

@StefanKarpinski
Copy link
Sponsor Member

take is deprecated, but could be reintroduced, I suppose?

@yuyichao
Copy link
Contributor

Most of the usage I saw can just use take! fwiw....

@StefanKarpinski
Copy link
Sponsor Member

The copy is just there in case the I/O buffer needs to continue to work, which it often doesn't.

@yuyichao
Copy link
Contributor

I know.... I just mean that most of the time it's not needed so unless it's clear that a lot of the code actually need the copy we don't necessarily need to introduce take.

@rfourquet
Copy link
Member

rfourquet commented Jul 28, 2017

My encounter with IOBuffer is mostly in the repl code, where it's frequent to need String (::IOBuffer); so the new syntax doesn't only seem verbose, but conceptually more involved than necessary for "give me the string representation of this buffer": make a copy of something and destroy this copy to transform it into a String. But a special purpose private function can also be created where needed (still, I don't see what would be wrong with convert(::Type{String}, ::IOBuffer)).

@JeffBezanson
Copy link
Sponsor Member Author

The problem is that String(::IOBuffer) always copied the data, which is often not necessary. As Yichao says, very often you want just take!, which is more efficient.

@rfourquet
Copy link
Member

So is the plan to reintroduce String(io::IOBuffer) after deprecation to mean String(take!(io))? in this case, String(copy(io)) would be as good as an hypothetical String(take(io)).
Otherwise, convert(String, io) would also work no? convert usually is not destructive of its second argument.

@yuyichao
Copy link
Contributor

This should not be defined as convert. We don't want this to happen on assignment.

@JeffBezanson
Copy link
Sponsor Member Author

String can't mean String(take!(...)), because there would be a hidden !.

I'm not sure what to make of this; I see why the function is useful, but it's pretty ad-hoc. For example there is no Vector{UInt8}(::IOBuffer). Should we just continue to have the special case?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain:io Involving the I/O subsystem: libuv, read, write, etc. domain:strings "Strings!" kind:deprecation This change introduces or involves a deprecation
Projects
None yet
Development

No branches or pull requests

8 participants