-
Notifications
You must be signed in to change notification settings - Fork 7.7k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Deprecate use of mbstring to convert text to Base64/QPrint/HTML entit…
…ies/etc The purpose of mbstring is for working with Unicode and legacy text encodings; but Base64, QPrint, etc. are not text encodings and don't really belong in mbstring. PHP already contains separate implementations of Base64, QPrint, and HTML entities. It will be better to eventually remove these non-encodings from mbstring. Regarding HTML entities... there is a bit more to say. mbstring's implementation of HTML entities is different from the other built-in implementation (htmlspecialchars and htmlentities). Those functions convert <, >, and & to HTML entities, but mbstring does not. It appears that the original author of mbstring intended for something to be done with <, >, and &. He used a table to identify which characters should be converted to HTML entities, and </>/& all have a special value in that table. However, nothing ever checks for that special value, so the characters are passed through unconverted. This seems like a very useless implementation of HTML entities. The most important characters which need to be expressed as entities in HTML documents are those three!
- Loading branch information
Showing
6 changed files
with
54 additions
and
12 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
9308974
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/phpstan/phpstan-src/blob/master/src/Type/Php/MbFunctionsReturnTypeExtension.php#L54
Code like this will trigger deprecation warnings now, and I can't think of an elegant way to solve it except for error suppression.
9308974
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@twose Thanks for letting us know the challenge you are facing.
It's unfortunate that some features which do not really make sense as part of MBString were added long ago. Some of these, like MBString's HTML entity handling feature, were completely broken from the day they were committed to the code base and have never worked properly, but now that we want to remove them, this has created some trouble for people like yourself. Sorry for that; it's not my intention to cause problems for any of our users.
The easiest and simplest way for you to avoid deprecation warnings in the code you link to is to include an
if...continue
clause in that loop. This is untested code, but just as a sample, it might look something like:Please let me know if something similar to that does the trick for you.
9308974
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alexdowad
Thank you for your quick response!
I wonder if these encodings will be removed from the
mb_list_encodings()
in the future.If yes, then we need to remove such compatibility code at that time, right?
And, maybe it is reasonable to allow the use of 'mb_encoding_aliases()' without throwing deprecation warnings before removing these encodings?
9308974
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, whenever you do not support PHP 8.1 any more, then yes... you could remove it.
For your use case, it does seem reasonable. Then again, if someone is specifically passing in a deprecated encoding which we want to remove (i.e.
mb_encoding_aliases('BASE64')
), then we would like that person to be warned of the deprecation.Again, if
mb_list_encodings
was modified so as not to return the names of deprecated encodings, that would be better for your use case. But there is a chance that it might break existing code for someone else.In general, trying to refactor APIs which are already in use is very, very hard. (Please see Hyrum's Law.) The best solution would be if we could just make a time machine, then go back in time and ensure that API design mistakes were never committed.
Failing that, another approach is to say that once released, APIs can never change, so if we want to fix design problems with
mbstring
, we should instead addmbstring2
and leave the old version untouched. This is the approach taken by Go's standard libraries (and there are notable advantages to that approach). However, in this case, I think the maintenance burden of maintaining two different versions ofmbstring
is unacceptable.The remaining alternative, which has been chosen, is to 1) carefully think about breaking changes to make sure they are really better in the long term, 2) try to make changes in a way which minimizes the number of users affected, and 3) provide deprecation warnings in advance so users become aware of pending changes and can take action. In some cases, the "action" may be to provide feedback to the developers, which might impact when and how the changes are done.
We are very interested in hearing the views of PHP users. You have already expressed your view that
mb_encoding_aliases
should not raise deprecation warnings for encodings which may be removed later. If you can provide reasons why this is the best way to handle the deprecation (for all involved, not just yourself), and a developer is available to implement the suggestion, then it might be possible to go that way.Thanks again for the report.
9308974
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alexdowad
Thank you very much for your detailed explanation!
I don't actually have any point of view, I'm just asking. (Maybe my poor English expressive faculty makes you think I'm supporting another point of view 🥲).
Now I've got all the answers I want.
Thank you again for the quick response!
9308974
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alexdowad I don't see any RFC about this new deprecation. This looks weird to me.