Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support JDK serialization/deserialization features #2730

Merged
merged 1 commit into from
Dec 5, 2020

Conversation

ziyilin
Copy link
Collaborator

@ziyilin ziyilin commented Aug 3, 2020

JDK's serialization/deserialization features are implemented by java.io.java.io.ObjectInputStream.readObject and java.io.ObjectOutputStream.writeObject APIs which are not supported by native image. This patch supports these two APIs in native image.

Features:

  • Support serialization API Ljava/io/ObjectOutputStream;#writeObject(Ljava/lang/Object;)V
  • Support deserialization API Ljava/io/ObjectInputStream;#readObject()Ljava/lang/Object;
  • Add a new configuration file serialization-config.json to provide serialization/deserialization target class information for
  • native-image at build time. The agent can intercept serialization/deserialization calls to store the serialization/deserialization target class information to the configuration file automatically.
  • This patch doesn't depend on the previously committed dynamic class loading feature Support dynamic class loading #2442
  • Unsupported multiple class loader usage will be reported at build time. For example, there are two classes with the same name "com.alibaba.test.serialze.Data", but one extends com.alibaba.test.serialize.DummyBase and another doesn't. When they are loaded and serialized by different classloaders. the following error will be reported at build time:
    image
  • JUnit is now supported. All tests in the attached tests.zip are JUnit tests.

Tests:

Tests are here:
tests.zip

There are 4 tests in tests.zip for this patch. Unzip the file and run each shell script started with "test" to see the result.

  • testCustomizedClassSerialize.sh: This test serializes and deserializes a customized class.
  • testDeserializeStream.sh: this test deserializes a float array
  • testSerializeArrayList.sh: this test serializes and deserializes an ArrayList class
  • testDeserializeMultiClassloader.sh: this test serializes and deserializes two different classes with the same name by different classloaders. This test is expected to fail.

@peter-hofer
Copy link
Member

Thank you for your contribution @ziyilin ! I've left some first comments and I'm looking forward to having a closer look.

@peter-hofer
Copy link
Member

peter-hofer commented Aug 18, 2020

The style checker reports several errors, some of which break builds, please see: https://travis-ci.org/github/oracle/graal/jobs/718831254#L2394-L2434

@ziyilin
Copy link
Collaborator Author

ziyilin commented Aug 19, 2020

The style checker reports several errors, some of which break builds, please see: https://travis-ci.org/github/oracle/graal/jobs/718831254#L2394-L2434

fixed

@ziyilin ziyilin force-pushed the serialization branch 2 times, most recently from 5df9526 to b2c20b2 Compare August 19, 2020 11:23
@ziyilin ziyilin force-pushed the serialization branch 4 times, most recently from b83c4e0 to 13c898b Compare August 21, 2020 09:32
@olpaw
Copy link
Member

olpaw commented Dec 2, 2020

Class GeneratedSerializationConstructorAccessor2 is dynamically generated by method MethodAccessorGenerator.generateSerializationConstructor for deseiralization usage. To new an abstract class is indeed violated the JVM specification, but Hotspot somehow managed to get it worked. Maybe we can follow the same strategy. I investigated Hotspot and found the following check in InterpreterRuntime::_new has been bypassed. But I still didn't find out how Hotspot managed to do this. It will be appreciated if any of you guys has any clue.

void InstanceKlass::check_valid_for_instantiation(bool throwError, TRAPS) {
  if (is_interface() || is_abstract()) {
    ResourceMark rm(THREAD);
    THROW_MSG(throwError ? vmSymbols::java_lang_InstantiationError()
              : vmSymbols::java_lang_InstantiationException(), external_name());
  }

@ziyilin thanks for investigating further. There is now a PR on master that allows us to defer illegal NewInstance to a image runtime error with --allow-incomplete-classpath. (3f76cff) It allows us to workaround the issue for now.

To have serialization in 21.0.0 as experimental feature we need to get this PR merged by Friday.
Please rebase the PR so that applies without merge conflicts to master and make sure that the copyright header will pass the checks (see #2730 (comment)). Then I can create an internal PR and run it though the full set of internal gates.

Copy link
Member

@olpaw olpaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All good. One small thing left. See #2730 (comment)

@ziyilin
Copy link
Collaborator Author

ziyilin commented Dec 3, 2020

@olpaw I have fixed the "new abstract class" issue.
Hotspot didn't have the same problem because it never runs the GeneratedSerializationConstructorAccessor2.newInstance method in which the abstract class is newed, although it dynamically generates the GeneratedSerializationConstructorAccessor2 class by method MethodAccessorGenerator.generateSerializationConstructor. The deserialization target class cannot be abstract, only its super classes can be. And the newInstance is only called for the deserialization target classes. There is no actual usage for super classes' generated ConstructorAccessor classes, but more like some kind of place holder. So I feed the generateSerializationConstructor method with a stub class to replace its first parameter which is the target of new instruction in the generated byte code when it is an abstract class. Meanwhile, the serialization checksum is always set to 0 for abstract classes, because there is no need to verify abstract classes for its corresponding GeneratedSerializationConstructorAccessor.

@olpaw
Copy link
Member

olpaw commented Dec 3, 2020

There is no actual usage for super classes' generated ConstructorAccessor classes, but more like some kind of place holder.

That matches with my observation. Using 3f76cff with --allow-incomplete-classpath I was able to workaround the issue because those synthetic constructors for abstract classes do not get called at image runtime (thus never resulting in image runtime exceptions).

Glad you found a proper fix so we do not need to rely on --allow-incomplete-classpath.

@olpaw
Copy link
Member

olpaw commented Dec 3, 2020

@ziyilin created internal PR. Thanks a lot for your contribution.

@ziyilin
Copy link
Collaborator Author

ziyilin commented Apr 27, 2021

@peter-hofer @olpaw I have a compatibility concern about the serialization feature. Native image is based on labs-openJDK or graal-jvmci-8 which are not exactly the same as OpenJDK. Is there any chance a serialized target class is a JDK class that has different contents between OpenJDK and labs-openJDK? In this case, the serialization data from native image could be different from openJDK, leading to inconsistent runtime behaviors.

@peter-hofer
Copy link
Member

@ziyilin It's possible, but unlikely. The Labs JDKs generally incorporate JVMCI changes and various bugfixes and not changes to JDK classes, especially not those classes which we would expect to be used in serialization, like collections. These are typically also written with serialization in mind (transient fields, serialVersionUID).

@ziyilin
Copy link
Collaborator Author

ziyilin commented Apr 27, 2021

@peter-hofer Thanks for explaining. In this case, I don't need to worry about the compatible problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[native-image] UnsupportedFeatureError: ObjectOutputStream.writeObject()
9 participants