Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add classloader parameter to XmlFactory #480

Closed
wants to merge 1 commit into from
Closed

Conversation

timja
Copy link

@timja timja commented Jul 3, 2021

Allow setting a custom classloader rather than relying on the thread context classloader for resolving the XMLInputFactory

What do you think?

Background:

Azure/azure-sdk-for-java#22764
jenkinsci/jackson2-api-plugin#82 (comment)

if (cl != null) {
xmlIn = XMLInputFactory.newFactory(XMLInputFactory.class.getName(), cl);
} else {
xmlIn = XMLInputFactory.newInstance();
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this is needed, it maintains compatibility fully, but I don't think the fallback functionality actually works for jackson, given jenkinsci/azure-storage-plugin#194 shows that the default jdk implementation doesn't work here

java.lang.UnsupportedOperationException: Cannot create XMLStreamReader or XMLEventReader from a org.codehaus.stax2.io.Stax2ByteArraySource
	at java.xml/com.sun.xml.internal.stream.XMLInputFactoryImpl.jaxpSourcetoXMLInputSource(XMLInputFactoryImpl.java:302)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the fallback functionality does work for the majority of Jackson users, where the thread's context class loader happens to include the Woodstox implementation. But it doesn't work for Jenkins plugins when Woodstox is removed from Jenkins core, because in that situation the plugin's class loader has access to Woodstox but core's class loader (which is the thread's context class loader) does not.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can Jackson simply default to passing in its own class loader, rather than using the thread CCL? Would seem to make it work the way you expect without being told. Such a change might be incompatible in weird cases however. Can it detect that it is getting an incompatible implementation from the JRE and automatically switch to its own CCL if so?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can Jackson simply default to passing in its own class loader, rather than using the thread CCL? Would seem to make it work the way you expect without being told. Such a change might be incompatible in weird cases however.

Agreed on both counts.

Can it detect that it is getting an incompatible implementation from the JRE and automatically switch to its own CCL if so?

We could check to see if it is the known-incompatible com.sun.xml.internal.stream.XMLInputFactoryImpl I suppose. But we probably couldn't do that with instanceof because it's an implementation-specific class and might not be present in all Java implementations. So you'd likely have to do string comparison on .getClass().getName() which isn't the most elegant.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given https://github.com/FasterXML/jackson-dataformat-xml/#maven-dependency and

<!-- and a Stax impl is needed: SJSXP (from JDK 1.6) might work, but always
has odd issues. Let's default to Woodstox: caller can upgrade to Aalto
(needs to block this dep)
-->
<!-- 09-May-2016, tatu: With Jackson 2.8, let's make this compile-dep to make it
less likely users accidentally try to use Sjsxp from JDK, which leads to probs
-->
<!-- 16-Jul-2016, tatu: For 2.10, need Woodstox 6.x to get module info -->
<dependency>
<groupId>com.fasterxml.woodstox</groupId>
<artifactId>woodstox-core</artifactId>
<version>6.2.6</version>
<exclusions>
<exclusion>
<groupId>javax.xml.stream</groupId>
<artifactId>stax-api</artifactId>
</exclusion>
</exclusions>
</dependency>
it seems like “strongly encouraging” use of Woodstox from the same class loader as Jackson would be reasonable.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems like “strongly encouraging” use of Woodstox from the same class loader as Jackson would be reasonable.

If we wanted to "strongly encourage" use of the same class loader as Jackson, this would do the trick (tested, works):

diff --git a/src/main/java/com/fasterxml/jackson/dataformat/xml/XmlFactory.java b/src/main/java/com/fasterxml/jackson/dataformat/xml/XmlFactory.java
index f9391d8d..97a8831a 100644
--- a/src/main/java/com/fasterxml/jackson/dataformat/xml/XmlFactory.java
+++ b/src/main/java/com/fasterxml/jackson/dataformat/xml/XmlFactory.java
@@ -110,14 +110,22 @@ public class XmlFactory extends JsonFactory
         _xmlGeneratorFeatures = xgFeatures;
         _cfgNameForTextElement = nameForTextElem;
         if (xmlIn == null) {
-            xmlIn = XMLInputFactory.newInstance();
+            try {
+                xmlIn = XMLInputFactory.newFactory(XMLInputFactory.class.getName(), XmlFactory.class.getClassLoader());
+            } catch (FactoryConfigurationError e) {
+                xmlIn = XMLInputFactory.newInstance();
+            }
             // as per [dataformat-xml#190], disable external entity expansion by default
             xmlIn.setProperty(XMLInputFactory.IS_SUPPORTING_EXTERNAL_ENTITIES, Boolean.FALSE);
             // and ditto wrt [dataformat-xml#211], SUPPORT_DTD
             xmlIn.setProperty(XMLInputFactory.SUPPORT_DTD, Boolean.FALSE);
         }
         if (xmlOut == null) {
-            xmlOut = XMLOutputFactory.newInstance();
+            try {
+                xmlOut = XMLOutputFactory.newFactory(XMLOutputFactory.class.getName(), XmlFactory.class.getClassLoader());
+            } catch (FactoryConfigurationError e) {
+                xmlOut = XMLOutputFactory.newInstance();
+            }
         }
         _initFactories(xmlIn, xmlOut);
         _xmlInputFactory = xmlIn;

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cowtowncoder Did you have an opinion about the approach described here: "strongly encouraging", as Jesse put it, the use of these classes from XmlFactory's class loader rather than the thread's current context class loader? I tested that in the Jenkins use case this would avoid SJSXP completely in favor of Woodstox. This seems more consistent to me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@basil I wish I had an opinion here: I can see why it'd make sense but feel I may not have the full picture of possible downsides. My only concern would be that of breaking someone's existing usage somewhere due to changing resolution (I also realize it is obv impossible to know 100% sure in advance, and like Jesse said, there may be weird edge cases).

How about this: I'll create a new issue for proposed change (linking back to this PR) and we can discuss that change there? If anyone has a link/links to article describing the usual logic to determine between context classloader vs classes own, in context of libraries, that'd be great.

{
this(oc, DEFAULT_XML_PARSER_FEATURE_FLAGS, DEFAULT_XML_GENERATOR_FEATURE_FLAGS,
xmlIn, xmlOut, null);
xmlIn, xmlOut, null, null);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should be plumbing through cl to the protected method rather than passing in null.

}

public XmlFactory(ObjectCodec oc, XMLInputFactory xmlIn, XMLOutputFactory xmlOut)
public XmlFactory(ObjectCodec oc, XMLInputFactory xmlIn, XMLOutputFactory xmlOut, ClassLoader cl)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically speaking this removes a public API; not sure if this was intentional. Maybe you wanted to preserve the existing public XmlFactory(ObjectCodec, XMLInputFactory, XMLOutputFactory) constructor and add a new one?

{
super(oc);
_xmlParserFeatures = xpFeatures;
_xmlGeneratorFeatures = xgFeatures;
_cfgNameForTextElement = nameForTextElem;
if (xmlIn == null) {
xmlIn = XMLInputFactory.newInstance();
if (cl != null) {
Copy link

@basil basil Jul 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency, I think we'd want to do the same thing for XMLOutputFactory.newInstance a few lines below on line 128.

Copy link

@jglick jglick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like there is an incompatible signature change here.

@jglick
Copy link

jglick commented Jul 3, 2021

Please test whether #481 fixes the issue in Jenkins.

@basil
Copy link

basil commented Jul 3, 2021

Please test whether #481 fixes the issue in Jenkins.

Yes it does. Tested with the following diff to jackson2-api-plugin:

diff --git a/pom.xml b/pom.xml
index 4eba2d1..31130c2 100644
--- a/pom.xml
+++ b/pom.xml
@@ -67,9 +67,9 @@
     <revision>2.12.4</revision>
     <changelist>-SNAPSHOT</changelist>
     <java.level>8</java.level>
-    <jenkins.version>2.222.4</jenkins.version>
+    <jenkins.version>2.300</jenkins.version>
     <jackson.version>2.12.3</jackson.version>
-    <jackson-databind.version>${jackson.version}</jackson-databind.version>
+    <jackson-databind.version>2.13.0-rc1-SNAPSHOT</jackson-databind.version>
   </properties>
 
   <repositories>
@@ -89,8 +89,8 @@
     <dependencies>
       <dependency>
         <groupId>io.jenkins.tools.bom</groupId>
-        <artifactId>bom-2.222.x</artifactId>
-        <version>807.v6d348e44c987</version>
+        <artifactId>bom-2.289.x</artifactId>
+        <version>887.vae9c8ac09ff7</version>
         <scope>import</scope>
         <type>pom</type>
       </dependency>
@@ -131,13 +131,13 @@
     <dependency>
       <groupId>com.fasterxml.jackson.core</groupId>
       <artifactId>jackson-core</artifactId>
-      <version>${jackson.version}</version>
+      <version>2.13.0-rc1-SNAPSHOT</version>
     </dependency>
 
     <dependency>
       <groupId>com.fasterxml.jackson.core</groupId>
       <artifactId>jackson-annotations</artifactId>
-      <version>${jackson.version}</version>
+      <version>2.13.0-rc1-SNAPSHOT</version>
     </dependency>
 
     <dependency>
@@ -149,7 +149,7 @@
     <dependency>
       <groupId>com.fasterxml.jackson.dataformat</groupId>
       <artifactId>jackson-dataformat-xml</artifactId>
-      <version>${jackson.version}</version>
+      <version>2.13.0-rc1-SNAPSHOT</version>
     </dependency>
 
     <dependency>
diff --git a/src/test/java/com/fasterxml/jackson/dataformat/xml/XmlMapperTest.java b/src/test/java/com/fasterxml/jackson/dataformat/xml/XmlMapperTest.java
index a864ecd..825d5a4 100644
--- a/src/test/java/com/fasterxml/jackson/dataformat/xml/XmlMapperTest.java
+++ b/src/test/java/com/fasterxml/jackson/dataformat/xml/XmlMapperTest.java
@@ -10,8 +10,6 @@ import org.jvnet.hudson.test.RealJenkinsRule;
 
 import java.nio.charset.StandardCharsets;
 
-import javax.xml.stream.XMLInputFactory;
-
 public class XmlMapperTest {
 
     @Rule public RealJenkinsRule rr = new RealJenkinsRule();
@@ -22,11 +20,7 @@ public class XmlMapperTest {
     }
 
     private static void _smokes(JenkinsRule r) throws Throwable {
-        XMLInputFactory inputFactory =
-                XMLInputFactory.newFactory(
-                        XMLInputFactory.class.getName(), XmlFactory.class.getClassLoader());
-        XmlFactory factory = new XmlFactory(inputFactory);
-        XmlMapper mapper = new XmlMapper(factory);
+        XmlMapper mapper = new XmlMapper();
         String content = "<foo><bar><id>123</id></bar></foo>";
         Foo foo = mapper.readValue(content.getBytes(StandardCharsets.UTF_8), Foo.class);
         assertNotNull(foo.getBar());

@timja
Copy link
Author

timja commented Jul 3, 2021

Shouldn’t your diff be passing a class loader @basil?

@basil
Copy link

basil commented Jul 3, 2021

Shouldn’t your diff be passing a class loader @basil?

No, I don't think so. Recall that I was testing #481, not this PR. :-)

@cowtowncoder
Copy link
Member

One quick note: any new configurability settings should go via XmlFactoryBuilder and not through XmlFactory constructor overloads. That's more of a technicality but should remove the need to add new constructor overloads.

Other than that I guess I better read the linked-to issues to have an opinion other than "do we really need to add classloader stuff" :)

@cowtowncoder
Copy link
Member

Ok... I see things get a bit hairy through the stack, SPI and so on. That makes sense I guess. There are couple of sort of orthogonal concerns, approaches and I am open to considering various ones.

First: although Woodstox definitely is the recommended Stax implementation (along with Aalto), it might be possible to resolve the specific issue of not being able to use Stax2ByteArraySource as input for SJSXP. This would likely be something I could even backport into older Jackson module version(s) (at least 2.12, possibly 2.11).
I am not sure if that would be good or bad in the sense that there may be further issues but wanted to mention it first.

Second: although I am fine adding ClassLoader argument to XmlFactoryBuilder -- and changing the logic to make XmlFactoryBuilder create instances so it need not pass CL at all -- I wonder how many valid options there are.
If this comes down to just 2 choices (context class loader vs... one XmlFactoryBuilder is loaded from?), we could probably have true/false setting instead, to reduce likelihood of issues.

Third: changing of the default logic sounds acceptable too (we have RC1 of 2.13.0 to sanity check changes), so even if a new option is exposed (to opt-in to old behavior, I assume, if that is rarely needed).
This would still be needed for common case of implicit or direct construction of XmlFactory (i.e. one that does not go through XmlFactoryBuilder) anyway.

Fourth: downstream frameworks may want to explicitly construct and pass XMLInputFactory and XMLOutputFactory, going forward. That won't solve all issues but may help avoid one error mode.

I hope above makes sense? Apologies if I missed something obvious.

Timeline-wise, I am hoping to get 2.13.0-rc1 out in 2-3 weeks, so timing here is quite good.

@cowtowncoder
Copy link
Member

cowtowncoder commented Jul 4, 2021

Just noticed there is #481, as an alternative. Will check it out.

EDIT: ended up writing #482 based on that, which should avoid the specific problem with non-stax2 implementations.
It may still be the case that default stax factory instantiation should use different ClassLoader, but I am bit wary of changing that -- feel free to convince me that change would still be useful.

cowtowncoder added a commit that referenced this pull request Jul 4, 2021
@timja
Copy link
Author

timja commented Jul 4, 2021

I think in general if you're using an SPI to resolve classes a classloader option should be exposed, e.g. see https://www.jenkins.io/doc/developer/plugin-development/dependencies-and-class-loading/#context-class-loaders

but given there's a work around of constructing your own instance it may not be needed.

I think I'll close this, and it can be revived if required.

@jglick may have an opinion on it

Thanks everyone for your work on this :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants