Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KAFKA-12327: Remove MethodHandle usage in CompressionType #10123

Merged
merged 5 commits into from
Feb 14, 2021

Conversation

ijuma
Copy link
Contributor

@ijuma ijuma commented Feb 13, 2021

We don't really need it and it causes problems in older Android versions
and GraalVM native image usage (there are workarounds for the latter).

Move the logic to separate classes that are only invoked when the
relevant compression library is actually used. Place such classes
in their own package and enforce via checkstyle that only these
classes refer to compression library packages.

To avoid cyclic dependencies, moved BufferSupplier to the utils
package.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

We don't really need it and it causes problems in older Android versions
and GraalVM usage.

Move the logic to separate classes that are only invoked when the
relevant compression library is actually used. Place such classes
in their own package and enforce via checkstyle that only these
classes refer to compression library packages.
@ijuma
Copy link
Contributor Author

ijuma commented Feb 13, 2021

@chia7712 Thoughts on this change?

Copy link
Contributor

@chia7712 chia7712 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does previous solution (dynamically find constructor) still load native library?

<allow pkg="net.jpountz.xxhash" />
<allow pkg="org.apache.kafka.common.compression" />
<allow pkg="org.xerial.snappy" />
<!-- For testing -->
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is a bit weird since ZstdFactory does reference the org.apache.kafka.common.record.BufferSupplier.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. It probably means we need to move BufferSupplier to another package to avoid circular dependencies.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed this.

@@ -144,15 +155,14 @@
</subpackage>

<subpackage name="record">
<allow pkg="net.jpountz" />
<allow pkg="org.apache.kafka.common.compression" />
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we move CompressionType to org.apache.kafka.common.compression and then make both ZstdFactory and SnappyFactory be package-private?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly. Let me see what other impact it has.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This affects a lot of files and I'd rather do that in a separate PR as I may backport this one to stable branches.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make sense.

@ijuma
Copy link
Contributor Author

ijuma commented Feb 13, 2021

@chia7712 Both solutions should behave the same at runtime. At compile time, the previous solution did not require snappy libraries in the classpath (but it did require lz4 and zstd). So, overall, I think there is no difference that matters.

Copy link
Contributor

@chia7712 chia7712 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

BTW, is it an issue that kafka client running on either older Android versions or GraalVM can't consume compressed data from brokers? Maybe broker can decompress those data on server-side for those clients which can't support compression.

@ijuma
Copy link
Contributor Author

ijuma commented Feb 14, 2021

@chia7712 with these changes, older Android clients should be able to start (before they would fail when the method handles code was reached even if no compression was actually used). If compressed data is used, then it depends on whether the native libraries are available. If not, then it would still fail.

Today, the solution would be to either:

  1. Include the relevant native compression libraries
  2. Limit topic data to the compression algorithms supported on the relevant platform

Both seem doable.

@ijuma
Copy link
Contributor Author

ijuma commented Feb 14, 2021

JDK 15 build passed, JDK 8 had unrelated flaky failures and JDK 11 died abruptly.

@ijuma ijuma merged commit 744d05b into apache:trunk Feb 14, 2021
@ijuma ijuma deleted the remove-method-handle-compression-type branch February 14, 2021 16:12
@ijuma
Copy link
Contributor Author

ijuma commented Feb 14, 2021

Merged to trunk and 2.8.

ijuma added a commit that referenced this pull request Feb 14, 2021
We don't really need it and it causes problems in older Android versions
and GraalVM native image usage (there are workarounds for the latter).

Move the logic to separate classes that are only invoked when the
relevant compression library is actually used. Place such classes
in their own package and enforce via checkstyle that only these
classes refer to compression library packages.

To avoid cyclic dependencies, moved `BufferSupplier` to the `utils`
package.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>
ijuma added a commit to ijuma/kafka that referenced this pull request Feb 14, 2021
…e-allocations-lz4

* apache-github/trunk: (118 commits)
  KAFKA-12327: Remove MethodHandle usage in CompressionType (apache#10123)
KAFKA-12297: Make MockProducer return RecordMetadata with values as
per contract
  MINOR: Update zstd and use classes with no finalizers (apache#10120)
KAFKA-12326: Corrected regresion in MirrorMaker 2 executable
introduced with KAFKA-10021 (apache#10122)
KAFKA-12321 the comparison function for uuid type should be 'equals'
rather than '==' (apache#10098)
  MINOR: Add FetchSnapshot API doc in KafkaRaftClient (apache#10097)
  MINOR: KIP-631 KafkaConfig fixes and improvements (apache#10114)
  KAFKA-12272: Fix commit-interval metrics (apache#10102)
  MINOR: Improve confusing admin client shutdown logging (apache#10107)
  MINOR: Add BrokerMetadataListener (apache#10111)
  MINOR: Support Raft-based metadata quorums in system tests (apache#10093)
MINOR: add the MetaLogListener, LocalLogManager, and Controller
interface. (apache#10106)
  MINOR: Introduce the KIP-500 Broker lifecycle manager (apache#10095)
MINOR: Remove always-passing validation in
TestRecordTest#testProducerRecord (apache#9930)
KAFKA-5235: GetOffsetShell: Support for multiple topics and consumer
configuration override (KIP-635) (apache#9430)
MINOR: Prevent creating partition.metadata until ID can be written
(apache#10041)
  MINOR: Add RaftReplicaManager (apache#10069)
MINOR: Add ClientQuotaMetadataManager for processing QuotaRecord
(apache#10101)
  MINOR: Rename DecommissionBrokers to UnregisterBrokers (apache#10084)
MINOR: KafkaBroker.brokerState should be volatile instead of
AtomicReference (apache#10080)
  ...

clients/src/main/java/org/apache/kafka/common/record/CompressionType.java
core/src/test/scala/unit/kafka/coordinator/group/GroupMetadataManagerTest.scala
@chia7712
Copy link
Contributor

Today, the solution would be to either:
Include the relevant native compression libraries
Limit topic data to the compression algorithms supported on the relevant platform
Both seem doable.

@ijuma Thanks for the sharing. IIRC, kafka producer which does not support the compression can send uncompressed data to server. The data get compressed on server-side. It is a useful feature since the compression does not obstruct us from producing data on env which can't load native compression libraries. In contrast with producer, kafka consumer can't get data from compressed topic if it can't load native compression libraries. WDYT? Does it pay to support such scenario?

@ijuma
Copy link
Contributor Author

ijuma commented Feb 17, 2021

I'm not sure it's worth it since the cost of compressing on the broker during fetches is significantly higher than compressing during produce (the data is already on the heap for the latter and there are usually multiple fetches per produce). That is not to say that it's never useful, just that the ROI seems a bit low.

zakkak added a commit to zakkak/quarkus that referenced this pull request Apr 26, 2021
Kafka 2.8.0 removes the `MethodHandle` usage in `CompressionType` (see
apache/kafka#10123) removing
`SnappyConstructors` as well.  As a result the substitution is no
longer valid.

The update to Kafka 2.8.0 was done in
quarkusio@f8ce993

Fixes quarkusio#16721
luca-digrazia pushed a commit to luca-digrazia/DatasetCommitsDiffSearch that referenced this pull request Sep 4, 2022
    Kafka 2.8.0 removes the `MethodHandle` usage in `CompressionType` (see
    apache/kafka#10123) removing
    `SnappyConstructors` as well.  As a result the substitution is no
    longer valid.

    The update to Kafka 2.8.0 was done in
    quarkusio/quarkus@f8ce993

    Fixes #16721
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants