Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: Allow non-padded base64 data to be decoded by decode_base64_field #27311

Merged
merged 2 commits into from
Aug 23, 2021

Conversation

michaelarnauts
Copy link
Contributor

@michaelarnauts michaelarnauts commented Aug 11, 2021

What does this PR do?

This change allows decoding of any raw base64 input strings that were previously encoded without standard padding character (=).

By stripping the padding, we can use base64.RawStdEncoding.DecodeString to decode the base64 string. This is easier then appending the padding characters.

Another attempt to fix this has been made in #25817, but that PR has been closed.

Why is it important?

When attempting to decode the payload (middle) section of a JWT token, it was discovered that the decode was failing, because padding characters are not included in a JWT token string. Padding is not required in base64, so it makes sense to allow to decode both unpadded and padded strings.

Currently, there is a workaround for some beats by appending the =-signs in a javascript processor, but that isn't available in all beats, and is a ugly workaround anyway. See https://medium.com/@guyromb/decode-jwt-traefik-access-logs-using-filebeat-95d935eb7c4f

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Author's Checklist

  • [ ]

How to test this PR locally

Tests have been added that will test this behavior.

Related issues

Use cases

Screenshots

Logs

@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Aug 11, 2021
@cla-checker-service
Copy link

cla-checker-service bot commented Aug 11, 2021

💚 CLA has been signed

@elasticmachine
Copy link
Collaborator

elasticmachine commented Aug 11, 2021

💔 Build Failed

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2021-08-21T08:13:30.607+0000

  • Duration: 99 min 8 sec

  • Commit: e4d4126

Test stats 🧪

Test Results
Failed 0
Passed 52574
Skipped 5133
Total 57707

Trends 🧪

Image of Build Times

Image of Tests

Steps errors 6

Expand to view the steps failures

filebeat-goIntegTest - mage goIntegTest
  • Took 5 min 4 sec . View more details on here
  • Description: mage goIntegTest
filebeat-windows-8-windows-8 - mage build unitTest
  • Took 5 min 18 sec . View more details on here
  • Description: mage build unitTest
filebeat-windows-8-windows-8 - mage build unitTest
  • Took 4 min 44 sec . View more details on here
  • Description: mage build unitTest
filebeat-windows-8-windows-8 - mage build unitTest
  • Took 4 min 46 sec . View more details on here
  • Description: mage build unitTest
gsutil -m -q cp -a public-read test-build-artifacts-filebeat-windows-8-windows-8-tgz gs://beats-ci-t
  • Took 0 min 3 sec . View more details on here
  • Description: @echo off gsutil -m -q cp -a public-read test-build-artifacts-filebeat-windows-8-windows-8-tgz gs://beats-ci-temp/Beats/beats/PR-27311-6
Error signal
  • Took 0 min 0 sec . View more details on here
  • Description: Error 'hudson.AbortException: script returned exit code 1'

Log output

Expand to view the last 100 lines of log output

[2021-08-21T09:49:02.331Z] 'gsutil' is not recognized as an internal or external command,
[2021-08-21T09:49:02.331Z] operable program or batch file.
[2021-08-21T09:49:02.454Z] Running in C:\Users\jenkins\workspace\PR-27311-6-919f8b9b-7134-44fd-9ccb-9a6baa9d89ef\src\github.com\elastic\beats\build@tmp
[2021-08-21T09:49:02.775Z] 
[2021-08-21T09:49:02.775Z] C:\Users\jenkins\workspace\PR-27311-6-919f8b9b-7134-44fd-9ccb-9a6baa9d89ef\src\github.com\elastic\beats\build@tmp>wget --version  1>NUL 
[2021-08-21T09:49:03.046Z] Masking supported pattern matches of $FILE_CREDENTIAL
[2021-08-21T09:49:03.114Z] 
[2021-08-21T09:49:03.114Z] C:\Users\jenkins\workspace\PR-27311-6-919f8b9b-7134-44fd-9ccb-9a6baa9d89ef\src\github.com\elastic\beats\build@tmp>wget -q -O gsutil.zip https://dl.google.com/dl/cloudsdk/channels/rapid/downloads/google-cloud-sdk-319.0.0-windows-x86_64-bundled-python.zip 
[2021-08-21T09:49:03.358Z] + gcloud auth activate-service-account --key-file ****
[2021-08-21T09:49:03.928Z] Activated service account credentials for: [beats-ci-gcs-plugin@elastic-ci-prod.iam.gserviceaccount.com]
[2021-08-21T09:49:04.248Z] + gsutil -m -q cp eC1wYWNrL2ZpbGViZWF0LXdpbmRvd3MtMTAtd2luZG93cy0xMGU0ZDQxMjZjOGM5NTMzMzUwZWNiZmRmOTdlYTA5NWY1ZjZjZWZlYmQ gs://beats-ci-temp/ci/cache/
[2021-08-21T09:49:04.397Z] module\http\test_http.py s.                                              [ 52%]
[2021-08-21T09:49:04.397Z] module\jolokia\test_jolokia.py ssss                                      [ 53%]
[2021-08-21T09:49:04.397Z] module\kafka\test_kafka.py ssssssssssssssssssss                          [ 61%]
[2021-08-21T09:49:04.397Z] module\kibana\test_kibana.py ss                                          [ 62%]
[2021-08-21T09:49:04.397Z] module\logstash\test_logstash.py sss                                     [ 63%]
[2021-08-21T09:49:04.397Z] module\memcached\test_memcached.py s                                     [ 64%]
[2021-08-21T09:49:04.397Z] module\mongodb\test_mongodb.py s                                         [ 64%]
[2021-08-21T09:49:04.397Z] module\munin\test_munin.py s                                             [ 65%]
[2021-08-21T09:49:04.397Z] module\mysql\test_mysql.py sssssss                                       [ 68%]
[2021-08-21T09:49:04.397Z] module\nats\test_nats.py ssssssssssssssssss                              [ 75%]
[2021-08-21T09:49:04.397Z] module\openmetrics\test_openmetrics.py s                                 [ 75%]
[2021-08-21T09:49:04.397Z] module\php_fpm\test_phpfpm.py s                                          [ 76%]
[2021-08-21T09:49:04.397Z] module\postgresql\test_postgresql.py ssssssssssssssssssss                [ 84%]
[2021-08-21T09:49:04.397Z] module\prometheus\test_prometheus.py sss                                 [ 85%]
[2021-08-21T09:49:04.666Z] module\redis\test_redis.py ssssssssssss                                  [ 90%]
[2021-08-21T09:49:09.618Z] Extracting from C:\Users\jenkins\workspace\PR-27311-6-919f8b9b-7134-44fd-9ccb-9a6baa9d89ef\src\github.com\elastic\beats\build@tmp\gsutil.zip
[2021-08-21T09:49:22.806Z] Extracted: 15728 files
[2021-08-21T09:49:23.020Z] Masking supported pattern matches of %FILE_CREDENTIAL%
[2021-08-21T09:49:23.819Z] 
[2021-08-21T09:49:23.819Z] C:\Users\jenkins\workspace\PR-27311-6-919f8b9b-7134-44fd-9ccb-9a6baa9d89ef\src\github.com\elastic\beats\build>gcloud auth activate-service-account --key-file **** 
[2021-08-21T09:49:28.023Z] Activated service account credentials for: [beats-ci-gcs-plugin@elastic-ci-prod.iam.gserviceaccount.com]
[2021-08-21T09:49:31.368Z] CommandException: No URLs matched: test-build-artifacts-filebeat-windows-8-windows-8-tgz
[2021-08-21T09:49:31.369Z] CommandException: 1 file/object could not be transferred.
[2021-08-21T09:49:31.479Z] ERROR: script returned exit code 1
[2021-08-21T09:49:31.813Z] 
[2021-08-21T09:49:31.813Z] C:\Users\jenkins\workspace\PR-27311-6-919f8b9b-7134-44fd-9ccb-9a6baa9d89ef\src\github.com\elastic\beats>go clean -modcache 
[2021-08-21T09:50:29.420Z] Failed in branch filebeat-windows-8-windows-8
[2021-08-21T09:51:01.968Z] module\system\test_system.py .....s..s...s.s.                            [ 96%]
[2021-08-21T09:51:01.968Z] module\traefik\test_traefik.py s                                         [ 97%]
[2021-08-21T09:51:01.968Z] module\uwsgi\test_uwsgi.py ss                                            [ 97%]
[2021-08-21T09:51:01.968Z] module\vsphere\test_vsphere.py sss                                       [ 99%]
[2021-08-21T09:51:01.968Z] module\zookeeper\test_zookeeper.py ss                                    [100%]
[2021-08-21T09:51:01.968Z] 
[2021-08-21T09:51:01.968Z] ============================== warnings summary ===============================
[2021-08-21T09:51:01.968Z] tests\system\test_lightmodules.py:57
[2021-08-21T09:51:01.968Z]   C:\Users\jenkins\workspace\PR-27311-6-d195dc3e-a21d-4772-bd10-43d9f6ad0d3f\src\github.com\elastic\beats\metricbeat\tests\system\test_lightmodules.py:57: PytestCollectionWarning: cannot collect test class 'TestHTTPHandler' because it has a __init__ constructor (from: metricbeat/tests/system/test_lightmodules.py)
[2021-08-21T09:51:01.968Z]     class TestHTTPHandler(http.server.BaseHTTPRequestHandler):
[2021-08-21T09:51:01.968Z] 
[2021-08-21T09:51:01.968Z] -- Docs: https://docs.pytest.org/en/stable/warnings.html
[2021-08-21T09:51:01.968Z] - generated xml file: C:\Users\jenkins\workspace\PR-27311-6-d195dc3e-a21d-4772-bd10-43d9f6ad0d3f\src\github.com\elastic\beats\metricbeat\build\TEST-python-unit.xml -
[2021-08-21T09:51:01.968Z] ============================ slowest 20 durations =============================
[2021-08-21T09:51:01.968Z] 16.98s call     metricbeat/tests/system/test_cmd.py::TestCommands::test_modules_enable
[2021-08-21T09:51:01.968Z] 16.95s call     metricbeat/tests/system/test_cmd.py::TestCommands::test_modules_disable
[2021-08-21T09:51:01.968Z] 16.94s call     metricbeat/tests/system/test_cmd.py::TestCommands::test_modules_list
[2021-08-21T09:51:01.968Z] 14.17s call     metricbeat/tests/system/test_config.py::ConfigTest::test_service_name
[2021-08-21T09:51:01.968Z] 12.57s call     metricbeat/tests/system/test_reload.py::Test::test_start_stop
[2021-08-21T09:51:01.968Z] 11.13s call     metricbeat/module/system/test_system.py::Test::test_process
[2021-08-21T09:51:01.968Z] 10.82s call     metricbeat/tests/system/test_cmd.py::TestCommands::test_modules_test_error
[2021-08-21T09:51:01.968Z] 10.69s call     metricbeat/tests/system/test_processors.py::Test::test_dropevent_with_condition
[2021-08-21T09:51:01.968Z] 9.61s call     metricbeat/tests/system/test_reload.py::Test::test_reload
[2021-08-21T09:51:01.968Z] 9.19s call     metricbeat/tests/system/test_processors.py::Test::test_multiple_actions
[2021-08-21T09:51:01.968Z] 9.15s call     metricbeat/tests/system/test_processors.py::Test::test_dropevent_with_complex_condition
[2021-08-21T09:51:01.968Z] 9.11s call     metricbeat/tests/system/test_processors.py::Test::test_dropfields_with_condition
[2021-08-21T09:51:01.968Z] 9.09s call     metricbeat/tests/system/test_processors.py::Test::test_contradictory_multiple_actions
[2021-08-21T09:51:01.969Z] 9.08s call     metricbeat/tests/system/test_processors.py::Test::test_include_fields
[2021-08-21T09:51:01.969Z] 8.96s call     metricbeat/tests/system/test_base.py::Test::test_export_index_pattern_migration
[2021-08-21T09:51:01.969Z] 8.95s call     metricbeat/tests/system/test_timeseries.py::TestTimeseries::test_enable_timeseries
[2021-08-21T09:51:01.969Z] 8.94s call     metricbeat/tests/system/test_base.py::Test::test_export_index_pattern
[2021-08-21T09:51:01.969Z] 8.88s call     metricbeat/tests/system/test_base.py::Test::test_export_config
[2021-08-21T09:51:01.969Z] 8.88s call     metricbeat/module/http/test_http.py::Test::test_server
[2021-08-21T09:51:01.969Z] 8.85s call     metricbeat/tests/system/test_base.py::Test::test_export_template
[2021-08-21T09:51:01.969Z] =========== 40 passed, 207 skipped, 1 warning in 403.34s (0:06:43) ============
[2021-08-21T09:51:01.969Z] >> python test: Unit Testing Complete
[2021-08-21T09:51:02.370Z] 
[2021-08-21T09:51:02.370Z] C:\Users\jenkins\workspace\PR-27311-6-d195dc3e-a21d-4772-bd10-43d9f6ad0d3f\src\github.com\elastic\beats>FOR / %d IN ("ve") DO @IF EXIST "%d" rmdir /s /q "%d" 
[2021-08-21T09:51:17.101Z] 
[2021-08-21T09:51:17.101Z] C:\Users\jenkins\workspace\PR-27311-6-d195dc3e-a21d-4772-bd10-43d9f6ad0d3f\src\github.com\elastic\beats>python .ci/scripts/pre_archive_test.py 
[2021-08-21T09:51:18.485Z] Copy .\metricbeat\build into build\metricbeat\build
[2021-08-21T09:51:18.485Z] Copy .\metricbeat\null\build into build\metricbeat\null\build
[2021-08-21T09:51:18.505Z] Running in C:\Users\jenkins\workspace\PR-27311-6-d195dc3e-a21d-4772-bd10-43d9f6ad0d3f\src\github.com\elastic\beats\build
[2021-08-21T09:51:18.523Z] Recording test results
[2021-08-21T09:51:22.368Z] [Checks API] No suitable checks publisher found.
[2021-08-21T09:51:22.736Z] 
[2021-08-21T09:51:22.736Z] C:\Users\jenkins\workspace\PR-27311-6-d195dc3e-a21d-4772-bd10-43d9f6ad0d3f\src\github.com\elastic\beats>go clean -modcache 
[2021-08-21T09:52:32.231Z] + gsutil --version
[2021-08-21T09:52:33.651Z] Masking supported pattern matches of $FILE_CREDENTIAL
[2021-08-21T09:52:33.963Z] + gcloud auth activate-service-account --key-file ****
[2021-08-21T09:52:34.532Z] Activated service account credentials for: [beats-ci-gcs-plugin@elastic-ci-prod.iam.gserviceaccount.com]
[2021-08-21T09:52:34.846Z] + gsutil -m -q cp bWV0cmljYmVhdC13aW5kb3dzLTEwLXdpbmRvd3MtMTBlNGQ0MTI2YzhjOTUzMzM1MGVjYmZkZjk3ZWEwOTVmNWY2Y2VmZWJk gs://beats-ci-temp/ci/cache/
[2021-08-21T09:52:36.383Z] Stage "Packaging" skipped due to earlier failure(s)
[2021-08-21T09:52:36.426Z] Stage "Packaging-Pipeline" skipped due to earlier failure(s)
[2021-08-21T09:52:36.497Z] Running in /var/lib/jenkins/workspace/Beats_beats_PR-27311/src/github.com/elastic/beats
[2021-08-21T09:52:37.690Z] Running on Jenkins in /var/lib/jenkins/workspace/Beats_beats_PR-27311
[2021-08-21T09:52:37.749Z] [INFO] getVaultSecret: Getting secrets
[2021-08-21T09:52:37.804Z] Masking supported pattern matches of $VAULT_ADDR or $VAULT_ROLE_ID or $VAULT_SECRET_ID
[2021-08-21T09:52:38.532Z] + chmod 755 generate-build-data.sh
[2021-08-21T09:52:38.532Z] + ./generate-build-data.sh https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Beats/beats/PR-27311/ https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Beats/beats/PR-27311/runs/6 FAILURE 5947656
[2021-08-21T09:52:38.532Z] INFO: curl https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Beats/beats/PR-27311/runs/6/steps/?limit=10000 -o steps-info.json
[2021-08-21T09:52:50.629Z] INFO: curl https://beats-ci.elastic.co/blue/rest/organizations/jenkins/pipelines/Beats/beats/PR-27311/runs/6/tests/?status=FAILED -o tests-errors.json

💚 Flaky test report

Tests succeeded.

Expand to view the summary

Test stats 🧪

Test Results
Failed 0
Passed 52574
Skipped 5133
Total 57707

@michaelarnauts
Copy link
Contributor Author

x Author of the following commits did not sign a Contributor Agreement:
0f25a96

Please, read and sign the above mentioned agreement if you want to contribute to this project

The CLA has been signed now.

@mergify
Copy link
Contributor

mergify bot commented Aug 16, 2021

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b base64_unpadded upstream/base64_unpadded
git merge upstream/master
git push upstream base64_unpadded

@michaelarnauts
Copy link
Contributor Author

Sure, I can fix the PR easily if somebody from the team could indicate if this is an accepted fix.

@jsoriano
Copy link
Member

/test

@jsoriano jsoriano added the Team:Elastic-Agent Label for the Agent team label Aug 20, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/agent (Team:Agent)

@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Aug 20, 2021
@jsoriano
Copy link
Member

Hey @michaelarnauts, thanks for your contribution!

I think the fix looks good, but not sure about the implications of decoding unpadded base64 data (if any), maybe @adriansr or @kvch can review or give their opinions? Thanks!

@adriansr
Copy link
Contributor

I've moved the changelog entry from Breaking Changes to Added as I don't think anyone could have been taking advantage of the fact that this processor failed for non-padded strings. Also the documentation didn't make any claims regarding padding. For the same reason it's also not a bug fix.

@adriansr
Copy link
Contributor

/test

@michaelarnauts
Copy link
Contributor Author

I've moved the changelog entry from Breaking Changes to Added as I don't think anyone could have been taking advantage of the fact that this processor failed for non-padded strings. Also the documentation didn't make any claims regarding padding. For the same reason it's also not a bug fix.

Sorry, I didn't mean to put it in a breaking change.

Not sure if it should be called a feature though, especially since the docs didn't indicate that the base64 strings should be padded. I assumed it would support all variants. I don't know what the policy of backporting to a point release is, and if that depends if it's a bug or a feature :)

@kvch
Copy link
Contributor

kvch commented Aug 23, 2021

I would consider it a feature. With this change, Beats can process non-padded base64 data. 🎉

@kvch
Copy link
Contributor

kvch commented Aug 23, 2021

Failing tests are unrelated.

@kvch kvch added the backport-v7.16.0 Automated backport with mergify label Aug 23, 2021
@kvch kvch merged commit 9c4f7f9 into elastic:master Aug 23, 2021
@kvch
Copy link
Contributor

kvch commented Aug 23, 2021

@michaelarnauts Thank you!

mergify bot pushed a commit that referenced this pull request Aug 23, 2021
…7311)

## What does this PR do?

This change allows the decoding of any raw base64 input strings that were previously encoded without standard padding character (`=`).

By stripping the padding, we can use `base64.RawStdEncoding.DecodeString` to decode the base64 string. This is easier than appending the padding characters.

Another attempt to fix this has been made in #25817, but that PR has been closed.

## Why is it important?

When attempting to decode the payload (middle) section of a JWT token, it was discovered that the decode was failing, because padding characters are not included in a JWT token string. Padding is not required in base64, so it makes sense to allow to decode both unpadded and padded strings.

Currently, there is a workaround for some beats by appending the `=`-signs in a javascript processor, but that isn't available in all beats and is an ugly workaround anyway. See https://medium.com/@guyromb/decode-jwt-traefik-access-logs-using-filebeat-95d935eb7c4f

Co-authored-by: Adrian Serrano <adrisr83@gmail.com>
(cherry picked from commit 9c4f7f9)
kvch pushed a commit that referenced this pull request Aug 24, 2021
…7311) (#27554)

## What does this PR do?

This change allows the decoding of any raw base64 input strings that were previously encoded without standard padding character (`=`).

By stripping the padding, we can use `base64.RawStdEncoding.DecodeString` to decode the base64 string. This is easier than appending the padding characters.

Another attempt to fix this has been made in #25817, but that PR has been closed.

## Why is it important?

When attempting to decode the payload (middle) section of a JWT token, it was discovered that the decode was failing, because padding characters are not included in a JWT token string. Padding is not required in base64, so it makes sense to allow to decode both unpadded and padded strings.

Currently, there is a workaround for some beats by appending the `=`-signs in a javascript processor, but that isn't available in all beats and is an ugly workaround anyway. See https://medium.com/@guyromb/decode-jwt-traefik-access-logs-using-filebeat-95d935eb7c4f

Co-authored-by: Adrian Serrano <adrisr83@gmail.com>
(cherry picked from commit 9c4f7f9)

Co-authored-by: Michaël Arnauts <michael.arnauts@gmail.com>
Icedroid pushed a commit to Icedroid/beats that referenced this pull request Nov 1, 2021
…astic#27311)

## What does this PR do?

This change allows the decoding of any raw base64 input strings that were previously encoded without standard padding character (`=`).

By stripping the padding, we can use `base64.RawStdEncoding.DecodeString` to decode the base64 string. This is easier than appending the padding characters.

Another attempt to fix this has been made in elastic#25817, but that PR has been closed.

## Why is it important?

When attempting to decode the payload (middle) section of a JWT token, it was discovered that the decode was failing, because padding characters are not included in a JWT token string. Padding is not required in base64, so it makes sense to allow to decode both unpadded and padded strings.

Currently, there is a workaround for some beats by appending the `=`-signs in a javascript processor, but that isn't available in all beats and is an ugly workaround anyway. See https://medium.com/@guyromb/decode-jwt-traefik-access-logs-using-filebeat-95d935eb7c4f

Co-authored-by: Adrian Serrano <adrisr83@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-v7.16.0 Automated backport with mergify enhancement review Team:Elastic-Agent Label for the Agent team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

decode_base64_field doesn't allow to decode unpadded base64 strings
5 participants