Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor performance tests: capture memory usage, add README #2455

Merged
merged 1 commit into from
Jul 7, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 1 addition & 3 deletions .github/workflows/integration-tests.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ jobs:
- name: Set up Go
uses: actions/setup-go@v3
with:
go-version: "1.19"
go-version: "1.20"
- name: Set up tools
run: |
# Install ginkgo version from go.mod
Expand All @@ -39,8 +39,6 @@ jobs:
- name: Run e2e tests
env:
DISABLE_PROMPT: true
S3_BUCKET_CREATE: false
S3_BUCKET_NAME: ${{ secrets.S3_BUCKET_NAME }}
ROLE_CREATE: false
ROLE_ARN: ${{ secrets.EKS_CLUSTER_ROLE_ARN }}
RUN_CONFORMANCE: true
Expand Down
4 changes: 1 addition & 3 deletions .github/workflows/nightly-cron-tests.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ jobs:
- name: Set up Go
uses: actions/setup-go@v3
with:
go-version: "1.19"
go-version: "1.20"
- name: Set up tools
run: |
# Install ginkgo version from go.mod
Expand All @@ -38,8 +38,6 @@ jobs:
- name: Run e2e tests
env:
DISABLE_PROMPT: true
S3_BUCKET_CREATE: false
S3_BUCKET_NAME: ${{ secrets.S3_BUCKET_NAME }}
ROLE_CREATE: false
ROLE_ARN: ${{ secrets.EKS_CLUSTER_ROLE_ARN }}
RUN_CONFORMANCE: true
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/pr-automated-tests.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ jobs:
- name: Set up Go
uses: actions/setup-go@v3
with:
go-version: "1.19"
go-version: "1.20"
- name: Set up tools
run: |
go install golang.org/x/lint/golint@latest
Expand Down Expand Up @@ -49,7 +49,7 @@ jobs:
- name: Set up Go
uses: actions/setup-go@v3
with:
go-version: "1.19"
go-version: "1.20"
- name: Build CNI images
run: make multi-arch-cni-build
- name: Build CNI Init images
Expand Down
5 changes: 1 addition & 4 deletions .github/workflows/pr-manual-tests.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ jobs:
- name: Set up Go
uses: actions/setup-go@v3
with:
go-version: "1.19"
go-version: "1.20"
- name: Set up tools
run: |
# Install ginkgo version from go.mod
Expand All @@ -45,11 +45,8 @@ jobs:
- name: Run e2e tests
env:
DISABLE_PROMPT: true
S3_BUCKET_CREATE: false
S3_BUCKET_NAME: ${{ secrets.S3_BUCKET_NAME }}
ROLE_CREATE: false
ROLE_ARN: ${{ secrets.EKS_CLUSTER_ROLE_ARN }}
RUN_CONFORMANCE: true
RUN_INTEGRATION_DEFAULT_CNI: false
run: |
./scripts/run-integration-tests.sh
2 changes: 1 addition & 1 deletion .github/workflows/release.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ jobs:
- name: Set up Go
uses: actions/setup-go@v3
with:
go-version: "1.19"
go-version: "1.20"
- name: Generate CNI YAML
run: make generate-cni-yaml
- name: Create eks-charts PR
Expand Down
18 changes: 5 additions & 13 deletions .github/workflows/weekly-cron-tests.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ jobs:
- name: Set up Go
uses: actions/setup-go@v3
with:
go-version: "1.19"
go-version: "1.20"
- name: Set up tools
run: |
# Install ginkgo version from go.mod
Expand All @@ -39,55 +39,47 @@ jobs:
- name: Run perf tests
env:
DISABLE_PROMPT: true
S3_BUCKET_CREATE: false
S3_BUCKET_NAME: ${{ secrets.S3_BUCKET_NAME }}
ROLE_CREATE: false
ROLE_ARN: ${{ secrets.EKS_CLUSTER_ROLE_ARN }}
RUN_CNI_INTEGRATION_TESTS: false
PERFORMANCE_TEST_S3_BUCKET_NAME: cni-performance-tests
RUN_PERFORMANCE_TESTS: true
RUN_TESTER_LB_ADDONS: true
RUN_INTEGRATION_DEFAULT_CNI: false
run: |
./scripts/run-integration-tests.sh
- name: Run kops tests
env:
DISABLE_PROMPT: true
S3_BUCKET_CREATE: false
S3_BUCKET_NAME: ${{ secrets.S3_BUCKET_NAME }}
ROLE_CREATE: false
ROLE_ARN: ${{ secrets.EKS_CLUSTER_ROLE_ARN }}
RUN_CNI_INTEGRATION_TESTS: false
RUN_KOPS_TEST: true
RUN_TESTER_LB_ADDONS: true
K8S_VERSION: 1.26.5
KOPS_VERSION: v1.26.4
RUN_INTEGRATION_DEFAULT_CNI: false
run: |
./scripts/run-integration-tests.sh
if: always()
- name: Run bottlerocket tests
env:
DISABLE_PROMPT: true
S3_BUCKET_CREATE: false
S3_BUCKET_NAME: ${{ secrets.S3_BUCKET_NAME }}
ROLE_CREATE: false
ROLE_ARN: ${{ secrets.EKS_CLUSTER_ROLE_ARN }}
RUN_CNI_INTEGRATION_TESTS: false
RUN_BOTTLEROCKET_TEST: true
RUN_TESTER_LB_ADDONS: true
RUN_INTEGRATION_DEFAULT_CNI: false
run: |
./scripts/run-integration-tests.sh
if: always()
- name: Run calico tests
env:
DISABLE_PROMPT: true
S3_BUCKET_CREATE: false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the motivation of these changes, especially the removal of S3 bucket creates and setting RUN_INTEGRATION_DEFAULT_CNI to false in the weekly cron?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

S3_BUCKET_CREATE and S3_BUCKET_NAME were unused variables that appear to have been missed in a previous cleanup, so I removed them here.

For RUN_INTEGRATION_DEFAULT_CNI, we were running the CNI integration tests with every invocation of run-integration-tests. For the weekly tests, that means we were running them 4 times with no real benefit. Since they already run nightly as part of the nightly-cron-tests, I concluded they should be removed from the weekly tests. This should drop our weekly test runtime by about 1.5 hours, as well.

S3_BUCKET_NAME: ${{ secrets.S3_BUCKET_NAME }}
ROLE_CREATE: false
ROLE_ARN: ${{ secrets.EKS_CLUSTER_ROLE_ARN }}
RUN_CNI_INTEGRATION_TESTS: false
RUN_CALICO_TEST: true
RUN_LATEST_CALICO_VERSION: true
RUN_TESTER_LB_ADDONS: true
RUN_INTEGRATION_DEFAULT_CNI: false
run: |
./scripts/run-integration-tests.sh
if: always()
53 changes: 53 additions & 0 deletions scripts/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
## Integration Test Scripts

This package contains shell scripts and libraries used for running integration tests.
This README covers the prerequisites and instructions for running the scripts.

### run-integration-test.sh

`run-integration-test.sh` can run various integration test suites against the current revision in the invoking directory.

#### Prerequisites:
1. Valid AWS credentials for an account capable of creating EKS clusters
2. Docker installed and able to publish to an ECR repository in your account that can store test images
(run `aws ecr get-login-password --region $REGION | docker login --username AWS --password-stdin ${ACCOUNT_ID}.dkr.ecr.us-west-2.amazonaws.com` to log into Docker)
3. Repositories in your ECR named `amazon-vpc-cni` and `amazon-vpc-init`
4. For performance tests, an S3 bucket in your account to store test results. The name is passed in `PERFORMANCE_TEST_S3_BUCKET_NAME`

#### Tests
The following tests are valid to run, and setting the respective environment variable to true will run them:
1. CNI Integration Tests - `RUN_CNI_INTEGRATION_TESTS`
2. Calico Tests - `RUN_CALICO_TEST`
3. Conformance Tests - `RUN_CONFORMANCE`
4. Performance Tests - `RUN_PERFORMANCE_TESTS`
5. KOPS Tests - `RUN_KOPS_TEST`
6. Bottlerocket Tests - `RUN_BOTTLEROCKET_TEST`

Example for running performance tests:
```
RUN_CNI_INTEGRATION_TESTS=false RUN_PERFORMANCE_TESTS=true PERFORMANCE_TEST_S3_BUCKET_NAME=cniperftests ./scripts/run-integration-tests.sh
```

#### Other
`run-integration-test.sh` will create a new cluster by default based on the test(s) being run.
1. For KOPS tests, the `kops` binary will be used to create the cluster.
2. For others, the cluster will be created from a template in [scripts/test/config](https://github.com/aws/amazon-vpc-cni-k8s/tree/master/scripts/test/config)

Note that some tests create clusters with ARM and AMDx86 node groups, so test cases must be able to pass on both. Specifically, images that test cases pull must be able to run on both architectures.

#### Manually running performance tests
The following steps cover how to manually run the performance tests:

1. Copy `scripts/test/config/perf-cluster.yml` to wherever you are driving the test from.
2. Get the AMI ID to use from `aws ssm get-parameter --name /aws/service/eks/optimized-ami/${EKS_CLUSTER_VERSION}/amazon-linux-2/recommended/image_id --region us-west-2 --query "Parameter.Value" --output text`
3. Set the following values in the template:
- Replace `CLUSTER_NAME_PLACEHOLDER` with a cluster name of your choice.
- Replace `managedNodeGroups.ami` with the AMI value derived above for your EKS version of choice.
- Replace `max-pods` and `managedNodeGroups.instanceType` with your values of choice.
4. Create the cluster with `eksctl create cluster -f $CLUSTER_CONFIG`, where `CLUSTER_CONFIG` is the template you modified above.
5. Deploy the cluster autoscaler with: `kubectl create -f scripts/test/config/cluster-autoscaler-autodiscover.yml`
6. Deploy the metrics server with: `kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml`
7. Apply the latest CNI manifest with `kubectl apply -f config/master/aws-vpc-cni.yaml`
8. Modify the init/main container image in the `aws-node` daemonset with your image of choice.
9. Deploy a performance deployment, i.e. `kubectl apply -f testdata/deploy-130-pods.yaml`
10. Collect statistics
7 changes: 4 additions & 3 deletions scripts/lib/cluster.sh
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,6 @@ function up-test-cluster() {
if [[ "$RUN_BOTTLEROCKET_TEST" == true ]]; then
echo "Copying bottlerocket config to $CLUSTER_CONFIG"
cp $CLUSTER_TEMPLATE_PATH/bottlerocket.yaml $CLUSTER_CONFIG

elif [[ "$RUN_PERFORMANCE_TESTS" == true ]]; then
echo "Copying perf test cluster config to $CLUSTER_CONFIG"
cp $CLUSTER_TEMPLATE_PATH/perf-cluster.yml $CLUSTER_CONFIG
Expand All @@ -38,7 +37,6 @@ function up-test-cluster() {
grep -r -q $AMI_ID $CLUSTER_CONFIG
export RUN_CONFORMANCE="false"
: "${PERFORMANCE_TEST_S3_BUCKET_NAME:=""}"

else
echo "Copying test cluster config to $CLUSTER_CONFIG"
cp $CLUSTER_TEMPLATE_PATH/test-cluster.yaml $CLUSTER_CONFIG
Expand All @@ -57,7 +55,10 @@ function up-test-cluster() {
export KUBECONFIG=$KUBECONFIG_PATH

if [[ "$RUN_PERFORMANCE_TESTS" == true ]]; then
echo "Deploying cluster autoscaler"
kubectl create -f $DIR/test/config/cluster-autoscaler-autodiscover.yml
echo "Deploying metrics server"
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
fi
}

Expand Down Expand Up @@ -123,4 +124,4 @@ function down-kops-cluster {
$KOPS_BIN delete cluster --name ${CLUSTER_NAME} --yes
aws s3 rm ${KOPS_STATE_STORE} --recursive
aws s3 rb ${KOPS_STATE_STORE} --region $AWS_DEFAULT_REGION
}
}
5 changes: 2 additions & 3 deletions scripts/lib/common.sh
Original file line number Diff line number Diff line change
Expand Up @@ -21,10 +21,9 @@ function display_timelines() {
echo ""
echo "Displaying all step durations."
echo "TIMELINE: Upping test cluster took $UP_CLUSTER_DURATION seconds."
if [[ $RUN_INTEGRATION_DEFAULT_CNI == true ]]; then
echo "TIMELINE: Default CNI integration tests took $DEFAULT_INTEGRATION_DURATION seconds."
if [[ "$RUN_CNI_INTEGRATION_TESTS" == true ]]; then
echo "TIMELINE: Current image integration tests took $CURRENT_IMAGE_INTEGRATION_DURATION seconds."
fi
echo "TIMELINE: Current image integration tests took $CURRENT_IMAGE_INTEGRATION_DURATION seconds."
if [[ "$RUN_CONFORMANCE" == true ]]; then
echo "TIMELINE: Conformance tests took $CONFORMANCE_DURATION seconds."
fi
Expand Down
Loading