Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deploy/gcp: support zonal and multi-zonal cluster #809

Merged
merged 6 commits into from
Aug 26, 2019
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions deploy/gcp/examples/multi-zonal.tfvars
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
#
# This will create a zonal cluster in zone us-central1-b with one additional zone.
# Work nodes will be created in primary zone us-central1-b and additional zone us-central1-c.
#
gke_name = "multi-zonal"
vpc_name = "multi-zonal"
location = "us-central1-b"
pd_instance_type = "n1-standard-2"
tikv_instance_type = "n1-highmem-4"
tidb_instance_type = "n1-standard-8"
node_locations = [
"us-central1-c"
]
13 changes: 13 additions & 0 deletions deploy/gcp/examples/single-zonal.tfvars
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
#
# This will create a zonal cluster in zone us-central1-b without additional zones.
# Work nodes will be created in a single zone only.
#
gke_name = "single-zonal"
vpc_name = "single-zonal"
location = "us-central1-b"
pd_instance_type = "n1-standard-2"
tikv_instance_type = "n1-highmem-4"
tidb_instance_type = "n1-standard-8"
pd_count = 3
tikv_count = 3
tidb_count = 3
25 changes: 25 additions & 0 deletions deploy/gcp/examples/tidb-customized.tfvars
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
pd_instance_type = "n1-standard-2"
tikv_instance_type = "n1-highmem-4"
tidb_instance_type = "n1-standard-8"

# specify tidb version
tidb_version = "3.0.2"

# override tidb cluster values
override_values = <<EOF
pd:
hostNetwork: true
tikv:
resources:
requests:
cpu: "1"
memory: 1Gi
storage: 10Gi
hostNetwork: true
tidb:
resources:
requests:
cpu: "1"
memory: 1Gi
hostNetwork: true
EOF
10 changes: 9 additions & 1 deletion deploy/gcp/main.tf
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@
locals {
# Create a regional cluster in current region by default.
location = var.location != "" ? var.location : var.GCP_REGION
}

provider "google" {
credentials = file(var.GCP_CREDENTIALS_PATH)
region = var.GCP_REGION
Expand Down Expand Up @@ -39,10 +44,13 @@ module "tidb-operator" {
vpc_name = var.vpc_name
subnetwork_name = module.vpc.private_subnetwork_name
gcp_project = var.GCP_PROJECT
gcp_region = var.GCP_REGION
gke_version = var.gke_version
location = local.location
node_locations = var.node_locations
kubeconfig_path = local.kubeconfig
tidb_operator_version = var.tidb_operator_version
maintenance_window_start_time = var.maintenance_window_start_time
operator_helm_values = var.operator_helm_values
}

module "bastion" {
Expand Down
5 changes: 5 additions & 0 deletions deploy/gcp/tidbclusters.tf
Original file line number Diff line number Diff line change
Expand Up @@ -31,9 +31,14 @@ module "default-tidb-cluster" {
pd_instance_type = var.pd_instance_type
tikv_instance_type = var.tikv_instance_type
tidb_instance_type = var.tidb_instance_type
pd_image_type = var.pd_image_type
tikv_image_type = var.tikv_image_type
tidb_image_type = var.tidb_image_type
monitor_instance_type = var.monitor_instance_type
pd_node_count = var.pd_count
tikv_node_count = var.tikv_count
tidb_node_count = var.tidb_count
monitor_node_count = var.monitor_count
tikv_local_ssd_count = var.tikv_local_ssd_count
override_values = var.override_values
}
60 changes: 48 additions & 12 deletions deploy/gcp/variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,17 @@ variable "GCP_PROJECT" {
description = "The GCP project in which to create the necessary resources"
}

variable "location" {
description = "The GKE cluster location. If you specify a zone (such as us-central1-a), the cluster will be a zonal cluster with a single cluster master. If you specify a region (such as us-west1), the cluster will be a regional cluster with multiple masters spread across zones in the region. If not specified, the cluster will be a regional cluster in GCP_REGION."
type = string
}

variable "node_locations" {
description = "The list of zones in which the cluster's nodes should be located. These must be in the same region as the cluster zone for zonal clusters, or in the region of a regional cluster. In a multi-zonal cluster, the number of nodes specified in initial_node_count is created in all specified zones as well as the primary zone. If specified for a regional cluster, nodes will be created in only these zones."
type = list(string)
default = []
}

variable "tidb_version" {
description = "TiDB version"
default = "v3.0.1"
Expand All @@ -24,6 +35,12 @@ variable "tidb_operator_chart_version" {
default = ""
}

variable "operator_helm_values" {
description = "Operator helm values"
type = string
default = ""
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

declare it in root module to make it configurable


variable "create_vpc" {
default = true
}
Expand All @@ -33,6 +50,12 @@ variable "gke_name" {
default = "tidb-cluster"
}

variable "gke_version" {
description = "Kubernetes version to use for the GKE cluster"
type = string
default = "latest"
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

declare it in root module to make it configurable


variable "default_tidb_cluster_name" {
description = "The name that will be given to the default tidb cluster created."
default = "tidb-cluster"
Expand All @@ -43,18 +66,6 @@ variable "vpc_name" {
default = "tidb-cluster"
}

variable "pd_replica_count" {
default = 3
}

variable "tikv_replica_count" {
default = 3
}

variable "tidb_replica_count" {
default = 3
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these three variables are not used anymore


variable "pd_count" {
description = "Number of PD nodes per availability zone"
default = 1
Expand All @@ -80,6 +91,26 @@ variable "tikv_instance_type" {}

variable "tidb_instance_type" {}

variable "pd_image_type" {
description = "PD image type, avaiable: UBUNTU/COS"
default = "COS"
}

variable "tidb_image_type" {
description = "TiDB image type, avaiable: UBUNTU/COS"
default = "COS"
}

variable "tikv_image_type" {
description = "TiKV image type, avaiable: UBUNTU/COS"
default = "COS"
}

variable "tikv_local_ssd_count" {
description = "TiKV node pool local ssd count (cannot be changed after the node pool is created)"
default = 1
}

variable "monitor_instance_type" {
default = "n1-standard-2"
}
Expand All @@ -92,3 +123,8 @@ variable "maintenance_window_start_time" {
description = "The time in HH:MM GMT format to define the start of the daily maintenance window"
default = "01:00"
}

variable "override_values" {
description = "YAML formatted values that will be passed in to the tidb-cluster helm release"
default = ""
}
4 changes: 3 additions & 1 deletion deploy/modules/gcp/tidb-cluster/data.tf
Original file line number Diff line number Diff line change
Expand Up @@ -19,9 +19,11 @@ data "external" "monitor_port" {
}

locals {
# examples of location: us-central1 (region), us-central1-b (zone), us-central1-c (zone)
cluster_location_args = "%{if length(split("-", var.gke_cluster_location)) == 3}--zone ${var.gke_cluster_location} %{else}--region ${var.gke_cluster_location} %{endif}"
# TODO Update related code when node locations is avaiable in attributes of cluster resource.
cmd_get_cluster_locations = <<EOT
gcloud --project ${var.gcp_project} container clusters list --filter='name=${var.gke_cluster_name}' --format='json[no-heading](locations)' --region ${var.gke_cluster_location} | jq '.[0] | .locations |= join(",")'
gcloud --project ${var.gcp_project} container clusters list --filter='name=${var.gke_cluster_name}' --format='json[no-heading](locations)' ${local.cluster_location_args} | jq '.[0] | .locations |= join(",")'
EOT
}

Expand Down
2 changes: 1 addition & 1 deletion deploy/modules/gcp/tidb-cluster/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ resource "google_container_node_pool" "tikv_pool" {
image_type = var.tikv_image_type
// This value cannot be changed (instead a new node pool is needed)
// 1 SSD is 375 GiB
local_ssd_count = 1
local_ssd_count = var.tikv_local_ssd_count

taint {
effect = "NO_SCHEDULE"
Expand Down
11 changes: 8 additions & 3 deletions deploy/modules/gcp/tidb-cluster/variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -47,15 +47,20 @@ variable "monitor_instance_type" {

variable "pd_image_type" {
description = "PD image type, avaiable: UBUNTU/COS"
default = "COS"
default = "COS"
}

variable "tidb_image_type" {
description = "TiDB image type, avaiable: UBUNTU/COS"
default = "COS"
default = "COS"
}

variable "tikv_image_type" {
description = "TiKV image type, avaiable: UBUNTU/COS"
default = "COS"
default = "COS"
}

variable "tikv_local_ssd_count" {
description = "TiKV node pool local ssd count (cannot be changed after the node pool is created)"
default = 1
}
26 changes: 19 additions & 7 deletions deploy/modules/gcp/tidb-operator/main.tf
Original file line number Diff line number Diff line change
@@ -1,9 +1,10 @@
resource "google_container_cluster" "cluster" {
name = var.gke_name
network = var.vpc_name
subnetwork = var.subnetwork_name
location = var.gcp_region
project = var.gcp_project
name = var.gke_name
network = var.vpc_name
subnetwork = var.subnetwork_name
location = var.location
node_locations = var.node_locations
project = var.gcp_project

master_auth {
username = ""
Expand Down Expand Up @@ -42,11 +43,22 @@ resource "google_container_cluster" "cluster" {
}
}

locals {
# Because `gcloud containers clusters get-credentials` cannot accept location
# argument, we must use --zone or --region flag according to location's
# value.
# This is same as in google terraform provider, see
# https://github.com/terraform-providers/terraform-provider-google/blob/24c36107e03cbaeb38ae1ebb24de7aa51a0343df/google/resource_container_cluster.go#L956-L960.
cmd_get_cluster_credentials = length(split("-", var.location)) == 3 ? "gcloud --project ${var.gcp_project} container clusters get-credentials ${google_container_cluster.cluster.name} --zone ${var.location}" : "gcloud --project ${var.gcp_project} container clusters get-credentials ${google_container_cluster.cluster.name} --region ${var.location}"
}

resource "null_resource" "get-credentials" {
depends_on = [google_container_cluster.cluster]
triggers = {
command = local.cmd_get_cluster_credentials
}
provisioner "local-exec" {
command = "gcloud --project ${var.gcp_project} container clusters get-credentials ${google_container_cluster.cluster.name} --region ${var.gcp_region}"

command = local.cmd_get_cluster_credentials
environment = {
KUBECONFIG = var.kubeconfig_path
}
Expand Down
14 changes: 10 additions & 4 deletions deploy/modules/gcp/tidb-operator/variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -11,14 +11,20 @@ variable "subnetwork_name" {
description = "The name of the subnetwork in which to place the cluster"
}

variable "gcp_region" {
description = "The GCP region"
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replaced by location

variable "gcp_project" {
description = "The GCP project name"
}

variable "location" {
description = "The GKE cluster location. If you specify a zone (such as us-central1-a), the cluster will be a zonal cluster with a single cluster master. If you specify a region (such as us-west1), the cluster will be a regional cluster with multiple masters spread across zones in the region."
type = string
}

variable "node_locations" {
description = "The list of zones in which the cluster's nodes should be located. These must be in the same region as the cluster zone for zonal clusters, or in the region of a regional cluster. In a multi-zonal cluster, the number of nodes specified in initial_node_count is created in all specified zones as well as the primary zone. If specified for a regional cluster, nodes will be created in only these zones."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of this for a single-zone cluster? Doesn't the above location variable already specify this properly?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, I see, there are zonal clusters that are multi-zone clusters. I wonder why not just use the regional cluster though?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, if you need to run a multi-zone cluster, it's better to use a regional cluster. However, if users want a small cluster for testing or for low network latency between work nodes, they can use a zonal cluster and run nodes in a single zone.

node_locations can also be used to choose node locations for the regional cluster too (at least two locations for a regional cluster).

Currently, in GCP, some types of machines are only available in a few zones, e.g. in region us-central1 c2-standard machines are only available in us-central1-b and us-central1-c zones.

If users want to use these compute-optimized machines, they must specify node_locations to deploy nodes in these zones.

type = list(string)
}

variable "gke_version" {
description = "Kubernetes version to use for the GKE cluster"
type = string
Expand Down