Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove zone #1243

Merged
merged 8 commits into from
Apr 24, 2022
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs-2.0/20.appendix/0.FAQ.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ Nebula Graph is still under development. Its behavior changes from time to time.

Starting with Nebula Graph 3.0.0, the statements `LOOKUP`, `GO`, and `FETCH` must output results with the `YIELD` clause. For more information, see [YIELD](../3.ngql-guide/8.clauses-and-options/yield.md).

### "How to resolve the error `Zone not enough!`?"
### "How to resolve the error `Host not enough!`?"

From Nebula Graph version 3.0.0, the Storage services added in the configuration files **CANNOT** be read or written directly. The configuration files only register the Storage services into the Meta services. You must run the `ADD HOSTS` command to read and write data on Storage servers. For more information, see [Manage Storage hosts](../4.deployment-and-installation/manage-storage-host.md).

Expand Down
19 changes: 19 additions & 0 deletions docs-2.0/3.ngql-guide/4.job-statements.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,25 @@ The long-term tasks run by the Storage Service are called jobs, such as `COMPACT

cooper-lzy marked this conversation as resolved.
Show resolved Hide resolved
All job management commands can be executed only after selecting a graph space.

## SUBMIT JOB BALANCE DATA

!!! enterpriseonly

Only available for the Nebula Graph Enterprise Edition.

The `SUBMIT JOB BALANCE DATA` statement starts a job to balance the distribution of storage partitions in the current graph space. It returns the job ID.

For example:

```ngql
nebula> SUBMIT JOB BALANCE DATA;
+------------+
| New Job Id |
+------------+
| 28 |
+------------+
```

<!-- balance-3.1
## SUBMIT JOB BALANCE IN ZONE

Expand Down
13 changes: 12 additions & 1 deletion docs-2.0/3.ngql-guide/9.space-statements/4.describe-space.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,11 +14,22 @@ The `DESCRIBE SPACE` statement is different from the `SHOW SPACES` statement. Fo

## Example

```ngql
nebula> DESCRIBE SPACE basketballplayer;
+----+--------------------+------------------+----------------+---------+------------+--------------------+-------------+---------+
| ID | Name | Partition Number | Replica Factor | Charset | Collate | Vid Type | Atomic Edge | Comment |
+----+--------------------+------------------+----------------+---------+------------+--------------------+-------------+---------+
| 1 | "basketballplayer" | 10 | 1 | "utf8" | "utf8_bin" | "FIXED_STRING(32)" | false | |
+----+--------------------+------------------+----------------+---------+------------+--------------------+-------------+---------+
```

<!--
```ngql
nebula> DESCRIBE SPACE basketballplayer;
+----+--------------------+------------------+----------------+---------+------------+--------------------+-------------+-------------------------------+---------+
| ID | Name | Partition Number | Replica Factor | Charset | Collate | Vid Type | Atomic Edge | Zones | Comment |
+----+--------------------+------------------+----------------+---------+------------+--------------------+-------------+-------------------------------+---------+
| 1 | "basketballplayer" | 10 | 1 | "utf8" | "utf8_bin" | "FIXED_STRING(32)" | false | "default_zone_127.0.0.1_9779" | |
+----+--------------------+------------------+----------------+---------+------------+--------------------+-------------+-------------------------------+---------+
```
```
-->
115 changes: 112 additions & 3 deletions docs-2.0/8.service-tuning/load-balance.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,118 @@

You can use the `BALANCE` statement to balance the distribution of partitions and Raft leaders, or clear some Storage servers for easy maintenance. For details, see [BALANCE](../synchronization-and-migration/2.balance-syntax.md).

!!! compatibility "Legacy version compatibility"
!!! danger

The `BALANCE` commands migrate data and balance the distribution of partitions by creating and executing a set of subtasks. **DO NOT** stop any machine in the cluster or change its IP address until all the subtasks finish. Otherwise, the follow-up subtasks fail.

## Balance partition distribution

!!! enterpriseonly

Only available for the Nebula Graph Enterprise Edition.

The `BALANCE DATA` commands starts a job to balance the distribution of storage partitions in the current graph space by creating and executing a set of subtasks.

### Examples

After you add new storage hosts into the cluster, no partition is deployed on the new hosts.

1. Run `SHOW HOSTS` to check the partition distribution.

```ngql
nebual> SHOW HOSTS;
+-----------------+------+-----------+----------+--------------+-----------------------+------------------------+-------------+
| Host | Port | HTTP port | Status | Leader count | Leader distribution | Partition distribution | Version |
+-----------------+------+-----------+----------+--------------+-----------------------+------------------------+-------------+
| "192.168.8.101" | 9779 | 19669 | "ONLINE" | 0 | "No valid partition" | "No valid partition" | "3.1.0-ent" |
| "192.168.8.100" | 9779 | 19669 | "ONLINE" | 15 | "basketballplayer:15" | "basketballplayer:15" | "3.1.0-ent" |
+-----------------+------+-----------+----------+--------------+-----------------------+------------------------+-------------+
```

2. Enter the graph space `basketballplayer`, and execute the command `BALANCE DATA` to balance the distribution of storage partitions.

```ngql
nebula> USE basketballplayer;
nebula> BALANCE DATA;
+------------+
| New Job Id |
+------------+
| 2 |
+------------+
```

3. The job ID is returned after running `BALANCE DATA`. Run `SHOW JOB <job_id>` to check the status of the job.

```ngql
nebula> SHOW JOB 2;
+------------------------+------------------------------------------+-------------+---------------------------------+---------------------------------+-------------+
| Job Id(spaceId:partId) | Command(src->dst) | Status | Start Time | Stop Time | Error Code |
+------------------------+------------------------------------------+-------------+---------------------------------+---------------------------------+-------------+
| 2 | "DATA_BALANCE" | "FINISHED" | "2022-04-12T03:41:43.000000000" | "2022-04-12T03:41:53.000000000" | "SUCCEEDED" |
| "2, 1:1" | "192.168.8.100:9779->192.168.8.101:9779" | "SUCCEEDED" | 2022-04-12T03:41:43.000000 | 2022-04-12T03:41:53.000000 | "SUCCEEDED" |
| "2, 1:2" | "192.168.8.100:9779->192.168.8.101:9779" | "SUCCEEDED" | 2022-04-12T03:41:43.000000 | 2022-04-12T03:41:53.000000 | "SUCCEEDED" |
| "2, 1:3" | "192.168.8.100:9779->192.168.8.101:9779" | "SUCCEEDED" | 2022-04-12T03:41:43.000000 | 2022-04-12T03:41:53.000000 | "SUCCEEDED" |
| "2, 1:4" | "192.168.8.100:9779->192.168.8.101:9779" | "SUCCEEDED" | 2022-04-12T03:41:43.000000 | 2022-04-12T03:41:53.000000 | "SUCCEEDED" |
| "2, 1:5" | "192.168.8.100:9779->192.168.8.101:9779" | "SUCCEEDED" | 2022-04-12T03:41:43.000000 | 2022-04-12T03:41:53.000000 | "SUCCEEDED" |
| "2, 1:6" | "192.168.8.100:9779->192.168.8.101:9779" | "SUCCEEDED" | 2022-04-12T03:41:43.000000 | 2022-04-12T03:41:43.000000 | "SUCCEEDED" |
| "2, 1:7" | "192.168.8.100:9779->192.168.8.101:9779" | "SUCCEEDED" | 2022-04-12T03:41:43.000000 | 2022-04-12T03:41:53.000000 | "SUCCEEDED" |
| "Total:7" | "Succeeded:7" | "Failed:0" | "In Progress:0" | "Invalid:0" | "" |
+------------------------+------------------------------------------+-------------+---------------------------------+---------------------------------+-------------+
```

4. When all the subtasks succeed, the load balancing process finishes. Run `SHOW HOSTS` again to make sure the partition distribution is balanced.

!!! Note

`BALANCE DATA` does not balance the leader distribution. For more information, see [Balance leader distribution](#Balance leader distribution).

```ngql
nebula> SHOW HOSTS;
+-----------------+------+-----------+----------+--------------+----------------------+------------------------+-------------+
| Host | Port | HTTP port | Status | Leader count | Leader distribution | Partition distribution | Version |
+-----------------+------+-----------+----------+--------------+----------------------+------------------------+-------------+
| "192.168.8.101" | 9779 | 19669 | "ONLINE" | 7 | "basketballplayer:7" | "basketballplayer:7" | "3.1.0-ent" |
| "192.168.8.100" | 9779 | 19669 | "ONLINE" | 8 | "basketballplayer:8" | "basketballplayer:8" | "3.1.0-ent" |
+-----------------+------+-----------+----------+--------------+----------------------+------------------------+-------------+
```

If any subtask fails, run `RECOVER JOB <job_id>` to recover the failed jobs. If redoing load balancing does not solve the problem, ask for help in the [Nebula Graph community](https://discuss.nebula-graph.io/).

### Stop data balancing

To stop a balance job, run `STOP JOB <job_id>`.

* If no balance job is running, an error is returned.

* If a balance job is running, `Job stopped` is returned.

!!! note

- `STOP JOB <job_id>` does not stop the running subtasks but cancels all follow-up subtasks. The status of follow-up subtasks is set to `INVALID`. The status of ongoing subtasks is set to `SUCCEEDED` or `FAILED` based on the result. You can run the `SHOW JOB <job_id>` command to check the stopped job status.
- After terminate and restart, the job status is set to `QUEUE`. If the previous status of subtasks was `INVALID` or `FAILED`, the status set to `IN_PROGRESS`. If it was `IN_PROGRESS` or `SUCCEEDED`, the status remains unchanged.
cooper-lzy marked this conversation as resolved.
Show resolved Hide resolved

Once all the subtasks are finished or stopped, you can run `RECOVER JOB <job_id>` again to balance the partitions again, the subtasks continue to be executed in the original state.

### Migrate partition

To migrate specified partitions and scale in the cluster, you can run `BALANCE DATA REMOVE <ip:port> [,<ip>:<port> ...]`.

For example, to migrate the partitions in server `192.168.8.100:9779`, the command as following:

```ngql
nebula> BALANCE DATA REMOVE 192.168.8.100:9779;
nebula> SHOW HOSTS;
+-----------------+------+-----------+----------+--------------+-----------------------+------------------------+-------------+
| Host | Port | HTTP port | Status | Leader count | Leader distribution | Partition distribution | Version |
+-----------------+------+-----------+----------+--------------+-----------------------+------------------------+-------------+
| "192.168.8.101" | 9779 | 19669 | "ONLINE" | 15 | "basketballplayer:15" | "basketballplayer:15" | "3.1.0-ent" |
| "192.168.8.100" | 9779 | 19669 | "ONLINE" | 0 | "No valid partition" | "No valid partition" | "3.1.0-ent" |
+-----------------+------+-----------+----------+--------------+-----------------------+------------------------+-------------+
```

!!! note

The `BALANCE DATA` commands are not supported.
This command migrates partitions to other storage hosts but does not delete the current storage host from the cluster. To delete the Storage hosts from cluster, see [Manage Storage hosts](../4.deployment-and-installation/manage-storage-host.md).

<!-- balance-3.1
!!! danger
Expand Down Expand Up @@ -159,4 +268,4 @@ nebula> SHOW HOSTS;

!!! caution

In Nebula Graph {{ nebula.release }}, switching leaders will cause a large number of short-term request errors (Storage Error `E_RPC_FAILURE`). For solutions, see [FAQ](../20.appendix/0.FAQ.md).
In Nebula Graph {{ nebula.release }}, switching leaders will cause a large number of short-term request errors (Storage Error `E_RPC_FAILURE`). For solutions, [FAQ](../20.appendix/0.FAQ.md).
8 changes: 3 additions & 5 deletions docs-2.0/synchronization-and-migration/2.balance-syntax.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,13 @@

The `BALANCE` statements support the load balancing operations of the Nebula Graph Storage services. For more information about storage load balancing and examples for using the `BALANCE` statements, see [Storage load balance](../8.service-tuning/load-balance.md).

!!! compatibility "Legacy version compatibility"

The `BALANCE DATA` commands are not supported.

The `BALANCE` statements are listed as follows.

|Syntax|Description|
|:---|:---|
|`BALANCE LEADER`| Starts a job to balance the distribution of storage leaders in the current graph space. It returns the job ID.|
|`BALANCE DATA`| Starts a job to balance the distribution of storage partitions in the current graph space. It returns the job ID. |
|`BALANCE DATA REMOVE <ip:port> [,<ip>:<port> ...]`| Migrate the partitions in the specified storage host to other storage hosts in the current graph space. |
|`BALANCE LEADER`| Starts a job to balance the distribution of storage leaders in the current graph space. It returns the job ID. |
<!-- balance-3.1
|`BALANCE IN ZONE [REMOVE <ip>:<port> [,<ip>:<port> ...]]`| Starts a job to balance the distribution of storage partitions in each zone in the current graph space. It returns the job ID. You can use the `REMOVE` option to specify the Storage services that you want to clear. The partitions of these services will be moved to other services for easy maintenance.|
|`BALANCE ACROSS ZONE [REMOVE "zone_name" [,"zone_name" ...]]`| Starts a job to balance the distribution of storage partitions across each zone in the current graph space. It returns the job ID. You can use the `REMOVE` option to specify the zones that you want to clear. The partitions of these services will be moved to other services for easy maintenance.|
Expand Down