Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

500 Internal Server Error when specifying incorrect prefix for oxide system networking bgp announce #6517

Open
elaine-oxide opened this issue Sep 4, 2024 · 0 comments
Labels
networking Related to the networking.
Milestone

Comments

@elaine-oxide
Copy link
Contributor

elaine-oxide commented Sep 4, 2024

I am running a4x2 with:

  • omicron 1059fe109fadbccf559e4adad134eaffcb67b85a
  • Oxide CLI version compiled from main with this commit:
$ oxide version
Oxide CLI 0.7.0+20240821.0
Built from commit: 90031e0c3ed6af105b66ea00ed0d70b8e9d3b8f4 
Oxide API: 20240821.0

I was testing the commands in https://docs.oxide.computer/guides/operator/expanding-connectivity for a second time, where I slightly modified the IP addresses for the first run, and further modified the .json file names and IP addresses for the second run.

Initial state:

$ oxide system networking address-lot list
[
...
  }, {
    "description": "a lot for cloud communications",
    "id": "9b3f0925-20f9-4d90-8386-b4f5c4a23652",
    "kind": "pool",
    "name": "cloud-pool",
    "time_created": "2024-08-30T00:15:43.988077Z",
    "time_modified": "2024-08-30T00:15:43.988077Z"
  }, {
    "description": "a lot for cloud communications2",
    "id": "b1f20e14-de62-42d0-819c-e2769bb6b41a",
    "kind": "pool",
    "name": "cloud-pool2",
    "time_created": "2024-09-03T20:05:56.197327Z",
    "time_modified": "2024-09-03T20:05:56.197327Z"
  }, {
...
]

$ oxide system networking address-lot block list --address-lot cloud-pool
[
  {
    "first_address": "203.0.133.1",
    "id": "8829bdb3-7496-4657-b09a-f88caf3ea6e0",
    "last_address": "203.0.133.254"
  }
]

$ oxide system networking address-lot block list --address-lot cloud-pool2
[
  {
    "first_address": "203.0.153.1",
    "id": "485595ad-a9d1-4662-9ba7-fba570a9acb3",
    "last_address": "203.0.153.254"
  }
]

Then I tried to run a modified version of one of the examples, but I did not properly modify it in full (modified the --address-lot, but forgot to modify the --prefix).

$ oxide system networking bgp announce \
    --announce-set as65547-announce \
    --address-lot cloud-pool2 \
    --prefix 203.0.113.0/24
Error Response: status: 500 Internal Server Error; headers: {"content-type": "application/json", "x-request-id": "ea4ea832-ef7e-4a15-90b1-042bf0aff541", "content-length": "124", "date": "Tue, 03 Sep 2024 20:27:43 GMT"}; value: Error { error_code: Some("Internal"), message: "Internal Server Error", request_id: "ea4ea832-ef7e-4a15-90b1-042bf0aff541" }

Above provided arguments fail to satisfy the requirements as described in the help output:

$ oxide system networking bgp announce --help
Announce a prefix over BGP.

This command adds the provided prefix to the specified announce set. It is
required that the prefix be available in the given address lot. The add is
performed as a read-modify-write on the specified address lot.
...

The prefix 203.0.113.0/24 I provided is not available in the given address lot cloud-pool2. From above output, we see that address lot cloud-pool2 has addresses 203.0.153.x. Also, see below observations that the prefix 203.0.113.0/24 is already in the specified announce set.

On a4x2 g3:

root@oxz_nexus_579cb51c:~# cat $(svcs -L nexus) | grep ea4ea832-ef7e-4a15-90b1-042bf0aff541 | looker
...
20:27:43.995Z ERRO 579cb51c-5cdc-4b01-a46c-53f6bbb78393 (dropshot_external): bgp_update_announce_set failed
    actor_id = 62ea6c2b-4ec3-48c2-9f4a-adeeeb04e384
    authenticated = true
    error = DatabaseError(UniqueViolation, "duplicate key value violates unique constraint \\"bgp_announcement_pkey\\"")
    file = nexus/db-queries/src/db/datastore/bgp.rs:595
    local_addr = 172.30.2.6:80
    method = PUT
    remote_addr = 172.20.2.90:37702
    req_id = ea4ea832-ef7e-4a15-90b1-042bf0aff541
    uri = /v1/system/networking/bgp-announce-set
20:27:43.995Z INFO 579cb51c-5cdc-4b01-a46c-53f6bbb78393 (dropshot_external): request completed
    error_message_external = Internal Server Error
    error_message_internal = unexpected database error: duplicate key value violates unique constraint "bgp_announcement_pkey"
    file = /home/elaine/.cargo/git/checkouts/dropshot-a4a923d29dccc492/06c8dab/dropshot/src/server.rs:902
    latency_us = 122344
    local_addr = 172.30.2.6:80
    method = PUT
    remote_addr = 172.20.2.90:37702
    req_id = ea4ea832-ef7e-4a15-90b1-042bf0aff541
    response_code = 500
    uri = /v1/system/networking/bgp-announce-set

I was able to fix my earlier mistake by providing the correct --prefix, and the command went through successfully and did not return any error messages.

$ oxide system networking bgp announce     --announce-set as65547-announce     --address-lot cloud-pool2     --prefix 203.0.153.0/24

Additionally, the error message shown in the nexus log above is due to the fact that the prefix is already in the bgp announcement list (output below was taken after running the above command with the corrected prefix):

$ oxide system networking bgp announcement list --name-or-id as65547-announce
[
  {
    "address_lot_block_id": "94e1528f-5f97-404e-b894-9aa931177e40",
    "announce_set_id": "8abc3e65-9433-46ac-8da2-2acec6a5afc6",
    "network": "198.51.100.0/24"
  },
  {
    "address_lot_block_id": "94e1528f-5f97-404e-b894-9aa931177e40",
    "announce_set_id": "8abc3e65-9433-46ac-8da2-2acec6a5afc6",
    "network": "203.0.113.0/24"
  },
  {
    "address_lot_block_id": "94e1528f-5f97-404e-b894-9aa931177e40",
    "announce_set_id": "8abc3e65-9433-46ac-8da2-2acec6a5afc6",
    "network": "203.0.133.0/24"
  },
  {
    "address_lot_block_id": "94e1528f-5f97-404e-b894-9aa931177e40",
    "announce_set_id": "8abc3e65-9433-46ac-8da2-2acec6a5afc6",
    "network": "203.0.153.0/24"
  }
]

Rather than a 500 Internal Server Error as displayed above, a more useful error message could be propagated to the user, so that user knows what to correct (in this case, the user might want to know that the requirement "It is required that the prefix be available in the given address lot." has not been satisfied, and/or that the provided prefix is already in the specified announce set).

=====

Upon further investigation, I see that the requirement "It is required that the prefix be available in the given address lot." is not enforced/validated.

I tried running the following command where the provided prefix is not in the given address lot.

$ oxide system networking bgp announce \
    --announce-set as65547-announce \
    --address-lot cloud-pool2 \
    --prefix 203.0.173.0/24

The above command went through successfully with no error messages, and resulted in the following state:

$ oxide system networking bgp announcement list --name-or-id as65547-announce
[
  {
    "address_lot_block_id": "eae2f630-05a7-47f3-b16e-c249a7caa989",
    "announce_set_id": "8abc3e65-9433-46ac-8da2-2acec6a5afc6",
    "network": "198.51.100.0/24"
  },
  {
    "address_lot_block_id": "eae2f630-05a7-47f3-b16e-c249a7caa989",
    "announce_set_id": "8abc3e65-9433-46ac-8da2-2acec6a5afc6",
    "network": "203.0.113.0/24"
  },
  {
    "address_lot_block_id": "eae2f630-05a7-47f3-b16e-c249a7caa989",
    "announce_set_id": "8abc3e65-9433-46ac-8da2-2acec6a5afc6",
    "network": "203.0.133.0/24"
  },
  {
    "address_lot_block_id": "eae2f630-05a7-47f3-b16e-c249a7caa989",
    "announce_set_id": "8abc3e65-9433-46ac-8da2-2acec6a5afc6",
    "network": "203.0.153.0/24"
  },
  {
    "address_lot_block_id": "eae2f630-05a7-47f3-b16e-c249a7caa989",
    "announce_set_id": "8abc3e65-9433-46ac-8da2-2acec6a5afc6",
    "network": "203.0.173.0/24"
  }
]

=====

I also tried providing a non-existing address lot. The following command went through with no error messages returned.

$ oxide system networking bgp announce     --announce-set as65547-announce     --address-lot cloud-pool3     --prefix 203.0.193.0/24

Resulting state:

$ oxide system networking bgp announcement list --name-or-id as65547-announce
[
  {
    "address_lot_block_id": "204fe754-ccc1-40b3-aab2-adef3a36b266",
    "announce_set_id": "8abc3e65-9433-46ac-8da2-2acec6a5afc6",
    "network": "198.51.100.0/24"
  },
  {
    "address_lot_block_id": "204fe754-ccc1-40b3-aab2-adef3a36b266",
    "announce_set_id": "8abc3e65-9433-46ac-8da2-2acec6a5afc6",
    "network": "203.0.113.0/24"
  },
  {
    "address_lot_block_id": "204fe754-ccc1-40b3-aab2-adef3a36b266",
    "announce_set_id": "8abc3e65-9433-46ac-8da2-2acec6a5afc6",
    "network": "203.0.133.0/24"
  },
  {
    "address_lot_block_id": "204fe754-ccc1-40b3-aab2-adef3a36b266",
    "announce_set_id": "8abc3e65-9433-46ac-8da2-2acec6a5afc6",
    "network": "203.0.153.0/24"
  },
  {
    "address_lot_block_id": "204fe754-ccc1-40b3-aab2-adef3a36b266",
    "announce_set_id": "8abc3e65-9433-46ac-8da2-2acec6a5afc6",
    "network": "203.0.173.0/24"
  },
  {
    "address_lot_block_id": "204fe754-ccc1-40b3-aab2-adef3a36b266",
    "announce_set_id": "8abc3e65-9433-46ac-8da2-2acec6a5afc6",
    "network": "203.0.193.0/24"
  }
]

Just to fully show that the address lot does not exist:

$ oxide system networking address-lot block list --address-lot cloud-pool3
[]
Error Response: status: 404 Not Found; headers: {"content-type": "application/json", "x-request-id": "ef10dada-5555-423b-a0b0-46d9ad11d928", "content-length": "157", "date": "Wed, 04 Sep 2024 18:12:55 GMT"}; value: Error { error_code: Some("ObjectNotFound"), message: "not found: address-lot with name \"cloud-pool3\"", request_id: "ef10dada-5555-423b-a0b0-46d9ad11d928" }
@elaine-oxide elaine-oxide added the networking Related to the networking. label Sep 4, 2024
@elaine-oxide elaine-oxide added this to the 11 milestone Sep 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
networking Related to the networking.
Projects
None yet
Development

No branches or pull requests

1 participant