Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: support for peer label in DNS lookups #15374

Closed
wants to merge 6 commits into from
Closed
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
193 changes: 150 additions & 43 deletions website/content/docs/discovery/dns.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -133,20 +133,47 @@ it is recommended to use the HTTP API to retrieve the list of nodes.

### Standard Lookup

The format of a standard service lookup is:
The following are valid formats for standard service lookups:

```text
[<tag>.]<service>.service[.<datacenter>].<domain>
```
- Lookup a local service:

```text
[<tag>.]<service>.service.<domain>
```

The lookup uses the datacenter of the Consul agent acting as the DNS server.

- Lookup a service in a cluster peer:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Lookup a service in a cluster peer:
- Add the `peer` segment to lookup a service in a cluster peer:

Correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly. They need to specify two labels. If the peer name is "foo", they'd have to add foo.peer.

At a higher level, I'm wondering about the benefit of changing this text from "context in which the format is useful" to "actions you need to take to use the format below". It feels strange to me to have one of the "valid formats for standard service lookups" be titled "lookup a local service" and the next be in a different form: "add the peer segment to lookup a service in a cluster peer".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would still prefer that we describe what users are supposed to do in each of these rather than leaving it for them to figure it out based on the snippet we're providing. That is my reason for these suggestions -- we're giving the what without the how.


```text
[<tag>.]<service>.service.<peer>.peer.<domain>
```

- Lookup a service in a WAN federated datacenter:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Lookup a service in a WAN federated datacenter:
- Specify the name of a datacenter to lookup a service in a WAN federated datacenter:


```text
[<tag>.]<service>.service.<datacenter>[.dc].<domain>
```

Use of the optional `.dc` label is recommended for readability.
jkirschner-hashicorp marked this conversation as resolved.
Show resolved Hide resolved

In all formats, `tag` is optional.
If no tag is provided, no filtering is done on tag.
jkirschner-hashicorp marked this conversation as resolved.
Show resolved Hide resolved

To lookup services directly from a mesh service's application code
with transparent proxy enabled, refer to
[service virtual IP lookups for Consul Enterprise](#service-virtual-ip-lookups) instead.

The `tag` is optional, and, as with node lookups, the `datacenter` is as
well. If no tag is provided, no filtering is done on tag. If no
datacenter is provided, the datacenter of this Consul agent is assumed.
The following table shows example lookups for the `api` service
depending on its location and whether you want to filter for instances
of the service by tag. All queries assume the `domain` is `consul`.

If we want to find any redis service providers in our local datacenter,
we could query `redis.service.consul.` If we want to find the PostgreSQL
primary in a particular datacenter, we could query
`primary.postgresql.service.dc2.consul.`
| Tag | Location | Query |
| --- | -------- | ----- |
| None | Local | `api.service.consul` |
| `v2` | Local | `v2.api.service.consul` |
| None | Peer `us-east` | `api.service.us-east.peer.consul` |
| None | WAN federated datacenter `dc3` | `api.service.dc3.dc.consul` |

The DNS query system makes use of health check information to prevent routing
to unhealthy nodes. When a service query is made, any services failing their health
Expand Down Expand Up @@ -183,19 +210,24 @@ foobar.node.dc1.consul. 0 IN A 10.1.10.12

### RFC 2782 Lookup

Valid formats for RFC 2782 SRV lookups depend on
whether you want to filter results based on a service tag:
RFC 2782 style lookups have the same behavior as
[standard service lookups](#standard-lookup),
but differ in format.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
RFC 2782 style lookups have the same behavior as
[standard service lookups](#standard-lookup),
but differ in format.
You can use RFC 2782-style lookups to query for services.

We don't really describe the behavior of standard lookups--just gave some examples of how to format the lookups--so there isn't really anything to compare it to. Even if there were details about the behavior, it would be better to just state the behavior of this lookup format here.

The RFC 2782 lookup format supports the same suffixes as standard lookups
for optionally specifying a cluster peer or WAN federated datacenter,
but differs in how service name and optional tag are specified.
The valid RFC 2782 lookup formats below assume a local service lookup:
Comment on lines +223 to +226
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The RFC 2782 lookup format supports the same suffixes as standard lookups
for optionally specifying a cluster peer or WAN federated datacenter,
but differs in how service name and optional tag are specified.
The valid RFC 2782 lookup formats below assume a local service lookup:
You can query for services in peered clusters and WAN-federated datacenters using the RFC 2782 lookup format. The following examples demonstrate how to query a local service using the RFC 2782 lookup format:

Not ideal because we just give some examples instead of describing the format, but I think it gets the point across. I think the assumption is that readers would just know how to format an RFC 2782 lookup? Can we link to the standard?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The challenge here is that standard and RFC 2782 lookups support all the same stuff after .service, but RFC 2782 lookups provide multiple ways to specify a tag. (There's an outstanding PR that will add a format not currently listed here.)

What's I'm trying to communicate is: The only difference is how <service> and <tag> can be specified. Everything else (<datacenter> and <peer>) is the same as a standard lookup.

Comment on lines +223 to +226
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The RFC 2782 lookup format supports the same suffixes as standard lookups
for optionally specifying a cluster peer or WAN federated datacenter,
but differs in how service name and optional tag are specified.
The valid RFC 2782 lookup formats below assume a local service lookup:
The RFC 2782 lookup format supports the same suffixes as standard lookups for specifying cluster peers or WAN federated datacenters, but it follows a different format for specifying service names and tags.
The following RFC 2782 lookups show how to format queries for a local service:

Switch to active voice
I have the same comment as above w/r/t relying on users to recognize the differences by looking at the snippet rather than helping them with a description.


- No filtering on service tag:

```text
_<service>._tcp[.service][.<datacenter>].<domain>
_<service>._tcp[.service].<domain>
```

- Filtering on service tag specified in the RFC 2782 protocol field:

```text
_<service>._<tag>[.service][.<datacenter>].<domain>
_<service>._<tag>[.service].<domain>
```

Per [RFC 2782](https://tools.ietf.org/html/rfc2782), SRV queries must
Expand All @@ -204,8 +236,16 @@ prevent DNS collisions.
To perform no tag-based filtering, specify `tcp` in the RFC 2782 protocol field.
To filter results on a service tag, specify the tag in the RFC 2782 protocol field.

Other than the query format and default `tcp` protocol/tag value, the behavior
of the RFC style lookup is the same as the standard style of lookup.
The following table shows example lookups for the `api` service
depending on its location and whether you want to filter for instances
of the service by tag. All queries assume the `domain` is `consul`.

| Tag | Location | Query |
| --- | -------- | ----- |
| None | Local | `_api._tcp.service.consul` |
| `v2` | Local | `_api._v2.service.consul` |
| None | Peer `us-east` | `_api._tcp.service.us-east.peer.consul` |
| None | WAN federated datacenter `dc3` | `_api._tcp.service.dc3.dc.consul` |

If you registered the service `rabbitmq` on port 5672 and tagged it with `amqp`,
you could make an RFC 2782 query for its SRV record as `_rabbitmq._amqp.service.consul`:
Expand Down Expand Up @@ -353,32 +393,77 @@ $ echo -n "20010db800010002cafe000000001337" | perl -ne 'printf join(":", unpack

By default, all service lookups use the `default` namespace
within the partition and datacenter of the Consul agent that received the DNS query.
To lookup services in another namespace, partition, and/or datacenter,
To lookup services in another namespace, partition, datacenter, or cluster peer,
jkirschner-hashicorp marked this conversation as resolved.
Show resolved Hide resolved
use the [canonical format](#canonical-format).

Consul server agents are in the `default` partition.
If DNS queries are addressed to Consul server agents,
service lookups to non-`default` partitions must explicitly specify
the partition of the target service.

To lookup services imported from a cluster peer,
refer to [service virtual IP lookups for Consul Enterprise](#service-virtual-ip-lookups-for-consul-enterprise) instead.
To lookup services directly from a mesh service's application code
with transparent proxy enabled, refer to
[service virtual IP lookups for Consul Enterprise](#service-virtual-ip-lookups-for-consul-enterprise) instead.

#### Canonical format

Use the following query format to specify namespace, partition, and/or datacenter
for `.service`, `.connect`, `.virtual`, and `.ingress` service lookup types.
All three fields (`namespace`, `partition`, `datacenter`) are optional.
```text
[<tag>.]<service>.service[.<namespace>.ns][.<partition>.ap][.<datacenter>.dc]<domain>
```
Use the following query formats to specify namespace, partition,
WAN federated datacenter, or cluster peer in service lookups.
In each format, all fields in square brackets are optional.

- Lookup a service, optionally in a cluster peer:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Lookup a service, optionally in a cluster peer:
- Lookup a service in a cluster peer:

I was a little confused by the word "optionally" in this sentence. The syntax shows the peer segment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything in square brackets is optional, as stated in Line 412:

In each format, all fields in square brackets are optional.

If you don't include .peer-name.peer, then it does a local lookup. Partition and namespace are also optional.

Therefore, I'm inclined to decline this change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't think this line makes sense. The explanation actually confirms that we should change it because we already state in the introduction that the peer is optional.


```text
[<tag>.]<service>.service[.<namespace>.ns][.<peer>.peer][.<partition>.ap].<domain>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't currently support this format with .service.p1.peer in 1.14.0, so we'll need to remove anything referring to it. I believe we'll only be able to use the new .peer syntax with virtual entries for the time-being.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good - I'll move that stuff out into another PR ...

... once the accuracy of some of the other content is reviewed (so I can batch all my changes).

I'm particularly not sure about the changes I made to the ### Service Virtual IP Lookups section.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self: .peer support varies by service lookup type and version.

.virtual has supported as of 1.14.0.
.service and .node have support as of 1.14.2.

.ingress and .connect lack support.

```

This format supports `.service` lookup types in Consul 1.14.2 or later
and supports `.virtual` lookup types in Consul 1.14.0 or later.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This format supports `.service` lookup types in Consul 1.14.2 or later
and supports `.virtual` lookup types in Consul 1.14.0 or later.
~> **Compatibility warning**: Consul 1.14.2+ is required for `.service` lookup types.

We should only call out the minor version where the change was introduced.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still need to mention somewhere which lookup types are supported by each format. This suggested change removes mention that .virtual lookup types are supported.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll mention that .service and .virtual types are supported above the compatibility warning.

The `peer` is a cluster peer of the `partition`.
If no `peer` is specified, the service lookup applies to the specified `partition`
rather than one of its cluster peers.

For example, assume that an agent in the `default` partition is acting as the
DNS server, the `k8s-app-1` partition is in the same datacenter, and
the `k8s-app-1` partition has a cluster peer named `billing`.
To perform a lookup of non-mesh service `api` in namespace `foo`
of that cluster peer, use the query
`api.service.foo.ns.billing.peer.k8s-app-1.partition.consul`.

- Lookup a service, optionally in a WAN federated datacenter:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The `peer` is a cluster peer of the `partition`.
If no `peer` is specified, the service lookup applies to the specified `partition`
rather than one of its cluster peers.
For example, assume that an agent in the `default` partition is acting as the
DNS server, the `k8s-app-1` partition is in the same datacenter, and
the `k8s-app-1` partition has a cluster peer named `billing`.
To perform a lookup of non-mesh service `api` in namespace `foo`
of that cluster peer, use the query
`api.service.foo.ns.billing.peer.k8s-app-1.partition.consul`.
- Lookup a service, optionally in a WAN federated datacenter:
Specify the name of a peered cluster in the `peer` segment.
Specify the name of the admin partition in the cluster in the `partition` segment.
If you do not include a `peer`, the service lookup queries the `partition` instead of one of the peered clusters.
In the following example, a Consul agent in the `default` partition acts as the DNS server, the `k8s-app-1` partition is in the same datacenter, and
the `k8s-app-1` partition has a cluster peer named `billing`.
To perform a lookup of non-mesh service `api` in namespace `foo`
of that cluster peer, use the query
`api.service.foo.ns.billing.peer.k8s-app-1.partition.consul`.
- Lookup a service in a WAN federated datacenter:

Did I restate that correctly? The original text is a little unclear.

Copy link
Contributor Author

@jkirschner-hashicorp jkirschner-hashicorp Jan 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took your 2nd paragraph, but changed the first paragraph to the following:

If `.<peer>.peer` is included, the `<peer>` label identifies a cluster peer
of the query's partition in which to lookup the specified `<service>`.
If the query does not specify a partition,
the query uses the partition of the Consul agent that received the DNS query.

I agree that the original text wasn't clear. Let me know how much this change helps (or not)!


```text
[<tag>.]<service>.service[.<namespace>.ns][.<partition>.ap][.<datacenter>.dc].<domain>
```

This format supports `.service`, `.connect`, `.virtual`, and `.ingress` service lookup types
in Consul 1.14.0 or later.

jkirschner-hashicorp marked this conversation as resolved.
Show resolved Hide resolved
Cluster peering must be used to connect datacenters containing admin partitions,
not WAN federation. Therefore, this format is not applicable to lookup a partition
in a different datacenter.
jkirschner-hashicorp marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Cluster peering must be used to connect datacenters containing admin partitions,
not WAN federation. Therefore, this format is not applicable to lookup a partition
in a different datacenter.
You must establish a peer connection between your Consul clusters to connect datacenters containing admin partitions. You cannot query an admin partition in another datacenter over a WAN federated network.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The limitation is not on connection / querying. WAN federation and admin partitions are mutually exclusive. If two datacenters are WAN federated, they can't have (non-default) admin partitions.

I'm not sure if that nuance is important here. I'll have to think about whether there's a clearer way than the original text...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New suggestion:

Suggested change
Cluster peering must be used to connect datacenters containing admin partitions,
not WAN federation. Therefore, this format is not applicable to lookup a partition
in a different datacenter.
This format does not support specifying both a partition and another datacenter.
Datacenters containing admin partitions can only be connected using cluster peering,
not WAN federation. To lookup a service in a partition of another datacenter,
you must establish a cluster peering relationship with that partition and
use the cluster peer lookup format.


The following table shows example `.service` lookups for the `api` service
depending on its location and whether you want to filter for instances
of the service by tag. All queries assume the `domain` is `consul` and the
DNS server is a Consul server agent in the `default` partition.

| Location | Query |
| -------- | ----- |
| The `default` namespace of the local `billing` partition | `api.service.billing.ap.consul` |
| The `app-1` namespace of cluster peer `us-east` of the local `billing` partition | `api.service.app-1.ns.us-east.peer.billing.partition.consul` |
| The `app-1` namespace of WAN federated datacenter `dc3` | `api.service.app-1.ns.dc3.dc.consul` |

For an enterprise service lookup, an RFC 2782 style prefix (`_<service>._tcp`)
can be used instead of a standard lookup prefix (`<service>`).

#### Alternative formats for specifying namespace

Though the [canonical format](#canonical-format) is recommended for readability,
you can use the following query formats specify namespace but not partition:

- Specify both namespace and datacenter:
- Specify both namespace and WAN federated datacenter:

```text
[<tag>.]<service>.service.<namespace>.<datacenter>.<domain>
Expand All @@ -394,6 +479,9 @@ you can use the following query formats specify namespace but not partition:
[<tag>.]<service>.service.<namespace>.<domain>
```

To lookup a namespaced service in a cluster peer,
the [canonical format](#canonical-format) must be used instead.
Comment on lines +500 to +501
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
To lookup a namespaced service in a cluster peer,
the [canonical format](#canonical-format) must be used instead.
Use the [canonical format](#canonical-format) to lookup services bound to a namespace in a cluster peer.

It's a bit of a gray area, but I think "namespace" is a noun. I recall seeing it in the K8s docs as verb, but it read as poor grammar to me. "Bound" might not be the right verb, either. I won't die on this hill if you disagree.


### Prepared Query Lookups

The format of a prepared query lookup is:
Expand Down Expand Up @@ -444,34 +532,53 @@ If you need more complex behavior, please use the

### Service Virtual IP Lookups

To find the unique virtual IP allocated for a service:
Service virtual IP lookups are only applicable when performed from a downstream
service in the mesh with [transparent proxy](/docs/connect/transparent-proxy) enabled.
Such a downstream service's application code can directly reference its upstream with
a service virtual IP lookup, rather than referencing a `localhost:port` combination
that maps to an [explicit upstream](/consul/docs/k8s/annotations-and-labels#consul-hashicorp-com-connect-service-upstreams).
To use this Consul DNS lookup format in Consul on Kubernetes,
set [`dns.enableRedirection`](/consul/docs/k8s/helm#v-dns-enableredirection) to `true`.

A service virtual IP lookup returns a unique virtual IP allocated for a
[Connect-capable](/docs/connect) service. This virtual IP is also included in the
service's [Tagged Addresses](/docs/discovery/services#tagged-addresses)
under the `consul-virtual` tag.

Use the following query format to lookup a service, optionally in a cluster peer:

```text
<service>.virtual[.<peer>].<domain>
```

This will return the unique virtual IP for any [Connect-capable](/docs/connect)
service. Each Connect service has a virtual IP assigned to it by Consul - this is used
by sidecar proxies for the [Transparent Proxy](/docs/connect/transparent-proxy) feature.
The peer name is an optional part of the FQDN, and it is used to query for the virtual IP
of a service imported from that peer.
To lookup services outside the context of a mesh service's application code
with transparent proxy enabled, refer to non-virtual
[service lookups](#service-lookups) instead.

The virtual IP is also added to the service's [Tagged Addresses](/docs/discovery/services#tagged-addresses)
under the `consul-virtual` tag.

#### Service Virtual IP Lookups for Consul Enterprise <EnterpriseAlert inline />

By default, a service virtual IP lookup uses the `default` namespace
within the partition and datacenter of the Consul agent that received the DNS query.

To lookup services imported from a cluster peered partition or open-source datacenter,
specify the namespace and peer name in the lookup:
```text
<service>.virtual[.<namespace>].<peer>.<domain>
```
Use the following query formats to perform a service virtual IP lookup
on namespaced or partitioned services:
Comment on lines +581 to +582
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Use the following query formats to perform a service virtual IP lookup
on namespaced or partitioned services:
Use the following query formats to perform a service virtual IP lookup
on services within a namespace or partition:

Same comment as above about parts of speech w/r/t namespace (and partition).


- Use the [canonical format](#canonical-format) to optionally specify
namespace, peer, and/or partition:

```text
<service>.virtual[.<namespace>.ns][.<peer>.peer][.<partition>.ap].<domain>
```

- Specify cluster peer and optional namespace in the lookup:
Comment on lines +584 to +591
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Use the [canonical format](#canonical-format) to optionally specify
namespace, peer, and/or partition:
```text
<service>.virtual[.<namespace>.ns][.<peer>.peer][.<partition>.ap].<domain>
```
- Specify cluster peer and optional namespace in the lookup:
- Use the [canonical format](#canonical-format) to specify
namespace, peer, and/or partition:
```text
<service>.virtual[.<namespace>.ns][.<peer>.peer][.<partition>.ap].<domain>
  • Specify cluster peer and namespace in the lookup:
Do we need to keep prefacing as these parts as optional?


```text
<service>.virtual[.<namespace>].<peer>.<domain>
```

To lookup services not imported from a cluster peer,
refer to [service lookups for Consul Enterprise](#service-lookups-for-consul-enterprise) instead.
To lookup services outside the context of mesh service's application code
with transparent proxy enabled, refer to non-virtual
[service lookups for Consul Enterprise](#service-lookups-for-consul-enterprise) instead.

### Ingress Service Lookups

Expand Down