ACME DNS-01 challenge, end to end
DNS-01 is how Let's Encrypt verifies you control a domain when port 80 isn't an option, and the only path to wildcard certificates. Here's what happens over the wire, what can go wrong, and the part CertMate automates.
When a certificate authority issues you a certificate for
example.com, it has to first convince itself you actually
control that domain. The ACME protocol defines three ways to prove it:
HTTP-01 (serve a file at a specific URL), TLS-ALPN-01 (negotiate a
specific TLS extension), and DNS-01 (place a TXT record at a specific
name). DNS-01 is the only one of the three that works for hosts that
aren't on the public internet, and the only one that can issue
wildcards.
The wire-level dance
Here's exactly what happens when you ask Let's Encrypt for a cert via DNS-01. Numbers refer to the steps below.
- Order. The ACME client POSTs
newOrderto the CA's directory, listing the identifiers it wants (example.com,*.example.com). The CA responds with a list of authorization URLs, one per identifier. - Challenge selection. The client GETs each authorization, picks DNS-01 from the offered challenges, and receives a token plus a key authorization (the token concatenated with a hash of the client's account key).
- Provisioning. The client SHA-256s the key
authorization, base64url-encodes it, and writes a TXT record at
_acme-challenge.example.comwith that value as the payload. For wildcards, the same name —_acme-challenge.example.com— is also the location for*.example.com, which has a subtle consequence: a wildcard order produces two challenges at the same DNS name, and you must publish both values (most providers handle this via multi-value TXT records). - Notify. The client POSTs to the challenge URL to tell the CA "ready, check me."
- Validation. The CA's authoritative validators
(multiple, geographically distributed; Let's Encrypt published
the count as five in 2020 and the architecture as
"multi-perspective") resolve
_acme-challenge.example.comvia the public DNS hierarchy. If at least the configured quorum sees the expected value, the challenge passes. - Finalize. The client POSTs the CSR, the CA mints the certificate, the order moves to valid, and the client GETs the cert chain.
- Cleanup. The client deletes the TXT record.
Where it goes wrong
In practice, almost every DNS-01 failure is one of these four:
DNS propagation
Your DNS provider's API confirms the TXT record was written. The
authoritative nameservers haven't picked it up yet. The validator
queries, sees stale data, marks the challenge failed. The fix is to
poll for propagation before notifying the CA — CertMate does this
via certbot's --dns-X-propagation-seconds argument,
tuned per provider.
CAA records
Your zone has a CAA record permitting only
digicert.com (because three years ago you bought an EV
cert and never cleaned up). Let's Encrypt walks the tree, sees the
restriction, refuses to issue. The fix is to delete or amend the CAA
before you start. The symptom is a clear error message from the CA
during finalize; CertMate surfaces it in the audit log.
Provider account scope
Your Cloudflare API token has Zone:DNS:Edit for
example.com but not for internal.example.com,
which lives in a separate zone. The TXT write returns 403. The fix
is to widen the token scope, or to issue against a per-zone token
(CertMate's multi-account support is exactly this).
CNAME loops or split-horizon DNS
You have example.com resolving differently inside your
VPC than on the public internet. The validator hits the public side
and sees something that doesn't reach your authoritative server. The
general fix is to delegate the
ACME validation zone via CNAME to a name you fully control on
the public side.
The bit CertMate does for you
Concretely, CertMate's DNS-01 pipeline does the following so you don't have to:
-
Selects the DNS provider plugin from the certificate's
dns_providerfield, loads the credentials from the configured account, and invokes certbot with the right--dns-X-credentialsarguments. - Writes the TXT record, polls for propagation with provider-tuned timeouts, validates with the CA, and reads back the certificate.
-
Stores the result in
/app/certificates/<domain>/{cert.pem, privkey.pem, fullchain.pem}with restricted permissions. - Fires the configured deploy hooks so whatever consumes the cert gets the new bytes.
- Logs the operation in the audit log with the user / token identity and the wall-clock timing of each phase. If anything fails, the failure mode is in the log with the certbot error verbatim.
When NOT to use DNS-01
DNS-01 is the right answer for almost everything, but there are two cases where HTTP-01 still wins:
- You don't have DNS API access. A managed DNS that only exposes a web UI is fine for HTTP-01 (the cert manager just serves a file). CertMate doesn't offer this path because the central premise is automation.
- Your DNS provider has slow propagation under load and your renewal cadence is tight. A handful of small providers have publicly documented 60+ second propagation under contention. HTTP-01 is instant on those.
For the wildcard case, there is no alternative. If you want
*.example.com, you're doing DNS-01. Continue to
the wildcard topic for
the practical walkthrough.