Verification API¶
The verification package chains individual checkers into a 5-step pipeline.
Pipeline¶
coldreach.verify.pipeline ¶
Email verification pipeline — chains individual checkers in order.
Steps (in order): 1. Syntax — RFC 5322 structure validation 2. Disposable — known throwaway domain blocklist 3. DNS / MX — async domain existence + MX record lookup 4. Reacher — SMTP verification via Reacher microservice (optional) 5. Holehe — platform-presence check across 120+ sites (optional, slow)
Each check contributes a score_delta to a running confidence score.
The pipeline stops early on a hard FAIL so downstream checks aren't wasted.
Score baseline
All emails start at a neutral baseline of 30. Checks add or subtract:
Syntax PASS: implied (no delta — just a gate)
Not disposable: +5
MX records found: +10
SMTP valid: +20 (Reacher — requires Docker service)
Holehe platforms: +15 (Holehe — slow, opt-in)
Found on website: +35 (source hint from crawler)
Final score is clamped to [0, 100].
PipelineResult
dataclass
¶
Aggregated result of all verification checks for one email address.
Attributes:
| Name | Type | Description |
|---|---|---|
email |
str
|
The raw input email (not normalised). |
checks |
dict[str, CheckResult]
|
Ordered mapping of check name → CheckResult. |
base_score |
int
|
Starting score before check deltas are applied. |
normalized_email
property
¶
RFC-normalised email if syntax check passed, else raw input.
to_dict ¶
Serialise to a plain dict (for JSON output).
Source code in coldreach/verify/pipeline.py
run_basic_pipeline
async
¶
run_basic_pipeline(
email,
*,
dns_timeout=5.0,
reacher_url=None,
reacher_timeout=15.0,
run_holehe=False,
holehe_timeout=30.0,
)
Run the full verification pipeline for one email address.
Steps: syntax → disposable → DNS → Reacher (optional) → Holehe (optional).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
email
|
str
|
The email address to verify. |
required |
dns_timeout
|
float
|
Timeout in seconds for the DNS resolver. |
5.0
|
reacher_url
|
str | None
|
Base URL of the Reacher SMTP service (e.g. |
None
|
reacher_timeout
|
float
|
HTTP timeout for Reacher requests (SMTP handshakes can be slow). |
15.0
|
run_holehe
|
bool
|
If True, check whether the email is registered on 120+ platforms. This is slow (15-45s) — only enable for high-value candidates. |
False
|
holehe_timeout
|
float
|
Per-request HTTP timeout for holehe module calls. |
30.0
|
Source code in coldreach/verify/pipeline.py
157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 | |
Check types¶
coldreach.verify._types ¶
Internal types for the verification pipeline.
CheckResult is the standard return type of every checker function.
It carries a pass/fail status, a human-readable reason, a score delta that
feeds into the final confidence score, and an open-ended metadata dict for
checker-specific data (e.g. MX records, platform list).
CheckStatus ¶
Bases: StrEnum
Outcome of a single verification check.
PASS
class-attribute
instance-attribute
¶
The check passed — email looks good for this criterion.
FAIL
class-attribute
instance-attribute
¶
The check failed — email should be discarded or heavily penalised.
WARN
class-attribute
instance-attribute
¶
The check raised a concern but did not hard-fail.
SKIP
class-attribute
instance-attribute
¶
The check was not applicable or the service was unavailable.
CheckResult
dataclass
¶
Result of a single verification check.
Attributes:
| Name | Type | Description |
|---|---|---|
status |
CheckStatus
|
Pass, fail, warn, or skip. |
reason |
str
|
Human-readable explanation — shown in CLI output. |
score_delta |
int
|
Amount to add to (positive) or subtract from (negative) the running
confidence score. Checkers that are informational only use |
metadata |
dict[str, Any]
|
Checker-specific extra data (e.g. |
Individual checkers¶
Syntax¶
coldreach.verify.syntax ¶
Email syntax validation (RFC 5321 / 5322).
Uses the email-validator library which implements the full RFC spec,
including international domain names (IDN), quoted local parts, IP address
domains, and correct normalisation (lowercase domain, Unicode NFC).
This is the first and fastest check in the pipeline — pure CPU, no network.
check_syntax ¶
Validate an email address against RFC 5322 syntax rules.
Does not check whether the mailbox actually exists — this is a structural check only.
On success the metadata dict contains:
- "normalized": the RFC-normalised form of the address (lowercase
domain, Unicode NFC local part).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
email
|
str
|
The raw email address string to validate. |
required |
Returns:
| Type | Description |
|---|---|
CheckResult
|
PASS with normalized form in metadata, or FAIL with reason. |
Examples:
>>> result = check_syntax("John.Smith@Example.COM")
>>> result.passed
True
>>> result.metadata["normalized"]
'john.smith@example.com'
Source code in coldreach/verify/syntax.py
Disposable domain¶
coldreach.verify.disposable ¶
Disposable / throwaway email domain detection.
Checks whether the domain part of an email address belongs to a known throwaway or temporary email service. These addresses are useless for lead generation — they're created once, read once, and abandoned.
The blocklist is bundled in coldreach/data/disposable_domains.txt
and loaded once, then cached via functools.lru_cache.
To extend the list: add new lowercase domain names (one per line) to that file, or contribute upstream at: https://github.com/disposable-email-domains/disposable-email-domains
is_disposable ¶
Return True if email uses a known disposable / throwaway domain.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
email
|
str
|
Full email address. Only the domain part is checked. |
required |
Returns:
| Type | Description |
|---|---|
bool
|
|
Examples:
Source code in coldreach/verify/disposable.py
check_disposable ¶
Pipeline checker: fail if email uses a disposable domain.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
email
|
str
|
Full email address string. |
required |
Returns:
| Type | Description |
|---|---|
CheckResult
|
FAIL (score -50) if disposable, PASS (score +5) otherwise. |
Examples:
>>> check_disposable("test@mailinator.com").failed
True
>>> check_disposable("john@stripe.com").passed
True
Source code in coldreach/verify/disposable.py
DNS / MX¶
coldreach.verify.dns_check ¶
Async DNS / MX record checker.
Resolves MX records for an email's domain to confirm the domain is capable of receiving email. This rules out typos, NXDOMAIN addresses, and domains that have never been configured as mail receivers.
Uses dnspython's native async resolver (dns.asyncresolver) — no thread
pool hacks. Requires Python 3.11+ and dnspython >= 2.0.
Priority order of DNS checks
- MX records (primary — what mail servers accept for this domain?)
- A record fallback (RFC 5321 §5: a domain with no MX but a valid A record is still technically a valid mail destination — we warn rather than fail)
get_mx_records
async
¶
Resolve MX records for domain, sorted by priority (lowest first).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
domain
|
str
|
The domain to look up (e.g. |
required |
timeout
|
float
|
DNS query timeout in seconds. |
5.0
|
Returns:
| Type | Description |
|---|---|
list[str]
|
MX hostnames in priority order (lowest preference value first). Empty list if the domain has no MX records or does not exist. |
Examples:
Source code in coldreach/verify/dns_check.py
domain_exists
async
¶
Return True if domain resolves to at least one A or AAAA record.
Used as a fallback when no MX records are found — some small domains rely on the implicit MX → A record fallback defined in RFC 5321 §5.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
domain
|
str
|
Domain name to check. |
required |
timeout
|
float
|
DNS query timeout in seconds. |
5.0
|
Source code in coldreach/verify/dns_check.py
check_dns
async
¶
Pipeline checker: verify the email's domain has a valid MX record.
Scoring
- MX records found: +10 points
- No MX but A record: WARN, +0 points (RFC fallback, unusual)
- NXDOMAIN / no A record: FAIL, -30 points (domain does not exist)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
email
|
str
|
Full email address string. |
required |
timeout
|
float
|
DNS resolution timeout in seconds. |
5.0
|
Returns:
| Type | Description |
|---|---|
CheckResult
|
PASS / WARN / FAIL as described above. PASS metadata includes
|
Examples:
>>> import asyncio
>>> result = asyncio.run(check_dns("test@gmail.com"))
>>> result.passed
True
>>> "mx_records" in result.metadata
True
Source code in coldreach/verify/dns_check.py
110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 | |
Reacher SMTP¶
coldreach.verify.reacher ¶
Reacher SMTP verification client.
Reacher (https://reacher.email) is a self-hosted Rust microservice that performs full SMTP verification including: - SMTP connection test - RCPT TO probe (is the mailbox deliverable?) - Catch-all detection - MX record lookup (can be skipped if already known)
Requires the Reacher Docker service to be running: docker compose up reacher
API endpoint: POST /v0/check_email Request: { "to_email": "john@example.com" } Response: Full JSON with smtp, mx, misc, syntax sections.
This checker is optional — gracefully returns SKIP if Reacher is not running or not configured.
check_reacher
async
¶
Verify email via the Reacher SMTP microservice.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
email
|
str
|
The email address to verify. |
required |
reacher_url
|
str
|
Base URL of the Reacher service, e.g. |
required |
timeout
|
float
|
HTTP timeout in seconds (SMTP handshakes can be slow). |
15.0
|
Returns:
| Type | Description |
|---|---|
CheckResult
|
|
Source code in coldreach/verify/reacher.py
Catch-all detection¶
coldreach.verify.catchall ¶
Catch-all domain detection.
A "catch-all" mail server accepts RCPT TO for ANY address at the domain — including completely random ones. This makes SMTP verification useless for individual addresses.
Detection method: Ask Reacher to verify a randomly-generated address at the domain. If it comes back "deliverable", the domain is catch-all.
Fallback (no Reacher): always returns UNKNOWN — we can't detect catch-all without sending an actual SMTP probe.
Result is cached per domain for the session to avoid redundant probes.
check_catchall
async
¶
Probe the domain to detect catch-all behaviour.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
domain
|
str
|
The domain to probe (e.g. |
required |
reacher_url
|
str | None
|
Base URL of the Reacher microservice. If None, returns SKIP. |
None
|
timeout
|
float
|
HTTP timeout for the Reacher request. |
10.0
|
Returns:
| Type | Description |
|---|---|
CheckResult
|
|
Source code in coldreach/verify/catchall.py
is_catch_all
async
¶
Return True/False/None for catch-all status.
Convenience wrapper around :func:check_catchall that returns a
plain boolean (or None if unknown) rather than a CheckResult.
Source code in coldreach/verify/catchall.py
Holehe platform check¶
coldreach.verify.holehe ¶
Holehe platform-presence check.
Uses the holehe library (github.com/megadose/holehe) to check whether an email address is registered on 120+ public platforms (GitHub, Discord, Spotify, Slack, etc.).
A positive result on multiple platforms confirms the email is real and the person is active — especially valuable for catch-all domains where SMTP verification is unreliable.
IMPORTANT: This check makes up to 120 HTTP requests. It is SLOW (15-45s). Only enable it explicitly via --holehe or use_holehe=True in FinderConfig.
Score deltas: +15 registered on ≥ 2 platforms + 5 registered on exactly 1 platform 0 not found (WARN, not FAIL — private persons may not be on these platforms) 0 SKIP — holehe not installed
check_holehe
async
¶
Check if email is registered on public platforms via holehe.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
email
|
str
|
Email address to check. |
required |
timeout
|
float
|
Per-request HTTP timeout for holehe module calls. |
10.0
|
min_platforms
|
int
|
Registrations needed to award the full +15 score delta. |
_MIN_PLATFORMS
|
concurrency
|
int
|
Maximum simultaneous holehe module requests. |
_CONCURRENCY
|
Returns:
| Type | Description |
|---|---|
CheckResult
|
PASS (+15) if registered on ≥ min_platforms platforms. PASS (+5) if registered on exactly 1 platform. WARN (0) if 0 registrations found. SKIP if holehe is not installed. |