Skip to content

Verification API

The verification package chains individual checkers into a 5-step pipeline.


Pipeline

coldreach.verify.pipeline

Email verification pipeline — chains individual checkers in order.

Steps (in order): 1. Syntax — RFC 5322 structure validation 2. Disposable — known throwaway domain blocklist 3. DNS / MX — async domain existence + MX record lookup 4. Reacher — SMTP verification via Reacher microservice (optional) 5. Holehe — platform-presence check across 120+ sites (optional, slow)

Each check contributes a score_delta to a running confidence score. The pipeline stops early on a hard FAIL so downstream checks aren't wasted.

Score baseline

All emails start at a neutral baseline of 30. Checks add or subtract:

Syntax PASS:         implied (no delta — just a gate)
Not disposable:     +5
MX records found:   +10
SMTP valid:         +20  (Reacher — requires Docker service)
Holehe platforms:   +15  (Holehe — slow, opt-in)
Found on website:   +35  (source hint from crawler)

Final score is clamped to [0, 100].

PipelineResult dataclass

PipelineResult(
    email, checks=dict(), base_score=_BASELINE_SCORE
)

Aggregated result of all verification checks for one email address.

Attributes:

Name Type Description
email str

The raw input email (not normalised).

checks dict[str, CheckResult]

Ordered mapping of check name → CheckResult.

base_score int

Starting score before check deltas are applied.

normalized_email property

normalized_email

RFC-normalised email if syntax check passed, else raw input.

score property

score

Final confidence score clamped to [0, 100].

passed property

passed

True if no check returned FAIL status.

failed property

failed

True if at least one check returned FAIL status.

mx_records property

mx_records

MX records returned by the DNS check, if available.

domain property

domain

Domain part of the (normalised) email.

failure_reason property

failure_reason

Reason of the first failing check, if any.

to_dict

to_dict()

Serialise to a plain dict (for JSON output).

Source code in coldreach/verify/pipeline.py
def to_dict(self) -> dict[str, Any]:
    """Serialise to a plain dict (for JSON output)."""
    return {
        "email": self.email,
        "normalized": self.normalized_email,
        "passed": self.passed,
        "score": self.score,
        "mx_records": self.mx_records,
        "checks": {
            name: {
                "status": check.status.value,
                "reason": check.reason,
                "score_delta": check.score_delta,
            }
            for name, check in self.checks.items()
        },
    }

run_basic_pipeline async

run_basic_pipeline(
    email,
    *,
    dns_timeout=5.0,
    reacher_url=None,
    reacher_timeout=15.0,
    run_holehe=False,
    holehe_timeout=30.0,
)

Run the full verification pipeline for one email address.

Steps: syntax → disposable → DNS → Reacher (optional) → Holehe (optional).

Parameters:

Name Type Description Default
email str

The email address to verify.

required
dns_timeout float

Timeout in seconds for the DNS resolver.

5.0
reacher_url str | None

Base URL of the Reacher SMTP service (e.g. "http://localhost:8083"). Pass None to skip SMTP verification.

None
reacher_timeout float

HTTP timeout for Reacher requests (SMTP handshakes can be slow).

15.0
run_holehe bool

If True, check whether the email is registered on 120+ platforms. This is slow (15-45s) — only enable for high-value candidates.

False
holehe_timeout float

Per-request HTTP timeout for holehe module calls.

30.0
Source code in coldreach/verify/pipeline.py
async def run_basic_pipeline(
    email: str,
    *,
    dns_timeout: float = 5.0,
    reacher_url: str | None = None,
    reacher_timeout: float = 15.0,
    run_holehe: bool = False,
    holehe_timeout: float = 30.0,
) -> PipelineResult:
    """Run the full verification pipeline for one email address.

    Steps: syntax → disposable → DNS → Reacher (optional) → Holehe (optional).

    Parameters
    ----------
    email:
        The email address to verify.
    dns_timeout:
        Timeout in seconds for the DNS resolver.
    reacher_url:
        Base URL of the Reacher SMTP service (e.g. ``"http://localhost:8083"``).
        Pass ``None`` to skip SMTP verification.
    reacher_timeout:
        HTTP timeout for Reacher requests (SMTP handshakes can be slow).
    run_holehe:
        If True, check whether the email is registered on 120+ platforms.
        This is slow (15-45s) — only enable for high-value candidates.
    holehe_timeout:
        Per-request HTTP timeout for holehe module calls.
    """
    result = PipelineResult(email=email)

    # ── Step 1: Syntax ────────────────────────────────────────────────────────
    syntax_result = check_syntax(email)
    result.checks["syntax"] = syntax_result

    if syntax_result.failed:
        logger.debug("Pipeline stopped at syntax for %r", email)
        return result

    normalized = str(syntax_result.metadata.get("normalized", email))

    # ── Step 2: Disposable domain ─────────────────────────────────────────────
    disposable_result = check_disposable(normalized)
    result.checks["disposable"] = disposable_result

    if disposable_result.failed:
        logger.debug("Pipeline stopped at disposable for %r", normalized)
        return result

    # ── Step 3: DNS / MX records ──────────────────────────────────────────────
    dns_result = await check_dns(normalized, timeout=dns_timeout)
    result.checks["dns"] = dns_result

    if dns_result.failed:
        logger.debug("Pipeline stopped at DNS for %r", normalized)
        return result

    # ── Step 4: Reacher SMTP verification (optional) ─────────────────────────
    if reacher_url:
        reacher_result = await check_reacher(
            normalized,
            reacher_url=reacher_url,
            timeout=reacher_timeout,
        )
        result.checks["reacher"] = reacher_result
        if reacher_result.failed:
            logger.debug("Pipeline stopped at Reacher for %r", normalized)
            return result

    # ── Step 5: Holehe platform presence (optional, slow) ────────────────────
    if run_holehe:
        holehe_result = await check_holehe(normalized, timeout=holehe_timeout)
        result.checks["holehe"] = holehe_result

    logger.debug(
        "Pipeline complete for %r — passed=%s score=%d",
        normalized,
        result.passed,
        result.score,
    )
    return result

Check types

coldreach.verify._types

Internal types for the verification pipeline.

CheckResult is the standard return type of every checker function. It carries a pass/fail status, a human-readable reason, a score delta that feeds into the final confidence score, and an open-ended metadata dict for checker-specific data (e.g. MX records, platform list).

CheckStatus

Bases: StrEnum

Outcome of a single verification check.

PASS class-attribute instance-attribute

PASS = 'pass'

The check passed — email looks good for this criterion.

FAIL class-attribute instance-attribute

FAIL = 'fail'

The check failed — email should be discarded or heavily penalised.

WARN class-attribute instance-attribute

WARN = 'warn'

The check raised a concern but did not hard-fail.

SKIP class-attribute instance-attribute

SKIP = 'skip'

The check was not applicable or the service was unavailable.

CheckResult dataclass

CheckResult(
    status, reason="", score_delta=0, metadata=dict()
)

Result of a single verification check.

Attributes:

Name Type Description
status CheckStatus

Pass, fail, warn, or skip.

reason str

Human-readable explanation — shown in CLI output.

score_delta int

Amount to add to (positive) or subtract from (negative) the running confidence score. Checkers that are informational only use 0.

metadata dict[str, Any]

Checker-specific extra data (e.g. {"mx_records": [...]}).

passed property

passed

True if status is PASS.

failed property

failed

True if status is FAIL.

warned property

warned

True if status is WARN.

skipped property

skipped

True if status is SKIP.

pass_ classmethod

pass_(reason='', score_delta=0, **metadata)

Create a passing result.

Source code in coldreach/verify/_types.py
@classmethod
def pass_(
    cls,
    reason: str = "",
    score_delta: int = 0,
    **metadata: Any,
) -> CheckResult:
    """Create a passing result."""
    return cls(CheckStatus.PASS, reason, score_delta, dict(metadata))

fail classmethod

fail(reason, score_delta=0, **metadata)

Create a failing result.

Source code in coldreach/verify/_types.py
@classmethod
def fail(
    cls,
    reason: str,
    score_delta: int = 0,
    **metadata: Any,
) -> CheckResult:
    """Create a failing result."""
    return cls(CheckStatus.FAIL, reason, score_delta, dict(metadata))

warn classmethod

warn(reason, score_delta=0, **metadata)

Create a warning result.

Source code in coldreach/verify/_types.py
@classmethod
def warn(
    cls,
    reason: str,
    score_delta: int = 0,
    **metadata: Any,
) -> CheckResult:
    """Create a warning result."""
    return cls(CheckStatus.WARN, reason, score_delta, dict(metadata))

skip classmethod

skip(reason='service unavailable')

Create a skipped result.

Source code in coldreach/verify/_types.py
@classmethod
def skip(cls, reason: str = "service unavailable") -> CheckResult:
    """Create a skipped result."""
    return cls(CheckStatus.SKIP, reason, 0, {})

Individual checkers

Syntax

coldreach.verify.syntax

Email syntax validation (RFC 5321 / 5322).

Uses the email-validator library which implements the full RFC spec, including international domain names (IDN), quoted local parts, IP address domains, and correct normalisation (lowercase domain, Unicode NFC).

This is the first and fastest check in the pipeline — pure CPU, no network.

check_syntax

check_syntax(email)

Validate an email address against RFC 5322 syntax rules.

Does not check whether the mailbox actually exists — this is a structural check only.

On success the metadata dict contains: - "normalized": the RFC-normalised form of the address (lowercase domain, Unicode NFC local part).

Parameters:

Name Type Description Default
email str

The raw email address string to validate.

required

Returns:

Type Description
CheckResult

PASS with normalized form in metadata, or FAIL with reason.

Examples:

>>> result = check_syntax("John.Smith@Example.COM")
>>> result.passed
True
>>> result.metadata["normalized"]
'john.smith@example.com'
>>> check_syntax("not-an-email").passed
False
Source code in coldreach/verify/syntax.py
def check_syntax(email: str) -> CheckResult:
    """Validate an email address against RFC 5322 syntax rules.

    Does **not** check whether the mailbox actually exists — this is a
    structural check only.

    On success the ``metadata`` dict contains:
    - ``"normalized"``: the RFC-normalised form of the address (lowercase
      domain, Unicode NFC local part).

    Parameters
    ----------
    email:
        The raw email address string to validate.

    Returns
    -------
    CheckResult
        PASS with normalized form in metadata, or FAIL with reason.

    Examples
    --------
    >>> result = check_syntax("John.Smith@Example.COM")
    >>> result.passed
    True
    >>> result.metadata["normalized"]
    'john.smith@example.com'

    >>> check_syntax("not-an-email").passed
    False
    """
    if not email or not isinstance(email, str):
        return CheckResult.fail(
            "Email must be a non-empty string",
            score_delta=-100,
        )

    email = email.strip()

    try:
        validated = validate_email(email, check_deliverability=False)
        # RFC 5321 says local parts are technically case-sensitive, but in
        # practice all real mail servers treat them case-insensitively.
        # We lowercase the full address for consistent storage and deduplication.
        normalized = validated.normalized.lower()
        logger.debug("syntax OK: %s%s", email, normalized)
        return CheckResult.pass_(
            "Valid RFC 5322 syntax",
            normalized=normalized,
        )
    except EmailNotValidError as exc:
        logger.debug("syntax FAIL: %s%s", email, exc)
        return CheckResult.fail(
            str(exc),
            score_delta=-100,
        )

Disposable domain

coldreach.verify.disposable

Disposable / throwaway email domain detection.

Checks whether the domain part of an email address belongs to a known throwaway or temporary email service. These addresses are useless for lead generation — they're created once, read once, and abandoned.

The blocklist is bundled in coldreach/data/disposable_domains.txt and loaded once, then cached via functools.lru_cache.

To extend the list: add new lowercase domain names (one per line) to that file, or contribute upstream at: https://github.com/disposable-email-domains/disposable-email-domains

is_disposable

is_disposable(email)

Return True if email uses a known disposable / throwaway domain.

Parameters:

Name Type Description Default
email str

Full email address. Only the domain part is checked.

required

Returns:

Type Description
bool

Examples:

>>> is_disposable("user@mailinator.com")
True
>>> is_disposable("user@gmail.com")
False
Source code in coldreach/verify/disposable.py
def is_disposable(email: str) -> bool:
    """Return True if *email* uses a known disposable / throwaway domain.

    Parameters
    ----------
    email:
        Full email address. Only the domain part is checked.

    Returns
    -------
    bool

    Examples
    --------
    >>> is_disposable("user@mailinator.com")
    True
    >>> is_disposable("user@gmail.com")
    False
    """
    try:
        domain = email.lower().split("@")[1]
    except IndexError:
        return False

    return domain in _load_domains()

check_disposable

check_disposable(email)

Pipeline checker: fail if email uses a disposable domain.

Parameters:

Name Type Description Default
email str

Full email address string.

required

Returns:

Type Description
CheckResult

FAIL (score -50) if disposable, PASS (score +5) otherwise.

Examples:

>>> check_disposable("test@mailinator.com").failed
True
>>> check_disposable("john@stripe.com").passed
True
Source code in coldreach/verify/disposable.py
def check_disposable(email: str) -> CheckResult:
    """Pipeline checker: fail if email uses a disposable domain.

    Parameters
    ----------
    email:
        Full email address string.

    Returns
    -------
    CheckResult
        FAIL (score -50) if disposable, PASS (score +5) otherwise.

    Examples
    --------
    >>> check_disposable("test@mailinator.com").failed
    True
    >>> check_disposable("john@stripe.com").passed
    True
    """
    if not email or "@" not in email:
        return CheckResult.skip("Invalid format — cannot extract domain")

    try:
        domain = email.lower().split("@")[1]
    except IndexError:
        return CheckResult.skip("Invalid format — no domain part")

    if domain in _load_domains():
        logger.debug("Disposable domain detected: %s", domain)
        return CheckResult.fail(
            f"Known disposable email service: {domain}",
            score_delta=-50,
            domain=domain,
        )

    logger.debug("Domain %s is not in disposable blocklist", domain)
    return CheckResult.pass_(
        "Not a disposable email domain",
        score_delta=5,
        domain=domain,
    )

DNS / MX

coldreach.verify.dns_check

Async DNS / MX record checker.

Resolves MX records for an email's domain to confirm the domain is capable of receiving email. This rules out typos, NXDOMAIN addresses, and domains that have never been configured as mail receivers.

Uses dnspython's native async resolver (dns.asyncresolver) — no thread pool hacks. Requires Python 3.11+ and dnspython >= 2.0.

Priority order of DNS checks
  1. MX records (primary — what mail servers accept for this domain?)
  2. A record fallback (RFC 5321 §5: a domain with no MX but a valid A record is still technically a valid mail destination — we warn rather than fail)

get_mx_records async

get_mx_records(domain, timeout=5.0)

Resolve MX records for domain, sorted by priority (lowest first).

Parameters:

Name Type Description Default
domain str

The domain to look up (e.g. "stripe.com").

required
timeout float

DNS query timeout in seconds.

5.0

Returns:

Type Description
list[str]

MX hostnames in priority order (lowest preference value first). Empty list if the domain has no MX records or does not exist.

Examples:

>>> import asyncio
>>> records = asyncio.run(get_mx_records("gmail.com"))
>>> len(records) > 0
True
Source code in coldreach/verify/dns_check.py
async def get_mx_records(domain: str, timeout: float = 5.0) -> list[str]:
    """Resolve MX records for *domain*, sorted by priority (lowest first).

    Parameters
    ----------
    domain:
        The domain to look up (e.g. ``"stripe.com"``).
    timeout:
        DNS query timeout in seconds.

    Returns
    -------
    list[str]
        MX hostnames in priority order (lowest preference value first).
        Empty list if the domain has no MX records or does not exist.

    Examples
    --------
    >>> import asyncio
    >>> records = asyncio.run(get_mx_records("gmail.com"))
    >>> len(records) > 0
    True
    """
    resolver = dns.asyncresolver.Resolver()
    resolver.lifetime = timeout

    try:
        answers = await resolver.resolve(domain, "MX")
        sorted_records = sorted(
            [(int(rdata.preference), str(rdata.exchange).rstrip(".")) for rdata in answers],
            key=lambda x: x[0],
        )
        return [hostname for _, hostname in sorted_records]

    except dns.resolver.NXDOMAIN:
        logger.debug("DNS NXDOMAIN: domain %r does not exist", domain)
        return []

    except dns.resolver.NoAnswer:
        logger.debug("DNS NoAnswer: no MX records for %r", domain)
        return []

    except dns.exception.Timeout:
        logger.warning("DNS timeout for %r (%.1fs)", domain, timeout)
        return []

    except dns.exception.DNSException as exc:
        logger.warning("DNS error for %r: %s", domain, exc)
        return []

domain_exists async

domain_exists(domain, timeout=5.0)

Return True if domain resolves to at least one A or AAAA record.

Used as a fallback when no MX records are found — some small domains rely on the implicit MX → A record fallback defined in RFC 5321 §5.

Parameters:

Name Type Description Default
domain str

Domain name to check.

required
timeout float

DNS query timeout in seconds.

5.0
Source code in coldreach/verify/dns_check.py
async def domain_exists(domain: str, timeout: float = 5.0) -> bool:
    """Return True if *domain* resolves to at least one A or AAAA record.

    Used as a fallback when no MX records are found — some small domains
    rely on the implicit MX → A record fallback defined in RFC 5321 §5.

    Parameters
    ----------
    domain:
        Domain name to check.
    timeout:
        DNS query timeout in seconds.
    """
    resolver = dns.asyncresolver.Resolver()
    resolver.lifetime = timeout

    for record_type in ("A", "AAAA"):
        try:
            await resolver.resolve(domain, record_type)
            return True
        except (dns.resolver.NXDOMAIN, dns.resolver.NoAnswer):
            continue
        except dns.exception.DNSException:
            continue

    return False

check_dns async

check_dns(email, timeout=5.0)

Pipeline checker: verify the email's domain has a valid MX record.

Scoring
  • MX records found: +10 points
  • No MX but A record: WARN, +0 points (RFC fallback, unusual)
  • NXDOMAIN / no A record: FAIL, -30 points (domain does not exist)

Parameters:

Name Type Description Default
email str

Full email address string.

required
timeout float

DNS resolution timeout in seconds.

5.0

Returns:

Type Description
CheckResult

PASS / WARN / FAIL as described above. PASS metadata includes mx_records (list of MX hostnames).

Examples:

>>> import asyncio
>>> result = asyncio.run(check_dns("test@gmail.com"))
>>> result.passed
True
>>> "mx_records" in result.metadata
True
Source code in coldreach/verify/dns_check.py
async def check_dns(email: str, timeout: float = 5.0) -> CheckResult:
    """Pipeline checker: verify the email's domain has a valid MX record.

    Scoring
    -------
    - MX records found:        +10 points
    - No MX but A record:      WARN, +0 points (RFC fallback, unusual)
    - NXDOMAIN / no A record:  FAIL, -30 points (domain does not exist)

    Parameters
    ----------
    email:
        Full email address string.
    timeout:
        DNS resolution timeout in seconds.

    Returns
    -------
    CheckResult
        PASS / WARN / FAIL as described above. PASS metadata includes
        ``mx_records`` (list of MX hostnames).

    Examples
    --------
    >>> import asyncio
    >>> result = asyncio.run(check_dns("test@gmail.com"))
    >>> result.passed
    True
    >>> "mx_records" in result.metadata
    True
    """
    if not email or "@" not in email:
        return CheckResult.fail(
            "Cannot extract domain — invalid email format",
            score_delta=-100,
        )

    try:
        domain = email.lower().split("@")[1]
    except IndexError:
        return CheckResult.fail(
            "No domain part found in email address",
            score_delta=-100,
        )

    if not domain or "." not in domain:
        return CheckResult.fail(
            f"Domain {domain!r} is not a valid FQDN",
            score_delta=-30,
            domain=domain,
        )

    mx_records = await get_mx_records(domain, timeout=timeout)

    if mx_records:
        logger.debug("MX records for %r: %s", domain, mx_records)
        return CheckResult.pass_(
            f"Found {len(mx_records)} MX record(s)",
            score_delta=10,
            domain=domain,
            mx_records=mx_records,
        )

    # No MX — try A/AAAA fallback (RFC 5321 §5)
    if await domain_exists(domain, timeout=timeout):
        logger.debug("No MX for %r, but A/AAAA record exists (RFC fallback)", domain)
        return CheckResult.warn(
            f"No MX records for {domain} — mail may use A record fallback (unusual)",
            score_delta=0,
            domain=domain,
            mx_records=[],
        )

    logger.debug("Domain %r does not exist (NXDOMAIN / no A record)", domain)
    return CheckResult.fail(
        f"Domain {domain!r} has no MX records and no A record — undeliverable",
        score_delta=-30,
        domain=domain,
        mx_records=[],
    )

Reacher SMTP

coldreach.verify.reacher

Reacher SMTP verification client.

Reacher (https://reacher.email) is a self-hosted Rust microservice that performs full SMTP verification including: - SMTP connection test - RCPT TO probe (is the mailbox deliverable?) - Catch-all detection - MX record lookup (can be skipped if already known)

Requires the Reacher Docker service to be running: docker compose up reacher

API endpoint: POST /v0/check_email Request: { "to_email": "john@example.com" } Response: Full JSON with smtp, mx, misc, syntax sections.

This checker is optional — gracefully returns SKIP if Reacher is not running or not configured.

check_reacher async

check_reacher(email, *, reacher_url, timeout=15.0)

Verify email via the Reacher SMTP microservice.

Parameters:

Name Type Description Default
email str

The email address to verify.

required
reacher_url str

Base URL of the Reacher service, e.g. "http://localhost:8083".

required
timeout float

HTTP timeout in seconds (SMTP handshakes can be slow).

15.0

Returns:

Type Description
CheckResult
  • PASS (+20): SMTP accepted, not catch-all
  • FAIL (-20): SMTP rejected or unreachable mailbox
  • WARN (0): deliverable but catch-all domain
  • SKIP: Reacher service unavailable
Source code in coldreach/verify/reacher.py
async def check_reacher(
    email: str,
    *,
    reacher_url: str,
    timeout: float = 15.0,
) -> CheckResult:
    """Verify *email* via the Reacher SMTP microservice.

    Parameters
    ----------
    email:
        The email address to verify.
    reacher_url:
        Base URL of the Reacher service, e.g. ``"http://localhost:8083"``.
    timeout:
        HTTP timeout in seconds (SMTP handshakes can be slow).

    Returns
    -------
    CheckResult
        - PASS (+20): SMTP accepted, not catch-all
        - FAIL (-20): SMTP rejected or unreachable mailbox
        - WARN (0): deliverable but catch-all domain
        - SKIP: Reacher service unavailable
    """
    if not email or "@" not in email:
        return CheckResult.fail("Invalid email for Reacher check", score_delta=-10)

    url = f"{reacher_url.rstrip('/')}/v0/check_email"
    payload = {"to_email": email}

    try:
        async with httpx.AsyncClient(timeout=timeout) as client:
            resp = await client.post(url, json=payload)
    except httpx.ConnectError:
        logger.debug("Reacher not reachable at %s — skipping SMTP check", reacher_url)
        return CheckResult.skip("Reacher service not running")
    except httpx.TimeoutException:
        logger.debug("Reacher timed out for %s", email)
        return CheckResult.skip("Reacher request timed out")
    except httpx.RequestError as exc:
        logger.debug("Reacher request error: %s", exc)
        return CheckResult.skip(f"Reacher request failed: {exc or 'connection error'}")

    if resp.status_code != 200:
        return CheckResult.skip(f"Reacher HTTP {resp.status_code}")

    try:
        data = resp.json()
    except Exception:
        return CheckResult.skip("Reacher returned invalid JSON")

    return _parse_reacher_response(email, data)

Catch-all detection

coldreach.verify.catchall

Catch-all domain detection.

A "catch-all" mail server accepts RCPT TO for ANY address at the domain — including completely random ones. This makes SMTP verification useless for individual addresses.

Detection method: Ask Reacher to verify a randomly-generated address at the domain. If it comes back "deliverable", the domain is catch-all.

Fallback (no Reacher): always returns UNKNOWN — we can't detect catch-all without sending an actual SMTP probe.

Result is cached per domain for the session to avoid redundant probes.

check_catchall async

check_catchall(domain, *, reacher_url=None, timeout=10.0)

Probe the domain to detect catch-all behaviour.

Parameters:

Name Type Description Default
domain str

The domain to probe (e.g. "stripe.com").

required
reacher_url str | None

Base URL of the Reacher microservice. If None, returns SKIP.

None
timeout float

HTTP timeout for the Reacher request.

10.0

Returns:

Type Description
CheckResult
  • PASS (score_delta=0): domain is NOT catch-all
  • FAIL (score_delta=-40): domain IS catch-all
  • SKIP: Reacher not configured
  • WARN: probe was inconclusive
Source code in coldreach/verify/catchall.py
async def check_catchall(
    domain: str,
    *,
    reacher_url: str | None = None,
    timeout: float = 10.0,
) -> CheckResult:
    """Probe the domain to detect catch-all behaviour.

    Parameters
    ----------
    domain:
        The domain to probe (e.g. ``"stripe.com"``).
    reacher_url:
        Base URL of the Reacher microservice. If None, returns SKIP.
    timeout:
        HTTP timeout for the Reacher request.

    Returns
    -------
    CheckResult
        - PASS (score_delta=0): domain is NOT catch-all
        - FAIL (score_delta=-40): domain IS catch-all
        - SKIP: Reacher not configured
        - WARN: probe was inconclusive
    """
    if not domain:
        return CheckResult.fail("Empty domain", score_delta=-10)

    # Return cached result immediately
    if domain in _cache:
        cached = _cache[domain]
        if cached is True:
            return CheckResult.fail(
                f"{domain} is a catch-all domain",
                score_delta=-40,
                is_catch_all=True,
            )
        if cached is False:
            return CheckResult.pass_(score_delta=0)
        return CheckResult.skip("Catch-all status unknown (cached)")

    if not reacher_url:
        return CheckResult.skip("Reacher not configured — catch-all unknown")

    probe_email = f"{_random_local()}@{domain}"
    result = await _probe_via_reacher(probe_email, reacher_url, timeout)
    _cache[domain] = result
    logger.debug("Catch-all probe for %s: %s", domain, result)

    if result is True:
        return CheckResult.fail(
            f"{domain} is a catch-all domain",
            score_delta=-40,
            is_catch_all=True,
        )
    if result is False:
        return CheckResult.pass_(score_delta=0)
    return CheckResult.skip("Catch-all probe inconclusive")

is_catch_all async

is_catch_all(domain, *, reacher_url=None, timeout=10.0)

Return True/False/None for catch-all status.

Convenience wrapper around :func:check_catchall that returns a plain boolean (or None if unknown) rather than a CheckResult.

Source code in coldreach/verify/catchall.py
async def is_catch_all(
    domain: str,
    *,
    reacher_url: str | None = None,
    timeout: float = 10.0,
) -> bool | None:
    """Return True/False/None for catch-all status.

    Convenience wrapper around :func:`check_catchall` that returns a
    plain boolean (or None if unknown) rather than a CheckResult.
    """
    result = await check_catchall(domain, reacher_url=reacher_url, timeout=timeout)
    if result.passed:
        return False
    meta = result.metadata.get("is_catch_all")
    if meta is True:
        return True
    return None

clear_cache

clear_cache()

Clear the in-memory catch-all cache (useful in tests).

Source code in coldreach/verify/catchall.py
def clear_cache() -> None:
    """Clear the in-memory catch-all cache (useful in tests)."""
    _cache.clear()

Holehe platform check

coldreach.verify.holehe

Holehe platform-presence check.

Uses the holehe library (github.com/megadose/holehe) to check whether an email address is registered on 120+ public platforms (GitHub, Discord, Spotify, Slack, etc.).

A positive result on multiple platforms confirms the email is real and the person is active — especially valuable for catch-all domains where SMTP verification is unreliable.

IMPORTANT: This check makes up to 120 HTTP requests. It is SLOW (15-45s). Only enable it explicitly via --holehe or use_holehe=True in FinderConfig.

Score deltas: +15 registered on ≥ 2 platforms + 5 registered on exactly 1 platform 0 not found (WARN, not FAIL — private persons may not be on these platforms) 0 SKIP — holehe not installed

check_holehe async

check_holehe(
    email,
    *,
    timeout=10.0,
    min_platforms=_MIN_PLATFORMS,
    concurrency=_CONCURRENCY,
)

Check if email is registered on public platforms via holehe.

Parameters:

Name Type Description Default
email str

Email address to check.

required
timeout float

Per-request HTTP timeout for holehe module calls.

10.0
min_platforms int

Registrations needed to award the full +15 score delta.

_MIN_PLATFORMS
concurrency int

Maximum simultaneous holehe module requests.

_CONCURRENCY

Returns:

Type Description
CheckResult

PASS (+15) if registered on ≥ min_platforms platforms. PASS (+5) if registered on exactly 1 platform. WARN (0) if 0 registrations found. SKIP if holehe is not installed.

Source code in coldreach/verify/holehe.py
async def check_holehe(
    email: str,
    *,
    timeout: float = 10.0,
    min_platforms: int = _MIN_PLATFORMS,
    concurrency: int = _CONCURRENCY,
) -> CheckResult:
    """Check if *email* is registered on public platforms via holehe.

    Parameters
    ----------
    email:
        Email address to check.
    timeout:
        Per-request HTTP timeout for holehe module calls.
    min_platforms:
        Registrations needed to award the full +15 score delta.
    concurrency:
        Maximum simultaneous holehe module requests.

    Returns
    -------
    CheckResult
        PASS (+15) if registered on ≥ min_platforms platforms.
        PASS (+5) if registered on exactly 1 platform.
        WARN (0) if 0 registrations found.
        SKIP if holehe is not installed.
    """
    try:
        from holehe.core import get_functions, import_submodules
    except ImportError:
        logger.debug("holehe not installed — skipping platform check")
        return CheckResult.skip("holehe not installed")

    modules = import_submodules("holehe.modules")
    websites = get_functions(modules)

    out: list[dict[str, object]] = []
    sem = asyncio.Semaphore(concurrency)

    async def _run(module: Any, client: httpx.AsyncClient) -> None:
        async with sem:
            try:
                await module(email, client, out)
            except Exception as exc:
                logger.debug(
                    "holehe module %s error: %s",
                    getattr(module, "__name__", "?"),
                    exc,
                )

    async with httpx.AsyncClient(timeout=timeout, follow_redirects=True) as client:
        await asyncio.gather(*[_run(m, client) for m in websites])

    registered = [r for r in out if r.get("exists")]
    count = len(registered)
    names = [str(r.get("name") or r.get("domain") or "?") for r in registered]

    logger.debug(
        "holehe: %d/%d platforms matched for %s: %s",
        count,
        len(websites),
        email,
        names[:8],
    )

    if count >= min_platforms:
        return CheckResult.pass_(
            f"Registered on {count} platform(s): {', '.join(names[:5])}",
            score_delta=15,
            platforms=names,
            platform_count=count,
        )
    if count >= 1:
        return CheckResult.pass_(
            f"Registered on {count} platform(s): {', '.join(names[:5])}",
            score_delta=5,
            platforms=names,
            platform_count=count,
        )
    return CheckResult.warn(
        "Not found on any checked platforms",
        score_delta=0,
        platforms=[],
        platform_count=0,
    )