Data Models¶
Core Pydantic models and enums that flow through the entire ColdReach pipeline — from source discovery through verification and into storage.
Core models¶
coldreach.core.models ¶
ColdReach core Pydantic models.
These are the primary data structures that flow through the entire pipeline — from source discovery through verification and into storage/output.
Design rules
- Every model is immutable (frozen=False by default for update ergonomics, but validation runs on assignment).
- All email strings are normalized to lowercase on input.
- Confidence is always in range [0, 100].
- Timestamps are always UTC-naive datetimes (stored as UTC, no tz info embedded to keep SQLite simple).
VerificationStatus ¶
Bases: StrEnum
Result of the full verification pipeline for one email address.
VALID
class-attribute
instance-attribute
¶
SMTP accepted the address and it's not a catch-all.
INVALID
class-attribute
instance-attribute
¶
Definitively invalid: bad syntax, NXDOMAIN, or SMTP 550.
RISKY
class-attribute
instance-attribute
¶
Passes basic checks but has low-confidence signals.
UNKNOWN
class-attribute
instance-attribute
¶
Cannot determine — catch-all domain or SMTP unreachable.
CATCH_ALL
class-attribute
instance-attribute
¶
Domain accepts all RCPT TO addresses — unverifiable via SMTP.
DISPOSABLE
class-attribute
instance-attribute
¶
Known throwaway / temporary email service.
UNDELIVERABLE
class-attribute
instance-attribute
¶
No MX records — domain cannot receive email.
EmailSource ¶
Bases: StrEnum
Where an email address was discovered.
SourceRecord ¶
Bases: BaseModel
A single discovery event — one source that found one email.
EmailRecord ¶
Bases: BaseModel
A single email address with its verification state and discovery sources.
Attributes:
| Name | Type | Description |
|---|---|---|
email |
str
|
The email address (normalized to lowercase on input). |
confidence |
int
|
Integer in [0, 100]. Higher = more likely to be valid and deliverable. |
status |
VerificationStatus
|
Verification status from the pipeline. |
sources |
list[SourceRecord]
|
All sources that discovered this address (de-duplicated upstream). |
is_catch_all_domain |
bool
|
True if the email's domain accepts all RCPT TO probes — SMTP verification is meaningless in this case. |
mx_records |
list[str]
|
MX hostnames for the domain, sorted by priority. |
holehe_platforms |
list[str]
|
Platform names where this email was confirmed registered (via Holehe). |
checked_at |
datetime
|
When verification was last run. |
normalise_email
classmethod
¶
Lowercase and strip whitespace.
Source code in coldreach/core/models.py
confidence_label ¶
to_dict ¶
Flat dict suitable for CSV export.
Source code in coldreach/core/models.py
DomainResult ¶
Bases: BaseModel
All email addresses discovered for one domain.
Attributes:
| Name | Type | Description |
|---|---|---|
domain |
str
|
The domain that was scanned (e.g. |
company_name |
str | None
|
Human-readable company name if known. |
emails |
list[EmailRecord]
|
Discovered and verified email addresses. |
is_catch_all |
bool
|
True if the domain's mail server accepts all RCPT TO probes. |
mx_records |
list[str]
|
MX records for the domain. |
crawled_at |
datetime
|
Timestamp when the scan completed. |
sorted_emails ¶
Return emails sorted by confidence descending.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
min_confidence
|
int
|
Exclude emails below this confidence threshold. |
0
|
Source code in coldreach/core/models.py
add_email ¶
Add or merge an email record, avoiding exact duplicates.
Exceptions¶
coldreach.exceptions ¶
ColdReach custom exceptions.
Hierarchy
ColdReachError ├── ConfigError — bad or missing configuration ├── SourceError — data source (scraper / API) failed │ └── RateLimitError — upstream rate limit hit ├── VerificationError — error during email verification └── ServiceUnavailableError — a Docker service is not reachable
ColdReachError ¶
Bases: Exception
Base exception for all ColdReach errors.
ConfigError ¶
Bases: ColdReachError
Raised when configuration is invalid or missing.
SourceError ¶
Bases: ColdReachError
Raised when a data source fails to return results.
RateLimitError ¶
Bases: SourceError
Raised when an upstream service rate-limits the request.
Attributes:
| Name | Type | Description |
|---|---|---|
service |
Human-readable service name (e.g. |
|
retry_after |
Suggested number of seconds to wait before retrying, if provided by the upstream service. |
Source code in coldreach/exceptions.py
VerificationError ¶
Bases: ColdReachError
Raised when the verification pipeline encounters an unrecoverable error.
ServiceUnavailableError ¶
Bases: ColdReachError
Raised when a required Docker service cannot be reached.
Attributes:
| Name | Type | Description |
|---|---|---|
service |
Short service name used in docker-compose.yml (e.g. |
|
url |
The URL that was attempted. |