Pattern Generation API¶
Generates likely email addresses from a person's name and a company domain. When known emails are available, the domain's format is inferred first so only targeted guesses are produced.
Pattern generation¶
coldreach.generate.patterns ¶
Email pattern generator — produces candidate addresses from a person's name + domain.
Given a full name like "John Smith" and domain "acme.com", generates the 12 most common professional email formats used by B2B companies:
john@acme.com john.smith@acme.com jsmith@acme.com j.smith@acme.com smithj@acme.com smith.j@acme.com johnsmith@acme.com smith@acme.com johns@acme.com john-smith@acme.com j-smith@acme.com js@acme.com
Names are normalised: accents stripped, hyphenated names split, suffixes (Jr, Sr, III, etc.) removed before pattern expansion.
Usage
from coldreach.generate.patterns import generate_patterns
candidates = generate_patterns("John Smith", "acme.com")
# → [EmailPattern(email="john@acme.com", format_name="first"), ...]
EmailPattern
dataclass
¶
A single generated email candidate.
Attributes:
| Name | Type | Description |
|---|---|---|
email |
str
|
The generated email address (already lowercased). |
format_name |
str
|
Short identifier for the pattern (e.g. |
generate_patterns ¶
Generate candidate email addresses for full_name at domain.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
full_name
|
str
|
The person's full name, e.g. |
required |
domain
|
str
|
The company domain, e.g. |
required |
Returns:
| Type | Description |
|---|---|
list[EmailPattern]
|
Deduplicated list of candidates ordered from most-common to least. Empty list if name cannot be parsed (e.g. empty string). |
Examples:
>>> patterns = generate_patterns("John Smith", "acme.com")
>>> [p.format_name for p in patterns[:3]]
['first', 'first.last', 'flast']
Source code in coldreach/generate/patterns.py
generate_role_emails ¶
Generate common role-based email candidates for domain.
Returns candidates like info@domain.com, sales@domain.com.
These are low-confidence guesses — always verify before using.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
domain
|
str
|
The company domain, e.g. |
required |
Returns:
| Type | Description |
|---|---|
list[EmailPattern]
|
Role email candidates with |
Source code in coldreach/generate/patterns.py
most_likely_format ¶
Infer the most common email format from a list of known addresses.
Useful when you already have one confirmed email at a domain and want to generate candidates for other people using the same format.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
known_emails
|
list[str]
|
List of confirmed email addresses at the domain. |
required |
domain
|
str
|
The domain to analyse. |
required |
Returns:
| Type | Description |
|---|---|
str | None
|
The |
Examples:
Source code in coldreach/generate/patterns.py
Format learner¶
coldreach.generate.learner ¶
Domain email format learner.
Infers a company's email format from confirmed addresses at that domain, then generates targeted candidates for a specific person — only the format(s) that match the domain's known pattern.
This avoids the shotgun approach of generating all 12 variants and running each through SMTP verification (expensive and likely to trigger rate limits).
Confidence tiers: - Known format match → confidence_hint = 10 (format confirmed from real emails) - Blind guess → confidence_hint = 5 (no known emails, guessing top-3 formats)
Example
from coldreach.generate.learner import targeted_patterns
# Domain uses "first.last" format (inferred from jane.doe@acme.com)
patterns = targeted_patterns("John Smith", "acme.com", ["jane.doe@acme.com"])
# → [EmailPattern("john.smith@acme.com", "first.last")]
# Domain format unknown — return top-3 guesses
patterns = targeted_patterns("John Smith", "acme.com", [])
# → [EmailPattern("john.smith@acme.com", "first.last"),
# EmailPattern("jsmith@acme.com", "flast"),
# EmailPattern("john@acme.com", "first")]
learn_format ¶
Return the most likely email format_name for domain.
Analyses the local parts of known_emails and returns the format_name
(e.g. "first.last", "flast") that best describes them.
Returns None if the format cannot be determined (too few emails,
or local parts are too ambiguous like role addresses info@, hr@).
Source code in coldreach/generate/learner.py
targeted_patterns ¶
Generate targeted email candidates for full_name at domain.
When a domain format can be inferred from known_emails, only patterns matching that format (plus close companions) are returned.
When the format is unknown, the max_fallback most common B2B formats are returned.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
full_name
|
str
|
Person's full name, e.g. |
required |
domain
|
str
|
Company domain, e.g. |
required |
known_emails
|
list[str]
|
Confirmed email addresses at domain (used to infer format). |
required |
max_fallback
|
int
|
Number of fallback formats to try when domain format is unknown. |
3
|
Returns:
| Type | Description |
|---|---|
list[EmailPattern]
|
Targeted candidates, deduplicated, ordered by confidence. Empty if full_name cannot be parsed. |