Build a Daily New-Business Lead List with the API
Every business registered in Florida becomes public record once the state files it. With the Sunbiz Daily API you can pull recently registered businesses — narrowed to your city, entity type, or ZIP — and turn them into a lead list that refreshes itself every morning. The key is to pull a rolling window rather than a single day, because Florida releases filings over the days after they're filed.
Which endpoint?
Corporate filings (corporations and LLCs) live at /api/v2/filings/. The
period parameter gives you a rolling window without computing dates yourself —
use 7d for a daily feed, or 14d/30d to cast a wider net.
Add city, filing_type, zip, status, or
county to focus on the leads you actually want.
period only accepts 7d, 14d, and 30d (plus
yesterday and all) — any other value silently falls back to 7 days, so
for a window wider than 30 days use an explicit start_date instead.
curl -H "X-API-Key: sb_your_key_here" \
"https://sunbizdaily.com/api/v2/filings/?period=7d&city=miami&filing_type=FLAL,DOMP&per_page=100"
filing_type accepts comma-separated codes. For a new-business lead list the two that
matter are FLAL (Florida LLCs) and DOMP (Florida profit corporations) —
together they're the overwhelming majority of new registrations. Other valid corporate codes:
DOMNP (domestic non-profit), FORP (foreign profit), FORL
(foreign LLC), FORNP (foreign non-profit), FORLP (foreign registered LLP),
DOSLP (domestic limited partnership), and TRUST.
filing_type the API doesn't recognize (a typo, or a real-but-unlisted code), it's
quietly dropped. A mixed list keeps its valid codes (FLAL,BADCODE behaves like
FLAL), but if nothing valid is left you get an empty result set with a
normal 200 response — no error, just total: 0. Use the codes above
verbatim and sanity-check your total: an unexpected zero usually means a mistyped
code.
How fresh is the data — and why pull a window?
The date the API filters and sorts on, file_date, is the date Florida filed the
document — not the date it became available through the API. Those differ: records become
available roughly six days after their file_date, and they keep
arriving for a week or two. So the most recent few days of any window are incomplete and revise upward — run the
same query a week later and yesterday's count will have grown several-fold.
That's why a single-day pull undercounts badly, and why the right pattern is a rolling
window plus deduplication: pull period=7d (or wider) every morning, and skip
any corporation_number you've already recorded. New rows that appeared since your last
run are your fresh leads — including ones the state released late.
-
Statutory conversions (e.g. an LLP converting to an LLC) get a new record
but keep the original entity's
file_date— so a business "created" this year can carry afile_datefrom decades ago and won't show up in a recent window. These are a small, unavoidable undercount for date-range queries. -
The window never runs into the future. The upper bound is always clamped to
today, so future-effective filings stay hidden until their date arrives, and the rare record
with no
file_datenever appears in any list (look those up by name or number).
Page through the results
Every list response is an envelope: a filings array plus a pagination
object with total, total_capped, page, per_page,
total_pages, and max_pages. Request up to 100 results per page and walk the
pages until you reach total_pages.
max_pages = 100,
and total_pages is capped to match — so a while page <= total_pages loop
terminates correctly, but only the first 10,000 records (100 pages × 100) are reachable. For a broad
window pagination.total itself saturates at 10000 and the response sets
total_capped: true — so check total_capped (not total) to tell
when a window overflows the ceiling. Request a page past the cap and you get an empty
filings array — no error, no duplicates. If a window is capped, split it by
city, zip, or filing_type to bring it under 10,000.
import time
import requests
API_KEY = "sb_your_key_here"
HOST = "https://sunbizdaily.com"
BASE = f"{HOST}/api/v2/filings/"
HEADERS = {"X-API-Key": API_KEY}
def get_page(url, params):
"""Fetch one page. A query too broad to run inline is offloaded to a
background job (HTTP 202); poll it until it's done. Returns (data, url) —
url switches to the job's poll URL once offloaded, so the caller keeps
paging there with the same envelope."""
resp = requests.get(url, headers=HEADERS, params=params, timeout=30)
if resp.status_code == 202:
url = HOST + resp.json()["poll_url"]
while True:
time.sleep(2)
poll = requests.get(url, headers=HEADERS, params=params, timeout=30)
poll.raise_for_status()
job = poll.json()
if job["status"] == "failed":
raise RuntimeError(job.get("error", "search job failed"))
if job["status"] == "done":
return job, url
resp.raise_for_status()
return resp.json(), url
def fetch_new_filings(city=None, filing_type=None):
params = {"period": "7d", "per_page": 100, "sort": "file_date", "order": "desc"}
if city:
params["city"] = city
if filing_type:
params["filing_type"] = filing_type
url, page = BASE, 1
while True:
params["page"] = page
data, url = get_page(url, params) # url sticks to the job once offloaded
yield from data["filings"]
pg = data["pagination"]
# total_pages is capped at max_pages, so this terminates at the ceiling.
if page >= min(pg["total_pages"], pg["max_pages"]):
break
page += 1
leads = list(fetch_new_filings(city="Miami", filing_type="FLAL,DOMP"))
print(f"{len(leads)} filings in the rolling window")200). But a query too large to run inline — say an unfiltered
period=all — is offloaded to a background job and the API returns 202 with a
poll_url. Poll /api/v2/jobs/{id}/ until status is
done, then read the same filings+pagination envelope (a
truncated: true flag means pagination.total is a floor, not an exact count).
The code above handles this for you. You can have a few jobs in flight before the API returns
429.
Get contact details for each lead
The list response is deliberately lean: each row carries only
corporation_number, corporation_name, filing_type,
filing_type_display, status, file_date, county,
and the principal city/state/zip. Add
?include=parties to inline each row's officers and
registered_agent (redaction-masked) — so you get the people in the same paged request,
no per-row detail call. The full principal and mailing addresses and fei_number
remain detail-only; for those, call /api/v2/filings/{corporation_number}/ once
per record.
?include=parties returns them inline and skips the
per-lead call entirely. You still need a detail call for full addresses or fei_number:
1,000 such lookups is 1,000 requests against a 1,000-per-hour limit, so fetch details only for the
leads you'll actually use, and pace the calls (watch X-RateLimit-Remaining).
def fetch_detail(corporation_number):
resp = requests.get(BASE + f"{corporation_number}/", headers=HEADERS, timeout=30)
resp.raise_for_status()
return resp.json() # officers, registered_agent, principal/mailing address, fei_number
for lead in leads[:50]: # fetch details for a slice, not the whole list
detail = fetch_detail(lead["corporation_number"])
lead["officers"] = detail["officers"]
lead["registered_agent"] = detail["registered_agent"]LAST FIRST with padding — parse them
before using. And principal_address.state is often blank even for Florida businesses
(the mailing address may carry the "FL"), so a state=FL filter silently misses those
rows — filter by city or zip instead when you can.
Filter by location the right way
city is an exact match (case-insensitive), not a substring search.
city=miami returns Miami but not "Miami Beach" — that's a separate city. To
cover a metro, pass comma-separated cities (city=miami,miami beach,hialeah) or filter
by a zip prefix instead (zip=331 matches all ZIPs starting 331). The
zip filter is a prefix match, which makes it the simplest way to pull a whole area.
Turn it into a daily job
-
Schedule a rolling pull: run the script each morning against
period=7d(or wider). Because recent days keep filling in, a rolling window catches late-released filings that a single-day pull would miss. -
Deduplicate: store the
corporation_numberof every lead you've already seen so overlapping windows and re-runs don't create duplicates. New numbers are your fresh leads. -
Mind the rate limit: requests are capped at 1,000 per hour per key, and each
detail lookup is its own request. Every response includes
X-RateLimit-Remainingso you can pace large pulls.
Next steps
Create a free key on your API Keys dashboard, then see the API documentation for the full list of filters, fields, and the other resources (fictitious names and partnerships, which use the same rolling-window and pagination patterns described here).