pytidycensus.estimates

Population estimates data retrieval functions.

Functions

discover_available_variables([vintage, ...])

Discover all available variables in a PEP dataset.

get_estimates(geography[, product, ...])

Obtain data from the US Census Bureau Population Estimates Program.

get_estimates_variables([year])

Get available population estimates variables for a given year.

Exceptions

APIError

Raised when there are issues with API requests.

DataNotAvailableError

Raised when requested data is not available.

InvalidGeographyError

Raised when an invalid geography is specified.

InvalidVariableError

Raised when an invalid variable is specified.

PopulationEstimatesError

Base exception class for Population Estimates errors.

exception pytidycensus.estimates.PopulationEstimatesError[source]

Bases: Exception

Base exception class for Population Estimates errors.

exception pytidycensus.estimates.InvalidGeographyError[source]

Bases: PopulationEstimatesError

Raised when an invalid geography is specified.

exception pytidycensus.estimates.InvalidVariableError[source]

Bases: PopulationEstimatesError

Raised when an invalid variable is specified.

exception pytidycensus.estimates.DataNotAvailableError[source]

Bases: PopulationEstimatesError

Raised when requested data is not available.

exception pytidycensus.estimates.APIError[source]

Bases: PopulationEstimatesError

Raised when there are issues with API requests.

pytidycensus.estimates.get_estimates(geography, product=None, variables=None, breakdown=None, breakdown_labels=False, vintage=2024, year=None, state=None, county=None, time_series=False, output='tidy', geometry=False, keep_geo_vars=False, api_key=None, show_call=False, **kwargs)[source]

Obtain data from the US Census Bureau Population Estimates Program.

The Population Estimates Program (PEP) produces estimates of the population for the United States, its states, counties, cities, and towns. For years 2020 and later, data is retrieved from flat CSV files. For years 2019 and earlier, data comes from the Census API.

Parameters:
  • geography (str) – The geography of your data. Options include: - ‘us’ (United States) - ‘region’ (Census regions) - ‘division’ (Census divisions) - ‘state’ (States and DC) - ‘county’ (Counties) - ‘cbsa’ (Core Based Statistical Areas) - ‘metropolitan statistical area/micropolitan statistical area’ (alias for cbsa) - ‘combined statistical area’ (Combined Statistical Areas) - ‘place’ (Incorporated places and Census designated places)

  • product (str, optional) – The data product. Options include: - ‘population’ (population totals) - ‘components’ (components of population change) - ‘characteristics’ (population by demographics) For years 2020+, only ‘characteristics’ requires this parameter.

  • variables (str or list of str, optional) – Variable ID(s) to retrieve. Use ‘all’ to get all available variables. Common variables include: ‘POP’, ‘BIRTHS’, ‘DEATHS’, ‘DOMESTICMIG’, ‘INTERNATIONALMIG’

  • breakdown (list of str, optional) – Population breakdown for characteristics product. Options include: - ‘AGEGROUP’ (age groups) - ‘SEX’ (sex) - ‘RACE’ (race) - ‘HISP’ (Hispanic origin) Can be combined, e.g., [‘SEX’, ‘RACE’]

  • breakdown_labels (bool, default False) – Whether to include human-readable labels for breakdown categories.

  • vintage (int, default 2024) – The PEP vintage (dataset version year). Recommended to use the most recent.

  • year (int, optional) – The specific data year. Defaults to vintage if not specified.

  • state (str, int, or list, optional) – State(s) to retrieve data for. Accepts names, abbreviations, or FIPS codes.

  • county (str, int, or list, optional) – County(ies) to retrieve data for. Must be used with state.

  • time_series (bool, default False) – Whether to retrieve time series data back to 2010.

  • output (str, default "tidy") – Output format (“tidy” or “wide”).

  • geometry (bool, default False) – Whether to include geometry for mapping.

  • keep_geo_vars (bool, default False) – Whether to keep all geographic variables from shapefiles.

  • api_key (str, optional) – Census API key for years 2019 and earlier.

  • show_call (bool, default False) – Whether to print the API call URL (for API-based requests).

  • **kwargs – Additional parameters passed to geography functions.

Returns:

Population estimates data, optionally with geometry.

Return type:

pandas.DataFrame or geopandas.GeoDataFrame

Examples

>>> import pytidycensus as tc
>>> tc.set_census_api_key("your_key_here")
>>>
>>> # Get total population estimates by state
>>> state_pop = tc.get_estimates(
...     geography="state",
...     variables="POP",
...     year=2022
... )
>>>
>>> # Get population by age and sex for counties in Texas
>>> tx_pop_demo = tc.get_estimates(
...     geography="county",
...     variables="POP",
...     breakdown=["SEX", "AGEGROUP"],
...     state="TX",
...     breakdown_labels=True
... )
pytidycensus.estimates.discover_available_variables(vintage=2024, geography='state')[source]

Discover all available variables in a PEP dataset.

Parameters:
  • vintage (int, default 2024) – The vintage year of the dataset

  • geography (str, default "state") – The geography to check for available variables

Returns:

DataFrame with variable names and descriptions

Return type:

pd.DataFrame

pytidycensus.estimates.get_estimates_variables(year=2022)[source]

Get available population estimates variables for a given year.

Parameters:

year (int, default 2022) – Estimates year

Returns:

Available variables with metadata

Return type:

pd.DataFrame