pytidycensus.geography

Geographic boundary data retrieval and processing using TIGER shapefiles.

Functions

get_block_group_boundaries(state[, county, year])

Get block group boundaries for a state, optionally filtered by county.

get_county_boundaries([state, year])

Get US county boundaries, optionally filtered by state.

get_geography(geography[, year, state, ...])

Download and load geographic boundary data from TIGER/Line shapefiles.

get_state_boundaries([year])

Get US state boundaries.

get_tract_boundaries(state[, county, year])

Get census tract boundaries for a state, optionally filtered by county.

Classes

TigerDownloader([cache_dir])

Downloads and processes TIGER/Line shapefiles from the US Census Bureau.

class pytidycensus.geography.TigerDownloader(cache_dir=None)[source]

Bases: object

Downloads and processes TIGER/Line shapefiles from the US Census Bureau.

BASE_URL = 'https://www2.census.gov/geo/tiger'
__init__(cache_dir=None)[source]

Initialize TIGER downloader.

Parameters:

cache_dir (str, optional) – Directory for caching downloaded files

static download_with_wget_or_curl(url, zip_path)[source]
download_and_extract(url, filename)[source]

Download and extract TIGER shapefile.

Parameters:
  • url (str) – Download URL

  • filename (str) – Local filename for caching

Returns:

Path to extracted shapefile directory

Return type:

str

get_shapefile_path(extract_dir)[source]

Find the shapefile (.shp) in the extracted directory.

Parameters:

extract_dir (str) – Directory containing extracted files

Returns:

Path to .shp file

Return type:

str

pytidycensus.geography.get_geography(geography, year=2022, state=None, county=None, keep_geo_vars=False, cache_dir=None, **kwargs)[source]

Download and load geographic boundary data from TIGER/Line shapefiles.

Parameters:
  • geography (str) – Geography type (e.g., ‘county’, ‘tract’, ‘block group’)

  • year (int, default 2022) – Census year for boundaries

  • state (str, int, or list, optional) – State(s) to filter data for

  • county (str, int, or list, optional) – County(ies) to filter data for (requires state)

  • keep_geo_vars (bool, default False) – Whether to keep all geographic variables

  • cache_dir (str, optional) – Directory for caching downloaded files

  • **kwargs – Additional filtering parameters

Returns:

Geographic boundary data

Return type:

geopandas.GeoDataFrame

Examples

>>> # Get county boundaries for Texas
>>> tx_counties = get_geography("county", state="TX", year=2022)
>>>
>>> # Get tract boundaries for Harris County, TX
>>> harris_tracts = get_geography(
...     "tract",
...     state="TX",
...     county="201",
...     year=2022
... )
pytidycensus.geography.get_state_boundaries(year=2022, **kwargs)[source]

Get US state boundaries.

Return type:

GeoDataFrame

pytidycensus.geography.get_county_boundaries(state=None, year=2022, **kwargs)[source]

Get US county boundaries, optionally filtered by state.

Return type:

GeoDataFrame

pytidycensus.geography.get_tract_boundaries(state, county=None, year=2022, **kwargs)[source]

Get census tract boundaries for a state, optionally filtered by county.

Return type:

GeoDataFrame

pytidycensus.geography.get_block_group_boundaries(state, county=None, year=2022, **kwargs)[source]

Get block group boundaries for a state, optionally filtered by county.

Return type:

GeoDataFrame