pytidycensus.decennial

Decennial Census data retrieval functions.

Functions

get_decennial(geography[, variables, table, ...])

Obtain data from the US Decennial Census.

get_decennial_variables([year, sumfile])

Get available decennial Census variables for a given year.

pytidycensus.decennial.get_decennial(geography, variables=None, table=None, cache_table=False, year=2020, sumfile=None, state=None, county=None, output='tidy', geometry=False, keep_geo_vars=False, shift_geo=False, summary_var=None, pop_group=None, pop_group_label=False, api_key=None, show_call=False, **kwargs)[source]

Obtain data from the US Decennial Census.

Parameters:
  • geography (str) – The geography of your data (e.g., ‘county’, ‘tract’, ‘block group’).

  • variables (str, list of str, or dict, optional) – Variable ID(s) to retrieve. Can be a single variable, list of variables, or dictionary mapping custom names to variable IDs. If not provided, must specify table.

  • table (str, optional) – Census table ID to retrieve all variables from.

  • cache_table (bool, default False) – Whether to cache table names for faster future access.

  • year (int, default 2020) – Census year (2000, 2010, or 2020). Note: 1990 data is not available via the API.

  • sumfile (str, optional) – Summary file to use. Defaults to ‘pl’ for 2020, ‘sf1’ for earlier years. Available options vary by year.

  • state (str, int, or list, optional) – State(s) to retrieve data for. Accepts names, abbreviations, or FIPS codes.

  • county (str, int, or list, optional) – County(ies) to retrieve data for. Must be used with state.

  • output (str, default "tidy") – Output format (“tidy” or “wide”).

  • geometry (bool, default False) – Whether to include geometry for mapping.

  • keep_geo_vars (bool, default False) – Whether to keep all geographic variables from shapefiles.

  • shift_geo (bool, default False) – (Deprecated) If True, warn user to use alternative geometry shifting.

  • summary_var (str, optional) – Summary variable from the decennial Census to include for comparison.

  • pop_group (str, optional) – Population group code for which you’d like to request data (for selected sumfiles).

  • pop_group_label (bool, default False) – If True, return a pop_group_label column with the population group description.

  • api_key (str, optional) – Census API key. If not provided, looks for CENSUS_API_KEY environment variable.

  • show_call (bool, default False) – Whether to print the API call URL.

  • **kwargs – Additional parameters passed to geography functions.

Returns:

Decennial Census data, optionally with geometry.

Return type:

pandas.DataFrame or geopandas.GeoDataFrame

Examples

>>> import pytidycensus as tc
>>> tc.set_census_api_key("your_key_here")
>>>
>>> # Get total population by state for 2020
>>> pop_2020 = tc.get_decennial(
...     geography="state",
...     variables="P1_001N",
...     year=2020
... )
>>>
>>> # Get race/ethnicity data with geometry
>>> race_data = tc.get_decennial(
...     geography="county",
...     variables=["P1_003N", "P1_004N", "P1_005N"],
...     state="CA",
...     year=2020,
...     geometry=True
... )
>>>
>>> # Get data with named variables and summary variable
>>> pop_data = tc.get_decennial(
...     geography="county",
...     variables={"total": "P1_001N", "white": "P1_003N"},
...     state="TX",
...     year=2020,
...     summary_var="P1_001N"
... )
pytidycensus.decennial.get_decennial_variables(year=2020, sumfile=None)[source]

Get available decennial Census variables for a given year.

Parameters:
  • year (int, default 2020) – Census year

  • sumfile (str, optional) – Summary file. Defaults to ‘pl’ for 2020, ‘sf1’ for earlier years.

Returns:

Available variables with metadata

Return type:

pd.DataFrame