pytidycensus.utils

Utility functions for data processing and validation.

Functions

add_margin_of_error(df, variables[, ...])

Add margin of error columns for ACS data with confidence level adjustment.

add_name_column(df)

Add NAME column using national_county.txt lookup table for geographic areas.

build_geography_params(geography[, state, ...])

Build geography parameters for Census API call.

check_overlapping_acs_periods(years, survey)

Check for and warn about overlapping ACS periods.

get_credentials()

load_county_lookup()

Load county lookup table from national_county.txt.

lookup_county_fips(county_name, state_fips)

Look up county FIPS code by name.

process_census_data(data, variables[, output])

Process raw Census API response into pandas DataFrame.

validate_county(county, state_fips)

Validate and convert county identifiers to FIPS codes.

validate_geography(geography[, dataset])

Validate geography parameter.

validate_state(state)

Validate and convert state identifiers to FIPS codes.

validate_year(year, dataset)

Validate year for given dataset.

pytidycensus.utils.get_credentials()[source]
pytidycensus.utils.load_county_lookup()[source]

Load county lookup table from national_county.txt.

Returns:

DataFrame with columns: state_abbrev, state_fips, county_fips, county_name

Return type:

pd.DataFrame

pytidycensus.utils.add_name_column(df)[source]

Add NAME column using national_county.txt lookup table for geographic areas.

Works for state, county, and tract level geographies by matching GEOID. For tract-level data, shows county and state name without tract number.

Parameters:

df (pd.DataFrame) – DataFrame with GEOID column

Returns:

DataFrame with NAME column added

Return type:

pd.DataFrame

pytidycensus.utils.validate_state(state)[source]

Validate and convert state identifiers to FIPS codes.

Parameters:

state (str, int, or list) – State name(s), abbreviation(s), or FIPS code(s)

Returns:

List of 2-digit FIPS codes

Return type:

List[str]

Raises:

ValueError – If state identifier is invalid

pytidycensus.utils.validate_county(county, state_fips)[source]

Validate and convert county identifiers to FIPS codes.

Parameters:
  • county (str, int, or list) – County name(s) or FIPS code(s)

  • state_fips (str) – State FIPS code

Returns:

List of 3-digit county FIPS codes

Return type:

List[str]

Raises:

ValueError – If county identifier is invalid

pytidycensus.utils.lookup_county_fips(county_name, state_fips)[source]

Look up county FIPS code by name.

Parameters:
  • county_name (str) – County name to look up

  • state_fips (str) – State FIPS code

Returns:

County FIPS code if found, None otherwise

Return type:

Optional[str]

pytidycensus.utils.validate_year(year, dataset)[source]

Validate year for given dataset.

Parameters:
  • year (int) – Census year

  • dataset (str) – Dataset type (‘acs’, ‘dec’, ‘estimates’)

Returns:

Validated year

Return type:

int

Raises:

ValueError – If year is not available for dataset

pytidycensus.utils.check_overlapping_acs_periods(years, survey)[source]

Check for and warn about overlapping ACS periods.

Overlapping ACS periods (e.g., 2018 and 2019 for ACS5) share common years and should not be used for statistical comparisons or trend analysis.

Parameters:
  • years (list of int) – Years being requested

  • survey (str) – Survey type (‘acs1’, ‘acs3’, or ‘acs5’)

Return type:

None

Warning

UserWarning

If overlapping periods are detected

pytidycensus.utils.validate_geography(geography, dataset=None)[source]

Validate geography parameter.

Parameters:
  • geography (str) – Geography level

  • dataset (str, optional) – Dataset type (“acs”, “decennial”, “estimates”) for context-aware validation

Returns:

Validated geography

Return type:

str

Raises:
pytidycensus.utils.build_geography_params(geography, state=None, county=None, **kwargs)[source]

Build geography parameters for Census API call.

Parameters:
  • geography (str) – Geography level

  • state (str, int, or list, optional) – State identifier(s)

  • county (str, int, or list, optional) – County identifier(s)

  • **kwargs – Additional geography parameters

Returns:

Geography parameters for API call

Return type:

Dict[str, str]

Raises:

NotImplementedError – If geography is recognized but not yet implemented

pytidycensus.utils.process_census_data(data, variables, output='tidy')[source]

Process raw Census API response into pandas DataFrame.

Parameters:
  • data (List[Dict[str, Any]]) – Raw Census API response

  • variables (List[str]) – Variable codes requested

  • output (str, default "tidy") – Output format (“tidy” or “wide”)

Returns:

Processed data

Return type:

pd.DataFrame

pytidycensus.utils.add_margin_of_error(df, variables, moe_level=90, output='tidy')[source]

Add margin of error columns for ACS data with confidence level adjustment.

Parameters:
  • df (pd.DataFrame) – Census data

  • variables (List[str]) – Variable codes

  • moe_level (int, default 90) – Confidence level (90, 95, or 99)

Returns:

Data with margin of error columns

Return type:

pd.DataFrame