pytidycensus.utils

Utility functions for data processing and validation.

Functions

add_margin_of_error(df, variables[, ...])

Add margin of error columns for ACS data with confidence level adjustment.

add_name_column(df)

Add NAME column using national_county.txt lookup table for geographic areas.

build_geography_params(geography[, state, ...])

Build geography parameters for Census API call.

get_credentials()

load_county_lookup()

Load county lookup table from national_county.txt.

lookup_county_fips(county_name, state_fips)

Look up county FIPS code by name.

process_census_data(data, variables[, output])

Process raw Census API response into pandas DataFrame.

validate_county(county, state_fips)

Validate and convert county identifiers to FIPS codes.

validate_geography(geography[, dataset])

Validate geography parameter.

validate_state(state)

Validate and convert state identifiers to FIPS codes.

validate_year(year, dataset)

Validate year for given dataset.

pytidycensus.utils.get_credentials()[source]
pytidycensus.utils.load_county_lookup()[source]

Load county lookup table from national_county.txt.

Returns:

DataFrame with columns: state_abbrev, state_fips, county_fips, county_name

Return type:

pd.DataFrame

pytidycensus.utils.add_name_column(df)[source]

Add NAME column using national_county.txt lookup table for geographic areas.

Works for state, county, and tract level geographies by matching GEOID. For tract-level data, shows county and state name without tract number.

Parameters:

df (pd.DataFrame) – DataFrame with GEOID column

Returns:

DataFrame with NAME column added

Return type:

pd.DataFrame

pytidycensus.utils.validate_state(state)[source]

Validate and convert state identifiers to FIPS codes.

Parameters:

state (str, int, or list) – State name(s), abbreviation(s), or FIPS code(s)

Returns:

List of 2-digit FIPS codes

Return type:

List[str]

Raises:

ValueError – If state identifier is invalid

pytidycensus.utils.validate_county(county, state_fips)[source]

Validate and convert county identifiers to FIPS codes.

Parameters:
  • county (str, int, or list) – County name(s) or FIPS code(s)

  • state_fips (str) – State FIPS code

Returns:

List of 3-digit county FIPS codes

Return type:

List[str]

Raises:

ValueError – If county identifier is invalid

pytidycensus.utils.lookup_county_fips(county_name, state_fips)[source]

Look up county FIPS code by name.

Parameters:
  • county_name (str) – County name to look up

  • state_fips (str) – State FIPS code

Returns:

County FIPS code if found, None otherwise

Return type:

Optional[str]

pytidycensus.utils.validate_year(year, dataset)[source]

Validate year for given dataset.

Parameters:
  • year (int) – Census year

  • dataset (str) – Dataset type (‘acs’, ‘dec’, ‘estimates’)

Returns:

Validated year

Return type:

int

Raises:

ValueError – If year is not available for dataset

pytidycensus.utils.validate_geography(geography, dataset=None)[source]

Validate geography parameter.

Parameters:
  • geography (str) – Geography level

  • dataset (str, optional) – Dataset type (“acs”, “decennial”, “estimates”) for context-aware validation

Returns:

Validated geography

Return type:

str

Raises:
pytidycensus.utils.build_geography_params(geography, state=None, county=None, **kwargs)[source]

Build geography parameters for Census API call.

Parameters:
  • geography (str) – Geography level

  • state (str, int, or list, optional) – State identifier(s)

  • county (str, int, or list, optional) – County identifier(s)

  • **kwargs – Additional geography parameters

Returns:

Geography parameters for API call

Return type:

Dict[str, str]

Raises:

NotImplementedError – If geography is recognized but not yet implemented

pytidycensus.utils.process_census_data(data, variables, output='tidy')[source]

Process raw Census API response into pandas DataFrame.

Parameters:
  • data (List[Dict[str, Any]]) – Raw Census API response

  • variables (List[str]) – Variable codes requested

  • output (str, default "tidy") – Output format (“tidy” or “wide”)

Returns:

Processed data

Return type:

pd.DataFrame

pytidycensus.utils.add_margin_of_error(df, variables, moe_level=90, output='tidy')[source]

Add margin of error columns for ACS data with confidence level adjustment.

Parameters:
  • df (pd.DataFrame) – Census data

  • variables (List[str]) – Variable codes

  • moe_level (int, default 90) – Confidence level (90, 95, or 99)

Returns:

Data with margin of error columns

Return type:

pd.DataFrame