pytidycensus.variables

Census variable loading and caching functionality.

Functions

clear_cache([cache_dir])

Clear the variables cache.

get_table_variables(table, year, dataset[, ...])

Get all variables for a specific table.

list_available_datasets(year)

List available datasets for a given year.

load_variables(year, dataset[, survey, ...])

Load Census variables for a given dataset and year.

search_variables(pattern, year, dataset[, ...])

Search for variables by pattern in labels, concepts, or names.

pytidycensus.variables.load_variables(year, dataset, survey=None, cache=True, cache_dir=None)[source]

Load Census variables for a given dataset and year.

Parameters:
  • year (int) – Census year

  • dataset (str) – Dataset name (‘acs’, ‘dec’, ‘pep’, etc.)

  • survey (str, optional) – Survey type (e.g., ‘acs5’, ‘acs1’, ‘sf1’, ‘pl’)

  • cache (bool, default True) – Whether to cache variables for faster future access

  • cache_dir (str, optional) – Directory for caching. Defaults to user cache directory.

Returns:

Variables with columns: name, label, concept, predicateType, group, limit

Return type:

pd.DataFrame

Examples

>>> # Load ACS 5-year variables for 2022
>>> acs_vars = load_variables(2022, "acs", "acs5")
>>>
>>> # Search for income-related variables
>>> income_vars = acs_vars[acs_vars['label'].str.contains('income', case=False)]
>>>
>>> # Load decennial census variables for 2020
>>> dec_vars = load_variables(2020, "dec", "pl")
pytidycensus.variables.search_variables(pattern, year, dataset, survey=None, field='label')[source]

Search for variables by pattern in labels, concepts, or names.

Parameters:
  • pattern (str) – Search pattern (case-insensitive)

  • year (int) – Census year

  • dataset (str) – Dataset name

  • survey (str, optional) – Survey type

  • field (str, default "label") – Field to search in (‘label’, ‘concept’, ‘name’, or ‘all’)

Returns:

Matching variables

Return type:

pd.DataFrame

Examples

>>> # Search for income variables in ACS
>>> income_vars = search_variables("income", 2022, "acs", "acs5")
>>>
>>> # Search for population in concepts
>>> pop_vars = search_variables("population", 2020, "dec", "pl", field="concept")
pytidycensus.variables.get_table_variables(table, year, dataset, survey=None)[source]

Get all variables for a specific table.

Parameters:
  • table (str) – Table code (e.g., ‘B19013’, ‘P1’)

  • year (int) – Census year

  • dataset (str) – Dataset name

  • survey (str, optional) – Survey type

Returns:

Variables for the specified table

Return type:

pd.DataFrame

Examples

>>> # Get all variables for median household income table
>>> b19013_vars = get_table_variables("B19013", 2022, "acs", "acs5")
>>>
>>> # Get all variables for race table in 2020 Census
>>> p1_vars = get_table_variables("P1", 2020, "dec", "pl")
pytidycensus.variables.clear_cache(cache_dir=None)[source]

Clear the variables cache.

Parameters:

cache_dir (str, optional) – Cache directory to clear. Defaults to user cache directory.

Return type:

None

pytidycensus.variables.list_available_datasets(year)[source]

List available datasets for a given year.

Parameters:

year (int) – Census year

Returns:

Available datasets and their surveys

Return type:

Dict[str, list]