{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Working with Census microdata\n", "\n", "[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/mmann1123/pytidycensus/blob/main/examples/05_pums_data.ipynb)\n", "\n", "Introduction to Public Use Microdata Sample (PUMS) analysis.\n", "\n", "### Census API Key\n", "\n", "To use pytidycensus, you need a free API key from the US Census Bureau. Get one at: https://api.census.gov/data/key_signup.html\n", "\n", "Set your API key:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import pytidycensus as tc\n", "import pandas as pd\n", "import numpy as np\n", "import matplotlib.pyplot as plt\n", "\n", "tc.set_census_api_key(\"YOUR API KEY GOES HERE\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Understanding PUMS Data\n", "\n", "PUMS provides individual-level responses that can be used for custom analysis." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Note: PUMS functionality would be implemented in future versions\n", "# This is a conceptual example of what PUMS analysis might look like\n", "\n", "print(\"PUMS data analysis capabilities:\")\n", "print(\"- Individual-level demographic data\")\n", "print(\"- Custom crosstabulations\")\n", "print(\"- Statistical modeling on microdata\")\n", "print(\"- Geographic resolution: PUMAs (100K+ population)\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## PUMS Variable Dictionary\n", "\n", "PUMS has hundreds of variables describing individuals and households:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Example of what PUMS variable exploration would look like\n", "# pums_vars = tc.load_pums_variables(2022, survey=\"acs1\")\n", "# person_vars = pums_vars[pums_vars['level'] == 'person']\n", "# housing_vars = pums_vars[pums_vars['level'] == 'housing']\n", "\n", "print(\"Example PUMS variables:\")\n", "print(\"Person-level: Age, Race, Education, Income, Employment\")\n", "print(\"Housing-level: Bedrooms, Value, Rent, Utilities\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Survey Weights and Statistical Analysis\n", "\n", "PUMS data requires proper handling of survey weights for accurate estimates:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Conceptual example of weighted analysis\n", "print(\"Survey weight considerations:\")\n", "print(\"- Person weights (PWGTP) for person-level analysis\")\n", "print(\"- Housing weights (WGTP) for housing-level analysis\")\n", "print(\"- Replicate weights for variance estimation\")\n", "print(\"- Use survey packages for proper statistical inference\")" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "name": "python", "version": "3.9.0" } }, "nbformat": 4, "nbformat_minor": 4 }