Chapter 7. Methods


In this technical chapter, we detail our methodology. Section 1 describes the data sources we used; in Section 2, we establish how we constructed and operationalized the measures we used in this report (e.g., child poverty, economic and demographic concepts). Finally, in Section 3, we describe our analytic approaches, including the various models we ran, as well as the limitations of our approaches.

As a reminder, we use different analytic approaches to examine the influence of economic and demographic shifts (Chapter 2) and social safety net programs (Chapters 34) on child poverty. To look at economic and demographic influences, we follow the approach developed by Hoynes, Page, and Stevens: Capitalizing on state-level variation in the timing and degree of changes in each of the factors examined, we use state-and-year fixed effects regression models to estimate the associations between changes in each economic and demographic factor and changes in child poverty. Because changes in federal policies often affect all states at the same time, this method does not allow us to evaluate the impact of federal policies on child poverty rates. Therefore, we examine the role of the social safety net programs by following the approach used by the United States Census Bureau: We compare actual child poverty rates using the Supplemental Poverty Measure (SPM) to estimated counterfactual poverty rates if an individual federal tax and transfer program (or the entire social safety net) were removed from the calculation of SPM household resources.

Read An Additional Note About Methodology for further thoughts on our methods.


Section 1: Data sources

Data for our analyses were drawn from two main data sources: the Annual Social and Economic Supplement (ASEC) of the Current Population Survey (CPS) (accessed through IPUMS)[1] and Columbia University’s Historical SPM Data.[2] Our analyses use data from 1980 to 2019; we focus our analysis on the time period from 1993 to 2019 and occasionally provide data from 1980 to 1992 as context.

The CPS is a monthly survey that provides current estimates and trends in employment, unemployment, earnings, and other characteristics of the general labor force, the population as a whole, and various population subgroups. The CPS-ASEC—conducted annually, mostly in March—provides annual estimates based on a survey of more than 75,000 households and is the source of official national estimates of poverty levels and rates, and of widely used income measures. The CPS-ASEC contains detailed questions covering social and economic characteristics of each person in a household as of the interview date. Income questions refer to income received during the previous calendar year.

The historical SPM dataset uses historical data from the CPS-ASEC and the Consumer Expenditure Survey (CEX) to produce historical SPM estimates dating back to 1967; the SPM was developed in 2009 as another way to conceptualize poverty that considers more information than the official poverty measure (OPM; see our resource on measuring poverty for more details on how the SPM differs from the OPM). The Census Bureau began publishing the SPM in 2011; therefore, Columbia calls its estimated series a historical SPM to distinguish it from the Census Bureau’s SPM.

We merged the CPS-ASEC data with the historical SPM data to create our analytic dataset. In this process, we dropped cases that were not matched between the two datasets (0.91% of all cases). The dropped cases were primarily due to 1) slight differences in the poverty universe between the two data sources (the historical SPM dataset excludes individuals living in group quarters), and to 2) differences in how the 2014 sample was treated across the two data sets. We also dropped a handful of duplicate cases (< 0.004%) from the CPS-ASEC data, which were only present in the 1980 to 1987 data files.

For our analyses of economic and demographic shifts, we linked the CPS-ASEC and historical SPM data to various additional data sources, which we describe below along with the measures using other data sources.


Section 2: Measures

Child poverty

We measure poverty based on the SPM. The SPM differs from the OPM in that it includes near-cash government supports (e.g., Supplemental Nutrition Assistance Program, or SNAP, and housing assistance) and tax benefits (e.g., the Earned Income Tax Credit, or EITC)—in additional to cash income—in its calculation of a family’s economic resources. The SPM also subtracts necessary expenses—such as tax burdens, out-of-pocket medical expenses, and work and child care expenses—from a family’s resources. Finally, the SPM uses more up-to-date assumptions about current living needs and includes adjustments for geographic differences in the cost of living.

The SPM is a quasi-relative poverty measure that does not use fixed thresholds. This means that historical changes in poverty could be at least partly due to changes in thresholds. For this reason, we use historical SPM data anchored to 2012 thresholds, which allow us to calculate comparable population-level child poverty rates back to 1980. This provides a cleaner estimate of the role that economic, demographic, and policy changes play in poverty trends over time. By holding the thresholds constant (but adjusting for inflation), we are able to examine the extent to which changes in poverty are due to changes in economics, demographics, and policy factors alone, and not as a result of changes in standard of living.[3]

We constructed two measures of child poverty using the SPM data: Specifically, we calculated the percentage of children from birth to age 17 who were in poverty and deep poverty in a given year. All children in households[4] with resources less than 100 percent of each respective poverty threshold were categorized as in poverty. All children with resources below 50 percent of their poverty threshold were categorized as in deep poverty. These groups are not mutually exclusive: Children classified in deep poverty are also considered to be in poverty. We repeated this calculation for all years from 1980 to 2019, at the national level and at the state-year level, and used the resulting variables in different analyses (see next section).

Demographic subgroups

We also repeated the poverty calculations for subgroups of children to provide data for the analyses in Chapters 2 and 4. To do this, we first generated flags for stable parental employment, family structure, race and ethnicity, and parental nativity, as described below. We then calculated poverty and deep poverty rates for each group.

Stable parental employment. Using the CPS-ASEC, we generated a flag for whether each child lived with either 1) no parent who was employed more than 25 weeks in the prior year, or 2) at least one parent who was employed for at least 26 weeks in the prior year.[5] Residential same-sex and cohabiting parents are included in our measures. To calculate this flag, we added up the total number of weeks a child’s residential parent(s) worked in the past calendar year, regardless of the intensity of their work (full-time or part-time). If one parent worked for 24 weeks and the other parent worked for 3 weeks, the child would be considered to have stably employed parents, as together they pass the 26-week threshold.

Family structure. Using the CPS-ASEC, we generated a flag for whether each child lives with either 1) two parents, or 2) one or no parents.[6] Biological parents, stepparents, and adoptive parents, as well as parents’ cohabiting partners, are included in the count, as are same-sex parents. The Current Population Survey has improved its ability to capture cohabiting parents over time, with a new measure introduced in 2007 and an improvement in processing by IPUMS in 2016; thus, our ability to capture families with cohabiting parents improves over time.

Race and ethnicity. The CPS changed how it asked about race and ethnicity in 2003, limiting the groups that we could examine over time. Using the CPS-ASEC, we grouped children into four mutually exclusive racial and ethnic groups: non-Hispanic Asian/Hawaiian/Pacific Islander, non-Hispanic Black, Hispanic, and non-Hispanic White. Survey respondents who indicated that the child is multiracial (an option starting in 2003) or a race other than the ones listed above are excluded from analyses broken down by race and ethnicity but are included in all other analyses. In 2019, approximately 5 percent of the child population was non-Hispanic and of two or more racial groups, and 1 percent was non-Hispanic American Indian/Alaskan Native alone.

Parental nativity. Using the CPS-ASEC, we generated a flag to indicate whether each child lived 1) with any parent born outside of the United States or 2) with all U.S.-born parents. Children whose parent(s) were born in the United States, inclusive of territories, were coded as living with all U.S.-born parents and in a non-immigrant family. If any of a child’s residential parents were born outside of the United States, they were considered to live in an immigrant family.

State-year macroeconomic, demographic, and policy conditions

We use state-year data for our fixed effects regression analyses in Chapter 2 to exploit variation in the timing and level of economic and demographic changes across states. Economic, labor market, and demographic factors are described below, along with the state-level policy conditions included in our models.

State-year economic and labor market conditions. Economic and labor market conditions are important contributing factors to child poverty, and we examined several: unemployment rate, real (inflation-adjusted) gross domestic product (GDP) per capita, median wage, state-level minimum wage, and single mothers’ labor force participation. We operationalized the unemployment rate as a state-year’s annual average unemployment for its population ages 16 and older, using data from the Bureau of Labor Statistics’ (BLS) State and Metro Area DatabaseReal GDP per capita was constructed using GDP data from the St. Louis Federal Reserve’s Federal Reserve Economic Database that we converted to 2019 dollars and divided by a state-year’s population. It is measured in thousands of dollars. Minimum wage was constructed as either a binding federal wage or a state-year’s minimum wage (the higher of the two), based on data from the U.S. Department of Labor. The single mother labor force participation rate was calculated using the CPS-ASEC as the percentage of single mothers ages 16 to 64 who are employed or are looking for work. Women who are not married (or are married but whose spouse is not present) and who live with their own child under age 18 are considered single mothers. Mothers with a cohabiting partner are considered single for this analysis.

Median wage was constructed from the Merged Outgoing Rotation Group (MORG) files of the CPS (accessed through the National Bureau of Economic Research). Hourly workers report their hourly wages directly, while salary workers report their weekly earnings and their usual hours of work. This information can be used to construct a consistent wage series for the time period we are studying. Following Lemieux (2006), we 1) calculated wages for full-time, non-self-employed workers with non-allocated wage information; 2) trimmed extreme wage values of more than $100 per hour or less than $1 per hour in 1979 dollars; 3) multiplied all top-coded wages by a factor of 1.4, to better preserve the distribution of wages; and 4) converted all wages into 2019 dollars.

State-year demographic conditions. All demographic characteristics, except for teen birth rates, were constructed using CPS-ASEC data. We operationalized share of children in two-parent families as the percentage of children (in a given state and year) from birth to age 17 who lived with two parents (compared to children who lived with one or no parents), following the definition for family structure described above. Share of population with a high school degree was operationalized as the percentage of a state-year’s population ages 25 and older with at least a high school degree or the equivalent. We captured differences in the share of children living in immigrant families by constructing the percentage of a state-year’s child population categorized as having at least one parent who is foreign-born (compared to children for whom all parent(s) were born in the United States). This variable is available starting in 1993 and is aligned with the definitions used in the subgroup analysis described above. The racial and ethnic composition of the child population (in a given state-year) was operationalized, in a series of variables, as the percentage of a state-year’s child population that identifies as non-Hispanic Asian/Hawaiian/Pacific Islander, non-Hispanic Black, Hispanic, non-Hispanic White, or some other non-Hispanic race/ethnicity (following the definitions described above). Teen birth rate data are derived from the National Center for Health Statistics and are operationalized as the number of births per 1,000 females ages 15 to 19. Finally, we included the child population (counts of persons birth to age 17[7]) in our fixed effects models to control for within-state variation in the child population over time, which may be correlated with child poverty rates.

State-year policy factors. States may adopt policies explicitly and/or implicitly aimed at combating poverty by distributing resources more equitably across their population. To account for within-state variation in state policies that are not captured by our state or year fixed effects and likely correlated with child poverty rates, we included three policies with state-year variation as control variables: 1) state earned income tax credit generosity, which we operationalized as the percent of the federal credit at which a state sets its own EITC using data from the National Bureau of Economic Research; 2) TANF/AFDC benefit generosity, which we operationalized as the maximum monthly benefit for a family of three in a given state-year, using data from the Urban Institute’s Welfare Rules Database and converted to 2019 dollars; and 3) whether a state, in a given year, had expanded Medicaid, using data from Kaiser Family Foundation.


Section 3. Analytic approach

Chapter 2: Fixed effects regression models

We used fixed effects regression models to examine the extent to which changes in economic and demographic factors within states over time are associated with changes in child SPM poverty rates. For this analysis, our data included state-year observations for 1993 to 2019 on child poverty, economic and labor force conditions, demographic characteristics, and state-level policy factors.

We estimated the following equation (after performing a within transformation), which relates child-specific SPM poverty rates within a state-year to macroeconomic, demographic, and policy factors:

where  is the poverty rate for all persons under age 18 in state s in year t.

The vector controls for economic and labor market variation within a state over time and includes a state-year’s unemployment rate to account for macroeconomic cycles; real GDP per capita (using a GDP deflator to convert all state-year GDPs to 2019 dollars) to account for growth in a state’s economy; median wage as a measure of cash income at the 50th percentile; minimum wage as a measure of cash income at the lowest end of the spectrum (those most likely to be experiencing poverty); and single mother labor force participation rate to account for growth in single mothers’ participation in the labor force.

The vector  controls for demographic variation within a state over time related to economic resources and access to the labor market and social safety net. This includes population-level estimates of educational attainment, race and ethnicity, children living in immigrant families, children living in one-parent or no-parent families, and teen birth rates.

The vector  controls for policy variation within a state over time and is intended to account for time-varying within-state changes that may have an association with poverty. The policies include a state’s EITC generosity and its TANF/AFDC benefit generosity in a given year to account for state transfers to those on the moderate to low end of the income distribution, as well as an indicator of whether the state in a certain year has expanded Medicaid to account for differences in medical expenses and health.

The year fixed effect parameter, , purges estimates of any omitted variable bias from variables common to all states that are changing over time (for instance, that every state is benefiting from technological advances). That is, if child poverty is declining in all states over time because of some omitted variable that impacts all states, the year fixed effect will absorb this problematic correlation. The state fixed effect parameter,, is removed by the demeaning process, which takes each state’s averages for all variables and subtracts them from the same state’s observed values for all variables. This effectively removes any problematic unobserved heterogeneity that is time-invariant within a state (meaning, any fixed differences in child poverty rates across states is removed).

After estimating our model, we created counterfactual data that held our data constant at 1993 levels and allowed each poverty-decreasing variable that was statistically significant to vary (one at a time) to visually examine the relative role of these variables in reducing poverty. All other variables were held constant at their 1993 levels and predictions were produced using the estimated coefficients from the model(s) starting with only allowing a state’s unemployment rate to vary, then allowing one additional variable from above to vary. We continued this process until each variable found to be statistically associated with changes in child poverty was allowed to vary and graphed the results.

Limitations of analyses in Chapter 2. Our child poverty model estimates will not generally coincide with their true population parameter counterparts and, as such, will not have any causal interpretation. Generally, for population parameters to be identified, the population error term must not be correlated with one or more explanatory variables. This will be violated if we have any unobserved time-variant variable that is correlated with both child poverty rates and any explanatory variable. This violation could also occur if past policies affect current child poverty rates, or if past child poverty rates affect current policies. This kind of bias, induced by simultaneity, may affect some or all of our estimates. It is likely that several of our explanatory variables are themselves functions of the child poverty rate. One can imagine that high levels of current child poverty cause, in part, unemployment to change or the government to respond with more benefits. Additionally, it is likely that the current period’s child poverty is a function of past realizations of some or all of our variables. One strategy for reducing these sources of endogeneity would be to include lagged values of some of our explanatory variables, lagged values of our dependent variable, a combination of both, or—if we were only interested in one effect—pursuing an instrument; however, all these strategies come with costs. Under any of these scenarios, one or more of our variables may be biased. Furthermore, several variables have been constructed from sample data using dataset-provided weights. To the extent these estimates are constructed with error, our estimates may also be attenuated. The model weights each state-year observation with its average child population across all its years to be more reflective of the actual distribution of child poverty, ensuring that states with lower child populations are not disproportionately affecting estimates.

Chapter 3: Descriptive poverty rates with and without the social safety net

In Chapter 3, we used descriptive analyses to examine how child poverty would change with changes in the social safety net. All analyses were at the national level, from 1980 to 2019.

The Historical SPM Data contain information for each child’s household on the monetary value of each benefit that a family did or did not receive. Some benefit amounts were captured directly from respondents (e.g., by asking them the monetary value of SNAP benefits they had received), while others were imputed based on other financial information provided (most notably, the EITC). Detailed descriptions of these imputations are available in Fox et al. (2015).

With these data, we calculated child poverty rates with and without the social safety net, following the method used by Fox and Burns in the annual SPM report from the Census Bureau. We recalculated each child’s family’s income without benefits by subtracting the benefit from their resources. We then recalculated poverty rates for each child, comparing the adjusted resources to the household’s SPM poverty threshold. We did this for each child and then aggregated to the national level and weighted the cases to be nationally representative.

We first looked at the overall role of the social safety net by estimating each child’s family’s pre-tax-and-transfer income—that is, their income minus the cash value of the sum of all the government benefits they receive. We repeated this process separately for each policy, examining the role of individual policies on child poverty rates. The results can be interpreted as the role that a given policy plays in reducing child poverty, without accounting for any behavioral change or any other interrelated policy. When we looked at the overall role of the social safety net, we included all government transfers and estimated the role that all policies play together. When we considered the full social safety net, we included all federal taxes—including the Child Tax Credit and other tax obligations and credits—as well as the individual programs described in the text. Except for state taxes, this analysis focuses on the role of federal anti-poverty policies and does not consider the role of any state or local policies or community supports (e.g., the local food bank). Federal programs administered by states, such as TANF and SNAP, are included.

We present three complementary numbers to help readers understand how programs are associated with changes in child poverty rates: the percentage point reduction in child poverty, the percent reduction in poverty, and the number of children protected from poverty. To calculate the percentage point reduction, we subtract the SPM poverty rate from the SPM poverty rate without the program. To calculate the percent reduction in poverty, we divide this number by what the SPM poverty rate would have been without the program. The number of children protected from poverty is the percentage point reduction multiplied by the total number of children in a year.

Limitations of analyses in Chapter 3. Our ability to capture the true role of the EITC in reducing child poverty was limited. Since 1996, Social Security numbers have been required of all family members in order to receive the EITC, making 21 percent of children in poverty ineligible to receive its benefits. Our analyses do not adjust for this limitation, so we talk about the potential role of the EITC as a poverty reduction policy. Our estimates of the relative role of the EITC also are likely overstated, as the EITC values in the Historical SPM Dataset are simulated based on family income, and on family structure and composition, and do not account for the approximately 20 percent of tax filers who are eligible for, but do not claim, the EITC. Nor does the simulation account for workers who do not routinely file taxes because they have no tax liability, but are nonetheless eligible for, and would receive, the EITC if they were to file taxes.[8]

We did not adjust for underreporting of resource receipt. Underreporting is a known issue, and TRIM3—a microsimulation rule-based model—has frequently been used to adjust for underreporting after 1993. However, this approach is difficult and resource-intensive, particularly for modeling multiple programs over a 40-year period. Furthermore, the current microsimulation, TRIM3, does not cover the full range of years included in this report (L. Wheaton, personal communication, April 19, 2022). New methods, such as regression-based approaches and model-based imputation, are emerging. Public-use files with misreporting-corrected SNAP receipt and benefit amounts based on the modeling approach outlined by Rothbaum, Fox, and Shantz (2021) are forthcoming.

In sum, our simulations of the role of social safety net programs in protecting children from poverty likely overestimate the role of the EITC and underestimate the role of most other social safety net programs, particularly SNAP and TANF.

Chapter 4: Descriptive poverty rates with and without the social safety net by subgroups

The main analytic approach of Chapter 4 parallels that used for Chapter 3. We again use descriptive analyses at the national level to examine how child poverty would change with and without the social safety net, but this time we conducted the analyses on subgroups of children: children with and without stably employed parents, children with two versus one or no parents, children in immigrant versus non-immigrant families, and children by race and Hispanic ethnicity. Again, we present the results in multiple ways. We examine both the absolute size of the social safety net (examining the percentage point decline in poverty for each group) and the relative role of the social safety net (examining the percent decline in poverty for each group). We emphasize the relative measure when comparing the role of the social safety net between groups as this helps account for differences in poverty levels at different time periods and between groups.

Limitations of analyses in Chapter 4. We were not able to conduct any tests for statistically significant differences between groups when examining the role of the safety net in preventing different subgroups of children from falling into poverty, so we instead focus our discussion on effect sizes. We could not conduct significance testing because there is an unmeasured amount of uncertainty built into the estimates of child poverty. Uncertainty is built into the estimates at multiple stages, including when the amount of benefits provided by different anti-poverty programs to households are estimated or simulated (estimations vary by program and year), and when poverty rates are estimated, based on the CPS-ASEC sample. We do not know how much uncertainty is added by the simulations. This information is not included in the Historical SPM data because standard errors cannot be constructed due to the complexity of the imputations.


Chapter 7 endnotes

[1] Sarah Flood, Miriam King, Renae Rodgers, Steven Ruggles, J. Robert Warren and Michael Westberry. Integrated Public Use Microdata Series, Current Population Survey: Version 9.0 [dataset]. Minneapolis, MN: IPUMS, 2021. https://doi.org/10.18128/D030.V9.0

[2] Chris Wimer, Liana Fox, Irwin Garfinkel, Neeraj Kaushal, Jennifer Laird, Jaehyun Nam, Laura Nolan, Jessica Pac, and Jane Waldfogel. Historical Supplemental Poverty Measure Data. Columbia Population Research Center. 2017. https://www.povertycenter.columbia.edu/

[3] In the 1980s and 1990s, the anchored SPM rates tend to be higher than the historical SPM rates due to the assumption of a higher standard of living (based on a 2012 standard of living).

[4] SPM is calculated at the SPM unit level, which is generally equivalent to households. For simplicity, we refer to these units as families throughout our report.

[5] To calculate parental employment, we used IPUMS’ built-in feature to link residential parent(s)’ employment information to the child, specifically the “wkswork1” variable, using the “momloc” and “poploc” variables. IPUMS-CPS updated its methods for constructing family relationship linking variables in 2016, and changes served to preserve comparability over time.

[6] We used the “momloc” and “poploc” variables created by and available through IPUMS to count the total number of parents a child has present in their household. IPUMS-CPS updated its methods for constructing family relationship linking variables in 2016, and changes served to preserve comparability over time.

[7] Although we utilized the same construction procedure as our other data, our child population estimates differ from published estimates such as those from the National Historical Geographic Information System. All other demographic estimates were replicated using other nationally representative datasets.

[8] The overestimation of the anti-poverty role of the EITC using simulation methods has also been noted in recent work by Maggie Jones and James Ziliak. Previous research that corrects for the overestimation of the EITC, however, has also found it to be among the most important anti-poverty programs for children.

Scroll To Top