You may think that the census is soooo 2020, but you’d be dead wrong. There’s still a lot more data that’s going to be released next year from the once-a-decade head count of every U.S. resident, which means more stories to write.
Even though the census was conducted two years ago, the U.S. Census Bureau has yet to release its most detailed data about families, household relationships and comprehensive breakdowns of the U.S. population by age, race, ethnicity and tribal affiliation. The first of these releases is scheduled for next spring, and then more follow later in the year.
Information gathered from the largest civilian mobilization by the federal government is used to determine political power and how federal funding is distributed.
Numbers used for the process of dividing up the number of congressional seats by state and redrawing political districts were made public last year. But those data sets were only limited to basic race, Hispanic origin categories; a breakdown of ages 18 or older; whether homes were occupied; and the number of people in group quarters, like prisons or dorms.
The main reason for the delay in the release of the more detailed data is that the bureau, for the first time, is applying a new privacy method aimed at protecting the confidentiality of those who put their personal information on census forms or tell census takers about their households. Confidentiality is required by law, but the delay is causing headaches for some city planners, demographers, business owners and others who rely on the detailed data to decide where schools and roads should be built and where to locate their stores. When the data is released next year, it will have been three years since it was collected, raising fears among researchers that it will be outdated.
The most comprehensive data sets from the 2020 census will be released on a rolling basis next year.
- Demographic Profile: Set to be released in May, this provides data on age groups broken down by five-year intervals, along with sex, race and Hispanic origin categories. These numbers also show relationships in a household and other information about households, such as their sizes.
The smallest level of geography for this data set is “places,” which typically are cities, towns and villages.
- Demographic and Housing Characteristics File: Also being released in May, this includes numbers on family relationships, as well as detailed data on race, Hispanic origin, age, sex and housing. The data is at much smaller levels than the Demographic Profile, getting as small as a census block or about the size of a neighborhood.
Numbers from the Demographic and Housing Characteristics File summarized for congressional districts will be released in August.
Among the countless queries that can be made with this data are the characteristics of same-sex families, the number of centenarians living in a particular area and places where the number of preschoolers has grown the most.
- The Detailed Demographic and Housing Characteristics File: Planned for release in August, this is the most detailed of the detailed data set and is being released in three bursts.
- The Detailed DHC-A being released in August will offer sex and age information for about 370 comprehensive race, ethnic and ancestry groups, such as Japanese, Mexican and Turkish. It also includes figures for 1,200 tribal groups.
- The next release, with a date to be determined, covers household and housing information for those same groups.
- The final data set, also with a release date yet to be determined, will be the supplemental Demographic and Housing Characteristics File which contains information about the size of households and families.
The privacy tool
The Census Bureau is still hammering out the details of how the privacy method – known as differential privacy – will be applied to these yet-to-be-released numbers, with a goal of balancing precision with privacy. The privacy tool injects random errors into the data so that information about households can’t be traced back to individuals. The data is then massaged so that it appears to be logical –- for instance, making sure that the population counts from Florida’s 67 individual counties add up to the state population total. Only three types of data from the 2020 census are being released without the application of differential privacy: state population totals, the number of housing units and the number and type of group quarters, such as jails, dorms and nursing homes.
Critics say the tool makes the data inaccurate at the smallest geographic levels and shortchanges some minority groups of potential political power and government funding. It also has raised questions about whether data users can have confidence in the accuracy of the numbers. Disagreements over whether differential privacy causes more harm than good have created a rift among some demographers, statisticians and researchers who use census data.
Supporters say modern-day computing has gotten so sophisticated, that combined with third-party data from private entities, the threat of identifying participants from the data is real. They argue that the Census Bureau always has fudged the numbers at low geographic levels to protect people’s privacy, except now, with differential privacy, bureau officials have just been a lot more transparent about it than in the past.
Some data won’t be available based on where you live because of the privacy plan. For instance, you may not be able to find out the breakdown by age and sex of people of Japanese ancestry living in North Dakota or Puerto Rico. Under the Census Bureau plan, sex and age breakdowns will be limited for detailed racial, ethnic and tribal groups based on the size of those groups in each state, county or place. Groups that have less than 50 people in a particular geography will only get a total population count, rather than a breakdown by age and sex in an effort to protect privacy.
Reporters looking to do census stories can ask their local government planners how the application of differential privacy is affecting the census data they use. Are there certain problems they are unable to answer because they don’t have confidence in the data, or the data isn’t yet available?
How good a job was the headcount?
Some methods for gathering data for the census are considered more reliable than others. A household member who answers the census questions for the household is considered the gold standard, followed by an interview of someone living in the home by a census taker. Next are administrative records, such as those from the IRS, followed by a census taker interviewing a proxy for the household, like a neighbor or landlord. If information about a household can’t be found, the Census Bureau utilizes a statistical method called count imputation that uses the information of a similar nearby household to fill in the gaps. Obviously, this is the least reliable method.
The Census Bureau recently released data showing what kind of methods were used at the county and census tract level across the U.S., allowing journalists to write stories about which method dominated their areas and report on why: Census Bureau Releases 2020 Census Operational Quality Metrics for Counties and Tracts
Last summer, the Census Bureau also released a report card on how good a job it did counting residents by state: Census Coverage Estimates for People in the United States by State and Census Operations. The Post-Enumeration Survey had census takers reinterview households but at a much smaller scale and compared those results with the census results to determine where more people were counted than actually exist and where people were missed. The survey found that there were undercounts in Arkansas, Florida, Illinois, Mississippi, Tennessee and Texas. There were overcounts in Delaware, Hawaii, Massachusetts, Minnesota, New York, Ohio, Rhode Island and Utah. Reporters can examine what caused the overcounts and undercounts and whether state and local get-out-the-count efforts paid off. Reporters also can examine what local leaders in their area plan to do to get a better count in 2030.
In the meantime …
Census stories are always examinations of how our communities are changing, and measuring that change doesn’t just take place every 10 years. The Census Bureau puts out other products that provide a yearly glimpse of our demographic transformation. The most comprehensive of these products is the American Community Survey (ACS), which provides insight into everything from educational attainment to fertility to Internet access: Here are the subjects included in the survey.
The one-year ACS for 2021 was released earlier this year, offering the first real perspective on changes during the pandemic. The 2020 survey wasn’t considered fit for use because of problems gathering data during the height of the COVID-19 outbreak, so any measurement of changes during the pandemic is going to be made by comparing 2019 data to 2020 figures.
During the pandemic, the Census Bureau also introduced a new product, the Household Pulse Survey, which provides the closest real-time data on the day-to-day experiences of Americans. Topics deal with vaccinations, education, mental health, one’s ability to pay rent, sexual orientation and gender identity.
So, there’s a lot out there. Happy story digging!