What Does Privacy Mean to the Census Bureau?
What is Differential Privacy?
According to the Census Bureau, the agency protects your privacy and your responses “by employing a technique called differential privacy.” What is differential privacy? How does it protect identifying information if the information is shared or released? Differential privacy is a system that allows companies, governments, and organizations to share information contained in a dataset through descriptions, without giving the exact date.
Often this information is put into two data sets that are equivalent in the overall data, but the data contains some substitution data that helps to protect the individual. This also prevents companies from combining several data sets to backtrack and use computing algorithms to determine to whom the data belongs.
The basics behind differential privacy are that if substitutions are small enough, queried data results cannot provide details about a specific individual, but the information is still relevant. The data can be studied by healthcare and other agencies to learn more about society, health, and more. They understand by this definition of differential privacy is “that a person’s privacy cannot be compromised by a statistical release if their data are not in the database.”
As differential privacy was initially developed by cryptographers, the two are often connected. Government agencies, such as the Census Bureau, have been known to use differentially private algorithms to publish demographic data. This is done by releasing statistical aggregates, which clues into user behavior without giving up any identifying information.
What is a differential algorithm?
While it sounds complicated, any algorithm is considered differentially private if the observed output cannot be connected to the individual source that was used in the computation. Differentially private algorithms are thought to be resistant to identification and re-identification attacks.
What is the New Privacy Tool?
Sixteen states are currently involved in a lawsuit with the Census Bureau to reduce the privacy of American citizens’ data. As Americans everywhere are clamoring for more privacy protections, there are obvious concerns related to the ways in which private and pertinent information is safeguarded from public consumption.
The lawsuit began with the state of Alabama accusing the Census Bureau of using a new privacy tool known as “differential privacy.” The concept of differential privacy is to protect the data contained in a database. This means that the Census Bureau uses all available means to protect your privacy from corporations who would manipulate and misuse it.
In 2006, data scientists Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam D. Smith presented their studies and differential privacy concepts in a scientific article. The article detailed the idea of “noise” as being added to a database to protect the individual data points. Noise, or minute mathematical adjustments, are used to alter the data so slightly that it can be used in studies, business, or health. The purpose of using noise is that generalizing the data or giving out the results of a few queries can expose data to the public.
Re-identification is one reason why the Census Bureau would choose to implement differential privacy. Re-identification works when a company or entity is attempting to find out the private information in a particular database. They use intelligent algorithms and additional sources of data to figure it out by matching corresponding data points from multiple databases. Through these means, private details can be found.
What Does Census Privacy Mean to You?
As the Census Bureau contains the private information of hundreds of millions of American citizens, it is imperative that this data is collected and protected in the most efficient manner possible. Subsequently, typical questions on a census form will ask citizens details about their:
- If you are of Hispanic origin.
- Details about who lives with you.
- Type of housing.
While Census statistics are compiled under the guise of documenting the ever-changing demographics of the American population, changes in the ways in which this information is collected, stored, and disseminated to the public will undoubtedly raise questions. For example, as Census data is imperative to the American political landscape, it is very important that all people who live within a particular voting block are accurately represented and accounted for at all times. The concept of differential privacy raises questions about the processes which the Census Bureau uses to collect data in the eyes of some American citizens.
Who is Objecting?
The premise of the state of Alabama’s lawsuit against the Census Bureau is that the use of differential privacy leaves room for undocumented individuals to corrupt and manipulate demographic numbers. In turn, the state of Alabama posits that improper documentation undermines their political power within the country, as electoral votes and congressional seats are delegated based upon the population of any given state. While Alabama was the first state to raise concerns about the Census Bureau’s practices, fifteen other states have also filed their own respective lawsuits including:
- New Mexico.
- South Carolina.
- West Virginia.
The fundamental problem presented in the lawsuit by the sixteen states is that because differential privacy creates false information by design, it prevents the states from accessing municipal-level information crucial to performing their essential government functions. Moreover, the distorting impact of differential privacy will likely fall hardest on some of the most vulnerable populations — such as individuals living within rural areas, as well as ethnic minority groups.
As such, many civil rights groups within the African American community have raised some concerns about the use of differential privacy. As the U.S. does not have a strong history of racial tolerance or much less acceptance, it is understandable that changes to the Census Bureau’s system would alarm leaders in the Black community. As minorities are already at a numerical disadvantage when it pertains to political power, failure to provide precise data makes it increasingly difficult for these groups to form a majority in a given community or district. These civil rights groups question whether the implementation of differential privacy could potentially dilute or even negate their local political power.
Though California has not yet joined the suit, officials from the state have raised concerns with the current administration. While they have shown some reserve in joining the lawsuit, they are one of the many states in which lawmakers have begun to question the ways in which the Census Bureau collects data. The primary concern is the impact this could have on a state’s ability to ensure legitimate voting results across elections. While there are currently sixteen states involved in the lawsuit, twenty-seven states are faced with deadlines relating to political redistricting.
These states seek alternative methods to mine the required data and even have gone so far as to rewrite laws dealing with redistricting deadlines. While differential privacy may appear to be a minor tool in the arsenal that the Census Bureau employs to track American citizenship, it has without question sparked a nationwide discussion on the importance of personal privacy and the best ways to go about maintaining it.