For at least two whole weeks, a database containing information on 198 million potential US voters – more than half of the American population – lay exposed on the internet, accessible to anyone who stumbled upon it while looking for unsecured assets.
Who’s data is it, and who left this data exposed?
All in all, between June 1 and June 14, some 25 terabytes of data was exposed, and of these 1.1 terabytes were available for download.
The database, owned by Deep Root Analytics, a data analytics contractor employed by the Republican National Committee, contains voters’ name and address, date of birth, registered party, and so on, but also things like computer modeled data about a voter’s likely positions on 48 different policy issues.
According to a statement sent by the company to The Hill, the data on voters’ expected positions is the result of their own analysis, and is used for making decisions about local television ad buying. “The data accessed was not built for or used by any specific client,” they said.
“We take full responsibility for this situation,” they added, and said that they contracted security firm Stroz Friedberg to investigate how the exposure happened.
UpGuard security researcher Chris Vickery was the one who found this database in an unsecured, publicly accessible Amazon Web Services S3 bucket, and notified the relevant regulatory bodies about it.
His colleague Dan O’Sullivan also took a look at the unearthed files, and searched for his name in the uncovered database. He said that the company made a pretty good guess about his preferences on many political issues.
“It is a testament both to their talents, and to the real danger of this exposure, that the results were astoundingly accurate,” he pointed out.
There is currently no information on whether these files have been accessed by anyone else.
“Deep Root Analytics, TargetPoint, and Data Trust—all Republican data firms—were among the RNC-hired outfits working as the core of the Trump campaign’s 2016 general election data team, relied upon in the GOP effort to influence potential voters and accurately predict their behavior. The RNC data repository would ultimately acquire roughly 9.5 billion data points regarding three out of every five Americans, scoring 198 million potential US voters on their likely political preferences using advanced algorithmic modeling across forty-eight different categories,” UpGuard noted.
“This exposure raises significant questions about the privacy and security Americans can expect for their most privileged information. That such an enormous national database could be created and hosted online, missing even the simplest of protections against the data being publicly accessible, is troubling,” O’Sullivan noted.
“The ability to collect such information and store it insecurely further calls into question the responsibilities owed by private corporations and political campaigns to those citizens targeted by increasingly high-powered data analytics operations.”