The New World of Massive Data Mining

 - Flickr user: Daremoshiranai http:/www.flickr.com/photos/daremoshiranai/

Flickr user: Daremoshiranai http://www.flickr.com/photos/daremoshiranai/

The New World of Massive Data Mining

Private and government groups are finding new ways to mine massive troves of digital data. Tom Gjelten and a panel of experts look at the implications for national security, education, science, medicine, as well as privacy concerns.

Every time you go on the Internet, make a phone call, send an email, pass a traffic camera or pay a bill, you create data, electronic information. In all, 2.5 quintillion bytes of data are created each day. This massive pile of information from all sources is called “Big Data.” It gets stored somewhere, and everyday the pile gets bigger. Government and industry are finding new ways to analyze it. Last week the administration announced an initiative to aid the development of Big Data computing. A panel of experts join guest host Tom Gjelten to discuss the opportunities -- for business, science, medicine, education, and security … but also the privacy concerns.

Guests

John Villasenor

senior fellow at the Brookings Institution and professor of electrical engineering at UCLA."

Michael Leiter

senior counselor,Palantir Technologies, former director, National Counterterrorism Center.

Dr. Suzanne Iacono

co-chair, Big Data Senior Steering Group and senior science adviser, Directorate for Computer and Information Science and Engineering at the National Science Foundation.

Daphne Koller

professor,Stanford Artificial Intelligence Laboratory

Program Highlights

The term big data refers to the massive amounts of digital information companies and governments collect about us and our surroundings every day, pictures, records, temperatures, conversations. Our guests discuss how government and private industry are using big data and the main concerns surrounding its collection and utility.

What Is "Big Data?"

Villasenor said that big data is "really big." The amount of data that's estimated to have been created or replicated would fill 11 billion iPod classics, each holding about 160 gigabytes. "Remember that the world population is only 7 billion so that's a truly incomprehensible amount of data," Villasenor said.

Practical Uses

Every organization, whether it's government or private sector, uses information in different ways, said Leiter. In the world of terrorism, data that was collected clandestinely could be cross-checked with information that was available publicly to try to identify people who were doing suspicious things. In the private sector, organizations like banks use data routinely to identify cyber fraud and organized crime activity. "There's almost no application, either in government or the private sector, that can't benefit from some of this big data," Leiter said.

Privacy An "Enormous" Concern

Privacy is an enormous concern, but big data isn't necessarily always directly correlated with privacy, Villasenor said. For instance, the total amount of data needed to represent all the websites an average person visits in one year is not that big - about one or two megabytes. But a lot of people would consider that information very private, Villasenor said. "That said, of course, the more data that's out there, then the more opportunity there is that it could potentially be used in ways that were detrimental to privacy," he said.

You can read the full transcript here.

Please familiarize yourself with our Code of Conduct and Terms of Use before posting your comments.

Our address has changed!

The Diane Rehm Show is produced by member-supported WAMU 88.5 in Washington DC.