NSA to store yottabytes of surveillance data in Utah megarepository (update: not so much)
Theres an interesting article in the current New York Review of books (predictably, a book review) detailing the history of the National Security Agency, that shadowy power-behind-the-power to which we surrender much of our privacy. That in itself is interesting, but I found the introduction a bit shocking: the NSA is constructing a datacenter in the Utah desert that they project will be storing yottabytes of surveillance data. And what is a yottabyte? Im glad you asked.
There are a thousand gigabytes in a terabyte, a thousand terabytes in a petabyte, a
thousand petabytes in an exabyte, a thousand exabytes in a zettabyte, and a thousand
zettabytes in a yottabyte. In other words, a yottabyte is 1,000,000,000,000,000GB.
Are you paranoid yet?
The more salient question is, of course, what are they storing that, by some estimates, is
going take up thousands of times more space than all the worlds known computers
combined? Dont think theyre going to say; they didnt grow to their
current level of shadowy omniscience by disclosing things like that to the public.
However, speculation isnt too hard on this topic. Now more than ever, surveillance
is a data game. What with millions of phones being tapped and all data duplicated,
constant recording of all radio traffic, 24-hour high definition video surveillance by
satellite, theres terabytes at least of data coming in every day. And who knows when
youll have to sift through August 2007s overhead footage of Baghdad for heat
signatures in order to confirm some other intelligence?
As for the medium on which the data might be stored on, thats anybodys guess. Whoevers making the estimates is probably playing a bit fast and loose with exponential curves, but if any of the alternative storage technologies we cover here on CG are any indication, yottabytes wont seem so big a few years from now. We can be sure, however, that despite their better dollars-per-gigabyte cost, spinning hard disks wont be in use as a main medium. The electricity required, mean time before failure, and other maintenance issues are probably unacceptable for an economy-minded government agency interestingly, it seems that lack of electricity is one of the NSAs primary concerns.
The article mentions that the NSAs equivalent in the UK, the Government Communications Headquarters, asked that all telecoms providers store and hand over a huge amount of customer data for an entire year. They refused, citing grave misgivings and noting that at any rate the level of data collection expected was impossible in principle. Tut tut! Those Brits lacked the American can-do spirit. Thus it was that AT&T and other telecoms instantly complied with US mandates following September 11. The extent of the governments meddling with switches, routers, antennas, and so on may never be fully known, but I wouldnt be surprised if everyone reading this article isnt on the record somewhere. Storage capacity of this magnitude implies a truly unprecedented amount of subjects for monitoring.
There is talk of the NSA shutting down altogether or being rolled into another agency, but I suspect that the too big to fail idea, as well as the our safety is worth any price dogma, will prevent that eventuality. Its more reasonable to ask when or if its expansion will cease being sustainable. These datacenters, and the yottabytes they will hold, are extremely expensive as well as practically having bulls-eyes painted on them to the enemy (whoever he is) though at under $10bn the NSAs budget is a footnote compared to other programs and agencies. So is the increasingly (to use a semi-word that is only rarely usable) tentacular NSA a necessary evil of the digital age, or a cancerous money sink born from the colossal intelligence competition of the Cold War?
The answer will only be visible in retrospect years from now, perhaps when a sequel to the book being reviewed (The Secret Sentry: The Untold History of the National Security Agency, by Matthew M. Aid) is released covering the heavily-redacted records of the early 2000s. In the meantime, its probably best to assume that the walls have ears.
(Updated with a note on storage medium)
Update 2: A commenter points out that in the study cited, yottabytes are only one possible estimate for total storage requirements. The more realistic estimates are in the hundreds of petabytes, which is much easier for a datacenter to accommodate. That said, Im leaving the post as it is because the speculation still stands with only hundreds of petabytes being stored in these datacenters. However, adjust your tinfoil hats accordingly. (Crunch Gear, 11.01, Devin Coldewey) http://www.crunchgear.com/2009/11/01/nsa-to-store-yottabytes-of-surveillance-data-in-utah-megarepository
Eastern