The Midnight Ride of Metadata

We have heard a lot of opinions about Snowden’s leak over the last few weeks. One that keeps coming up is that, for all the money that NSA has spent capturing literally everything to go over the internet and telephone networks, they didn’t get much bang for their buck. Put another way, since they have only recovered the content of a very few conversations of American persons, it’s no big deal.

First, we only have their word on that, and DIRNSA has been willing to lie under oath to conceal this program from Congress before. So their credibility is not exactly stainless here.

Second, we also have their word that the program is hugely successful and the only reason we didn’t have a terrorist attack since 9/11. (Let us step aside to snark. Except for Nidal Hasan shooting up a Fort Hood troop medical clinic after months of communications with Anwar Al-Awlaki, which they somehow missed. And the guy who ran over a bunch of people, which they somehow missed. The guy who shot up a recruiting office, which they somehow missed. And the two Boston Marathon bombers, Speedbump and Flashbang, which they somehow missed, despite Speedbump spending six months in Dagestan and/or Chechnya and being fingers to the US Intelligence Community by a helpful Russian liaison, all of which they somehow missed).

But, apart from those niggling little things they missed, this program caught all the terrorists, about whom they can’t tell you. Because you might say something about it, and some security risk Booz-Allen contractor might blab it to the world.

Like, uh, Ed Snowden.

But actually, both statements could be true. They could never listen to the content of conversations, and derive great intelligence value from the stuff they learn. Don’t believe me? Try searching the net for “traffic analysis,” a venerable intelligence discipline that lets interceptors determine a surprising amount about an opposition organization just by the relationships between stations and the forms of the messages they exchange, even if they can’t decode the messages.

That’s exactly what you can do with metadata. Indeed, the sort of information traffic analysts traditionally used was the metadata of radio communications: message length, group count, speed of transmission, frequencies used, location of transmitters and signal strength, radio fingerprints, and changes in any of those things.

As a thought experiment, in part, to develop new analytical techniques, in part, and finally as a means to explore a fascinating historical figure, a few scholars have tried to examine the American Revolution’s Paul Revere using various metadata and social network analysis, which is (and has been for decades) a discipline known to intelligence analysts.

  • Kieran Healey from Duke University explicitly ties his examination of Revere to what the NSA might do with Prism data. He writes in a tongue-in-cheek tone, in character as a British analyst trying to make sense out of 1773 Boston’s many insurgent movements: 

Rather than relying on tables, we can make a picture of the relationship between the groups, using the number of shared members as an index of the strength of the link between the seditious groups. Here’s what that looks like.

revere-group-view

Healey goes on to do a similar chart with individuals instead of organizations. Drawing lines between 200-odd patriots representing their connections to one another, he finds that Revere is in a remarkably central location.

Once again, I remind you that I know nothing of Mr Revere, or his conversations, or his habits or beliefs, his writings (if he has any) or his personal life. All I know is this bit of metadata, based on membership in some organizations. And yet my analytical engine, on the basis of absolutely the most elementary of operations in Social Networke Analysis, seems to have picked him out of our 254 names as being of unusual interest.

He goes further with the math, as do practitioners of social network analysis in and out of the intelligence community.

  • Shin-Kip Han’s 2009 article on Revere (pdf), which seems to have inspired Healey, uses social network analysis to argue that Revere, like his fellow rider Dr Warren, was a vital bridging character connecting various otherwise-isolated patriots and patriot groups. Han is a sociology professor at UIUC. The article, being a product of modern academia, can’t resist lapsing into quasi-Marxist class cant:

Where was Paul Revere in this picture? First and foremost, he was a silversmith. The master artisans like him were separated from the journeymen and apprentices in wealth and rank, and there was a hierarchy of trades that put silversmiths, goldsmiths, and distillers at the top among them. Still, in the overall colonial social hierarchy, he stood in the middle, between patricians and plebeians. Both a mechanic who made buckles and mended buttons for fellow artisans and their families and an artist who designed rococo-style “scalop’d salvers” for the merchant elite, he moved back and forth between the worlds of artisans and gentlemen, including many of Boston’s leading Whigs. The nature of his work rendered Revere a potentially useful bridge between the “bully boys” of Boston’s waterfront and the Harvard-educated gentlemen who led the American Revolution.

And as that passage shows, even as cant Han is interesting. (We did delete the citations for readability). Han goes further in the analysis of the same data than Healey did.

We found both papers highly interesting, both because we’re interested in the Revolution, having visited many of the remaining sites where Revere and his peers acted, and because we’re interested in intelligence analysis. The process Healey and Han use here was first demonstrated in 1974 by Ronald Brieger in this paper. Brieger’s key insight was that persons and groups relate in ways that can be analyzed independently using matrix mathematics, yielding deeper insights into group dynamics than previous social-analytic approaches. He called his approach, “membership network analysis.”

If you’re inclined to play with the data, Healey has put it online here. Meanwhile, read those papers and think about what someone can figure out about you if they can only see the metadata of your calls and computer activity.

It’s a good thing Lord North and his King didn’t have Healey or Han on their side. It’s a particularly good thing for Paul Revere!

Update

Healey posted a follow-up to his Revere post, with some notes about the methodology he used (very basic for social network analysis) and some later work in the field.

5 thoughts on “The Midnight Ride of Metadata

  1. Tom Schultz

    As an old 982, think it would have been fun to spend some time in the bird cage with Healey and Han.

  2. Y.

    1) good point on what they missed, like that fuckwit Hassan.

    2) The thing is, anyone paranoid enough to use serious encryption is likely to have his messages go unread. They might twig that he’s sending something


    They could never listen to the content of conversations, and derive great intelligence value from the stuff they learn.

    If you think for a moment they’re not archiving and indexing all emails, phone calls and IM convos, or planning to do so, I’ve got an ancient bridge for sale…

    Someone somewhere computed it’d cost NSA ~ 30 million a year to keep all phonecalls recorded for one year. The number is going down each month as better drives appear..

    1. Hognose Post author

      Point may have been badly worded. Point is: they can derive great intel value WITHOUT listening to the content of the conversations, as the linked documents on Paul Revere show. As long as they know that the lists of members are of groups that are hostile to the Crown, just analyzing who’s a member of what with whom — something modern computers can simplify and accelerate — identifies to you the nodes on the link-list that can be most disruptive if whacked.

      For instance, if you were an analyst with BATFE and were tasked to conduct IPB (intelligence preparation of the battlefield) for an upcoming confiscation order, you could prioritize your targets by using:

      Members of national gun-rights organizations
      Members of local gun-clubs
      Members of gun forums and auction sites
      People whose credit card receipts show firearms or ammunition purchases
      Letter writers who opposed gun control in public newspapers
      Concealed handgun license holders (and other licenses in the states that require them).

      You can weight the responses: Class III dealers, people who buy lots of ammo, people who write for the gun media, might go to the head of the line. The more such lists the subject is on, the more likely he’s got guns that need confiscated. You don’t have to listen to a single one of his calls or read a single email or text message.

      Why do you think ATF crawls the auction sites, collecting account names and serial numbers? What use is this data to them? None, if they are interested in operating within current law. It’s all IPB for the future law of their dreams. And it’s perfectly legal. (Access to the NSA Prism data, which I do not believe they have, would not be).

      You can calculate the centrality betweenness to determine just who to take out that will disable the political opposition. But as chilling as these examples are, this kind of thing is a crude, brute-force application of SNA. It can be done with a great deal more subtlety and deftness.

      1. Y.

        True, I get that, however, it’s important to observe Federal gov’t has long ago moved beyond hewing to the letter of the law ..

  3. William Turner

    Just because we are paranoid doesn’t necessarily mean that they aren’t really after us!
    Why have all this data if there isn’t some reason to use it and use it they will at the first opportunity.
    As stated by one of the founders, I can’t remember and there isn’t time to look it up, “Government, like fire, is a dangerous servant and a fearful master”

    All the ALPHABET agencies, taken as a group, are becoming fearful masters. It seems that metadata could be used on them too.

    “He who would trade freedom for security deserves neither freedom nor security” Benjamin Franklyn. (I hope it was him who said that and I spelled his name right) It’s the thought that counts.

    Keep your ammo dry.

Comments are closed.