NSA, Web Companies and the NY Times

I'll put it bluntly: The whole flap about the NSA tapping all the data on the web does not pass the bullshit meter. It's not fucking possible.

Why do I say this? I've worked for companies that handle large, and I mean large, amounts of user data, metadata, click-throughs, etc. It's not an easy thing to capture, route or store all those lines of logging. Most companies do it by sampling - they don't log every click, they can't write it out fast enough! 

One set of systems I worked on would fill up each machine's log, that was set on a 3% sampling rate, in less than 12 hours. The box only kept 10 logs back. Maximum of 5 days. There were hundreds of just this one type of machine. During slow periods, a process would come along and collect these - because they recorded ads shown and ads clicked - but only 3%. This process was slow, and the place to write them usually ran out of storage, until they were processed (reduced to abstraction and summaries) by a big Hadoop cluster. This took hours.

This was just ad data - not browser meta data - that was sampled at only 3%, and it still required a lot of processing. The companies do it because that's how they get paid by advertisers. Once the data is processed, it's discarded, to make room for more.

The NYT and Guardian act like the NSA just comes in, hooks up a cable to a server, and starts sucking data for free. It doesn't work that way. Phone logs are different, it is the to and from of phone numbers, and simple cellular routing. It's designed for each *phone* to record it, and then for the home office to grab it as needed for billing.

Web providers can't do that - each page view is like a minute of a phone call, but all different. Sure, they can capture the incoming IP, referer, and what page got rendered. But the large companies can't store that as raw data for long. There's too much. Add in mail and chat, and it's an avalanche.

That's actually why they spend a lot of processing and compute power on personalization - because they want to tailor things to you. But they can't keep the specifics for long, they have to distill it down into a bunch of numbers and keywords.

As far as the big web companies just *giving* this raw big data to the government? No, for a lot of reasons:
  1. It is proprietary - it is what differentiates them from their competition.
  2. It's expensive to store, process and transport the mass of raw data that supposedly the NSA is getting.  Expensive enough to be a significant line item on a company's balance sheet. Waaaaay more expensive for even one company than $20 million a year.
  3. You can't gather and shuffle this much stuff around without the rank and file knowing. Even "backdoors" are obvious to any sysadmin or programmer who has to deal with the code. 
  4. There are tens of thousands of servers all over the world at companies like Google and Facebook.  You can't just connect up to them and start pulling all the user metadata without screwing up the software or the network. Too much load, too much bandwidth used.
Grabbing a specific FISA request is a little easier - it is small enough to sift out of the river of data flowing by without stopping up the process and impacting the property/sites. Companies scrutinize these requests, and interpret them fairly narrowly - if their customers didn't trust them to not indiscriminately hand out even metadata, they wouldn't stay. Customer trust is a big thing in the web business - ask Facebook, as they lose customers over how much they share without consent to advertisers.

These theoretical backdoors are a security risk for any web site. They have multiple layers of security, so it probably wouldn't work anyway. Just a small security hole on one application isn't enough to give the NSA any access to logs, which is where the user identifiable metadata lives. You would have to have a separate access to each server, and believe me, I'd know if people had log sucking processes feeding to outside the network where I've worked.

Even tapping the user input stream and duplicating it for internal load testing is a non-trivial problem, and that's only turned on for a few machines, not thousands.

Yes, in theory, it could be done, if the NSA had a dedicated, high capacity storage server bank and a high bandwidth, dedicated network pipe to each and every data center of every company. Such a thing would cost billions, yes, with a "B", and would not be even close to a secret.

Now, some people might say "what about man in the middle"? Well, this might be possible for unencrypted connections, but it would have to be done at the ISP level, or at the edge of each web company's network, and still would be too expensive and too obvious.

Again, there are companies that independently analyze web traffic, and they need an extra application layer to do so, and even then they still have the big data and sampling problems. They also aren't cheap.

So no, the whole "The NSA is spying on the web!!!111!!!!!" thing just doesn't pass my bullshit detector.

I work with these huge server farms. Even with high speed networks and huge filers, it is time consuming, IO intensive and expensive to shuffle that kind of log data around. Yes, some companies do it, with a percentage of their logs, and spend a big wad of cash to do it, and then discard the raw data so they can process more. They don't store and forward it all to the government - the bandwidth, personnel, and storage space just isn't there to do that for free, and the NSA's budget isn't big enough to get even the big companies set up.

FISA requests about specific stuff? Yeah, that's easier to pull out - but only concurrently, not weeks ago. It still would cost overhead and bandwidth, and if not done carefully could cause service outages. So companies only deliver the minimum required, to minimize the impact to their business. Plus, they really, really hate the gag order that comes with FISA requests, so they are not inclined to give them any more than the absolute minimum.

The NYT and Guardian claims are outrageous, and don't pass the practicality test.  Sorry to piss on your outrage. The secret FISA courts and Patriot Act crap are bad enough, and you shouldn't let the fantastic dilute your anger at the real stuff that goes on all the time. Pay attention to the real issues - secret courts and fishing expeditions - and ignore the sensationalists who are trying to make fools of you.


White Person's Guide to Dealing With PoC.


Hopefully I don't stick my foot in it, but I have seen enough asininity on the net to choke a horse. I also don't believe it is the PoC's (Person of Color's) responsibility to teach white folks how not to be bigoted assholes.  This is my attempt to teach politeness to the rude.

So here goes...
  • Do not quote your "black/hispanic/asian friend" as the source of all truth about that group, especially to that group. Not all PoC are alike, really. Individuality is the rule, not the exception. Stereotypes are mental straightjackets. 
  • Don't expect your PoC friend/acquaintance/net.person to be the spokesperson for all PoC. Just like you don't speak for all whites, they don't speak for all PoC. They'll usually tell you how common an experience is among their friends and family, if you ask politely.
  • Do not expect to be praised for talking to a PoC. Really. You don't get brownie points for having "a black/hispanic/asian friend". You don't get brownie points for simply being a considerate, polite human being. You might, however, get the benefit of having a broader outlook.
  • If a PoC, live or on the net, says something about what they've experienced being a PoC in this society, listen. Don't contradict them, don't minimize it, don't negate it. It's not your place to tell them where they've been or what has happened to them. Repeat it, retweet it, try to understand how they feel about it, but don't even imply that it is somehow insignificant or didn't happen. Institutionalized racism is real, and even if you don't notice it, it still bites PoC.
  • If you inadvertently stick your foot in your mouth, either just the toes or up to the hip, apologize, without "but", without victim blaming, and keep a lid on your privilege next time. Don't expect them to educate you on why you were wrong, offensive or just insensitive - it's not their job to make you not be a bigot. Don't expect them to forgive you, either. They aren't obligated to soothe your guilt.
  • PoC have standards about who they will accept as friends too. You may not measure up. Deal with it. They may not feel a need to have a token white friend who can regale them with a high level of cluelessness.
  • Don't make comments about their appearance that touch on stereotypes. "You look great today" is fine, "Your hair is really pretty, can I touch it?" is most decidedly not. Would you like it if someone  commented that "Hey, your neck is less red today!"
  • Realize that PoC may not have had the same opportunities as you did. Realize that predominantly minority schools are often underfunded. Realize that a PoC has to deal with shit every day that you can't even imagine having to put up with. Realize that even if you have been poor, you haven't been poor and a PoC.
  • PoC often come from a different subculture. Expecting them to relate to white, suburban, middle class jokes when they are black, urban, and working class will probably be a big flop - just like you don't get all the jokes from your white, rural, farming country relatives.
  • Claims of "it's a free country" and such ring pretty hollow to PoC - the police usually treat them like outsiders, criminals who just haven't been caught yet. Support them when they say there are problems between the authorities and their communities.
  • Yes, it's true, a black person can use the "N" word, and you can't. Deal with it. If you'd had "Cracker" thrown at you in the way they have had the "N" word used, you'd want to find a way to lessen the sting too. That goes for other slurs too.
  • If a PoC indicates that you have been an ass, don't justify, or demand that they explain how and why. Be glad they even told you, and didn't just cut you off like you didn't exist. Just apologize. Think about the incident, what you said, and how it might have sounded to them. Nine times out of ten, with a little bit of thought and empathy you will get it. Ask a friend if you still can't figure it out.
  • There are areas of intersection of minority status and bigotry. Don't take them for granted. Build on common ground, but respect differences and different impacts. Shit you take for granted often is hard won by PoC.
  • The hardest part: If you don't have something really constructive or helpful to say when someone has a problem due to being a PoC in the US, keep your yap shut. Make sure that other people know that they've been wronged, if it's a public discussion, but don't throw your two cents in. It isn't helpful.
  • Don't do the "Yes, but" thing when a PoC tells you about an incident or entrenched bigotry. It doesn't help, and will likely get you mocked, chewed out, or given the cold shoulder. When in doubt, keep your mouth shut. It's harder to stick your foot in that way.
  • Don't use the "tone" argument, don't derail, don't dismiss, minimize, or otherwise negate a PoC's experiences. Would you want to be erased, treated like nothing you experienced was true, or even real, and that how you felt was inconsequential or invalid? So don't do it to someone else. Remember, when in doubt, shut up.
  • If a white friend of yours is being racist, call them on it. Don't expect that they will listen well, but they might be more inclined to hear it from you than a stranger. If they don't stop that kind of crap, rethink if you really want to associate with that type of jerk. There is a thing such as guilt by association. Yes, it's harder with family.
I'm sure I'll think of more, but this has been knocking around in the back of my head for a while. BTW, I won't accept "yes, but" criticism from fellow whites on this one. 

White Privilege: Unpacking the Invisible Knapsack
Derailing For Dummies - yes, this is sarcasm
the tone argument