Bruce Schneier on identity theft in 2005:
The second issue is the ease with which a criminal can use personal data to commit fraud. It doesn’t take much personal information to apply for a credit card in someone else’s name. It doesn’t take much to submit fraudulent bank transactions in someone else’s name. It’s surprisingly easy to get an identification card in someone else’s name. Our current culture, where identity is verified simply and sloppily, makes it easier for a criminal to impersonate his victim.
With the recent Anthem data breach it is clear that the world is now on notice that for a significant portion of the United States population, the combination of a Social Security number and a birth date is not a secure way to verify identity. The financial services industry has thus developed new identity verification schemes and fraud detection tools. The IRS now uses big data to detect and prevent fraud. PayPal, Dropbox, and Google have all adopted two factor authentication. Stripe has adopted a machine learning model that adapts itself to your business. The tools to mitigate and prevent identity fraud exist, and companies and governments should employ them.
An interesting slide deck on something that I think most professors already knew and noticed. Students often skip lectures and attendance drops off near the end of the semester. This happened in both my undergrad and in law school. My general sense is that a lot of students felt they could simply learn through self-study and did not get much value out of the lectures. If you can pull it off it’s probably not a huge problem, but for students who struggle to get good grades I think that this kind of data will be critical in trying to figure out how to engage them and help them improve their performance.
Of course this data is not nearly as bad as the rather dismal completion rate for MOOCs.
My friend Lon Seidman explains how he used data to learn about his YouTube audience and improve his videos:
The other day, to much applause by the Internet community, Tom Wheeler announced that the FCC would issue regulations to ensure network neutrality. This Wall Street Journal article tells the story about how this victory was achieved. For startups and small businesses that are unable to negotiate special deals with ISPs this keeps the playing field level. It is a win for capitalism and the Internet.
Even something like the loading speed of webpages makes a signifcant impact on a business. That is why companies like Google, Amazon, and Etsy have conducted studies on the impact of page speed on their businesses. Lara Hogan at Etsy has a helpful overview of designing for performance that explains why they care. Key fact: after 3 seconds 40% of users will abandon your website. If you are not thinking about page speed you are automatically cutting out nearly half of your potential viewers or customers. That is why companies are rightfully worried that their websites could end up in a slow lane, and why people will pay money for services like Amazon Cloudfront.
Rather than write out a long post today I wanted to link to a bunch of interesting data related developments:
IBM Watson released five new services to its developer cloud - Speech to text, text to speech, visual recognition, concept insights, and tradeoff analytics. There are some fun mini-demos on the website but if you want to take advantage of visual recognition today one of the cool things you can do is upload a photo set to your Google Drive and search for unlabled images using text descriptions.
The White House released an interim big data and privacy report
Lex Machina is using data and machine learning to help companies win lawsuits - This software uses data from litigation to help attorneys ascertain what the most effective litigation strategies are. I think software like this can give new attorneys a leg up against more experienced attorneys who instinctively build this knowledge base over time.
Planet Money investigated Amazon Mechanical Turk - If you need a large volume of data entered inexpensively then Amazon mechanical turk is probably the best solution. However many of the workers are not making much money from it.