Blockspring is brilliant. Maybe they know that

A couple of months ago I was having a chat with a prospective client about where the data-driven marketing ecosystem is going. In his job, it’s important to understand and plan for that. I suggested that we will see a consolidation of APIs, and that he’d be wise to start thinking through the scenarios.

Two weeks later I read about Blockspring, which had recently raised some money from Andreessen-Horowitz. Blockspring is a genius idea, and I’m going to explain why. And incidentally, I do not work for Blockspring, nor do I work for a16z (startup-speak for Venture Capital firm Andreesen-Horowitz, because it’s their very clever URL domain). In fact, I’m not sure if I even know anyone who works for either of those companies. But I do think Blockspring is on to something huge – and wide open.

Before I dive into my case, let’s just spend a sentence or two on what Blockspring actually does from a user perspective.

But Wait … What’s an API Anyway?

In keeping with my theme – writing for businesspeople who are not technologists – I’ll start here. 1A person with whom I’m very close has a CXO title and a lot of responsibility, and has been quoted in the public domain saying smart-sounding things about data analysis. S/he has absolutely no idea what s/he is talking about, and jokes privately about that. My litmus test for this blog is: “explain this concept to that person so that s/he will understand it. Or, as Denzel Washington’s character in Philadelphia said, “explain it to me like I’m a 6-year-old.”

An Application Programming Interface (API) is simply a way apps talk to one another. For example, Twitter has an API that allows programmers from any other company (or even individuals) to pull information directly from Twitter’s data stream. Programmers can then use this interface to build their own products and features around Twitter’s data.

APIs are very commonly used these days. Continuing with the Twitter example, there are dozens of products out there (if not many more) built in part or wholly on Twitter data. Followerwonk is a currently popular example; they provide analysis of Twitter audiences using the Twitter API. Their product is a bundling of features and functions that Twitter could build into their own product, but hasn’t. This turns out to be a very important point.

So What’s Blockspring?

OK, bear with me here. Blockspring has a tool that allows the user to build spreadsheet formulas out of APIs. Right, I know that’s probably the most confusing sentence I’ve ever written, so let’s break it down.

  1. Spreadsheet Formulas. As readers of my erstwhile LinkedIn posts (and gazillions of non-readers) know, one of the most basic functionalities of spreadsheets that make them so powerful is built-in formulas. So, for example, if I want to find the monthly payments on an apartment that cost $500,000 (hey, I live in NYC!) assuming I paid 20% down, and got a loan at 4% per year for 30 years to pay the rest, I can tell Excel: ExcelFormula1
    and Excel will tell me:ExcelFormula2
    This all happens in one cell of the spreadsheet, because the formula PMT is built in to Excel. It knows exactly what to do with the numbers. (This, by the way, is an algorithm, another heavily-used term these days, which I’ll cover in another post.)
  2. APIs. Well, we’ve already described what they are (in the broadest possible terms). I’ll add that APIs are all over the place, and (for our purposes) they’re really the principal way that databases and applications talk to each other over the internet.
  3. So knowing this, let’s take a look at a Blockspring formula. In this example, I’m using Google Sheets instead of Excel; Blockspring can work in either! I’m using the API for Alchemy, which is a text analysis tool from IBM:AlchemyAPI1
    It looks complicated, but it’s just like any other spreadsheet formula – it has a function (“Blockspring”), which tells the spreadsheet what to look for. And it has arguments, in this case, a “Block”, which is Blockspring’s word for what to do with the API connection; “url”, which tells the function what it’s looking at generally; and “http://gothamist.com/2015/08/23/ny_times_sane_people_struggle_to_un.php”, the actual URL. The formula then connects to IBM and executes, all inside the spreadsheet. Here’s the result:AlchemyAPI2
    The API looked up the URL and classified the content. It is saying with 98.9242% “confidence” that the web page at that address is a News site, and that it’s roughly 53% sure it falls into the Movies category, and about 52% that it’s about Reality TV. For those interested, the URL in question points to an article in Gothamist titled “NY Times, Sane People Struggle To Understand Donald Trump’s Appeal”.

All of this is really cool, especially if you’re a data journalist. But everything up to this point is really just background information, and doesn’t have much to do with the real genius of Blockspring. Let’s get to that!

Data wants to centralize

You may not remember it, but if you ever took Economics 101 you probably learned about “Natural Monopolies”.
A Natural Monopoly is an industry characterized by very high relative fixed costs and very low relative variable costs. Let me make that more digestible: if it costs a whole lot to get into the business, but not very much to provide your product or service (usually service) once you’re in, that’s a natural monopoly. The classic example is telephone companies. Building the infrastructure was enormously expensive, but once it was built, the cost of providing one minute of service for customers is very small.

These are called “natural” monopolies because the economic pressure is toward fewer and more powerful competitors, regardless of the state of the industry at any point in time. Let’s take a look at the “Data Industry” 2“Data Industry” is in quotes here because I don’t think it’s really one contiguous industry; it’s more of an entire ecosystem through this lens.

Structured Data: Centralization of Storage

I don’t want to get too bogged down here in what “structured” and “unstructured” data are; let your CTO worry about that for now. I’ll probably cover it in some more detail in a later post. For the purposes of this post, let’s look at it like this: structured data is stored in rows & columns. Say you have a spreadsheet with three columns: address, latitude, and longitude. It has, let’s say, 1 million rows, representing the addresses of Austin, Texas. This is “structured data”.

The thing is, most structured data that underpins the Internet requires dedicated storage. And dedicated storage is – or was – expensive. Here’s a great chart from Matthew Komorowski showing the history of those storage costs3this chart presents retail storage costs, but that should be strongly correlated with enterprise storage. I’m not going to present a 3-D chart of price/scale/time, at least not here.:

cost-per-gigabyte-large

Matt does a great job of explaining why he stopped updating this chart in mid-2014. It comes down to this: dedicated storage is becoming both less expensive and less relevant.

But let’s talk about the time up until about 2010. Storing data was expensive, and the cost was largely a fixed cost: you had to buy the storage empty, and build it in to your enterprise. And then you had to gather, “clean”, and upload all the data in the right format. It was an expensive infrastructure cost.

But once you had all that data – cleaned, uploaded, and formatted – the relative cost of servicing and acting on that data was pretty small. This looks like a natural monopoly. In fact, in the U.S., it was not a monopoly; it was an oligopoly. Actually, several different oligopolies. A few examples:

  1. Consumer Marketing data. Dominated for years by essentially three players: Experian, Acxiom, and InfoUSA. [Disclosure: I have done a bunch of consulting work for Experian and Acxiom]
  2. Consumer Credit data. Very similar to marketing data, this industry segment has long had its three “majors”: Experian 4this is a completely different unit of the same company, and segregated from the Marketing data by Federal law, TransUnion, and Equifax.
  3. Health Insurance data. Until early 2009, there was one de facto database that U.S. health insurance companies used to set the prices of medical procedures. It was owned by Ingenix, which was a subsidiary of United Healthcare. To cut a long story (that I know very little about) short, this didn’t sit well with regulators. From an economic standpoint, it makes perfect sense, though, that this database would have been so centrally controlled and operated.

And there are many more examples.

So we’ve uncovered a very interesting development: a dramatic shift in the relative costs of providing data-driven services on the web. What’s that got to do with Blockspring? We’re getting there! [Uber-geek Note: Data Storage Industry.]


gettin_there


Unstructured Data: Era of Tools & APIs

I’ll tell you a secret. Yes, there’s a lot of hype around “data science”, and I’ve written about that before. No, the idea of analyzing data to gain insights is not new. Yes, even the statistical methods behind all the algorithms you read about have been around for a very long time.

But there is something really new, and it has had an enormous impact on the evolution of the Internet, and on data analysis in general. That something is “Unstructured Data”.

If Structured Data is structured  in columns and rows, then Unstructured Data must be … well, unstructured. It doesn’t have to be laid out in neat tables, where you can look up Row 352,627/Column 477 of Table 68. This post is unstructured data. The email you got from your boss at 2AM – unstructured data. That episode of The Young Turks you watched on YouTube last night – unstructured data. Here’s the key point: the vast majority of data out there “in the wild” is unstructured. But our old methods of data analysis – putting everything into columns and rows – made analysis of all this data a practical impossibility.

New data analysis tools have changed that. We now have the power to analyze all of this “unstructured” – and highly distributed – information. So what? What do the new capabilities mean from a strategic point of view?

Shifting Power

It’s all about changing market forces. To analyze data, you (obviously) need two things:

  1. Data
  2. Analytical tools

In the “old days”, just getting data was a barrier. And since the data itself lived in a natural monopoly, having great analysts with great tools didn’t matter much if you didn’t have the market power to buy, or – even more foreboding – to collect and store vast quantities of data.

But the very nature of all that unstructured data – emails, Tweets, web sites, videos, “social graphs”, etc. – is that it is highly decentralized. This leaves a power vacuum, and nature abhors a vacuum. Enter the API.

APIs are poised to fill the vacuum

First of all, APIs are not that new either. Kin Lane (“the API Evangelist”), who is clearly much more passionate about all things API than I am, attributes their initial development to Roy Thomas Fielding in his 2000 doctoral dissertation. What has changed is the environment they live in.

APIs live in the world of “network benefits”, about which much has been written since the rise of the Internet. I still think the old Fax machine analogy is best:: the very first Fax machine had no value at all. The second Fax machine gave the first one value. Each additional Fax machine made the network of Fax machines more valuable on the whole, and this would remain true until some replacement network arose.

It is similar with APIs. The more APIs there are, the more valuable the network of APIs becomes. And this collective value is poised to fill the vacuum in market power left by both the “democratization” of data, and the dramatic decrease in storage costs.

Are you starting to see where Blockspring fits in? Don’t worry; I’ve still got some ‘splaining to do.

Belling the Cat

In August of 2012, Twitter announced API 1.1, which introduced some big policy changes including rate limiting, user caps, and instructions on where & how to develop apps that interface with Twitter data. Let’s not get too deep into the technology details of all this, and instead focus on the business strategy ramifications. Suffice it to say, if you were an app developer in August 2012, you definitely remember this! It was a huge deal. And if you had ever studied Porter5https://hbr.org/1979/03/how-competitive-forces-shape-strategy/ar/1 you’d have seen this change coming from miles away.

I can’t find the original source of the chart below; maybe it even came from Twitter themselves. But in their August 2012 announcement, they told developers, “hey, kinda stay away from the upper right quadrant. That’s what we want for ourselves.” (My wording; their message.)

TwitterAPIv1.1 2x2

Developers went nuts. Lots of companies had made huge bets on products that assumed the providers of these APIs would always allow free access, in both the monetary and unrestricted sense of the word “free”. And now many of these companies faced a real existential threat. Basically, they had been getting their raw materials for free, and now someone wanted to end that.

This was, forgive me, a stupid miscalculation. Of course as they gain more market power, or as their backers look for a lucrative exit, companies that provide a key strategic value to other companies’ businesses will want a piece of the action. It’s the nature of free market capitalism.

(Incidentally, and nakedly self-servingly, when Brad & I first built Moveable Feast Mobile Media, APIs were still just plain free. Mobile was still in its earliest stage of experimentation, as was Big Data, and the “sharing economy”. Brad & I, being somewhat more “seasoned” in the ways of the internet, recognized what was happening, and we knew what would happen next. We deliberately made very limited use of external APIs, and we explicitly assumed that August 2012 would eventually arrive.)

Twitter’s move in August 2012 sent a signal: “these APIs are the backbone of the future Internet, the Internet of web services, and there’s money to be made here. The competitive dynamics of the Internet are going to change again.” In a lot of ways, APIs are to the Internet now what hyperlinks were to the Internet of 1995. And Blockspring is trying to index them.

BLOCKSPRING’S Big Idea: An Index of APIs!

Did all this sound familiar? Technology changes lead to a huge disaggregation, leading to uncertainty about the power balance in a natural monopoly? Yes! As I said above, it sounds exactly like Worldwide Web, circa 1995. Yahoo! was founded in 1994, Google first started as Sergey & Larry’s project in ’96. Lycos & AltaVista – oh, well never mind those.

The eventual brilliance was that the product being offered  – a catalog of information – provided value for its users, but it also extracted value from their use. End users were trading their usage data for the value of the index. And this turned out to be unimaginably valuable, as we all now know.

Blockspring is doing this. I don’t really know if Jerry Yang & David Filo6the founders of Yahoo!), or if Sergey Brin & Larry Page((founders of Google really understood how valuable their catalogs of the internet would eventually be; I tend to think they didn’t. Maybe Blockspring’s founders and investors do. They’re very smart folks, and they have a precedent in Google & Yahoo! But whether they realize it or not, they’re positioning themselves very well.

Finally, let me repeat my disclaimer/disclosure: I don’t know anyone at Blockspring or at Andreessen-Horowitz. I also don’t know anything about the state of competition for what Blockspring’s trying to do. I chose to use them in this post because I stumbled across their product and liked it – and saw it as an excellent case study in Internet Economics and Entrepreneurial Strategy.

Google & Yahoo!, to a very large degree, became the tollgate for the modern Internet, by offering a roadmap to the world beyond the toll plaza. The API ecosystem is still unmapped.

Internet Tollgate

I’m going to leave it here, because in one of my next posts I’d like to pick up on this idea of trading value for value. It’s a very interesting topic in data and mobile, my primary areas of expertise and interest.

And please let me know what you think!

References   [ + ]

1. A person with whom I’m very close has a CXO title and a lot of responsibility, and has been quoted in the public domain saying smart-sounding things about data analysis. S/he has absolutely no idea what s/he is talking about, and jokes privately about that. My litmus test for this blog is: “explain this concept to that person so that s/he will understand it. Or, as Denzel Washington’s character in Philadelphia said, “explain it to me like I’m a 6-year-old.”
2. “Data Industry” is in quotes here because I don’t think it’s really one contiguous industry; it’s more of an entire ecosystem
3. this chart presents retail storage costs, but that should be strongly correlated with enterprise storage. I’m not going to present a 3-D chart of price/scale/time, at least not here.
4. this is a completely different unit of the same company, and segregated from the Marketing data by Federal law
5. https://hbr.org/1979/03/how-competitive-forces-shape-strategy/ar/1
6. the founders of Yahoo!), or if Sergey Brin & Larry Page((founders of Google