Nick Beim

Thoughts on the Economics of Innovation

The Barbell Effect of Machine Learning

shutterstock_225928441 (1)

If there is one technology that promises to change the world more than any other over the next several decades, it is arguably machine learning. By enabling computers to learn certain things more efficiently than humans and discover certain things that humans cannot, machine learning promises to bring increasing intelligence to software everywhere and enable computers to develop ever new capabilities – from driving cars to diagnosing disease – that were previously thought impossible.

While most of the core algorithms that drive machine learning have been around for decades, what has magnified its promise so dramatically in recent years is the extraordinary growth of the two fuels that power these algorithms – data and computing power. Both continue to grow at exponential rates, suggesting that machine learning is at the beginning of a very long and productive run.

As revolutionary as machine learning will be, its impact will be highly asymmetric. While most machine learning algorithms, libraries and tools are in the public domain and computing power is a widely available commodity, data ownership is highly concentrated.

This means that machine learning will likely have a profound barbell effect on the technology landscape. On one hand, it will democratize basic intelligence through the commoditization and diffusion of services such as image recognition and translation into software broadly. On the other, it will concentrate higher-order intelligence in the hands of a relatively small number of incumbents that control the lion’s share of their industry’s data.

For startups seeking to take advantage of the machine learning revolution, this barbell effect is a helpful lens to look for the biggest business opportunities. While there will be many new kinds of startups that machine learning will enable, the most promising will likely cluster around the incumbent end of the barbell.

Democratization of Basic Intelligence

One of machine learning’s most lasting areas of impact will be to democratize basic intelligence through the commoditization of an increasingly sophisticated set of semantic and analytic services, most of which will be offered for free, enabling step-function changes in software capabilities. These services today include image recognition, translation and natural language processing and will ultimately include more advanced forms of interpretation and reasoning.

Software will become smarter, more anticipatory and more personalized, and we will increasingly be able to access it through whatever interface we prefer – chat, voice, mobile application, web, or others yet to be developed. Beneficiaries will include technology developers and users of all kinds.

This burst of new intelligent services will give rise to a boom in new startups that use them to create new products and services that weren’t previously cost effective or possible. Image recognition, for example, will enable new kinds of visual shopping applications. Facial recognition will enable new kinds of authentication and security applications. Analytic applications will grow ever more sophisticated in their ability to identify meaningful patterns and predict outcomes.

Startups that end up competing directly with this new set of intelligent services will be in a difficult spot. Competition in machine learning can be close to perfect, wiping out any potential margin, and it is unlikely many startups will be able to acquire data sets to match Google or other consumer platforms for the services they offer. Some of these startups may be bought for the asset values of their teams and technologies (which at the moment are quite high), but most will have to change tack in order to survive.

This end of the barbell effect is being accelerated by open source efforts such as OpenAI as well as by the decision of large consumer platforms, led by Google with TensorFlow, to open source their artificial intelligence software and offer machine learning-driven services for free, as a means of both selling additional products and acquiring additional data.

Concentration of Higher-Order Intelligence

At the other end of the barbell, machine learning will have a deeply monopoly-inducing or monopoly-enhancing effect, enabling companies that have or have access to highly differentiated data sets to develop capabilities that are difficult or impossible for others to develop.

The primary beneficiaries at this end of the spectrum will be the same large consumer platforms offering free services such as Google, as well as other enterprises in concentrated industries that have highly differentiated data sets.

Large consumer platforms already use machine learning to take advantage of their immense proprietary data to power core competencies in ways that others cannot replicate – Google with search, Facebook with its newsfeed, Netflix with recommendations and Amazon with pricing.

Incumbents with large proprietary data sets in more traditional industries are beginning to follow suit. Financial services firms, for example, are beginning to use machine learning to take advantage of their data to deepen core competencies in areas such as fraud detection, and ultimately they will seek to do so in underwriting as well. Retail companies will seek to use machine learning in areas such as segmentation, pricing and recommendations and healthcare providers in diagnosis.

Most large enterprises, however, will not be able to develop these machine learning-driven competencies on their own. This opens an interesting third set of beneficiaries at the incumbent end of the barbell: startups that develop machine learning-driven services in partnership with large incumbents based on these incumbents’ data.

Where the Biggest Startup Opportunities Are

The most successful machine learning startups will likely result from creative partnerships and customer relationships at this end of the barbell. The magic ingredient for creating revolutionary new machine learning services is extraordinarily large and rich data sets. Proprietary algorithms can help, but they are secondary in importance to the data sets themselves. The magic ingredient for making these services highly defensible is privileged access to these data sets. If possession is nine tenths of the law, privileged access to dominant industry data sets is at least half the ballgame in developing the most valuable machine learning services.

The dramatic rise of Google provides a glimpse into what this kind of privileged access can enable. What allowed Google to rapidly take over the search market was not primarily its PageRank algorithm or clean interface, but these factors in combination with its early access to the data sets of AOL and Yahoo, which enabled it to train its algorithms on the best available data on the planet and become substantially better at determining search relevance than any other product. Google ultimately chose to use this capability to compete directly with its partners, a playbook that is unlikely to be possible today since most consumer platforms have learned from this example and put legal barriers in place to prevent it from happening to them.

There are, however, a number of successful playbooks to create more durable data partnerships with incumbents. In consumer industries dominated by large platform players, the winning playbook in recent years has been to partner with one or ideally multiple platforms to provide solutions for enterprise customers that the platforms were not planning (or, due to the cross-platform nature of the solutions, were not able) to provide on their own, as companies such as Sprinklr, Hootsuite and Dataminr have done. The benefits to platforms in these partnerships include new revenue streams, new learning about their data capabilities and broader enterprise dependency on their data sets.

In concentrated industries dominated not by platforms but by a cluster of more traditional enterprises, the most successful playbook has been to offer data-intensive software or advertising solutions that provide access to incumbents’ customer data, as Palantir, IBM Watson, Fair Isaac, AppNexus and Intent Media have done. If a company gets access to the data of a significant share of incumbents, it will be able to create products and services that will be difficult for others to replicate.

New playbooks are continuing to emerge, including creating strategic products for incumbents or using exclusive data leases in exchange for the right to use incumbents’ data to develop non-competitive offerings.

Of course the best playbook of all where possible is for startups to grow fast enough and generate sufficiently large data sets in new markets to become incumbents themselves and forego dependencies on others, as for example Tesla has done for the emerging field of autonomous driving. This tends to be the exception rather than the rule, however, which means most machine learning startups need to look to partnerships or large customers to achieve defensibility and scale.

Machine learning startups should be particularly creative when it comes to exploring partnership structures as well as financial arrangements to govern them – including discounts, revenue shares, performance-based warrants and strategic investments. In a world where large data sets are becoming increasingly valuable to outside parties, it is likely that such structures and arrangements will continue to evolve rapidly.

Perhaps most importantly, startups seeking to take advantage of the machine learning revolution should move quickly, because many top technology entrepreneurs have woken up to the scale of the business opportunities this revolution creates, and there is a significant first-mover advantage to get access to the most attractive data sets.

This post also appeared on TechCrunch

Dataminr and the Science of Real-Time Information Discovery

Today Dataminr announced a $130m round of financing from a group of leading financial institutions and prominent financial thought leaders including John Mack, Vikram Pandit, Tom Glocer and Noam Gottesman.  

A number of friends have asked me about the company and what I find most interesting about it. This seemed like a good opportunity to highlight a few thoughts. 

What I find most interesting about Dataminr is that in addition to building a business, it is pioneering a new science. The science is real-time information discovery, and it involves sifting through the ever-growing tidal wave of real-time public data to identify and determine the significance of breaking events by their nascent digital signatures, as they happen. Sometimes these events are well-wrapped, for example by someone witnessing an event and tweeting about it, with others providing corroboration. Sometimes they aren’t, with algorithms figuring out what is happening by seeing thousands of facets of something larger. The company has a deep strategic partnership with Twitter that makes this kind of discovery possible. 

This new science is, without a doubt, very cool. It enables one to discover news before it’s news and market-moving information before markets move. It provides a kind of X-ray vision into what is going on in the world in real-time with a filter for what is significant, and to whom. All on the basis of publicly available data.

In a period of five months, Dataminr has become the real-time wire service used almost universally by major news organizations, beating out the next best service by over an hour and discovering troves of unknown unknowns that would never have otherwise come to light. It has become adopted by the lion’s share of leading financial institutions to have access to the frontier of breaking information in real time.  

What’s also interesting is how Dataminr will change the world. In my view most industries that rely on real-time information — an ever-increasing number — will be influenced by it, and some will be transformed by it. The wave of change began in the fields of finance, news and public safety, and I think will move quickly to risk management, security and PR. And undoubtedly to other verticals in ways that are difficult to predict. I am particularly excited about what the company and its technology can do to help save lives in the fields of public safety and humanitarian assistance.  

Dataminr is in the early days of a long journey, but it is already impacting the world in significant ways, and it’s exciting to be a part of.

Are Venture Capitalists Biased Against Female Entrepreneurs?

In her article Taking a Hammer to the Silicon Ceiling, Amanda Bennett hits on a real problem in the venture industry where spoken and unspoken biases have a significant impact: it is harder for women to raise money than it is for men. However hopeful one’s outlook, this is an uncomfortable and inescapable truth that the industry should acknowledge.

What’s the reason for it? I’ve been in the venture business for 14 years, and rarely, but sometimes, I’ve seen it come from unabashed bias about women’s ability to do as good a job as men. Generally this relates to the subject of women already having or potentially having children. I’ve heard people remark: “Wouldn’t that be a big distraction for the company, and how could they possibly be as productive as men in those circumstances?” This particular kind of bias is rarely expressed in a public manner but certainly affects the thinking of some. The good news is that as younger generations of investors assume more prominent roles in the industry, I think it will substantially diminish.

More often, I’ve seen the challenges female entrepreneurs face in raising money result from a bias that is rooted in the primary way venture capitalists make decisions, which is through pattern recognition. In a private conversation, a successful west coast venture capitalist expressed the issue to a friend of mine in a backward-looking empirical fashion that was an attempt to be unbiased: “look at the numbers – most successful startups are started by men in their 20’s and 30’s; the number of successful startups founded by women is much smaller.” Yes, but most startups in any historical timeframe were started by men in their 20’s and 30’s. This doesn’t speak to the likelihood of women succeeding, particularly since a significantly larger number of women are starting companies today than in the past.

Social scientists call this logical flaw selecting on your dependent variable: determining that A is a principal cause of B by looking only at cases of B. Used as the primary lens for evaluating new investment opportunities in venture capital, it creates all sorts of intellectual distortions and inertia and is the principal reason most venture capitalists are late to promising new trends and only jump on board when there is a significant pattern of success. I think this is the cause of the biggest challenge that female entrepreneurs face in raising money. Most venture capitalists have not internalized the success of female entrepreneurs to a sufficient degree to have it influence their intuitive pattern recognition, partly due to what they perceive as a lack of a large enough n and partly no doubt due to the fact that they have not worked with female entrepreneurs directly. It was also the cause of challenges that entrepreneurs faced in raising money in a variety of pioneering new fields, from personal computers to the internet to digital animation. Success by entrepreneurs in these fields was not yet a large enough historical pattern to influence investors’ thinking.

I believe this is changing. When I look at the number of female entrepreneurs who have built successful companies over the past 20 years or are doing so today, a significant historical pattern is definitely emerging. This group is comprised of some very impressive people, all the more so since they’ve had to clear higher bars than their male counterparts. Some of their companies are already significant successes, and others are on their way. A very partial set of examples that come to mind include Judy Falkner (Epic Systems), Diane Greene (VMWare), Julia Hartz (EventBrite), Jilliene Helman (Realty Mogul), Sheila Marcelo (, Natalie Massenet (Net-a-Porter), Alexis Maybank and Alexandra Wilkis Wilson (Gilt Groupe), Miriam Naficy (Minted), Alison Pincus and Susan Feldman (One King’s Lane), Kim Popovits (Genomic Health), Victoria Ransom (Wildfire), Clara Shih (Hearsay Social), Adi Tatarko (Houzz), Lynda Weinman ( and Anne Wojcicki (23andMe). And many dozens of others. If one does not see a pattern there, I think it may be due to lack of awareness of the facts.

I personally believe that the magnitude of success of these entrepreneurs and their peers is precisely what will finally move the needle for the silent majority of venture capitalists stuck on historical pattern recognition, for they will represent a significant historical pattern that one would ignore only at one’s peril. It’s only when venture capitalists fear they will miss out on something big that their behavior will ultimately change. Remember all those venture capitalists who thought that it would be challenging to make money on the internet, or in social media or on mobile? Those debates have been definitively won and lost, and today everyone invests in these areas. I think that those harboring concerns about investing in female entrepreneurs, even if they won’t say so directly, will ultimately abandon those concerns in the face of significant and increasing data relating to their success.

There is another bias that Bennett mentions in her article, one that creates disadvantages for female entrepreneurs but advantages for female venture capitalists: that the venture capital industry as a whole, given that it is primarily comprised of men, is slow to recognize opportunities in female-dominated industries. The first people to see big new opportunities in female-dominated industries are generally women, and many male venture capitalists may never catch on. This can lead to a particularly significant adverse selection problem for venture firms in today’s internet world, where social media and ecommerce, to name two major fields, are both dominated by female users. I believe the large number of successful ecommerce and media startups focused primarily on female users — from Pinterest and Houzz to the Honest Company and Net-a-Porter — has now become an historical pattern of sufficient scale that it will help increase the numbers of women in the venture industry going forward (although the industry moves slowly), since they will likely be better able to spot these opportunities than their male counterparts. And this will certainly help female entrepreneurs.

For all the problems that the venture industry has with investing in female entrepreneurs, there are some investors who do care and who do support female entrepreneurs in a significant way. And often this works out particularly well for them given the biases mentioned above. In her article, Bennett asks “would a man have seen what Sheila Marcelo saw: the need for a way to connect caregivers with those who need child, elder and pet care?” Certainly much less clearly than Sheila did, but yes, there was one. I invested in Sheila the day the company was founded based on my belief in her and in her vision. I invested in Alexis Maybank and Alexandra Wilkis Wilson in the very early days of the Gilt Groupe for similar reasons. I am close to investing in my fifth female founder. I invested in these entrepreneurs primarily because they were extraordinary individuals with big ideas who understood their industries and customers extremely well, and sometimes this understanding related to the fact that they were women. I’m very glad I made these investments and look forward to investing in more female entrepreneurs in the future.

I believe that in the long term, markets do tend to be efficient, and the success of these and other female entrepreneurs will ultimately erase the regrettable biases that female entrepreneurs have to fight against today.

Full disclosure: Beyond investing in female entrepreneurs, I actually married one (in a field very different from my own). She has been the greatest source of insight and learning for me on this subject.  

A Very Cool Thing I Learned About My Dad

Every so often a family member does something significant that makes you really proud. That happened to me this week when I learned about the full details of the role my Dad played in trying to prevent regulatory failure by the New York Federal Reserve in a pretty astonishing story that was uncovered by Pulitzer Prize-winning reporter Jake Bernstein at ProPublica and This American Life and covered subsequently by Michael Lewis on Bloomberg and by the Washington Post.

The story involves a former employee of the New York Federal Reserve named Carmen Segarra who was fired for refusing to back down from her conclusion that Goldman Sachs fell short of regulatory requirements for dealing with certain conflicts of interest. Sensing that she was working in an overly deferential regulatory system that would reject her conclusions, she secretly recorded meetings that supported her case.

The most notable smoking gun quotes were from a Goldman employee who said that “once clients are wealthy enough, certain consumer laws don’t apply to them” and from a fellow Fed regulator who responded to Segarra’s surprise at this statement by saying “you didn’t hear that.”

The background for this story is that in 2009, the head of the New York Federal Reserve asked my father, David Beim, a Professor at Columbia Business School, to write an internal report on how the Fed could have missed all the incredibly risky behavior at investment banks that helped cause the 2008 financial crisis.

After dozens of interviews, he came to a conclusion that surprised him. He expected to find a failure of financial analysis, but what he found instead was a cultural failure. The NY Fed had become overly risk averse, and its employees kept their heads down, prioritized peaceful coexistence over challenging conversations and allowed institutional consensus to weaken their findings.

His report suggested a path forward, including recommendations to find more independently-minded employees and to create a culture that would enable them to speak freely, come to uncomfortable conclusions and let the truth bubble up. The Fed sought to take his advice after receiving the report and went on a hiring spree to find more outspoken people like Carmen Segarra, although as suggested by subsequent events, many cultural problems remained.

The report had been kept secret until it was released in legal proceedings relating to Carmen Segarra’s departure from the Fed. Now it is out in the open for all to read. It is, as Michael Lewis says, an extraordinary document.

What impressed me was not only that my Dad hit upon one of the core uncomfortable truths that helped create the financial crisis, but that he did so in a thoughtful, unafraid manner and refused to bend even when the Fed tried to get him to modify his report so they wouldn’t look so bad. Pressured by senior Fed officials to remove a quotation from someone he had interviewed that “regulatory capture” set in very quickly after new employees joined, he refused to do so. This was a core failure of a core institution in the U.S. financial system, and he did not want to let politics interfere with the truth.

Courage, thoughtfulness, honesty and unwavering integrity are things I’ve always admired in my Dad. If the majority of financial managers and regulators on Wall Street were made of similar material, I honestly don’t think we would have had a financial crisis in the first place.

(My Dad recently joined Twitter. You can find him @dobeim.)  #BeimReport

Postscript: The Wall Street Journal just published an article and video interview with my Dad on his report:

A Debate about the Future of NY Tech

HotTopics recently hosted an interesting debate moderated by Jeff Glueck about the future of the NY tech scene that I participated in along with Kevin Ryan, Dennis Crowley, Jessica Lawrence, Alfred Lin and Bob Goodman.

Here are the highlights. Some of the big questions we hit were:

-How is the NY technology ecosystem different than Silicon Valley?
-What are NY’s key strengths and challenges?
-Where does NY tech go from here?
-Does NY favor startups that focus on making money over big-swing platforms that defer their focus on revenue?
-In this inning of information technology, what kinds of industries are disrupted by insiders vs. outsiders?

I wish we had had more time to discuss this last question, as it is a very interesting one worth a debate or series of blog posts in its own right.