Why can't India have it's own OpenAI ?

Why can't India have it's own OpenAI ?

Joydeep Sen Sarma
Joydeep Sen Sarma
←  Back

Why can't India have it's own OpenAI ?

Joydeep Sen Sarma
Table of Contents
Sign up for our Newsletter
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

(Discussions on this blog are welcome and can be found on this LinkedIn post and this Twitter thread)

Many in the Indian tech ecosystem wonder why we haven't produced a globally important AI company despite our vast engineering talent and thriving IT industry. Is it just a matter of investing a large amount of capital? Are highly profitable Indian IT firms to blame for holding back? Are we constrained only by our ambition? Similar laments are raised during each new software wave. Why couldn’t we create our own Google? Or our own Facebook?

I was Co-Founder & CTO at Qubole from 2011-20 (and an early Facebook engineer prior). While it would be delusional to compare Qubole to OpenAI (it was closer to the likes of Together.ai/Fireworks.ai/Modal today) - but we tried to build a large deep-tech product out of India. Most of the engineering was done in India. The company raised $100M in total, got to $55M ARR at peak with marquee Fortune-500 clients - but still couldn’t quite make it across the finishing line.

What lessons, if any, can one learn from Qubole’s journey about building enterprise deep-tech out of India?  It’s been a few years since I exited Qubole - and have had a lot of time to think about our journey. Collating some personal perspectives on this topic in this blog.

The Beginning

Our story began in 2011. My co-founder and I had just come out of Facebook where we had built Apache Hive and run Data-as-a-Service for the whole company. We managed a Petabyte of data (it was a big deal then) and operated thousands of machines for ETL/OLAP running Hadoop/Hive/HDFS etc. Given our open-source credentials and pedigree - we had little trouble raising venture money. The idea was to provide Big-Data as a Service for the whole world in the public cloud. Our experience building Hive and running it as a service for a large Internet firm were core to this.

Sounds vaguely familiar? 

Yup - this was the Snowflake(SF) & Databricks(DB) model before they were born. And a direct competitor to AWS's own budding services in this area. It was an idea slightly ahead of its time - the public cloud was seen as a toy for developers in 2011 - and not for serious enterprises. And big data was still in its infancy.

‍

Building Product out of India

I had decided to move back to India after leaving Facebook. So almost from the start the company was split between US & India - with business teams almost exclusively out of US and (over-time) Engineering mostly based out of India. 

We found it very difficult and expensive to hire top-notch dev talent in the US. My presence in India meant there was very high trust across continents and local management that allowed us to scale in India. We were also unique - cutting edge work in core systems was rare (then) in India. As a bonus - we also offered engineers a chance to join a well known Silicon Valley startup at a founding phase. So we were able to land some great talent in Bangalore. And once there is critical mass in one geo - it tends to accrete. So we ended up with most of Engineering being based out of India.

At its peak - the India team was almost 220 strong - with 160 odd in Engineering (remaining in Support, Field Engineering, S&M and other supporting groups).

‍

Promising initial success

Our initial few years were swell. We did a Series-A/B/C at a decent pace. While we were getting outraised by DB & SF (by a big margin) - but our revenues were growing fast. And our cost basis was lower (given that Engineering was ~50% of the company and was based mostly out of India). Business velocity probably peaked around 2016 and was ok until 2018 or so.

As a testament to the quality of Engineering & Product - we were able to easily take business from AWS. If someone wanted a better Big-Data platform on the Cloud - we beat them hands down. For a few years we were also competitive with SF & DB. I remember we lost Nielsen and CapitalOne to SF marginally and had many joint customers with DB (MediaIQ in India for example split workload across DB & Qubole). At one point, three of the largest cab firms, Lyft, Grab Taxi and Ola were all using Qubole as their Data Platform.

We heard from AWS folks that Qubole’s tech blog was required reading for their big-data teams - and even ended up hiring one of AWS PMs as our first product manager. The success was also recognized from outside - getting on CNBC Top 50 Disruptor list was probably the highlight for me.

To summarize: Up until 2018 - one of the top products in the hottest deep-tech space then - Big-Data - was built largely out of India with marquee clients the world over. (something very few people in India know)

‍

Troubling times

Post 2017-18 came darker times. Our growth rate came down. We struggled to land large deals needed to meet quarterly targets. Competition intensified, not just from Databricks and Snowflake but from a wave of new entrants. While our retention and NDR remained world class (a testament to the product and support quality), organic revenue growth masked the weakness in acquiring new customers. 

One key inflection point was when Microsoft tied up with Databricks. We had already spent a year (and a large team) working on an offering for Azure. That was obliterated with all the Azure business going to Databricks.

Covid struck us particularly hard. We had a lot of concentration in the Travel vertical (Expedia and Lyft were our largest customers) - and all these customers cut back on usage. With even retention and NDR struggling - nothing could now mask the weakness in landing new customers. The venture world has no appetite for companies that are not growing. Not being able to raise funds means not being able to spend on marketing and sales - which means even lower growth. A virtual doom loop.

Finally - the rest of the market was booming. AWS, Azure, GCP, Databricks, SnowFlake, Dremio, Trino .. a long list of companies doing similar work were seeing continued success. We started losing our most valuable assets - prized engineers and solutions architects. A flat and uncertain company valuation means stock compensation goes for a toss. And the amazing team we had assembled started seeing attrition as people left for companies with growing/higher valuations that could offer a much higher total compensation.

In 2020, Qubole was sold quietly to a PE firm. Aside from our customers, not many even noticed. It was a damp end to a once promising venture.

‍

‍

Lessons building Deep-Tech from India

A business outcome like this has many many factors behind it. To be fair - there was a strong luck factor as well - much better outcomes also almost happened (but did not materialize). It is not possible to cover all the issues in a piece like this. But there are specific lessons related to building Deep-Tech out of India and how those relate to this business outcome - that I will attempt to document here.

‍

Lesson 1: New Software Super-Cycle is a Land-Grab

Most revenue in Enterprise Software accrues to firms selling to very large enterprises. Large enterprises are also very sticky - and moats are created by acquiring a large number of Fortune-500 clients. As a result - in early phases of a software cycle like Big-Data - there is an insane rush to acquire large customers. The long-term winners and losers are often decided in this early phase.

Today, Databricks and Snowflake are winners of the Big-Data Platform land-grab. Okta is the winner of the Identity/Auth provider land-grab. We see similar dominance in Security (with Zscaler, SentinelOne and CloudFlare) and so on. 

Top Venture Capital firms understand this pattern intimately - and there’s a corresponding insane rush to invest in companies that seem to be winning land-grabs. The difference between an early leader and the next contestant is not linear to the gap - but can be exponential. Today the public market valuations of DataBricks and SnowFlake are ~1000x of Qubole’s exit value.

The Foundation Model and AI space today is seen as a similar super-cycle - and the insane amount of money pouring in is a similar attempt at a land-grab (whether it works or not is a different question).

‍

Lesson 2: Top Technologists in such cycles are insanely valuable

Since this land-grab is all about having better technology and product - and the key to that are super-star technologists - the value of such techies is sky-high. The numbers can be astronomical and may not make sense from a profitability or even revenue perspective. But they make eminent sense if one understands that grabbing land early could lead to super-normal returns far far into the future. This recent post on X caught my eye and is emblematic of the current bull market in AI:

There are two interesting corollaries here:

  • Lower-Cost Engineering isn’t really an advantage at the beginning of a software super-cycle. Rather than focusing primarily on cost, companies need to prioritize speed and differentiation to grow fast and secure major enterprise customers early in the cycle.
  • Engineers in domains not relevant to these cycles are not as valuable (while they may still be paid very well). While a database internals engineer was one of the most valuable hires in 2020 - today a LLM-internals engineer is likely far more valuable.

‍

Lesson 3: Founders don’t scale

Many ventures are started by gifted technical founders. In the early phase - such founders can create a lot of technological differentiation even with a small (or zero!) supplementary team.  Qubole was no different. As a reference point - it took AWS almost 6 years to catch up to some of the initial work I had done around auto-scaling and using Spot-instances in Hadoop (which took just a few months to implement).

But founders don’t scale. Ultimately a large impactful company can only be built with a large technology team - with at least a good handful of people being as good (or better!) than the founders.

Unfortunately, the initial accomplishments with a small team, largely based on founder acumen - can be deceptive. It can lead to an extrapolation that the success can be replicated at a far bigger scale, long into the future - but that can be untrue. As we have seen by now - scaling the initial founder acumen requires hiring expensive engineers, all across the world (but particularly in US/EU), raising large amounts of capital and having a strong GTM motion that can sustain the funding for a long time.

‍

Lesson 4: Market for top Technologists is global with a strong US bias

In 2019 or so - I was looking at one of the open-source projects from Databricks and checking the author’s profile. The person was a PhD dropout from one of the top Univs in Europe and had a deep background in Databases. He was a mid-level engineer at DB. While Qubole had amazing talent relative to India - we only had a handful of top engineers of that pedigree (taking the profile at face value).

That day I had a realization that no matter how hard we worked or how thrifty we ran the firm - we could never win against DB. They had hundreds of such star engineers where we had maybe half a dozen. Engineers who could truly build ground-breaking new products or make big technological jumps in the area of Databases or ML all by themselves. (DB was launched out of Berkeley/Stanford and was chock full of talent from top Univs in the US and EU).

The take-away here is that building a globally competitive deep-tech team largely from India is a bit of an oxymoron:

  • The market for top technologists is global.
  • The US (and to lesser extent EU) are far far ahead of India in terms of talent depth and density.
  • Top Indian talent leaves for the US and developed world as well. (We faced immense pressure from many engineers to sponsor H1/L1 visas).
  • Our PhD programs are nowhere close to US/EU. While a Doctorate itself may not be required for a professional software engineering job - the quality of technical exposure a top PhD program exposes students to - is invaluable.

‍

Lesson 5: Strong GTM motion is required for success

While Venture capital acts as a seed - only serious market traction and revenues can lead to continuous large injections of capital to fund the land-grab (and the technology team behind it).

Putting it differently - while a strong technology team and product can be the Egg - the Chicken is the revenue and that must follow in spades - and this depends on the success of the GTM motion. One of the challenges for firms from India has always been establishing scalable GTM motion and landing top Fortune-XXX clients at scale. Raising some initial capital and hiring a good technology team is not enough.

While Qubole did reasonably well early on - it really struggled in this respect overall. The reasons for this are unrelated to India (since S&M teams were placed out of the US) - yet it was a very important factor in our outcome.

‍

Lesson 6: Engineering presence in Valley important for Developer marketing

One of the things I realized early in our journey was that the fact that most of Qubole’s engineers were in India led to a lack of presence in the developer community in the US. Technical products like Big-Data are all about fitting into the tech stack of developers. A large engineering team situated in the Bay Area and US - also acts as an implicit developer marketing and product evangelist team of sorts on the ground. 

(One of my memories talking to the then CEO of MongoDB in ~2010 was their story of how being located bang in the middle of Manhattan and Silicon Alley in the early days was critical to building the initial MongoDB user community).

We were very good and prolific at writing technical blog pieces online - but that was not enough. We also had Sales and Marketing teams in the US - but developers hate being sold to. The best way to get traction in the developer community is via developers.  Arguably, the importance of this point is less post-Covid - as we have all become used to remote work and workers are scattered all over the world. But I continue to see the effect of viral spread of technologies within the Bay Area act as initial springboard for startups.

Putting it all together

Qubole lost the Big-Data platform land-grab race. Less than extraordinary scaling of revenues & growth early on top of the strong initial product differentiation (made possible by strong founders and early employees) - eventually became a misstep that we could not recover from. Ordinary growth meant ordinary capital raises. Ordinary capital raises meant it didn’t have the scale and depth of technical talent (and S&M muscle) that its primary competitors had. The thinking that a low-cost engineering base in India could substitute for larger capital raises (and allow the firm to do well in this high-tech land-grab race) was inherently flawed in retrospect. Over time this resulted in the product falling behind and the company eventually entered a doom loop of low growth leading to low investment leading to further lower growth.

‍

Is it all hopeless for India then?

The lessons here are very specifically from one company and a very specific kind of a horse-race - the race to be one of the winning and dominant horses of an Enterprise Software Super-Cycle. OpenAI, Anthropic and others in that category are engaged in this fight today. To that extent - repeating the strategy of Qubole - to try to beat the top dogs in the AI super-cycle with a technology team out of India - is likely to lead to a disappointing outcome (not necessarily the same outcome - but likely not a huge success either).

But this is not the only horse-race one can run in. There are many other ways to build value without attempting to take on tanks head-on. What alternative business strategies can companies from India embrace in booming new domains? How can the Government help? Why is China able to do far better in such domains? 

There’s lots more to write about on this topic - something I plan to take up in a follow-on blog.

(Comments are welcome on this LinkedIn post or this Twitter thread)

‍

Related Blogs

See all Blog Posts
TOC heading
Text LinkText Link Active
Get a Free consultation with a Support Expert
Learn how fast growing companies like Teleport, Chronosphere and Acryl Data have scaled Support processes with ClearFeed
Thank you for contacting us. Our team will reach out to you shortly.
Oops! Something went wrong while submitting the form.