Season 2: Episode #8

When Netflix Bet On AWS

Right from the start, Netflix understood the potential of the cloud and embraced AWS. But this wasn’t just one company using another’s services. There was a deep collaboration between Netflix and AWS that literally helped shape AWS and its services. We’ll hear the detailed, behind the scenes story of how Netflix helped make AWS from Adrian Cockcroft, tech adviser, former Netflix cloud architect and former AWS VP of Sustainability Architecture.

Adrian Cockcroft

Guest

Adrian Cockcroft

Tech adviser, former Netflix cloud architect, AWS VP of Sustainability Architecture, founding member of eBay Research Labs

Read Bio
Adrian Cockcroft

Adrian Cockcroft

Tech adviser, former Netflix cloud architect, AWS VP of Sustainability Architecture, founding member of eBay Research Labs

Transcript

Hilary Doyle: This is going to be good. Today, we are looking at a company that, in many ways, helped shape AWS. That company, Netflix.

Netflix set to make a run at being a big Hollywood studio with plans to release about 90 films a year. That’s according to the New York Times. There will be originals along with indie films, and then the addition of documentaries and animation as well.

Yes, they changed television, but they also changed culture inside the tech world. And our guest, Adrian Cockcroft, was Netflix’s major change agent.

Rahul Subramaniam: Netflix was one of the first companies to really adopt the distributed nature of the cloud. AWS even took a lot of tips and learnings from Netflix, like what to test, how to test, and even how to talk about the cloud. Netflix articulated this better than anyone, so what can we all learn from Adrian’s success and challenges?

This is AWS Insiders, an original podcast from CloudFix, bringing you what you need to know about AWS through the people and the companies that know it best. I’m Rahul Subramaniam, and I’m the founder and CEO at CloudFix.

Hilary Doyle: CloudFix is the nonstop automated way to find and fix AWS, recommended cost savings. It never stops working. I’m Hilary Doyle, I’m the co-founder of Wealthie Works Daily.

We often talk about the cloud being the center of the universe, but what we talk less about are the people and the companies that really shaped the cloud, as we know it. Yes, we talk about Google, AWS, Microsoft, and we slag on Oracle. But with AWS in particular, its customer obsession means it really does take guidance on its growth from its customers.

In the case of Netflix, you are going to hear from Adrian about just how far ahead of AWS early Netflix was. In other words, you have more to thank Netflix for than Bridgerton alone.

Rahul Subramaniam: Customer obsession is absolutely right, Hilary. Many years ago, I was told by AWS product managers that they never add a feature in AWS, unless a customer has a real use case and has demanded it.

Of all the AWS customers, I can say with a fair bit of confidence that, Adrian has probably had some of the greatest impact in shaping AWS, even as a customer. Later in his career, of course, he made an even larger contribution as an AWS insider,

Hilary Doyle: Not just an AWS insider, maybe the ultimate AWS insider. But before we chat with Adrian, here are this week’s AWS news headlines.

First up, a news item for the fax machine nostalgics, the printer scanner haters, and the pun lovers. Hold your hats, AWS has announced that it’s machine learning service Textract, now has some serious accuracy enhancements for the AnalyzeDocument forms feature. It’s going to help folks better automate their document processing workflows.

Rahul, I have high expectations for this service, so tell us how it’s going to play out in real life.

Rahul Subramaniam: Okay. Now, this might sound odd, but let me describe just some of the activities I’ve had on my plate this week.

Hilary Doyle: I’m looking forward to this.

Rahul Subramaniam: I’ve been filling out my expense reports from five weeks of crazy travel, and of course, filling my income tax returns, which is tax season in India.

Hilary Doyle: Right.

Rahul Subramaniam: Now, sifting through all these bank brokerage and insurance documents, it’s been quite a pain. Now, I would have been a raging mess today if it weren’t for the automation that I built with Textract to sift through all of these documents and submit all of my data to my chartered accountant.

Hilary Doyle: Nice plug.

Rahul Subramaniam: Now, imagine, the other side of this exercise where organizations that need to process thousands of these documents on a daily basis, Textract is awesome at doing just that. It’s been around for a while, in fact, and it just got a lot better.

Now, we’ve been talking so much about generative AI, that we’ve forgotten some of these amazing AI tools from Amazon, that have been around for years and continue to solve some critical real world problems.

Hilary Doyle: People tend to ignore and/or underestimate anything that features a pun, so I feel Textract’s pain.

Listen, we’re going to stay in this world of machine learning. But first, a bit of context might be helpful. Embeddings are numerical representations or vectors. They’re created from generative AI and help to capture the semantic meaning of a text input, so that it can feed into a large language model. Amazon, Aurora PostgreSQL-compatible addition now supports the pgvector extension to store embeddings from machine learning models in your database, and to perform efficient similarity searches. In other words, it can store and search embeddings in Amazon Bedrock, in SageMaker, the list goes on.

How should we think about using or deploying this, Rahul?

Rahul Subramaniam: Okay. So, we are back to talking about generative AI and large language models.

Hilary Doyle: Good. So good to be back here.

Rahul Subramaniam: So, the challenge here is, where and how do we store all of these vectors? Now, Aurora now supports pgvector, which means that developers were already familiar with Postgres, which is one of the most popular databases in the world, can now start using it to store vectors as well. It really flattens that learning curve for all of us who are just trying to keep our heads above water, when it comes to keeping up with all the developments that are in the GenAI space right now.

Hilary Doyle: Right.

Rahul Subramaniam: Most importantly, it’s also a reminder that, when you make a bet on the cloud, benefits like this are free and nonstop.

Hilary Doyle: Okay. We’re going to pivot hard over to the Greeks, because we love all things Greek on this podcast, which brings us to the great news we have this week from AWS Lambda, which can now detect and stop recursive loops in Lambda functions. Glory be.

Rahul, why do we need this and why is it good news for the health of your Lambda functions?

Rahul Subramaniam: Okay. So, you already know that I’m a huge fan of Lambda, mostly because of its pricing. Now, it costs you a dollar for a million invocations of a Lambda function, right? That just blows my mind. That said, there have been times when my Lambda bills ran into thousands of dollars in a matter of minutes, and we trace them back to the fact that there was a problem with the way some logic was implemented, and Lambda’s just got called over and over again, and of course at a crazy scale.

Hilary Doyle: Wow.

Rahul Subramaniam: Now, what I love about this solution is that, AWS didn’t just build a solution that detects such situations and halts execution, saving you from these massive bills, but they turned it on as a default for everyone. And they made it really hard to turn it off. You actually have to call customer support and beg them and make your case.

Hilary Doyle: Nobody’s going to do that.

Rahul Subramaniam: Exactly. No one should do that. So, AWS is working incredibly hard to prevent customers from causing harm to themselves with all this amazing power.

Hilary Doyle: Well, thank you for that wrap up. That’s it, my friends, for this week’s AWS news headlines.

Joining us now is Adrian Cockcroft. Adrian is a best-selling author, a musician, a tech advisor, and an executive who oversaw Netflix’s original move to the cloud. He then joined AWS where he became VP of sustainability architecture. He is also a founding member of eBay Research Labs. Adrian, welcome to the show.

Adrian Cockcroft: Well, thanks for having me. Great to be here.

Hilary Doyle: It’s great to have you. Let’s start here. Netflix was one of the first companies to truly embrace the cloud. You went all in on a new and relatively unproven technology. You took a whole company with you, and then that migration took seven years.

Now, there is a corporate and a personal answer here, and I’m hoping you’ll share the personal with us. How do you maintain the confidence of your convictions over seven years?

Adrian Cockcroft: In 2008, Netflix had a very large outage. We were down for several days in our data center, and that was what really got the question going, are we doing the right thing? Should we be doing something different from the way we run our systems? And at the same time, we were running a DVD business that was growing, but it was a pretty small footprint to run the DVD business. Customers interacted with it once a week to pick some different movies, and most of it was, the delivery was over the mail system.

So, when we turned on streaming, you’re interacting with that all the time. Basically, the compute load per customer went up by a factor of, I don’t know, a thousand or something, per customer streaming versus per customer on DVD shipping. So, we were going like, “We need to start building big data centers, or we need something else.” And we didn’t know how big and we didn’t know quite what we were going to need, so we couldn’t capacity plan effectively, and that was one of the big drivers for moving to cloud. So, we said, okay, “Let’s talk to Amazon in 2008.” And they basically said, “We’re not ready for you, go away for here.”

Hilary Doyle: Oh, wow.

Adrian Cockcroft: And yeah, the only way you could buy AWS then was on a credit card.

Hilary Doyle: Really?

Rahul Subramaniam: That’s true.

Adrian Cockcroft: Right. There was no corporate purchasing, there were no sales teams, there was no sales rep at all at AWS. And so, that was one of the pioneering things we did with AWS. And it’s a different kind of enterprise deal than what you typically get with other enterprise technology providers, and part of that is from the influence that, I think, Netflix had on it.

So, the first things we did were not customer facing. In 2009, we got movie and coding working. So, that’s just a bunch of cute… You get raw video in and you need to encode it down to the right formats for streaming. We had, at one point, we said, “Why not set an auto scaler to 4,000 and see what happens.” And we had 4,000 machines appeared and went, “Wow.” This has took about an hour or so. That’s cool. We just needed them for an hour or two. So, that was one of those moments when you realize that, “Yeah, this cloud thing actually works in ways that you can’t do that in a data center.”

And then, in 2010, we started moving the front end of Netflix, all the web pages that you’d interact with, and that took about nine months. So, by the time we got to Christmas… The thing was, if we didn’t do that, we’d need a new data center. So, there was a hard limit. If we don’t do this in this time, we’ll run out of capacity, right? So, we had a picture in all our presentations of a plane going down the runway, and it either crashed into a building at the end of the runway or it took off.

By the end of 2011… So, it’s really a two-year migration. The thing you think of as the Netflix product was running in the cloud. So, it’s really a two-year migration. Billing was still being done in the data center corporate IT, it was in the data center. It took many more years to get rid of all of that stuff. But what you think of as the Netflix product was up and running in the cloud.

Rahul Subramaniam: Adrian, the way you describe this feels like the cloud was a very natural and easy choice to make, when given the choice between a data center and the cloud. However, today, when you look at the enterprise landscape, there are still a large number of enterprises that are still very reluctant and hesitant to adopt the cloud. What, in your opinion, both… I mean, having lived on both sides of this particular story, which is at Netflix and at AWS, why do you think that hesitation is still there, even a decade down the line?

Adrian Cockcroft: I think, in some sense, even if you go back to before this… I used to work at Sun Microsystems, and Sun was trying to do cloud in about 2000 and 2003. Right. Yup.

And Sun only knew how to sell to CIOs, it didn’t understand credit card selling. There was no direct to consumer. And the CIOs didn’t like cloud then and they don’t like it now. A CIO likes to have… I built a big data center on their resume. It’s a bit of a cynical way of looking at it. I think everybody now has something in the cloud. If you were a big company that’d been around a long time, you also have something in the data center.

So then, the question is, where are you putting most of your investment? And if sort of 80% of your budget for new things is going into AWS, then we’ll say, “Okay, you’re all in on AWS.” There’s some large reports. Doesn’t mean that absolutely everything is, but most of it is. And those AWS is very good at making those customers successful.

And the other thing that’s been going on recently is a lot of cost-cutting. Right. So, the nice thing when you’re on the cloud is, you do some performance tuning, and next month your cloud bill is down by 20% or 30%, or depending on how hard you worked at it. And you just stop spending so much. But if you’re in a data center, you’ve capitalized, your spend is locked in, and then the other thing I see in enterprises, in particular, is the way they’re using cloud, are terrifically inefficient.

Rahul Subramaniam: Very true.

Adrian Cockcroft: I mean, they were inefficient in the data center, and they took the same practices to the cloud, and they’re using cloud very, very inefficiently as well.

One of the things that I was quite proud of at Netflix was, Netflix runs extremely efficiently, very high utilization, very low waste, very highly tuned, and gets the most out of what it’s running on, and has done that from the beginning.

Hilary Doyle: When you talk about AWS in those early days, it really sounds like you were leading them into their own technology. Can you tell us a little bit about what that relationship was like, early on with AWS?

Adrian Cockcroft: Yeah. It was very much a partnership, and that’s something I’ve seen. I’ve been on both sides of that now. AWS is very good at listening to its customers about what what’s needed. With Netflix, it was so early that things like… The security model didn’t exist, right?

Rahul Subramaniam: Yeah. I mean, it did not exist back in 2009. It came much later. I mean, you just had a single account with one user, and you had to potentially share that with everyone in your organization.

Adrian Cockcroft: Yeah. We were using EC2 classic and we had security groups. And that was it. And it worked. And AWS now has so many services. I actually don’t know how to use AWS anymore. I really only know how to use a EC2 classic because it was simple enough-

Hilary Doyle: Oh my God.

Adrian Cockcroft: … I could figure it out. But I wasn’t ever really that hands on, anyway. I was sort of architecting the system rather than building bits of it.

We had a weekly meeting. We had two weekly meetings with AWS. One was, the account team talking about all the latest issues, what’s going on, us telling them, “Hey, we’re launching in Europe, we need a thousand machines in Dublin.” And they’re going, “Okay, fine. And whatever.” And then, that was easy and said, “Yeah, we’re easy. Have a thousand machines in Dublin, that’s not a problem.” And then, we said, “Okay, we’re going to replicate all of Netflix, US in Oregon.” And they said, “Oh, hang on a minute. We need to schedule some hardware for that.”

Hilary Doyle: Let’s set up a lunch.

Adrian Cockcroft: I like to say, you want to be a small fish in a big pond, preferably an ocean. You don’t want to be a shark in a paddling pool.

Rahul Subramaniam: Right.

Adrian Cockcroft: Right. And if you tried to run Netflix in some of the smaller regions, it just wouldn’t fit. It’s a very big footprint. And we were growing fast. So, we were hitting limits and things like that. And whole limits management became a big issue. And there was a second weekly meeting, it was more of a training meeting where they’d do a deep dive on some new thing they were doing.

At some point, though, we hired Jeremy Edberg who’d been at Reddit, which was another large AWS customer. And he joined us, and his reaction was… Because he’d been on the customer advisory board and things like that, and he thought he was pretty plugged into AWS. And he joined us and said, “They tell you everything. They didn’t tell us all these things.”

So, there was a lot of very deep level of disclosure going on. And that comes from building trust.

Hilary Doyle: Yes.

Rahul Subramaniam: Yup.

Adrian Cockcroft: Right? So, Netflix was providing very deep feedback, and needed to understand what was going on. And that trust was really the key thing that made all that work.

Hilary Doyle: Do you think that that back and forth made you both move faster?

Adrian Cockcroft: Sure. Yeah. Netflix moves way faster than AWS. We’re always waiting for them to do stuff. Netflix is ridiculously faster getting things done.

Hilary Doyle: Yes.

Adrian Cockcroft: Mostly, because it’s not trying other than… I mean, there’s APIs to TV sets that are very stable. But internally, you can turn things over very quickly. If you’re building public services like AWS does, you have to bake them a lot more. So, sort of the pace of the… But AWS is moving vastly faster than all the other enterprise providers that you could talk to. So, that was mostly the way it felt.

Rahul Subramaniam: Adrian, at Netflix, back in 2009, 2010, you had already started building a culture, which was a paradigm shift from what existed across, pretty much, every other enterprise out there, right? I mean, the way you thought about technology, the way you drove the adoption of all this distributed computing, stateless services, serverless, all of that stuff, it was fundamentally different, in every way, from what had happened in the past.

What was that core thing that held that culture together or that created that culture of taking those risks? Because at that time, it was all taking those risks, making those bets, and moving on with it.

Adrian Cockcroft: The culture was already there. It was set up at the beginning. Netflix was set up with a very different culture. As the Netflix culture deck, there’s a whole story around that, but that was really just publishing the way that they worked internally, anyway. And when I was being interviewed in ’07, I went, “This just sounds fascinating.” I mean, this is real. I want to be part of it just to figure out how it works.

Netflix, at the time, at least, was very, very willing to try things, very able to try, what looked like, most people would think was too risky, and to make it work. So, we weren’t just the earliest user of AWS at scale, we were the first people to use Nginx, and kind of helped create Nginx as a company that wrote a check to the guy that had built this open source thing saying, “We need support,” and he had to form a company to cash the check, kind of level.

And then, Jfrog, Artifactory, one of the first customers for that.

Rahul Subramaniam: Right. A.

Adrian Cockcroft: Nd let me think, Cassandra with DataStax, not just… The partnership there, we had committers on the Cassandra project. What you get with that is that, particularly, with startups, you become one of the marquee customers that they have. You get huge discounts. I mean, the marketing value of Netflix is worth more than you can pay the customer.

Rahul Subramaniam: That’s true.

Adrian Cockcroft: So, it’s almost like the other way around. So, it becomes very cost-effective to use things, to be like that. And similarly, with AWS, we needed to be in a very strong partnership, so that AWS saw us as… Because Amazon Prime’s a competitor, so you want to make sure that AWS feels that it has to be super supportive of Netflix and be very publicly supportive of Netflix, so that if it ever Prime ever said, “Can you stop doing that because it’s helping Netflix?” They’d say, “No, we can’t do it. We have to…”

So, that was part of the way that Netflix managed all of these suppliers, was to get a really good deal.

Hilary Doyle: Let’s talk a little bit more about the culture at Netflix. Was Chaos Monkey just a logical evolution of that culture, or did you see a shift once Chaos Monkey and Chaos Engineering were really integrated?

Adrian Cockcroft: So, the previous hardware architecture we had was IBM pSeries machines, Oracle, lots of corporate stuff, but it still went down. So, the promise-

Hilary Doyle: Sorry.

Adrian Cockcroft: … was that, this was going to be super reliable hardware, and the developers could ignore failures because the Ops people would take care of it and-

Hilary Doyle: That wouldn’t happen.

Adrian Cockcroft: … it wouldn’t just work.

Hilary Doyle: Yeah.

Adrian Cockcroft: And of course, it kept going down, and we had SAN corruption. Storage area network corruption was what took us down for three days. Well, that didn’t work. So, if we assume that the machines were running on, are unreliable, then we can run on cheap machines and make them disposable. And there’s this cattle versus pet analogy that some of you may know. It’s like, how much milk do-

Hilary Doyle: We appreciate cattle here, too, though. We give them a lot of respect. I mean, we need to change the analogy, but we do hear you. Yes.

Adrian Cockcroft: So, we switched from pets to cattle, as the analogy. Everything we deployed was an order scale group. And initially, we had those set at fixed levels. And when you want to grow an autoscaler, you just set it up, and now you’ve got more machines. But if you want to shrink it, you have to figure out, how do I shut down a machine? And do I have to carefully drain all of the traffic from it and very gently make sure it’s fine before I turn it off? Or do I just turn it off and the system will deal with it?

So, what the Chaos Monkey did was, it just went around. And every now and again, we’d pick an order scale group and shut down a machine, and the order scaler put it back again. And that we were exercising order scaling, scale down, and resilience against failures. And we were running on cheap Intel hardware.

At that time, the failure rate of the original M1 series instances that AWS had was a lot higher than the machines today. So, you can look at the original Chaos Monkey itself as, really, an architecture design control that says, “We are order scaling, and we want to scale down as well as up.” That’s one way of looking at it, as well as saying, “We don’t want any stake on any of these machines.” The stake goes somewhere else. We don’t want any session to wear cookies and things like that. So, that was the model.

And then, we had a Chaos Gorilla, which shut down the whole zone, and it picked the zone at random at runtime. Whenever you ran it, you didn’t know which zone was going to go down, and it would just shut down all of the capacity in that zone. And they started running that every couple of weeks.

Hilary Doyle: They’d all send me an army.

Adrian Cockcroft: Yeah. And then, there was another one for checking security stuff. It would go and try and make sure the certs went timing out on Security Monkey. They’re all just sort of demons. And then, Janitor Monkey would clean up all of the unattached EBS volumes and other junk like that. But people-

Hilary Doyle: So … 

Adrian Cockcroft: … would leave lying around.

And Chaos Kong was region level, if we switched out of one region and… Because Netflix, at that time, was running in east and west, so we’d switch all of the traffic to west and make sure that that worked. Or switch the traffic from Europe to east to make sure that we could tolerate the loss of one of the three regions.

Hilary Doyle: We’re pressing pause on the monkey talk just for a moment because… Well, Rahul, you tell him.

Rahul Subramaniam: Well, this podcast isn’t the only way to hear all about AWS and feel like an insider. You have to check out the AWS Made Easy livestream.

Hilary Doyle: Every week, Rahul is joined by AWS enthusiast, Stephen Barr, along with some amazing guests from AWS herself to talk about what’s new. And most importantly, to answer any questions you may have, live.

Rahul Subramaniam: You can find out more at cloudfix.com/livestream. We stream on LinkedIn, YouTube, Twitch, Facebook, and Twitter. So, please join us.

Hilary Doyle: All right. I’m hitting the play button. Let’s get back to Netflix.

Adrian, all of these stories I hear about Netflix, just smack of such certainty and confidence. And I want to understand how much of that was actually true. I mean, as you were making this migration, were there moments of doubt where you thought, maybe a hybrid solution is better or have we made the right call? Did you personally have any of these moments? It’s just you and me here.

Adrian Cockcroft: Well, it wasn’t a young company.

Hilary Doyle: Right.

Adrian Cockcroft: And I wasn’t the oldest engineer. We didn’t have any graduate or hires, interns. It was a bunch of senior people that had done it over and over again, taking everything they’d learned from decades of experience. I had 30 years experience, then.

So, the company itself was pretty mature. It was a decent size, but everything we did was to de-risk. So, at some point, there’s another story I can tell you. There was an argument over how we were going to replicate state between Virginia and Oregon, right?

Rahul Subramaniam: Right.

Adrian Cockcroft: And we have been in an argument in a meeting over whether it would work just using Cassandra or whether we needed to build a service to replicate the data over. And so, I came out of this meeting and said, “Well, let’s just try it.” Wandered over with the Cassandra manager that was in the meeting with, and went over to one of the engineers and said, “We just got a bunch of machines.” I think we had 48 machines in Virginia and 48 in Oregon that we hadn’t deployed yet. Says, “Can we just run a test on it? Let’s just set it up as a 96 note Cassandra cluster, and just beat the crap out of it.” Run it, run it, write, write as much stuff as you can and just see, does it get there? And she said, “Sure.”

It was about 4:00 PM, so by the end of that day, the cluster existed and was loading up a recent backup of data. And then, she came in the next day and was running some benchmarks. And we called Amazon before, because we were running about… The bandwidth across the country was like 480 gigabits per second.

Rahul Subramaniam: Wow.

Adrian Cockcroft: We didn’t actually end up pushing that much, but we could have… It turns out Cassandra was single… Right? We were running-

Rahul Subramaniam: Yeah, that was single-threaded, then. Yup.

Adrian Cockcroft: We ended up running about 10 gigabits per second of replication traffic for a while, which wasn’t too bad, but we didn’t know that at the time. It was plenty… But anyway, it turns out, without doing anything to anything we’d got, we had at least 10 times more capacity than we needed to do it.

So, went to the next meeting the following week and said, “We’ll ran this test.” I mean, the whole configuration existed for a few hours. We retired that risk.

Rahul Subramaniam: Got it.

Adrian Cockcroft: Went to the meeting and said, “If you don’t like our test, run your own test.” Okay. Fine. We’ll just do it that way. And as far as I know, the way they still do replication. So, we did that over and over again. If somebody’s had an idea, you just run a test to retire that risk or prove that it worked or didn’t work.

And that the ability to do that in the cloud by just firing stuff up in a few hours was revolutionary, like that’s nothing like my old days back at Sun Microsystems. We’d sort of have to go argue for budget and go build something in the lab, and it would be months. We were doing stuff in hours that, in the data center, would take months or be completely impractical. And you’d go talk to your Ops people and say, “I need a hundred machines scattered around the world for a few hours and see what happens.” Right?

Rahul Subramaniam: True.

Adrian Cockcroft: It’s not happening.

Rahul Subramaniam: Adrian, we’ve spoken a lot about open source and a bunch of other partnerships also that you had. But I mean, at Netflix, you guys kind of pioneered a lot of the open source AWS tooling or cloud tooling, so to speak, which was, in a certain sense, Netflix could have taken the view that this is intellectual property because it sets Netflix apart from other competitors. But instead, decided to open source it, and really bring about a lot of the acceleration in folks moving into AWS.

What was the specific thinking behind it? Why was the decision to open source it and stuff, keeping it all internal?

Adrian Cockcroft: So, we were moving from a enterprise-centric world in the data center where we had contracts with Oracle, and we used AquaLogic, and all these… And we had… I forget who the different vendors were, but we had all of these enterprise deals that were expensive. And we were moving to the cloud where we were running on a hundred times more machines. Like, the node count was up enormously. The data center was only ever like a hundred machines or something, maximum, and there were eight backends or something like six Oracle databases or something. It wasn’t scaled out. We couldn’t afford the enterprise licensing.

Rahul Subramaniam: Oh, licensing by the-

Adrian Cockcroft: We didn’t want to pay for enterprise licensing, so we made a conscious decision to move to open source as we went to cloud. So, it wasn’t just a cloud transition, it was an enterprise to open source. And this was one way we cut our costs, because Cassandra is a lot cheaper than Oracle. And getting yourself off of Oracle is a whole problem on its own, as many people have discovered. So, that was part of it.

Then, we started fixing things like Cassandra and building things and just saying, “Well, we need to contribute the fixes, so we need to understand how to contribute to open source projects because we’re fixing the code.” So, we signed up through Apache Foundation, did a contributor license that covered the whole company and said, “Okay, if it’s Apache license, you can contribute fixes, don’t need to talk to anyone, just tell us you did it.” You don’t need to get legal approval or whatever.

Rahul Subramaniam: Got it.

Adrian Cockcroft: Fit some other license, talk to legal, and we’ll deal with that. And then, somebody said, “Well, I got this completely new thing I’ve built, that I’d like to open source. And it’s a new project.” And said, “Should I go and get the lawyers to look at it?” And I said, “Well, if you think the lawyers are going to find any bugs in your code, sure.” But it’s not really… This is kind of the attitude. This was my manager, at the time, Yuri, the VP of cloud. There’s a email. I think it got copied into a presentation once.

So, Jordan Zimmerman was the guy, and the code was a zookeeper library called, Curator. And we ended up putting it out there. Actually, it is now part of the zookeeper distribution. I mean, it’s a top level Apache project, but it was the first thing we put out. And it created a process around, so we need to be able to put these things out there. So, that was part of it.

And then, there were two other reasons for doing open source. Well, actually, there were three or four different reasons. One was, we wanted to be able to hire people that were interesting people, that had open source project experience. So, that was-

Rahul Subramaniam: Got it.

Adrian Cockcroft: … one. We wanted to make sure the way we were using AWS, that more people were using AWS the same way, and one way of doing that was to cut out the way we operated as code. So, people that adopted our code would use AWS with a similar pattern to us, so we wouldn’t end up as a shark in a paddling pool.

Hilary Doyle: I love that analogy.

Rahul Subramaniam: Got it.

Adrian Cockcroft: Yeah. We want to be surrounded by lots of other people doing things the same way. So, it was in Netflix’s best interests to make AWS successful and have as many other companies come in as possible.

I spent a lot of time encouraging Capital One to do that, and I was personally involved and did some contract work with them after I left Netflix. Capital One was probably one of the next really big significant things. It was the first bank to really commit to AWS, and they followed a lot of the path that we took. They have a big open source program. They did a lot of the same things.

And the final reason was to build a technology brand for Netflix. And it may feel odd now. Everyone thinks of Netflix as a high-tech company. At the time, it was not seen like that. Netflix was some little backwater company deep in the South Bay. They weren’t really doing high-tech, they were tiny. They had some stuff to do with recommender algorithms that was cool. And that was the only thing anyone thought was cool at Netflix.

And we were a movie brand or a TV brand. It’s content. So, what we did was, we deliberately created a technology brand as a halo brand around the movie experience, partly so we could hire people, partly to get better leverage with startups and discounts and things like that. And partly, as I said, to just be contributing back, and be able to hire the kind of people that contribute back a lot.

Rahul Subramaniam: Adrian, I’d love to finish up by hearing, what advice you have for anyone in the process of moving to the cloud?

Adrian Cockcroft: Are there still really people that are still in that deciding to move to cloud thing?

Rahul Subramaniam: Lots. It’s shockingly a very, very large number of people.

Adrian Cockcroft: Okay. So, the book that I’d read is The Value Flywheel Effects by David Anderson.

Hilary Doyle: Right.

Adrian Cockcroft: It came out last year. It talks about, how to map out what’s important to your company and what isn’t, and how to use… Wardley mapping is a powerful strategy technique to decide what to do, and it tells stories of Liberty Mutual, which is 150 or so year-old insurance company, how they are. They’re one of the fastest moving companies now.

You can have an idea for a product there and roll it out tomorrow, like a new insurance product, which is an insane… Nobody can do that, right? But they are doing things like that. They are not just rolling out code updates in a day, they’re doing completely new products, completely new services, customer facing in hours, rather than weeks or months. And if that sounds ridiculous, then you need to go read the book to see how they do that.

Hilary Doyle: Unfortunately, we have to leave it here. Thank you so much for being with us, Adrian, and for sharing your incredible stories.

Rahul Subramaniam: Absolutely. Thank you so much, Adrian.

Adrian Cockcroft: Sure. Happy to share it. Thank you.

Rahul Subramaniam: And it’s been an absolute pleasure listening to all of these stories and learning from you. Thank you so much.

Adrian Cockcroft: Cheers.

Hilary Doyle: Adrian is a walking masterclass in business and culture building. I loved hearing about Netflix’s rationale for moving to open source. They just didn’t want to pay for enterprise licensing. Who cannot relate to that? Bonus points to them for then using their open source code base as a way to fish for employees out in the wild because now, Netflix could actually interview prospects who’d already dug into and familiarize themselves with the code. It is a brilliant way of protecting a culture of individual exploration, efficiency, and excellence.

Netflix really mixed the ingenuity and agility of a startup with a deep, vast experience in its employee base, and that combination works. Offline, I want to mention that Adrian also highlighted the difference between cultures at Netflix and AWS. He said that, AWS, as we know, is driven by processes that aid collaboration within and among its teams. So, if there’s a problem, you look to resolve it at the process level. Whereas, at Netflix, he spoke about assembling an Olympic team, you’re hiring for excellence, and then you get out of the way. So, if your Olympian doesn’t perform, then you know you’ve made the wrong hire, and you solve for that. It’s two very different approaches to culture building and two extremely successful outcomes.

Rahul Subramaniam: I was also fascinated by how forward the mindset at Netflix was, even in those early days. I mean, they truly grasped the idea of what the cloud could do for them and embraced it wholeheartedly. I mean, you could say that they didn’t really have a choice, especially at their scale. But even today, you see so many organizations go the other way when faced with similar choices.

I mean, I absolutely love the stories about the deep collaboration between Netflix and AWS, that literally shaped AWS and its services, even though you could say Prime Video and Netflix are probably archrivals in the media and the entertainment space. I mean, this is in contrast to so many organizations that are still debating whether a bet on AWS is going to cause some sort of conflict of interest. I mean, both Netflix and AWS have such differentiated cultures that allowed them to grow rapidly together. I mean, there’s just so much to learn from both these organizations.

Hilary Doyle: Is there ever.

But enough from us. What do you think? If you have thoughts on this show or anything AWS, please reach out to us at [email protected]

Rahul Subramaniam: And we’d really appreciate a review in your favorite podcast app, and if you’re told your friends to check us out.

Hilary Doyle: AWS Insiders is brought to you by CloudFix. They are an AWS cost optimization tool, and you can learn more about them at cloudfix.com.

Rahul Subramaniam: Thanks for listening. Bye-bye

Meet your hosts

Rahul Subramaniam

Rahul Subramaniam

Host

Rahul is the Founder and CEO of CloudFix. Over the course of his career, Rahul has acquired and transformed 140+ software products in the last 13 years. More recently, he has launched revolutionary products such as CloudFix and DevFlows, which transform how users build, manage, and optimize in the public cloud.

Hilary Doyle

Hilary Doyle

Host

Hilary Doyle is the co-founder of Wealthie Works Daily, an investment platform and financial literacy-based media company for kids and families launching in 2022/23. She is a former print journalist, business broadcaster, and television writer and series developer working with CBC, BNN, CTV, CTV NewsChannel, CBC Radio, W Network, Sportsnet, TVA, and ESPN. Hilary is also a former Second City actor, and founder of CANADA’S CAMPFIRE, a national storytelling initiative.

Rahul Subramaniam

Rahul Subramaniam

Host

Rahul is the Founder and CEO of CloudFix. Over the course of his career, Rahul has acquired and transformed 140+ software products in the last 13 years. More recently, he has launched revolutionary products such as CloudFix and DevFlows, which transform how users build, manage, and optimize in the public cloud.

Hilary Doyle

Hilary Doyle

Host

Hilary Doyle is the co-founder of Wealthie Works Daily, an investment platform and financial literacy-based media company for kids and families launching in 2022/23. She is a former print journalist, business broadcaster, and television writer and series developer working with CBC, BNN, CTV, CTV NewsChannel, CBC Radio, W Network, Sportsnet, TVA, and ESPN. Hilary is also a former Second City actor, and founder of CANADA’S CAMPFIRE, a national storytelling initiative.