Binpress Podcast Episode 11: Slava Akhmechet of RethinkDB
This week, we chat with Slava Akhmechet, co-founder of RethinkDB, an open-source distributed database taking the development world by storm. Slava discusses why experimental work often doesn’t make the cut for commercial codebases, why you should focus on ideas instead of venture capital, and the biggest opportunity in commercial open source. He also covers how assumptions keep people from coming up with great ideas, how Star Trek: The Next Generation explains why he built RethinkDB, and much more!
Alexis: Thank you, Slava, for taking time out of your schedule to join us here on the podcast.
Slava: It’s my pleasure. I’m looking forward to this.
Alexis: Before we dive into RethinkDB, tell us a little bit about yourself.
Slava: Well, my name is Slava Achmechet, I’m one of the founders at RethinkDB. I was born in Ukraine and moved with my parents to New York City when I was about 13 years old in 1996, and I’ve been programming basically for as long as I can remember. Back when I was a kid, I really loved to program and all my peers loved playing games and I love programming computer games so that’s how I got started. I can tell you a little bit more about that.
Alexis: Yeah, absolutely.
Slava: Then later on, I was always into computers and computer science. I graduated with a computer science degree and spent some time in grad school, again studying computer science and just starting RethinkDB has sort of been a very natural thing for me. I always wanted to start a tech company and build technology products that improve people’s lives in some ways or make them happy and that’s how RethinkDB came along.
Alexis: So how did you first get interested in, “You know, I want to build a database,” because that’s probably not the first thing that comes to people’s minds. “I want to build a game,” or “I want to build an app,” that kind of thing.
Slava: Yes. I was in grad school at Stony Brook University. One year in, I passed all my qualifiers and I was looking at “Which lab am I going to join?” I was playing around with a few ideas. So one was doing massively parallelized simulations on mammalian brains on super computers, and we had accessed the idea on BlueJimp. So it sounds pretty fancy but it was actually this big problem where neurons, human neurons or mammalian neurons have thousands of connections and they all interact with each other which is really hard to simulate and parallelize on the modern computer because you get metric bottlenecks.
So the thing I was working on, I was trying to figure out, “Okay, how do you take this simulation model and how do you make it work efficiently on modern computers?” And the second thing I was playing with is a file systems lab where people were building file systems for Linux and they were just experimenting with various ideas and that’s where I met my co-founder Mike who was actually into human-computer interaction.
So we’re sitting around, bouncing ideas with each other and we thought, “Hey, how can we combine our skills to build something interesting for people?” And the thing that we realized is that, so we have backgrounds in infrastructure and human-computer interaction and we realized that the way people are building applications and deploying applications, specifically web applications, basically changed dramatically in the past ten years. And databases were designed 40 years ago back when none of this stuff existed. So just the colonel of the idea was what happens if you sit down and rethink, redesign these systems to work in the modern world? What assumptions would you throw out? What new assumptions will we deal with? And that’s how the idea came along and it was very exploratory at that time.
It wasn’t that we wanted to build a database, it’s that we thought, “What happens if we explore like how people build web applications and how can we apply our skills in the most effective way possible?” And I think that’s how the project came along.
Alexis: Was it with the intent of commercializing it at some point or was it more of an exploratory thing as you’d previously mentioned?
Slava: I think it started out as an exploratory thing and then we realized that when we started talking to people about it. So we were in New York and we got into Y Combinator. So there was some kernel of commercializing it. We thought, “We’re going to give it a try.” We moved to California, started talking to people about it and we realized that there was just a tremendous amount of excitement in just when we talked to developers and CEOs and even business leaders about the span of technology, so it just became clear that there is something there, and commercialization came out of that but I don’t think it was… like initially we didn’t think it through all that far. We just wanted to build something interesting.
Alexis: So after all that exploration, what resulted in RethinkDB? What did it form into?
Slava: So, RethinkDB is an open source distributor document database.
Alexis: You haven’t said that a thousand times.
Slava: That’s right. But I never get tired of saying it because I really love the product. But look, we let people do it, we let them build and scale reactive applications. And just to give you an idea what that means, if you’ve ever used for example Gmail, and you’re looking at the Gmail prep, just reading conversations and if a new email comes along, there’s this little notification bar that pops up on the bottom that says you’ve gotten a new email.
Or another example of this is if you’ve ever used a product like Quora for questions and answers, you’re looking at a question and an answer and then someone else edits it in a different browser, you see an object instantaneously. So that’s real time, this type of new real time reactive experience is a relatively new thing and companies like Facebook and twitter and Google have basically trained consumers, they trained all of us to expect that kind of an experience but if you’re a company and you have a website or an app and you are trying to build something like that, it turns out to be really time-consuming and really hard because there is a lot of innovation in the web world and server world like Node.JS and Socket.IO or Sockjs that lets people do that but there hasn’t been very much in the web world to let people do that.
So the type of thing in RethinkDB, what we do is we allow people to write queries in a query language designed for JSON documents and it’s very, very convenient to use if you use something like jQuery it’s very similar. You can just write queries and get answers. You could scale it out in a click of a button to multiple machines.
But then what you could do is when you write a query on a .changes command and then you get a real time stream of updates to the results. So for example, you could say something like give me an average age of my user, plus this isn’t actually a very useful goal, and you could type .changes at the end and you’ll just get anytime someone inserts something into the database and the average age changes, you just get that update, the updated value. And it turns out that that makes the development experience of building this reactive application just dramatically easier for people, and that’s the core of RethinkDB. That’s why people pick it up and use it for the most part.
And there’s a lot of different things around it. The user experience, we wanted it to be super easy to get started with and we wanted it to be super easy to scale out with their application scale to really care about the quality of the query language, all the stuff. But the fundamental core of the product is just redesigning a database in a way that lets people build and scale these reactive applications in the modern world.
Alexis: So listeners might be thinking, “Alright, I’ve got MongoDB or something like Cassandra.” What makes RethinkDB different in some other ways?
Slava: So RethinkDB is actually quite similar to MongoDB. So if you’ve used MongoDB, I don’t think you would be very surprised when you get started with Rethink. It’s a very similar experience, it’s very easy to get started and it’s familiar. So it certainly would be familiar to for example MongoDB users. Now, the fundamental differences in RethinkDB is this reactive scalability component. The fact that you could get an incremental real time update to any results or to most results, so that’s a huge differences. That absolutely changes the way of program because instead of writing of a query and then pulling it let’s say every five seconds, the database pushes the updates out to you. That just makes building applications just dramatically different and dramatically easier.
So that’s the fundamental part and then there’s a lot of things around the edges. So for example, RethinkDB supports distributor joints. There are very few NoSQL databases that do that and you can use joints the way you would use them with their relational, traditional database, you could use that with JSON documents which gives people enormous flexibility in how to structure their data. And all of this comes from just our love and care for the user experience and I don’t mean just the administrator council, I mean the whole thing, how the people write applications because developers spend 8 hours a day in this environment and we really wanted to make that pleasant for people.
So that query then gets sent to server, compiled on the server, and then gets distributed across the distributed systems of nodes where the computation happens and then as a user, you don’t have to worry about any of that at all, you just get the result, and the system takes care of where is the data, how do I send a query, how do I parallelize it, how do I make it efficient. So from the user’s point of view, you could use this extremely flexible query language to write your application the way you want and then you can get a real time stream of updates which is just as dramatic change in how you would go about building web and mobile applications.
Alexis: You’ve already kind of answered this question, but we asked some users on Reddit, Twitter, Facebook and the like for questions for you. Could you give us some examples, use cases? They ask, “When is RethinkDB a better choice for building an app than other databases?”
Slava: So I just talked a little bit, we talked a little bit about an example like Quora where you’re like at an answer in your browser and someone else does something else, they make a change, and you see the answer right away, so that will be just a wonderful, wonderful use case for RethinkDB. If you’re building an application like that and you want to bring such a sophisticated user experience to your customers or to your users, RethinkDB makes that really easy.
But let me give you another example of how that could work. So some of RethinkDB users are using it for games, so RethinkDB is the background store for game worlds. Imagine if you have this game where in game you’re selling items to the game players, so you have this end game economy which is a very common kind of thing right now. So it turns out if you talk out to the people who build these games, what’s really, really interesting about it is that when you, so if you measure demand for particular items in the game and you start changing how many items of a particular type you’d flash into the game world, you could dramatically change, like maximize sales, so you could quadruple your revenue just by paying attention to the in-game economy.
And if you look at how people do this traditionally, they take a snapshot of their game world, let’s say every 24 hours, they transfer it into Hadoop and then in Hadoop they do some data crunching, and then they figure out “Okay, this is how we want to modify our economy” and then they affect the game world. So if you have a functionality where your core database that the game is built on supports real-time queries and real-time update streams on these queries, you can modify this economy in real-time. So you don’t have to wait 24 hours to do this back and forth. You could just do it immediately and the queries can be quite sophisticated. So you can write a MapReduce query and then say give me an incremental set of updates to this MapReduce query.
You could write something like group users by location in the game world and figure out what they’re buying between these times, and the query can get progressively more complicated and sophisticated and then you just say .changes and you can affect your game world, you can modify your game world in real-time which is just a dramatic change in how… I guess people would call this analytics but it isn’t quite analytics because it’s merging this gap between analytics and the real-time sort of experience. RethinkDB is just great for all kinds of use cases like this. Any time you want to do any kind of real-time or reactive experience or you need results to queries in real-time, RethinkDB is great for those kinds of use cases.
Alexis: How long has it been since you started RethinkDB?
Slava: We started the company in May of 2009 so it’s been about five years, a little more than five years.
Alexis: At what point did you get into Y Combinator?
Slava: May 2009. So we’ve been throwing around some ideas before that but really, like the fundamental aspects of the company came out after we joined Y Combinator, so I sort of think of it as May 2009 as the germination time for the company.
Alexis: So what’s your monetization model for RethinkDB?
Slava: Well as I already mentioned, RethinkDB is open source and we want anybody to be able to use it. If you can’t afford to pay, it’s a free software. You can download it on the internet and start using it. What happens with free software or open source software is if something that’s at the core of what you’re doing, at the core of your business and it’s sufficiently sophisticated, people absolutely traditionally need to pay for operational support. So people download RethinkDB, they pick it up, they build an application within their organization on top of the product, then they hand it off to operations. And operations are the people, basically the guys that have to wake up at 3 a.m. in case something goes wrong. They generally pay for operational support for products which are the core of their stuff, and RethinkDB is one of those products.
We haven’t actually opened up monetization publicly so you can’t publicly for RethinkDB right now but we will be doing that very soon and that’s the fundamental revenue stream for the product. We want anybody to be able to use it but when people need sophisticated support, they can pick up the phone and call someone, get someone who is really knowledgeable in the other end of the line who can help them out with their problem.
Alexis: Are you interested in monetization when it comes to RethinkDB as a service, like having it online and having folks spin up their own instances of RethinkDB?
Slava: Right, so you could do right now with EC2 and we offer people various ways of doing it but we don’t want to be a service company because it’s almost a different business together. If you look at the development process and actually the marketing and just how the business works of building a software company, a software that you ship to people, it’s really dramatically different in many ways from building a service company. And I don’t think you can effectively do that under one roof without significantly splitting focus. So I don’t think we’re going to do the service thing for a while but there has been a couple of services that were built it by our community members and that’s only going to get better. So we really kind of, we don’t quite outsource it but it’s just something that gets built in the ecosystem around the product by other people, and we’re perfectly happy for that to happen.
Alexis: I usually ask folks about what they’ve learned about pricing but when it comes to services, when it comes to support, it’s a bit harder to answer and nail down. You don’t have set prices, I guess, everything is pretty fluid, but have you learned anything that might be applicable to other folks who are providing support for their software?
Slava: Yeah, definitely. This particular advice I don’t think would apply to products that you sell to consumers, but when you sell products to businesses, for beginners at least if it’s the first business you’ve started it’s very hard to estimate the value of the product for other people from their point of view, so what people just end up doing is they say “Well it costs us this much money to develop this and this much money to support it” and they figure out how much it costs them to build the product, and then they mark it up by some amount so they make some money by let’s say 30 percent or 200 percent or whatever their margins are, and then they go out and charge people that.
But it turns out that if you build a really valuable product for people, you can charge 10 times as much or a hundred times as much and people will be perfectly happy to do that, and they’ll feel like they’re getting a lot of value out of it because it’s really important to them and I found that it’s really hard for people that estimate that early on especially because when companies get big, the kind of fundamental assumptions change. So for example there are companies where time is way more important than money for them, like they have a lot of cash, but they need to get the market quicker, and they’re perfectly happy to trade their cash for time. When you’re selling product, you generally don’t think about it that way right? You just think about how to get your startup off the ground or something.
So that would be the lesson I would tell people, is to make sure to sit down with the customer and try to figure out, and you can just ask people, they’ll be happy to tell you. You could ask, “How valuable is this to you? How much money would you be willing to pay?” And people are kind of afraid to do it or sometimes they think, “Well, the other party has the incentive to give you a low-ball number,” but in practice I think if you sit down with these people and they are honest and genuine and you’re offering a good product, people are happy to pay way more than you usually think they would.
Alexis: How did you spread the word about RethinkDB in the early days?
Slava: So, I think open source has helped a lot and you can’t just open source a product and expect people will know about it. It doesn’t quite work that way. But open source has been the core of like our beliefs. We really care about it. Everyone who works at RethinkDB, we all love open source and we use open source software our entire lives, so it’s kind of a fundamental part of our culture. When you do that, it’s not just the software is somewhere on some FTP server or something where you could download it. The development process isn’t set-up, the issue tracker isn’t set-up so any user can come in and comment on a feature or a feature request or a bug or when we have technical discussions about features, anybody could come in and comment. So we always think ourselves as just a part of the RethinkDB the system, we just happen to get pain for doing it.
So the product, it’s not just the source code. When we say it’s free or open source, the whole thing is fundamentally open, like the development process is open, the company is open and we really care about maintaining that kind of culture and I think when you do that and you use social media or like Twitter, people begin to really identify with the product, they care about it, making it good because they know they will be heard, and then people starting other need-ups or they themselves write and we can help them out by sending them information how to do it and little gifts, things like that. That kind of thing works really, really well because you get, basically it’s word of mouth because people really care about the product and you just to have help them out a little bit.
Alexis: So how do you get devs to try a new database and more importantly not just try it but push it into production?
Slava: People try new things all the time. It’s actually a really nice aspect of our users because developers love to tinker and when people first pick up RethinkDB, they might not necessarily think, “Oh, I’m going to do this. I’m going to build this big project and put it in production. They start out saying, “I want to build a weekend app” and the project is really is to get started, so they start out, they try it out, they build some app, and then they just absolutely fall in love with the product and then they’d go to their organization and they find something new get built, they say, “I’ve used this amazing product. It’s a lot of fun to use. It’s really easy and it solves a lot of problems,” and then people look at it and that’s how things get adopted.
This isn’t the new idea that hobbyists can open drive markets and they can make tremendous difference. So that was the premise of RethinkDB, it was all bottom-up growth. We really wanted to make it just absolutely amazing for people who are tinkering and building apps and then from there, they just go out and spread the world and if the product is valuable, then people will pick it up.
Alexis: You have any examples of who uses RethinkDB that you could share with us?
Slava: We’re actually going to announce this pretty soon so I don’t want to share specific names of companies. So we’re going to announce pricing and things like that in the next couple of months, so we’re going to do all of that together. So I’ve been very surprised because we originally built it for web developers and web applications but it’s been used by municipalities and by certain federal agents, this has been used in the financial industry, it’s been used in biotech where people do DNA analysis, and of course it’s used in the web and mobile, but it’s very exciting because it turned out that the product is just really, really horizontal. You could apply it like almost anytime you build something with data or for the internet, you could start using it which at this point is almost anybody who is building software.
Alexis: What have you learned from interacting with the community over these years?
Slava: There have been I think a lot of really interesting lessons. The main one is that what you think people care about isn’t necessarily what people will actually care about, and that’s really cool because for us, once we got the development process and get-up and people came in and started commenting, we realized, “Oh man, their problems are, they’re not the same thing as we think their problems are.” So you really have to go out and talk to people and measure things. So I think people who build products, they’re really good at coming up with like the initial germination or initial vision for what it’s going to be, but then your users and your community is really, really good for refining it.
And it’s always this two-step process where you build something but then you really need the real world to refine it. So that was the first lesson, just what people care about is not necessarily the same thing that you think they care about. The second thing we learned that we felt was very exciting is, and it’s kind of interesting maybe unfortunate in some ways, is that like how hard something is to build is completely uncorrelated to how valuable people will think it will be. So you could do something really simple and people will perceive it as something extremely valuable, or maybe they’ll perceive it as something extremely difficult.
I’ll give you one example of this. One example is the backup tool for RethinkDB. So RethinkDB of course allows you to export and back up data, and people cared about this a lot, they always thought it’s really important and it was a missing piece, this was a while back, and we couldn’t quite understand why that is because RethinkDB supports this flexible query language and it takes it 10 minutes to write a script that will let you back up your data. But somehow people felt that this feature is missing something and then we built out this backup command, and you can say RethinkDB backup and it will back up your data and you can restore it back.
It wasn’t quite 10 minutes work. I’m of course oversimplifying because it had to deal with failures and it had to be efficient, parallelize, all that stuff, so maybe it took a week of work or so but it still seemed really, really simple but people thought of it as a really valuable tool and with this we realized like, “Oh, you could spend 20 percent of the time that you would think you’d have to spend and people will find it immensely valuable.” So I think that was the second biggest lesson and really all of these is just one big point, you have to listen to your users really, really, really carefully and find out what they care about because when you build anything with creative people generally really care about the think that they’re building, it’s a very emotional thing.
They love the creative process and they love the software but I think in reality, it’s not about you. It’s about the people who are using the things that you’re building. So you have to love your community more than you love your product if that makes sense. You have to really, really sit down and care about how people perceive it and what problems you’re solving for them and how they think about it, which is totally unobvious and it’s kind of a platitude. It’s almost like people say well you have to eat right and exercise and it sounds really easy but it‘s really hard to do on practice.
Alexis: Right. Continuing this community thread…. A lot of open source projects, when it comes to hiring, they look towards their contributors particularly the very active ones. Is this something that you all do and what qualities do you look for when hiring other than just contributions?
But there’s this qualitative jump from there to like the core server and the distributed infrastructure, and because that’s been changing pretty rapidly and there is a large of internal knowledge and internal muscle memory built out around it, it isn’t well documented and it isn’t easier for people to just get into it from the outside. You really have to spend a lot of time to understand how the system works, and we just haven’t quite gotten it to the point where that’s easy to do because it necessarily hasn’t been a priority. So we did hire contributors from the community when they contributed to the drivers but it’s different. Again it’s not the same thing as the database, the core database. So that would be the first question.
The second one was what qualities to look for. So we never look for people with database experience or anything like that, we really look for just traditional things. So you have to have the passion for what the group is building, because if people care about… If you care about building an amazing database experience for your users and you’re hiring someone who cares about functional programming languages, that doesn’t necessarily look good because they kind of have a different agenda, they want a different thing. So the number one thing we care about is do people want to build this amazing database experience for their users? Are you passionate about it, are they curious about it? Do they do have a lot of energy around it, like is this something they care about.
And then once people have that, we just look at very traditional computer science knowledge like algorithms, knowing the tool chain, being able to just code out solutions to problems, things I like that. I don’t think it’s very different from any traditional interview process you’d see at Google or Facebook or any other high-tech company.
Alexis: Okay. Returning to funding for a little while, you all have raised more than 10 million dollars, what kind of wisdom could you impart to folks who are beginning to walk down the venture capital road?
Slava: So I think you can raise money in two ways. You can raise money with reputation because you’ve already done something amazing before. Actually three ways. So you could raise money on reputation because you’ve done something amazing before and then people just respect you, they trust you, they know that if they give you capital, you’re going to go and do something amazing or at least attempt going to do something amazing. You could raise money because your product is really taking off and you’ve built something great already, and you get a lot of users, a lot of growth, and then people want to invest into that business and you could raise money just on an idea because you find someone who absolutely falls in love with your idea and just wants to fund it.
And the third one is the hardest because investors fundamentally… So again you have to look at it from their point of view and their businesses, they get money from their limited partners and it’s a financial instrument so they have to return money to their investors and they really care about business that are succeeding. When it comes to venture capital, I think it can be immensely useful just from the point of view of getting capital to grow your business and getting advice from investors on how to build the business that is super useful, but I think again I’d focus on your users. I think if you build something amazing, everything else is almost an afterthought, that’s just going to happen. But if you start focusing on venture capital and saying “How do I raise money to build my business,” that typically doesn’t work so well.
And I think right now, development has gotten way easier and way cheaper and you can always build something, you could always find the ways to build something the people love on very little capital. So I would go after first. I tried to figure out what is the most important company I could start or the most important product I could build, and once you do that, if you really do that, I think venture capital is just something that follows. It’s much more about figuring out “How do I pick a good startup idea? How do I actually get on things like that?” I’d focus on that rather than venture capital itself.
It’s almost like actually if you ever read Paul Graham’s essay on The Python Paradox, he said, “You want to hire people who are learning programming languages because they care about the programming languages and paradoxically they are learning these but they can’t get jobs doing it,” so the paradox is it’s easier to get a job if you learn something where there are not very many job offers for that language.
I think it’s something similar with companies. You want to build a company because you really care about the users even though there may not necessarily very much venture capital excitement around it. Because if there is a lot of excitement around it, it means it’s probably already been built, and if you build the company like that then everything else follows.
Alexis: In your past five years at RethinkDB, what’s one mistake that you’d rather not repeat?
Slava: Oh man.
Alexis: We’ve got time for several if you want.
Slava: Let me think about this for a second. So I think the most important thing that I’ve learned is that… So I really believe, after building RethinkDB for a while, I really believe in the idea of efficient markets and efficient markets for ideas to be specific. And what I mean by that is, so if you look at how people start companies or how they build features or how they do anything, creative people, they typically look at an idea and they say, “Oh, I’m going to build apps and apps doesn’t exist now, and I’m going to go and try to do that.” I think that’s just a fundamental thing that creative people do, like they look at the world and they say, “Something doesn’t exist, I’m going to go build that and I think it’s going to be successful.” I think it’s really important to try and find out why it doesn’t exist because nine times out of 10, there are fundamental structural reasons for why the world is the way it is. Does that make sense? I’m not sure…
Alexis: Absolutely yeah. It sounds like it’s part of the saying, “Don’t built it if it isn’t useful or if it’s not really needed.”
Slava: Yeah. So once you realize, once you start asking these questions, say, “Why doesn’t it exist?” Basically 90 percent of ideas turned out to be not worth building. And they weren’t building if you’re doing it as a hobby right? You have to be honest with yourself, “Am I doing it because it’s fun or am I doing this because it’s useful to people?”
So I think it’s really, really important. So then you get to this one idea out of ten where you find out, “Okay, it doesn’t exist for this structural reason and I think I can overcome this structural reason and it will be very valuable” and then you can go on and do it. So that’s a mistake that we’ve made I think quite a bit. We build a feature and we say, “Oh man, this feature will be super cool for people,” and then we discovered it doesn’t exist for a particular reason, like there’s reason why it doesn’t work that way. But once you find this one thing out of ten where the reason is something that’s changed in the world, like there has been a fundamental change that happened and now the old reason is no longer valid. Then you’ve got something really really valuable.
Because people, like human beings are really good at internalizing answers to things, they’re really good at internalizing cultures, and they’re really bad at reevaluating these things and reconsidering them. So when you look at the world and you say, “Okay, there has been this fundamental change.” Something about the world changed, like more people on the internet for example. What does that mean for everything we already know, for everything we believe, and then you go back from there and examine these beliefs and you don’t even know what to examine. You very often don’t even know you have certain beliefs.
Alexis: Yeah. The fish doesn’t know it’s in water, yeah.
Slava: Yeah, exactly, and that’s the hardest thing to do and we screw that up probably more than anyone else at the beginning, but I think that is extremely useful so I wouldn’t… the one mistake is don’t assume that something doesn’t exist because no one tried it. Usually, at least ten teams tried it and there was a reason why it’s not there, so you really have to stumble on that one thing that happened because the world has changed in some fundamental way and people haven’t realized it yet.
Alexis: That’s a very anthropological way of looking at things.
Slava: I guess so.
Alexis: Do you have any tips for how to kind of separate yourself from these assumptions that you’ve already made without knowing that you’ve made them?
Slava: So that’s really hard. It’s really difficult. I’ve kind of been thinking about that a lot and I haven’t quite puzzled it out. I’d really love to write a blog post on it at some point, and I tried a couple of times and I haven’t come up with anything like pragmatic and useful. So one thing I can think of is you could look at trends, like you could look at companies that are still small but are looking like they’re growing quickly, and then you could ask yourself, “Okay, what would it mean if this company takes over the world? What is the next best thing to build?”
So for example, when GitHub got started, there were all these other companies saying “Oh, it’s really cool and valuable to build a hosting and collaboration service around the source control system.” So Bitbucket went out and did it from Mercurial, I think, and there were a lot of others. So what happens is people start to emulate. Actually, for all the Bitbucket fans, it’s not all implausible that Bitbucket was there first so don’t kill me for this. But I hope you see where the bigger sort of point.
So what happens is when something is beginning to succeed, people start to emulate it and they think they can out compute it, I think it’s much more valuable to say, “Okay, this thing is succeeding. Suppose it just takes over the world, what is the next thing you would build?” And then you go and build that. And by the time that company has succeeded, like you’ve got something really valuable because you’ve sort of looked at the assumptions differently. Does that make sense?
Alexis: Yeah. Absolutely, yeah.
Slava: So that would be one way of doing it. I would like to think of some more but I can’t think of any at the top of my head.
Alexis: Well, so email me when you got that blog post written. On the flipside, instead of focusing on the negative things, what’s one decision that you’re particularly proud of?
Slava: I’m really, really proud of the team that we built here. I’m really, really proud of the product itself, and it’s really been this creative kind of collaboration of different kinds of people in the company that has worked phenomenally well. So for example, I’m a systems person and a programming language person, I really love programming languages and I ended up just kind of inadvertently hiring other people to care about that. And my co-founder is someone who really cares about the users experience and he’s thought me a tremendous amount about it and really like inadvertently hired people who cared about that.
And then there are people in the company who care about security, who care about just various other aspects, performance, things like that, so when you take people who care about different things deeply and you put them in a room together and there’s this battlefield of ideas in a very creative, constructive way, then I think something wonderful happens and you build something real and amazing. So I’m super proud of the team we put together and how people work together, how respectful they are, and at the same time how critical they are with each other’s ideas. I think that’s kind of been the key to building a really pleasant and useful product for people.
Alexis: Speaking of the team, how large has it grown?
Slava: We are 17 people right now.
Alexis: Wow. Are you all distributed or located in one office?
Slava: Right now, we’re all local.
Alexis: Okay, was that an opposition to a distributed team or was it just a fact that you just prefer being local?
Slava: So I think it depends on what you are building, and there is sort of a lot of conventional wisdom now being questioned around local teams versus distributed teams and people talk about collaboration a lot and how some companies are entirely distributed. So for me, that hasn’t actually worked because if you don’t have, human beings are very much, I mean, we’re still human beings, right? We’ve evolve in a certain way and when you put two people in a room, the kind of creative spark you get out of that is not the same thing as people are in different cities. I think geography still matters to us immensely.
It’s been very deliberate. I think for a product like this where you require just immense amount of collaboration between people, it will be very hard to get something of this quality if people were distributed. And I’m not saying it wouldn’t work for other companies or other projects. It’s just not something that would work particularly well for us.
Alexis: Now, are you sure you studied computer science and you didn’t sit in a few anthropology or sociology classes because there’s a very humanistic analysis and consideration to some of your questions.
Slava: I learned all of that here in the process of building RethinkDB and making mistakes. Yeah, I’m definitely an engineer at heart but then software is this very careful and creative mix of engineering and people. You have to care about engineering and you have to care about the people who are building it, funding it, using it, who are writing about it, talking about, so really, I’ve learned to care about that. And actually, you used the word anthropology a couple of times. I really think what happens is if you take an engineer like a scientist, someone who measures things, and you put him in a world of people and he has to succeed in that world but he knows nothing about it, well he’s going to fall back in what he knows. He’s going to observe them and create hypothesis and test them, and that’s really what I’ve done.
Alexis: Applied anthropology, okay. So what’s the biggest opportunity that you see now in open source?
Slava: I think there have been a lot of companies built like Microsoft or Oracle, around close source software that built infrastructure, and I don’t think that can happen again. So early adaptors basically like made open source just so much the prerequisite. If you’re going to build infrastructure software and developers are a fundamental part of your audience, you have to make it open source. I think that there will be a lot more infrastructure companies built around it, and the reason why I say infrastructures because if you build plan site tools, you can’t make money from it. People will just download it and use it for free and people won’t really pay for support because it’s not a fundamental part of your business. It’s something that you kind of use to build as opposed to the core of it.
So I think there will be a lot of really interesting infrastructure companies around it, and you can look at Docker, CoreOS and they’re building just fundamental pieces of infrastructure, just rethinking all of the old assumptions again. Before, like Red Hat has been the open source company offering operating systems, and Canonical was another one. And now, enough has changed in the world that you could go out and build a new operating system that does things differently and CoreOS is doing that. Same thing for Docker and VMware or Xen.
So I think there’s a lot of opportunities to look at infrastructure and say, “How has the world changed where all assumptions don’t apply and then go build that.” I think there will be a lot of open source companies doing it. So that’s one thing.
Another aspect of this is that Canonical I think used to be like this leading open source company that everyone would look to for community values and just propagating open source, the idea of open source and philosophy of open source.
Alexis: The shining beacon on the hill for open source.
Slava: That’s right, and I think they’ve lost their way a little bit. When I ask people “If you could think of one company that does that, who can you think of?” and people don’t really name Canonical anymore. So Mozilla is doing that now to a large degree but Mozilla isn’t quite a business in the same way. They have a different mission and it’s amazing, but I think there’s an opportunity for another company to arise as the shining beacon of open source and really return to these philosophical and human values of what open source represents.
Alexis: Now this changes from situation to situation but in general, what are some ways that you consider to be the best way to sustain open source?
Slava: Do you mean an open source business, the company or the movement in general?
Alexis: Both but in specific, I was more asking for a person who has an open source project and they’d love to make a living on it or help that support.
Slava: So open source is really interesting because from of a business point of view, it’s actually, unless you’re doing things very specifically, it’s not necessarily a really good way of making money. So I’ll give you an example. Traditionally, if you took to economists and we mentioned anthropology and now we’re switching to economist a little bit and I really care about it too, but if you talk to economist, they would generally say that money is a really good way and probably the only way to measure value.
Traditionally that has been really true. When you build a product or you build a service and you figure out how much money that service is making, well you’ve just figured out how valuable it is to people. But I think again something changed in the world where this isn’t specifically true and one good example of this is Wikipedia. Wikipedia does not really make money but it’s just how valuable is that for humanity, like if you took Wikipedia away, I think it we’ll just be set back so tremendously. So there are things that are getting built now that are extremely valuable but we can’t charge for it for various reasons. And I think open source, for a lot of open source projects, there’s something similar going on.
Again you have to be really honest with yourself, “Am I in it for the money or am I in it because I really care about the things that I’m building and it’s a hobby for me,” or a philosophical reason or whatever reasons people have. So if you’re in it for the money, I think starting a hobbyist open source project is probably not the best way of doing it.
But if you have started an open source project and you really care about it, you can do it I think in two ways. So if it’s a plan tool, you really can’t make a business out of it. You have to just ask people for donations and you have to structure it as a non-profit and use Kickstarter or use social media and say, “Hey, we need that much money to fund this product for the next year. If you care about it, please donate.”
And people love doing that and you can give them incentives too. So again back to economics, incentives work really well so you could sell mugs or sell t-shirts, like sell little kind of trinkets that exude the philosophy of your project and mark them up and be honest about it and people will be super happy to do that. So that’s for plan site stuff but if you’re building infrastructure, you could charge people for support and they will be happy to do that but that’s hard to do for an individual contributor because you have to work on the projects and then it basically becomes consulting.
Alexis: A pair of questions from the mailbag before we wrap things up. One Reddit user says, “Has your approach to working with large code bases or architecture changed over the course of working on the RethinkDB project? Any stylistic best practice or architectural ideas that you’ve adopted or rescinded?”
Slava: Yes, the process has changed dramatically. So one thing I’ve realized, and this is kind of similar to the question you asked before about what we’ve learned. So when we got started, we did it under the premise like, “Hey, there are so many papers in academia about really interesting ideas so let’s go out and implement some of them and see if they’re valuable to people.” But it turns out that once you it get to the big system, you can’t really do that because there’s so much just engineering low-hanging fruit, there is so much work to do. You really have to keep things like super simple and if you’re making something complicated, you better have a good reason for why you’re making it complicated because it’s probably not going to work.
And this is kind of the philosophy that Linus adopted for the Linux kernel, like there are so many research papers on how to improve the Linux kernel and so many experimental algorithms and none of them make it into the kernel because they’re just too complicated for people to understand. So that’s one thing that we’ve learned, just keep things super, super simple, like once the code base gets big, you just can’t afford to put complicated things around it. You simply can’t. You have to keep things simple. And I think that’s what a lot of people that are writing papers, so you could write papers because it’s interesting and that’s great, you’re just building up knowledge for humanity, but you have to understand that you typically just can’y put that into a big focus, empirically it doesn’t work. It’s too hard for people to do.
So that’s lesson number one. Lesson number two is code reviews. This is again like diet and exercise, like everyone knows you’re supposed to do it but very few people actually do it.
Alexis: And flossing, you can’t forget flossing.
Slava: That’s right, and floss. So halfway in, we implement the code review policy and we also learned that there are some things like flossing that you can’t do halfway, like you either do it everyday or you may as well not do it at all. We code review absolutely every single commit. And code reviews are amazing for two reasons. One is quality and two is dissemination of knowledge among the team because knowledge is not just one person who knows how this piece of code works but it’s at least two people, and that worked phenomenally well, and also that helps to keep things simple because if you built something really complicated, the reviewer isn’t going to understand it and they’re going to force you to simplify it. So code reviews have been just like absolutely amazing for this kind of thing.
Alexis: And it keeps everybody intimately familiar with the code, yeah?
Slava: Yep, yep. It really, really helps because you’ve got more than one person looking at a piece of code, more than one person knows how this works. If I sit alone in a room and think of something, like we really need other human beings to test ideas and refine ideas. Just like a product, you need users to refine the ideas for your product. For code, you need other people to look at it to refine your ideas and your code base or a feature or whatever or the class that you’ve built, so code reviews have been just incredibly useful and I encourage everyone. Like if you’re not doing code review in every commit and you’re doubting like, “Man, is this worth spending the time?” like the answer is yes, it’s worth it.
Alexis: Alright, now, wrapping up the mailbag here, these are the last two questions from listeners out there. What’s next for RethinkDB?
Slava: So we’ve been working on the product for quite a while and RethinkDB has a big surface area. There’s just like a lot to do right there as the storage engine and the distributor system around it and the client drivers and the administrative UI, there’s a lot to do and we’ve been doing it for a while and we’ve been working super hard on polishing the product and making it absolutely the best product there is.
So right now, I got to the point where the product is good enough that it’s extremely useful to people and people like absolutely love it. So now what I’m working on is just I just want to go out to the world and get as many people to know about it as possible, and I just tell people exactly what it is, how it was built, what it’s for, what it will do for them, and I think you’ll be hearing about RethinkDB a lot more because I switched my personal focus on doing that. And I’m very excited about it because for the users of the product, it’s very useful because the more people know about it, the more stuff gets built, the more resources around there is, so I think the community is about to get a lot. It’s already very vibrant and it’s very intimate, it’s great to be a member of it, but I think it’s going to get a lot more vibrant pretty soon. So that’s from just the community point of view.
From features point of view, there are a couple of things that I’m really, really, really, excited about. I can’t wait. I’ve been playing with prototypes and it’s absolutely amazing, so I’ll tell what those are. We’re shipping geospatial queries and geospatial indexes hopefully next week.
Alexis: This will be very neat.
Slava: It’s very neat and it’s very exciting. It’s just absolutely awesome to play with. It’s very visual and it’s very useful. If you’re building anything with maps or locations, it’s just awesome, so I’m very excited about it, and a lot of users have been asking for it for a long time, we just took the time to do it right. The thing that’s shipping after that, it’s going to take a little bit of time but what I’ve been playing is the current version and it’s awesome, is the cluster management and the administration API, and just to give you very briefly just what this is. So when you are using RethinkDB right now and you want to shard your database or shard your table, it’s very easy to do. You could say I want 3 shards and this cluster will automatically partition everything or add replicates or whatever. It will just work and it’s great. But right now it doesn’t have very much visibility. So it’s not clear to people how that happens internally. It’s pretty hard to understand and you can’t easily make programmatic changes. You either have to do it by the web UI or scripted by a command line tool.
So we’re integrating all of these into the recall programming language so you’ll be able to write queries that manipulate your cluster and look at its status. And we’ve put an enormous amount of effort into taking like these super complicated aspects of the distributed system and making it just dramatically simple for people. So the API is probably going to be the easiest thing to use or as easy as anything else in RethinkDB, but it’s this thrown into a super complicated thing underneath and we’ve worked a lot, it’s like a crucible where you take these complicated concepts and make them simpler and simpler and simpler until the thing is really beautiful. So I think it will be super useful to our users because a lot of people have been running clusters of RethinkDB and have been working around these issues, so it’s just useful pragmatically for programmatically changing clusters and monitoring what they look like, monitoring performance, and it’s also really, really beautiful from the distributed system point of view and making it simple to understand, so I think people are going to love these two features and I’m super excited about the upcoming two releases.
Alexis: One question I ask everybody on the show is what is your text editor of choice?
Slava: Emacs. No doubt about it. I do everything in Emacs.
Alexis: Which text editor do you think is in the lead so far?
Slava: In the lead, I think it’s…
Alexis: Among I should say our guests.
Slava: Well, I think in our company it’s Vi. It’s about three quarters vi. The rest is Emacs and a couple of other editors that people use. I think in the Linux world and the Unix world, it’s undoubtedly vi but it’s not going to make me switch. You can pry Emacs from my cold, dead hands.
Alexis: Mitchell Hashimoto was a switch-over from Emacs to Vim and now Sublime Text which, not an official count, it seems to be winning among our guests.
Alexis: Sublime, yep.
Slava: Oh really? Cool. I didn’t know that. So actually just for the record, I use vi for like one-off tasks and stuff and I love that editor too but like I spend 90 persent of my time, it’s split between Emacs and Gmail.
Alexis: Alright. And one last surprise question unique to you. I noticed on your website, you linked to a clip of Star Trek: The Next Generation. I was wondering why you chose that scene and if you could explain to the listeners which one it is.
Slava: Oh are you talking about the Commander Data?
Slava: Yeah. So I love Star Trek and we could probably spend another hour just talking about that show. I think it’s absolutely amazing. By the way, I think we need a new Star Trek show that reexamines the world and does Star Trek for the new world and I think JJ Abrams has done a pretty good job with the movies. As a Star Trek I’m not entirely happy about it but at least we’ve got something and the mainstream audience cares about it. But I think Star Trek is amazing and I know it’s not perfect and it’s campy and I get all that but I think it envisions a society that would be great if we all kind of try to move towards. So I love the show.
You mentioned anthropology and they do this really well, like they look at humanity and they’ll say “Well, if humans act like this in this situation…” It’s almost like you’re watching national geographic, like it’s about yourself, it’s great. So I love that show, and the specific clip you’re referring to is where Commander Data which is this android, he’s an artificial being. He meets his creator and he asked his creator like “Hey, why did you create me?” The creator goes, “Why does a boxer box or why does a painter paint?”
And the reason why I picked that clip is because people often ask, “Why did you build RethinkDB?” And there are all these answers like yeah, you know, we let people build and scale reactive apps and we give them a really pleasant experience and we give them distributed joints and we let them build the stuff and we give them all the stuff, and all of that is certainly true and I really care about it. But if you really start peeling that onion, and get to the core of it. I think the core of it is like it’s basically that. Why did I do this? Well because I don’t know what else I would do. It’s just who I am, so that’s why I picked that clip.
Alexis: Well, I think that is a fantastic note to end the interview on. If folks would like to dive into RethinkDB, where can they find more information?
Slava: Please go to RethinkDB.com. Everything’s there. It’s very easy. You can download the product, there are tutorials so you could get started in just a couple of minutes. If you have any questions, go to GitHub, search for RethinkDB or if you could look @RethinkDB in Twitter and just ping us on Twitter, we’ll answer right away, so there’s many ways to engage with the community. Try the product, let us know what you think, build a couple of apps on it. We’re super excited about it and we’d love for you to try it.
Alexis: And where can we stalk you on Twitter?
Slava: So I am @RethinkDB all the time. My personal Twitter account is @spakhm which is my first name and last name but really @RethinkDB on Twitter, I always check that stream, so if you ask anything, I’ll be there to answer.
Alexis: And for us, you can find us @binpress or binpress.com and you can find myself @AlexisSantos. And before the listeners shuffle on to the rest of their schedule, the rest of their podcast listening time, please I’d like to petition you to check us out on iTunes. Subscribe if you haven’t already and give us rating. We prefer honesty but five stars would be very nice, and help spread the word. Send out a tweet or something about the podcast.
Well Slava, it has been a pleasure.
Slava: Alexis, thank you very much. That was a very fun interview.
Slava: Thanks for having me.
Author: Alexis Santos