Binpress Podcast Episode 16: Shay Banon of Elasticsearch
This week we talk with Shay Banon, co-founder and CTO of Elasticsearch, the open source distributed search engine. Shay covers how a cooking app birthed Elasticsearch, why open source businesses are better based on runtimes than libraries, and much more. He also discusses the importance of an open product roadmap, how to make a distributed team feel connected, and how the virtues of The Big Lebowski apply to open source.
Listen to the podcast in the player above, or click here to download it directly. Subscribe on iTunes or do so manually by using this RSS feed.
- Shay Banon: Website, Twitter, Github
- Elasticsearch: Website, Twitter, Github
- Logstash: Website, Twitter, Github
- Kibana: Website
- Marvel: Website
- Marketing to Developers Event
Alexis: Shay, thank you for coming on the show!
Shay: Thanks for having me!
Alexis: Not a problem! Now people probably know about Elasticsearch. I mean, if you pull a developer off the street in San Francisco and ask them, “Hey, who’s Shay Banon and what is Elasticsearch?” They might know. But folks listening to the podcast, maybe they’re a mobile developer and they’re not quite so sure. So tell us a little bit about yourself.
Shay: Yeah, sure. My name is Shay Banon. I’m the one who wrote the first few lines of code of Elasticsearch itself. I’ve been doing it for a few years until I started a company with a few other founders around it, so now we also have Elasticsearch, the company.
Elasticsearch, the company itself, doesn’t only do Elasticsearch, the open source project. We also have Logstash and Kibana, which I can talk about also as well as other projects. Elasticsearch itself, the product, effectively it’s an open source, distributed search engine that also happens to do analytics pretty well in real-time.
Alexis: So before we get into Elasticsearch and the nitty-gritty about that, and Kibana and Logstash, tell us a bit about yourself. How did you get started when you first started programming?
Shay: Yeah, sure. Well, I’m not one of those guys that started to program when I was six or something along those lines.
Alexis: It was seven, right?
Shay: [Chuckles] More like 18. At high school, I did the math, physics and – the classes that we had during high school were mostly electronics, so I did electronics. When I enrolled to university I was actually going to the electronics track. I was on the fence – whether I wanted to study electronics or computer science, to be honest. I did some computer science before.
The decision was seemingly simple. One weekend, I opened up the newspaper and counted the job ads for computer software engineers and electronics engineers, and computer software won so I changed my major to computer science. I’m pretty happy that I did that [chuckles].
Alexis: So when you first started programming, what kind of stuff were you into? Were you always into web stuff and search, or was it something else? Maybe video games, as many people start out with.
Shay: Well I’ve been around programming for almost 20 years now, so I’ve been around for quite some time. When I was studying, I think in the second semester, I started to work full-time as well – programming, effectively. I was effectively working on real-time systems – embedded real-time systems – effectively building the heads-up display instrumentation for helicopter pilots.
Alexis: Oh, wow.
Shay: Yeah, so helicopter pilots would put their night vision goggles and they can’t look down in order to watch the helicopter instruments, so we built a system that injected those data directly into the night vision goggles.
Alexis: So Google Glass is nothing for you [chuckles].
Alexis: So Google Glass is nothing for you.
Shay: [Chuckling] Yeah. So that was very, very exciting, obviously. I got to program then see real-time systems – it’s another layer of complication compared to other type of apps, obviously; the importance of injecting the right instrument at the right time can mean life and death sometimes. That was super, super exciting – to work across the world, helping integrate those systems. I’ve learned tons out of it.
Alexis: Now that sounds like a very closed source system, since you don’t necessarily want your code that you’re working on to be open, just, “Let’s see – how can we jam this guy’s heads-up display?”
Shay: Mm-hm, indeed.
Alexis: So how did you make the shift from that to open source? What was your first taste of open source?
Shay: Well after I got married, my wife changed her career and wanted to try to become a chef, so she went to London to study to become a chef at the Cordon Bleu and I followed her. When I lived in London, I didn’t have any job, so I was trying to make myself attractive to the job market [chuckles]. I started to learn – I was doing Java for quite some time before but I was starting to get used to things that I didn’t use before. Open source projects, mainly, things like Spring framework was just happening, Hibernate – things along those lines.
I used that opportunity to try and build software for my wife so she can manage all the knowledge that she had when she was going through all of her studies.
Alexis: How nice!
Shay: Yeah, so I started to build what I named iCook, which was effectively an app that would manage all of her knowledge around cooking and culinary and what have you. Obviously I overdesigned the hell out of it because [chuckling] I wanted to learn Spring and I wanted to learn Hibernate and I wanted to learn tons of other stuff.
But at the core of it, I wanted to have this simple search box where she could just start to type something and find whatever relevant information that she had. Obviously, at the time, Lucene was pretty new still but it was even then the best open source information retrieval library out there, so immediately started to use it as well; I really got into it. The interesting bit is that I saw that I was using tools that helped map a business domain model to another representation, namely –. For example, Hibernate was an ORM tool, right? Mapping objects-relational databases. And I thought that it would be interesting to have a project that will help map objects to Lucene itself.
So I started to work on that as a side project and it ended up being, I thought, to be very beneficial so I open sourced it. I’ve been doing it for years; it proved to be super, super successful, very usable for users, and that’s how I got into open source [chuckles].
Alexis: And so that was Compass, right, which eventually morphed into Elasticsearch?
Shay: Indeed. Yes, exactly.
Alexis: Talk about a project that grows out of something – the project that you’re working on now that grows out of a much smaller project. Whoever would’ve imagined that it would’ve come out of a cooking app?
Shay: I certainly didn’t [chuckles].
Alexis: So at what point did you realize, “Hold on a second – this thing could actually be a business”?
Shay: Well I think it was mainly around Elasticsearch. I didn’t really think that Compass can be a business, to be completely honest. If you try to build a business around open source, I think it’s important that it will have some characteristics around it.
For example, it’s important that open source will be a runtime. Operations people should see it. Compass was a library, so people would just embed it in their applications, so you can’t really build a business around something like that.
When Compass morphed into Elasticsearch, that process was also very, very interesting. I realized, through Compass, how search can be such a vital component in applications. The fact that compass allowed users to map any domain model into search, allowed me to be in the midst of talking to users about how they’re mapping domain models that I would never think about into search because it was such a great exploration tool, right?
And that just made me realize how powerful search can be. If you looked at search at the time, people would look at it as – I called it just search, like just do full-text search or something along those lines. Compass helped me realize how much search can be more than “just search.”
Together with that, I was working at the time with a company called GigaSpaces, doing distributed in-memory data grids. I saw trends of people having more and more data, and they were struggling, “how the hell do we make sense of it?” Obviously, it didn’t take me a long time to realize that search is a great way to explore your massive amounts of data.
It took me – I think I was thinking about Elasticsearch close to a year before I mustered the courage to write the first line of code, to be honest. I learned through Compass that if you want to have an open source project that really succeeds, you have to invest a lot in it – weekends, nights, mailing lists, IRC, everything that comes with it; it’s effectively a second job. You have to make conscious decisions if you want to make sure that it’s going to be successful.
But it got to a point where Elasticsearch was just waiting to come out. I had all the design of the initial version of Elasticsearch in my head, just waiting to write it down. I ended up writing it down, and after a few months I open sourced the first version.
Back to your question, I think that Elasticsearch, which is different from Compass, it’s a runtime system, right? I mean, it’s distributed systems, people running 50, 100, 200, 500 nodes of a cluster of Elasticsearch depending on their data and their scale. It’s something that ends up being important for applications. I mean, the usability aspects of apps can deteriorate tremendously if search or analytics are not available. Those aspects help build a strong and sustainable company that runs an open source project.
Alexis: Before we get further, I guess the most straightforward scenario for Elasticsearch is basically a basic search. When people hear big data, some of their eyes tend to glaze over – I know mine do. Do you have any quick examples that can give us an idea of what folks can build with it?
Shay: Sure. I think that the power of Elasticsearch comes from its ability to combine three different aspects into a coherent view, which are the ability to do a full-text search, a structured search, and what we call aggregations, but they’re effectively analytics in real-time.
A few examples that I can give, at least public ones – GitHub uses Elasticsearch to drive all of their search across their website. I think that’s a good example of obviously full-text search, because you search across issues or commits or code base, or something along those lines.
It also includes analytics; you can use that to see the top languages that are being used that matches your search, or the top committers or things along those lines.
Another interesting use case is Foursquare, for example, and Yelp, to a certain degree as well. They’re using Elasticsearch to search across geolocation-based data, find venues that are next to me or things along those lines. Obviously, Elasticsearch is also pretty big in the context of the logging space, which I can spend some time into why the hell that happened as well.
Tons of different cases, billions and billions of records being stored in Elasticsearch, visualized through Kibana and streamed into Elasticsearch through Logstash, that helped effectively create more visibility within organizations, and that use case is super popular as well.
Is that enough?
Alexis: Oh yeah, absolutely. You’re at the point where you’ve decided, “Elasticsearch can be a business. I’m building it.” What were your plans for monetization and how has that changed over time?
Shay: Well, when we started Elasticsearch the company, there were a few of us, obviously, and we immediately started to provide production support for our subscription services for Elasticsearch itself. This works really well for us; Elasticsearch ends up being used as a core component, as I mentioned, within applications, and people do want to have help and support when it comes to – either from the beginning, in terms of how to design a system that can ingest and process typically massive amounts of data, to help if there is a production issue.
We also provide training around Elasticsearch – both public and private trainings, so that’s another way to monetize open source. Those have been super successful. I still remember giving and writing the first training and we gave it in San Francisco, and I think ever since then, all of our public trainings are just completely booked and we’re just trying to get more and more out there.
Once we settled on that and obviously increased the team and everything along building a company around such a project, we started to build our first commercial add-on on top of Elasticsearch. It’s called Marvel. It helps you manage and monitor your data, your Elasticsearch clusters.
The nice thing that we did with Marvel is that it effectively uses open APIs of Elasticsearch, so we’re using existing open APIs and we’re actually using Elasticsearch to monitor Elasticsearch itself [chuckles].
Alexis: Wow, that’s nice.
Alexis: That’s pretty cool.
Shay: So that’s another way people can easily use our open APIs to connect to any other monitoring tools, but within that we provide a lot of value as a product. We made it relatively cheap so people can easily go and start to use it. We think it’s beneficial for almost any deployment of Elasticsearch.
Alexis: So Marvel is $500/year for the first five nodes and $3,000 for each five-node cluster after that, right?
Shay: Yeah, correct. At least now, we’re changing the pricing for it. We wanted to make it super, super cheap when it comes to the first five nodes, and then obviously as your cluster grows, we think that it becomes – we thought, at the time at least, that it would become more viable for you. We found out that people really want to use more nodes in the context of Elasticsearch.
Data keeps growing. If you think about logs, as time passes, you have more of them. That huge jump between the first five nodes and the next five nodes was deterring users from buying Marvel, and we want to make it as approachable as possible.
We changed our pricing model; we made the first five nodes between $500-$1,000 but the next five nodes are only $250. We really want users to go and use Marvel as much as possible, and we fixed, at least, the mistake that we did initially with the pricing. We saw how valuable it is that we want users to easily add more nodes and not worry too much about the additional price hike that they might have.
Alexis: Are there any tips for pricing that you might be able to share from other lessons that you’ve learned?
Shay: Yeah. I would say that probably your first aim at pricing would be a mistake, so be open to changes [chuckling]. Listen to your users and your customers and try to measure. We ran a lot of scenarios with existing users and customers in terms of the pricing and what would make them feel like they’re getting a valuable product while they still feel like they can pay for it.
Once we ran all of those scenarios, we managed, I think, to come up with a much better pricing model that will, at the end of the day, to be honest, show the value and help people realize the value of Marvel and use it all over their deployments
Alexis: Speaking of showing people value, Logstash and Kibana came before Marvel, right?
Alexis: They were both open source, right?
Shay: Correct. Elasticsearch is interesting. One of the most fascinating things that happened in the context of Elasticsearch is the breadth of open source ecosystem that was born “on top of it.” I think it started just with people in the community building language clients for it, whether it’s Go, PHP, Python, Ruby, .NET.
When we started the company, one of the first things that we did is I tried to make sure that all of those community members have 100% of their time being able to devote to building those language clients, so I think we pretty much hired all of them to make sure that they’re fully devoted into making sure that the language clients work well.
I think that that was super important. In Elasticsearch, we tried to design a very simple API to consume, but at the end of the day, if you have a very strong integration with different programming languages and frameworks, it ends up making it even simpler to use so we put a lot of focus there.
Together with that, we saw other open source projects happening. Logstash, which is not tied to Elasticsearch specifically – Elasticsearch is just one out put that Logstash can take — logs or different types of time-based events to any degree, any time-based event. Process it, munge it, enrich it, and then send it to any output.
Elasticsearch ended up being one of the most popular outputs out there; it ended up being very, very good when it comes to processing any time-based event, including logs, obviously. Jordan started this project and after this project has been out there for a while, Rashid who used this project, got fed up with the pretty miserable UI that Logstash had to explore the data that ended up being in Elasticsearch, so he went ahead and built Kibana to just have a better visualization of Logstash events.
Both of those projects ended up being super popular. Our approach in Elasticsearch is to – if we see a project that is super popular, we want to make sure that we build a team around it. We have the full breadth of resources that we can give it as a company to make sure that it’s successful.
About half a year or so after Elasticsearch started, I talked to Rashid. We both shared the vision that Kibana can be a great way to visualize any data that exists in Elasticsearch, and he joined the company and built Kibana 3. We just released the first beta of Kibana 4, which is super exciting.
The same thing happened with Jordan. Jordan joined our company together with Logstash so we can make sure that there’s enough people working on it full-time. It’s quite amazing – I mean, both of those projects have an amazing community, but the fact that we now have, I think, close to eight people working full-time just on Kibana and the same number working full-time on Logstash means that they can move forward faster, we can become better and that’s just great for everybody.
Alexis: Now how did you resist the temptation of thinking to yourself, “Alright, Elasticsearch is already popular. I’m going to make Logstash and Kibana closed source so I can use this to monetize rather than just providing support”?
Shay: Well first of all, both of them were already open source so there was no question that they should remain open source.
At the end of the day, I think that there’s a lot of responsibility when you build a company around an open source project when it comes to the community and people that invest a lot in your open source projects and we try to live up to it. I think that you can build a very strong business around open source projects and start to add commercial add-ons that do not directly contradict the open source aspect of your projects.
I think Marvel is a great example. If you look around you’ll see tons of open source monitoring tools or real-time monitoring tools on top of Elasticsearch because it’s built on top of an open API, but Marvel was a great use case where we feel like we provide enough value that we can make it commercial. Other than that, at the end of the day, we want as many users and as many people using our products. We think they’re super successful and super usable, and open source is a great platform to make it happen.
Alexis: You’re now about four and a half years into Elasticsearch, right?
Alexis: And about two or plus years into Elasticsearch the company?
Shay: Correct, yes.
Alexis: What’s changed over time with the community?
Shay: There are different communities, I would say, for our different products, specifically around Elasticsearch. I think one of the interesting things that happened when we started the company –I was doing Elasticsearch my own and I wrote almost all of the code for it for the first two years. I think one of the interesting things that happened is that one of the founders of the company, a fellow named Simon Willnauer – at the time he was a PMC member in Apache Lucene, obviously coming from a lot of background when it comes to managing open source projects, the scale in Apache Lucene. He effectively took over Elasticsearch, in which I’m very happy [chuckling]. He’s super talented.
He helped take Elasticsearch to a whole new level when it comes to it being open and managing it to a level where there’s a lot of interaction with the community. Elasticsearch itself, it’s not a simple project. It’s quite complex; it’s distributed systems. It has to deal with all the way from interactions with Apache Lucene to networking or clustering and what have you, so it’s quite hard for people sometimes to get in and try to contribute to it. But on the other hand, we try very hard to have tons of discussions with users to make sure that we understand what they want in order to validate an approach or understand bugs or understand feature requests, to make sure that they end up being implemented.
So I think on that level, I think Elasticsearch is light years ahead compared to two years ago when it comes to being open and issues being open and what have you. Two years before, a lot of it was happening in my mind [chuckling]. I was just cracking code all the time, so I think that that’s a huge move forward.
Even then, I think that there are areas where the project itself need to help – for example, make sure that the openness is more visible. I can give an example where we, in the last year, have been working very, very hard on resiliency aspects of Elasticsearch. If you think about it, that’s a major, major thing, right? I mean, we had to go back all the way to Lucene and add features there and then in Elasticsearch. It’s just a huge scope of projects.
It wasn’t very evident because – it was still open, but there were GitHub issues all over the place opened, and in Lucene, JIRA issues opened there as well, and it wasn’t very easy for people to understand all the effort that goes into resiliency in one simple place. People obviously deeply care about it; it’s quite fascinating to see that aspect just being used more and more in mission-critical systems, so obviously they care a lot about the resiliency of the system that they put the data in.
Once we found out, we were naïve. Everything is in the open – there are issues all over the place open and we’re discussing it and what have you. But when someone came in that doesn’t follow the projects to a level that we do, or other more established people in the community do – I don’t know if established is the right word, but people interact with it daily – then they didn’t get how much work went into it.
Here’s an example where we said, “Okay, that’s super important.” Even though it’s super open, people just don’t the breadth of work that goes into it, so we just created this dedicated page that effectively summarizes all the work that goes into it so people can go into that page and immediately understand what’s going on.
Alexis: Let’s see – you are $104 million with venture capital [chuckling]?
Shay: Yes, something like that.
Alexis: How did you decide to – or when did you decide to take venture capital and why?
Shay: I think it goes back to – pretty quickly after I started Elasticsearch, I saw the project was exponentially more popular compared to anything that Compass ever was. The first thing that I did was I decided to quit my job and just do Elasticsearch full-time. It was a big decision, with a wife and a young kid at the time, and quitting the job was obviously something that itself is not easy. But I felt that there was something there; people used it, deployed it and saw a lot of value in it. I was super excited. Tons of community and community development that happened on top of Elasticsearch, as I mentioned.
At the time, it was still pretty young, right? So I said, “I’m going to invest a lot of time just building Elasticsearch itself, on my own.” And make sure that if it makes sense to build a company around it, then it will just be all over the place that people won’t be able to ignore it [chuckling].
The first indicators were there, that it can reach that level regardless of where you store your data or regardless of what you do with it, if you put Elasticsearch next to it, then it provides applications with tons of additional value.
After a few years of doing it on my own, it got to a point where people in the community started to say fella, “We need support; we need professional help for it.” There was the question of the “hit by a bus” type scenario and what have you and I was thinking about starting a company as well, so it got to a point where it just became so obvious that I decided to see if I can start a company around it.
Ended up starting a company with a few people that I know – Steven Schuurman, who was heavily involved in a previous open source project and company called SpringSource; Uri, who is a great friend; and Simon Willnauer, as I mentioned. We started the company immediately without any venture capital or something like that, and then we sought out some financing. To be honest, at the time, Elasticsearch was just so prevalent that any venture capitalist that we were talking to –.
Alexis: They’ve heard about it [chuckles].
Shay: They’ve heard about it. Most of their portfolio companies were using it, and I think that nothing beats that, right? If you show value and you show actual value in a product that is being used and people just like it, then it makes everything else simple.
Alexis: So you survived several series of funding and you’re the CTO, so I take it you’re more focused on the development process and not so much the business. Is that a personal kind of –? I’ve talked to quite a few developers who liked to stick to the more technical aspects. Is that the same case here?
Alexis: You and me both.
Shay: I’m more of a backend guy [chuckles]. So I’m definitely still technical, but there’s –. As a founder, you run a company so you build a company, you want to have the company behave in a certain way, have a certain culture, so it goes beyond being just technical. There’s a lot of responsibility that all the founders take when it comes to making sure that the company grows correctly, especially with the company being completely distributed. So I’m definitely obviously heavily involved in that.
There were tons of bets that I took with Elasticsearch, and one of them was joining forces with Steven who is the CEO of the company, whom I didn’t know before and I just heard about.
Alexis: Oh, wow.
Shay: Yeah, it’s pretty nuts. But looking back, it’s one of the best decisions that I ever made; both Steven and myself work really, really well together. Steven came from an engineering background and we just enjoy running the company together. I think it works beautifully.
Alexis: I asked the question because it seems to me, at least, when it comes to funding, this might have been unchartered territory for yourself at the time. What have you learned that might help other developers who have no experience with funding? This opportunity or the circumstances present themselves as like, “Well, you know, I might as well look for funding. This business needs it if I really want it to take off.”
Shay: I can only speak from my experience, to be honest, and what I went through with Elasticsearch made it relatively simple, obviously. We had the occasional VC that didn’t get it, but most of them just got it, and they got it because Elasticsearch was so popular and they could see how Elasticsearch was being used in the wild.
When we were walking into meetings with VCs, they already knew about Elasticsearch. They already did their due diligence; they already heard – their portfolio company’s raving about it, so the discussion was less about the applicability of Elasticsearch or whether it even makes sense to build such a system. It moved to more interesting discussions around how big can it get and what are we trying to solve and where are we going to be.
Those are the types of discussions that always excite me. You show value – you start by showing value and then show how far that value can take you [chuckling]. And it ended up being simple discussions, to be honest.
Alexis: I’m glad to hear it was smooth sailing; it seems like it continues to be.
Alexis: Let’s see – Marvel came out how long ago now?
Shay: I think seven months or something along those lines? Eight? Something like that.
Alexis: Alright, so it’s pretty fresh. What did you do to spread the word? Because I often hear from developers that have established themselves and their company that even though they have name recognition and they’ve got a lot of people following their work, they still have to do quite a bit of PR and use some techniques to make sure that people learn about their new products.
Shay: Well first of all, the Elasticsearch site is something that every developer that interacts with Elasticsearch ends up visiting. Just announcing Marvel on the site, I think, ended up putting it in front of most of the users of Elasticsearch, so that was a big help.
Another thing that was super important, I think, is that we made it super simple to get it. It’s a single command line, you run it, it gets installed immediately – you immediately see value in it, and if you think that it’s a value to you and you’re running it in production, then you can go back and pay for it. It’s in the combination of those two aspects that helped the adoptions of Marvel to a level that it’s very hard to do with other products.
Let’s go back to hiring for a little bit. You mentioned you brought on Steve – what qualities do you look for when you were hiring in the early days, and now when the company’s much more mature?
Shay: In the early days, it was relatively simple, to be honest. Because I was doing Elasticsearch for two years, I knew a lot of people in the community – people that developed Apache Lucene, people that contributed to Elasticsearch, people that ended up building language clients, or Kibana, or Logstash on top of it.
In the beginning, we were hiring mostly engineers. You want to start with a strong product, and that was our focus when we started. I knew those people, effectively personally, right? Even though I probably never met with most of them, we had long –.
Alexis: You’ve seen their commits. [chuckles].
Shay: Exactly, long discussions on IRC. I just knew those people and I knew that I liked working with them and it just made sense.
That was relatively easy. Now, we always try to reach out to – if we see people that are heavily involved or committers in Apache Lucene or something along those lines, that we can give them the opportunity to work full time on open source, then we’re always happy to have them join us. Obviously there are a limited number of people there.
What we’re looking for mainly is just hiring to our network, which I think works best. Our people that work at our company come from strong open source backgrounds – they’re speakers in meetups; they’re well-known, so they have a very wide network when it comes to people that they know and people that they appreciate in terms of technical capabilities and cultural aspects. That makes the hiring relatively easy because we know who we want to get next if we are in a state that we can go ahead and, for example, we increase the Kibana team, or increase the Logstash team, or Elasticsearch, or Lucene or anything along those lines.
Alexis: So now that you’re a household name, among developer households at least, how do you deal with competitors or potential competitors? How do you make sure you’re staying on top of things?
Shay: To be honest, I don’t spend a lot of time thinking about competitors. It boils down to the early days of Elasticsearch; it’s more about showing users what they can do with their data and Elasticsearch is a unique space, a unique place to be there and show them what they can do with it if they change a bit the way that they look at their data.
For example, when I started using Elasticsearch, the enterprise search market was pretty hot. Big companies were involved there. The problem was that the enterprise search market ended up reducing search to something that was very basic and I was very annoyed by it, to be completely honest, because I saw everything that search can do.
At the end of the day, it wasn’t around looking at competitors and comparing feature sets or something along those lines; it was more about educating the users about what they can do with search and trying to even redefine it. That goes through the product itself, helping them through designing usable features and features that make sense and are easily consumed that they can actually do more with it. Nothing beats putting Kibana and within five minutes you have an amazing dashboard that is updated in real-time and you can see graphs, right? That’s one of the reasons why Kibana is so important for our company and the projects.
We don’t really, to be honest, think about competitors that much; it’s more about we’re trying to execute on our mission statement, and so far it seems to be working. Users understand what we’re trying to do; they use us, and to be honest, this is the way that I would like our company to proceed.
Alexis: Alright. So now that we’ve patted yourself on the back [chuckling], what’s one mistake you’d rather not repeat?
Shay: I think I talked about one of them which was around being open and some assumptions that we made around being open. That was a mistake that we made and we keep on fixing it. I’ve been trying to make sure that it’s not only about “Oh, but everything is open, but it’s very hard for users that are just trying to figure out what the hell is going on there to be able to understand what’s going on.”
I think another one, I would say – we’re building an interesting company, at least on the company level. Our company is very, very distributed; we’re more than 100 people now. As we grow, we have different offices, but even though we have different offices across the globe, we also have people that work remotely. I would say that I think one of the reasons why we managed to innovate so quickly is because we are a distributed company and we just hire people that make sense to our company and the best in the context of what we’re trying to do.
But on the other hand, it’s not easy to build a distributed company; you want to make sure that there’s no silence, you want to make sure that there’s communication flowing, that people understand what’s going on.
And this is a continuous challenge. I wouldn’t even say that it’s a mistake or something along those lines. Obviously, we made a lot of small mistakes down the road, for example, realizing that people don’t know enough about what’s going on in the company to a level where people that work remotely feel like they’re on their own and HipChat may not be enough. We’re using HipChat, so.
When we heard that, it was very simple. We’re using Zoom for video, so we have a what we call an “always on” video. Everybody can jump in and you see faces, you have interactions and things along those lines. I think that this is something that we spend a lot of time in the company to make sure that we manage it correctly, to make sure that people feel like it’s a single, coherent company.
Alexis: Now you might have answered this already. On the flip side, what’s one decision that you’re particularly proud of?
Shay: I think focusing on search? [Chuckling]
Alexis: Staying focused. Yeah, okay.
Shay: Yeah. If you look at it, when I was trying to explain in the early days why search is such an important aspect for people to use and why it’s so beneficial, people didn’t immediately get it. It was very surprising to me because you would look at a company like Google that built itself on the foundation of search, but what happened is that people mainly had in the back of their minds enterprise search solutions. I mean, how do I build a system that can crawl SharePoint and parse Word document version 7.1 or something like that and I can search across it?
I think just believing in search and the core aspects that it can give you – when I say search, I mean analytics. I mean free-text, structured search – all of those together – and deciding not to give up on it and building a product that can help users use it rather simply so they can see the value in it. I think that’s the biggest decision that I made on the product and company label, at the beginning at least.
Alexis: Before we get too far away from when you mentioned – your distributed team, was that a conscious decision? Maybe you admired the idea of a distributed team or was it out of necessity?
Shay: It’s sort of a combination [chuckling]. When we started, we were already distributed. I moved to Amsterdam and lived here together with Uri and Steven, but Simon is in Germany. Right from the get-go, we were distributed.
When we started to hire people, I didn’t really care that, for example, Clinton lives in Barcelona – he built the PHP client for Elasticsearch and did a lot more for it, so it doesn’t really matter. Along the same lines, we ended up just hiring people from the community and ended up being all over the place, and that’s perfectly fine. The benefit that we see from it is just tremendous.
We do know that we need offices, specifically in specific locations. I mean, our company needs an office in London; it needs an office in the Valley, needs an office in Germany. But even then, things can catch you by surprise, right? Rashid lives in Phoenix, and there are tons of talented developers in Phoenix that he knows. Now suddenly we have seven people in Phoenix, so we have an office in Phoenix. If you would’ve asked me if we will have an office in Phoenix two years ago, I would’ve said no. Some things just catch you by surprise, but those are good things that happen.
Alexis: Watch out! The startups aren’t going to migrate to Boulder; it’s going to be Phoenix. That’s the next place.
Alexis: So what’s the biggest opportunity in open source? If you weren’t working on Elasticsearch, what would you be working on?
Shay: That’s a hard question.
Alexis: That’s what I get paid for [chuckles].
Shay: Yeah, it’s hard to answer. I can’t answer specifically. The only thing that I can say that – if someone is building an open source project, there are a few important things that they need to know when they’re doing something and they want to see it through.
I think that one of the first things that people don’t realize is how much time it takes if you want to make sure that it’s going to be successful. I definitely didn’t realize it when I first released Compass. Weekends, nights – everything – IRC chats, on the mailing list, trying to answer tons of questions – those are hours and hours into data that you have to do, and you have to put in besides just coding features and what have you.
So this is a big decision that someone makes, and I think that it’s also a responsibility of the open source author. I mean, if people start to use it, they take a bet on you and you have a responsibility for your users to make sure that you’re there to help them out. Don’t take the decision to open source something very lightly.
The other thing is that it has to be valuable, to a level. I mean, both Compass and Elasticsearch ended up proving to be valuable quite quickly. I think there’s the breadth of users out there that are just looking for open source projects that end up being valuable is almost unlimited. If you build a system that is open source and it starts to get traction and it starts to show that it’s valuable, then you’re on to something.
Alexis: So once you’re at that point, what’s the best way to make a project sustainable?
Shay: I think there are a few ways to do it. At the end of the day, the first few years – yeah, there are different ways that a project becomes open source. There are big companies just deciding to open source a project that they were working on internally, and that one is the simpler case because there’s already a team that’s working on it. There are already a lot of relatively big number of people that can help out when it comes to making sure that the project will be successful and probably the company that open sourced it pays the developers themselves to continuously work on that project.
That, I would say, is the simpler route. Something that I did, and I’m sure that – obviously there are tons of people out there just scratching an itch and going and try to write something and then open sourcing it on your own. And that one is not easy [chuckling], as I said, and you have to be ready to invest tons of time into it if you want to make sure that it’s successful.
My advice is just be prepared to invest that time if you really want to see it through. And once you do that, and once you do it in the open and spend all of that time, try to build a very strong community that will help you drive the project forward.
In the context of Elasticsearch, there were not a lot of contributors to Elasticsearch’s code base itself in the first two years because of the complexity and the moving target that it was, but there was just a breadth of ecosystem that got built on top of it that was open source. That was, to my mind, that was mind-boggling to me. So investing a lot of time there as well and making sure that that ecosystem is successful will mean that your open source project will also be successful.
Alexis: You thought the other question was hard. Now it’s time for an even harder question. What’s your text editor of choice? [Chuckling]
Shay: That’s a simple one for me. I mean, I’m a Java developer for a lot of years now and I’m using IntelliJ. I think that’s – Java IDE out there. But that’s what I mostly do; I just program in Java.
When I was doing C, then obviously it was Vim and Emacs. I did 50/50 on them. I wasn’t that really just on them; I didn’t really care. But at least in the context of Java now, it’s all IntelliJ.
Alexis: Alright. One last question – The Big Lebowski. I am noticing a theme here and I have to admit, I haven’t watched the movie, and I should. That is a sin of the highest film-going degree. But I guess – why are you a fan of it? I mean, you own thedudeabides.com [chuckles].
Shay: Yeah, I was surprised at how I managed to get it [chuckling]. I’m a big Coen brothers – films that they do. I’m a big fan of The Big Lebowski and The Dude Abides type mentality. I think it ends up going all the way to how I personally tried and how we try now, in terms of Elasticsearch, managing open source projects. There’s a lot of responsibility around that and the dude will abide [chuckling] and he will take upon himself or herself the responsibility and will try to make sure that it happens.
There’s another tagline that I use in Elasticsearch, which is You Know, For Search, which is actually even closer to my heart when it comes to Coen brothers’ films. That was funny – in one of the Coen brothers’s films, they involved the hula-hoop. The movie was about the invention of the hula-hoop, and nobody understands what the hell does it do. The guy, Tim Robbins, who invented it says, “You know, for kids” and nobody gets it.
It’s funny, right? I mean, in the early days, trying to explain to people what a search system can do, you go, “You know, for search” but nobody really gets it if you just try to explain it verbally. But once they started to use it, once they started to see everything that it can give to them, everybody started to obviously play around with it.
Alexis: And then we got the feel of it, yeah.
Alexis: Alright, so is there anything I missed that I should’ve asked, or is there something that you’d like to get across to the listeners?
Shay: No, I’m good. I send my deep appreciation for everybody – any and every user of Elasticsearch – for using our products. At the end of the day, just seeing the different use cases and all the places where Elasticsearch is being used is humbling, especially in the early days. There were a lot of users that took a bet on this young product, and I hope that we lived up to the responsibility that they took there. And thank you.
Alexis: So if you want to learn more about Elasticsearch, where should we go?
Shay: Elasticsearch.org – that’s the project’s site. It has tons of information around all the different products that we have. I just finished writing a book about Elasticsearch; it’s all free on the website. It’s on GitHub as well, so tons of information there about everything and anything that we do.
Alexis: And if we’d like to find out what you’re eating for lunch or dinner, where can we follow you on Twitter?
Shay: My handle in Twitter and almost everything else is @kimchy – K-I-M-C-H-Y. Yeah, I try to tweet [chuckling].
Alexis: And you can find us @Binpress and myself, @alexissantos. Again, thanks Shay. I appreciate taking time out of your schedule to talk with us about Elasticsearch and reminisce a little bit about its history.
Shay: Cool, thank you very much for having me.
Alexis: Thank you.
Author: Alexis Santos