Binpress Podcast Episode 22: Sage Weil and Ross Turk of Ceph and Inktank
This week we talk with the creator of Ceph and co-founder of DreamHost, Sage Weil, as well as Ross Turk, VP of Marketing and Community at Inktank. Ceph is an open source distributed storage system and Inktank is the company created to support it, which was recently acquired by Red Hat.
Sage and Ross discuss community building, the importance of keeping an open source project’s brand separate from its supporting company, and using video chat for remote work. Sage also covers how he developed with web ring concept, why academic projects should make the jump to open source after graduation, and much more.
Alexis: So this week we have a very special episode because it is the first time the Binpress Podcast is hosting two guests at once. We’ve got Sage Weil and Ross Turk. They’re both involved with Ceph, and we’ll get into what exactly that is in a moment. Sage, how are you doing?
Sage: Pretty good, how are you doing?
Alexis: Quite well. I’m not going to say that I’m nervous about interviewing two people at once and trying to cram everything into an hour – no. I’m a professional podcaster and I will pull this off [chuckles].
Alexis: And Ross, how are you?
Ross: I’m doing very well, thank you.
Alexis: Let’s see. Where do we start? I guess this time it makes sense explaining what Ceph is and your involvement with it, and then well unravel things and work backwards. Who would like to take this?
Ross: I think that should be you, Sage. Why don’t you go first?
Sage: [Chuckles] Sure, alright. Ceph is an open source, distributed storage system. The idea is to take open source software and layer it on commodity hardware and have the same sort of storage features that you’d get out of something that would ordinarily cost a lot of money. You get the reliability from replicating data across lots of nodes, you get the performance and scalability that comes from that and you get a rich set of interfaces whether it’s object storage, block storage or file storage.
Alexis: And Ceph is supposed to be incredibly fault-tolerant, even self-healing and it’s incredibly receptive to scaling, right?
Sage: Mm-hm, that’s magic [chuckling].
Alexis: And it included a PhD thesis that we’ll hear about soon. You’re the founder of the Ceph project, and the principal architect, and the co-founder of DreamHost which we’ll hear about in a moment. And Ross is on the other side of the coin here, working with community, right?
Ross: Yeah, I was one of the folks that Sage brought in when he was building a company around Ceph to figure out how to build that company brand while still maintaining a strong community. That was my challenge for the two years that Inktank was a company.
Alexis: Alright. Well, it still is a company, but it’s now a part of Red Hat, correct?
Ross: That’s correct. Yeah, it is still a subsidiary of Red Hat. We got bought by Red Hat about I guess six months ago now, isn’t it, Sage?
Sage: Yup, time flies.
Alexis: So Sage, you have a very long history with programming and the web, lots of web projects here. How did you get your start? When did you first learn how to program?
Sage: Like most people in my generation it was Apple II something or other learning BASIC, trying to write a computer game. I think my first serious project was trying to make a CD-ROM game in high school that was a similar type of gameplay to Mist or The 7th Guest where you have nicely rendered graphics and you’re navigating through some weird universe. In this case, it was the Cedonia region of Mars. That was fun; that was sort of trial-by-fire, learning how to develop in C++ and make something that actually worked and could be shipped.
The next thing was sort of when the Internet appeared in my hometown. We got our first local ISP and you could dial up with a 28.8 baud modem and get online. The first real project there was a site called WebRing that let related websites link together into “rings,” they’re called. Had a bit of a day in the late ‘90s before it was bought by GeoCities then acquired by Yahoo! It’s still around today, I think. You don’t see them quite as much as you used to.
Occasionally they show up in xkcd comics, sort of ironically referring to the Hay Day – the Internet of the ‘90s. That’s always fun.
Alexis: Yeah, while I was preparing for the podcast and doing research, I was like, “Man, I’m about to talk to the developer of the WebRing idea.” I had no idea that those were your roots.
Sage: Yup. I think most of the projects I’ve been involved in, I sort of took an idea that was somebody else had tested and built all the code and actually made it happen. It was actually somebody else who said, “We should make a CGI script to do this” and I said, “Yes.” And then I went and did it.
Alexis: So Ross, tell us a little bit about your background.
Ross: I think I come from the same generation as Sage. My first computing was on a PC though, but it was still the same basic experience. I started getting into the Internet when I was living in Denver. There was this system that the local college, the University of Denver, made public. It was just a shell machine, like an old Solaris machine. Actually it was probably SunOS at the time and anybody who wanted could get access to this system and get a shell account.
Still in my early years of high school I discovered Unix and worked all summer at a McDonald’s to get a machine that could run Linux. So similar kind of heritage as Sage there, and I decided to work instead and I got a variety of low-level sys admin and engineering jobs to sort of work my way up. As Sage instead invented and implemented marvelous things, I ended up working for other people, getting the same knowledge from others, I guess [chuckles].
Alexis: I guess this is the perfect time to segue into DreamHost. I guess this is when it fits into the picture. When did that come about?
Sage: When I went to college, one of my roommates and several other people that he was working with had started this website called New Dream Network. That was sort of the conglomeration of web developers trying to do fun things on the Internet. I ended up joining forces with them to try to deal with the early infrastructure pains of making WebRing actually work.
At some point, we realized that we had several servers that were actually tucked under somebody’s desk at their office, hooked up to their – T1 line was a big deal at the time – and realized that if we sold space on them to people who wanted to host websites, we could actually pay for the bill. So that’s how DreamHost was born, and it turns out that has consumed the last 15 years of that whole tangent and has grown to be quite successful. I think it’s one of the – if not the largest – independently-owned web hosts out there. I think most of us have been bought up by AIT and GoDaddy and so forth.
Alexis: And you’re still involved, right?
Sage: Barely. Yes, I’m on the board of directors and our office is next door so I can see what’s going on but I’m mostly focused on Ceph these days.
Ross: Yeah, we share a break room with the DreamHost folks, so we’re still pretty well-connected.
Sage: Yeah, we eat their snacks.
Alexis: [Chuckles] Ceph is open source. When did you get involved with open source?
Sage: I think I’ve always been involved in open source as a consumer from the beginning. DreamHost, obviously, was all Linux hosting with all big open source consumers, so I never actually really produced any of it until more recently.
I became involved in Ceph when I went to graduate school at UC Santa Cruz in 2004. I joined a research group that had a grant from the Department of Energy. They were doing how to scale object storage – object-based file systems. Even then, I was figuring out the scalable metadata and all the pieces and trying to put together a prototype system. It was really only as the whole system came together that I put two and two together with our frustration as a buyer at DreamHost, trying to deploy scalable systems and having a range of extremely expensive and not particularly effective options available, and then designing something that was awesome and could solve all those problems and wanting to solve that problem, fill the gap, so to speak. It was sort of over the course of doing that work that I truly became a strong believer that the system needed to be open to really have a real impact.
Alexis: Why did you decide, “You know, I really need to get my PhD”?
Sage: Honestly, I was a little bit bored at DreamHost for implementing all the orchestration automation stuff to run the whole platform and it was challenging, but not the most exciting thing. And so going to grad school was sort of opening the door to a new chapter of fun and excitement.
Alexis: We’ve got pretty much a broad overview of Ceph and some implementation examples in terms of something that DreamHost could use, but what are some of the other things that people could build with it that you’ve maybe seen built?
Sage: A lot of stuff; there’s sort of a laundry list of things you can do. Originally, the goal was to make a scalable file system that could be used for supercomputers, so thousands of nodes and a quarter million processors, dumping files in the same directory – that sort of thing. But the way that we built the architecture ended up being much more general, so over the next several years after finishing my PhD, going back to DreamHost and hacking on the system with a small group of engineers, we realized that this low-level object layer could be used for lots of other stuff.
We built a scalable block device on it that glues into KVM and Linux kernel and so forth. We built an object storage gateway that gives you S3 type semantics, and we built a generic API like Librados that you could build other projects on, sort of expecting that other users would find other new and interesting things to build once they have this reliable, scalable, low-level storage primitive object layer.
It hasn’t been as wildly successful as I originally hoped, but there had been some really interesting things that people have done with it. When I go to conferences, Linux conferences, it’s always great to talk to people who are building applications.
There’s one company – another web host that’s building a whole analytics platform where they’re dumping all their metrics data in there and building visualizations and stuff on top of it. There are some research groups that have done some really cool stuff with embedding runnable code in the low-level RADOS layer by taking a scripting language essentially and pushing it all the way down to the OSD so you can have the general purpose computing environment.
Somebody reimplemented the scalable distributed log thing that was recently published. I forget – I’m blanking on the name right now, but there was a recent research paper where they said, “Do the same thing on RADOS” and they went off and wrote that.
We have several customers actually who are doing Dropbox-like services and they’re using that low-level interface because they just need the scalable, reliable, consistent storage piece, and then they’re layering all of the higher level functionality on top.
Alexis: Now you said it wasn’t as wildly successful as you’d have hoped, but $175 million sounds pretty wildly successful.
Sage: [Chuckles] Well, it’s the API. Anytime you have a new interface, a new way of talking to the storage system – it might make perfect sense to you when you think it’s like the most useful building block, but the reality is that you have decades of people using different interfaces and they don’t necessarily have the same ideas that you do.
So lots of people who are playing with Ceph, not very many people are using that low-level interface, but we think it’s pretty cool so we’re always interested to talk to people who are finding new and interesting things they can build with that part of the system.
Alexis: This isn’t the first acquisition you’ve been through. Now that you’ve got quite a few years of experience under your belt as well as a PhD, when it came to this acquisition, what were some of the things that I guess you were happy that you knew of, that you’ve been through this game before, that you had to deal with it again?
Sage: This acquisition was on a whole different scale than anything I’d been through before, so it was still definitely –.
Alexis: A new experience.
Sage: [Chuckles] It was still an experience, yeah. The WebRing project was acquired first by a very small company. It was doing basically WebRing and then acquired by GeoCities, but I only vaguely remember that, because it was a long time ago. I was in college and wasn’t really driving the whole process, but it was just much smaller. This is a whole new ball game.
I think the main thing is I have a strong appreciation for the value of good lawyers in going through that sort of process and any amount of time that you’ll inevitably spend reading legal documents instead of doing real work and going through any sort of acquisition. But I’m happy to have that behind us, I think, and get back to work.
Ross: I think during that whole process, one of the things that I think I’m most struck with is how much time we spent talking about how to make sure the project was going to be healthy after this happens. I was surprised about that, because I would’ve thought it was all going to be business, but we spent most of our time talking about the community.
Alexis: Now this is something that I’m pretty clearly interested in, because tomorrow – and I guess for listeners of the podcast, it’ll be next week – I’m interviewing the former CEO of Unity, the gaming engine. He stepped down as CEO last month and instead put – I think it’s John Riccitiello, the former EA CEO as the new CEO of Unity. If you’re not familiar with gamers’ perceptions of EA, particularly indie developers, they’re not very happy with the company.
I was reading the announcement post on the Unity blog and David was just answering comment after comment after comment, saying, “The stories that you’ve heard about EA or how John handled this or whatever, they’ve been taken out of proportion. We trust him; he agrees with our vision and all this.” How did you handle that? I guess, more broadly, how do you handle discontent in a community?
Sage: I think we are fortunate that we didn’t have a firebrand brand or personality in the same way that having somebody from EA step in probably would in that case.
I think we’re very concerned about what the overall reaction would be with the acquisition. I think the interesting thing about and the reason why we spend so much time talking about the project and community during this acquisition is that we weren’t looking to sell; we were looking to raise money and continue on our merry way. Red Hat is one of the few buyers that are trustworthy and has an established reputation of being a good steward of open source projects and having the same sort of values that we were trying to engender in the company or on the project.
Alexis: They’ve got a bit of a track record.
Sage: It was an opportunity that probably might not have come along again later. We were pretty full of ourselves and didn’t expect them to be able to afford us two or three years down the line.
So then the question is, “Do we go for the business opportunity and go big in whatever or do we do the thing that’s going to be best for the project and is it really going to be best for the project?” There’s a lot of concern that we have a lot of partners who are obviously Red Hat competitors and how they would react. We made it very clear that we feel very strongly about Ceph being an independent project with multivendor support and all that stuff. We’re very sensitive to that and announcing in the subsequent communication.
I expected more skepticism than we actually got and was pleasantly surprised when that didn’t blow back on us, although in our view, Red Hat has a very good reputation and track record with open source, they’re also viewed as the big, bad incumbent in the open source industry, so it was a little bit delicate.
Ross: Yeah, I think it depends on who David is and who Goliath is, and you can change the role depending on the context. I think Red Hat is one of those brands that can be either one depending on who you’re talking to, but they were still on our shortlist of – we had a very short list of companies we would have sold to, and they were definitely on that list.
Ross: I think something that’s been really exciting for me to see afterwards – and I was hoping for this – was that a lot of the people who face Red Hat in the market, other distributions, other service companies, a lot of them have announced Ceph products after the acquisition. I think that was something that I was looking for as a metric that it wasn’t something that was extinguished in the community; it was something that was causing the community to become bigger and more vibrant and more broad. I’m really happy to have seen that, that you essentially have other people doubling down on Ceph either because of or in spite of this acquisition. I think that’s a super good indicator.
Sage: I had a few very surprising interactions with people who are directly competitive with Red Hat; they’re, on the one hand, sort of unhappy at the news of being sold to their competitor, but had zero concerns about the project itself being in good hands, which was a relief.
Alexis: It was more of a “Damn, well – oh, okay.”
Ross: A”t least now we know how to work. We know how to work with this situation,” is what we were hearing from a lot of people. I think that was good.
Alexis: So continuing this community thread, what’s changed over the years when you’ve been interacting with the community as you’ve grown since 2012?
Sage: I don’t know. Ross, do you want to start?
Ross: Well I would start and say since I’ve become involved in 2012, when we started the company, I would say the community was more other companies and less individuals. You have component manufacturers and appliance makers. The community has rapidly become a community of lots of other companies working on products and services around Ceph instead of individual enthusiasts and small consultants. It’s become a much more broad and diverse community, which is really interesting.
I think that’s had a bunch of effects on how we do our Ceph Developer Summits and how we put together Ceph Days. Things like that have had to evolve as the communities become a bit more professional. I don’t know. Sage, do you see the same thing?
Sage: Yeah, actually. I did a talk recently at PDSW Supercomputing where I was talking about this, and I had this realization while I was writing the talk that the real uptake that we’ve seen in the amount of real engineering contributions to the code – that subset of the community that’s actually helping build the project – has pretty much scaled with the success of Inktank and its ability to engage with partners.
Ceph has been out there and we’ve talked to people and enthusiastic individuals, but until you have real organizations with a business interest and skin in the game, you don’t get them paying people’s salary to go code or build products around Ceph.
In some ways, the community – Ceph’s success was really amplified or predicated on Inktank’s ability to work with partners and get products out.
Ross: Yeah, and that was always the point of Inktank, to make sure there was somebody who could make all those commercial engagements and broaden the community that way and make sure that users were comfortable knowing that if they ever went into production, they had support. That sort of thing, it’s like the company was always a catalyst to broaden the community and to broaden Ceph and to make it more than it was.
Alexis: So what’s been the hardest part in growing the community or working with them over the past few years?
Sage: I don’t know, there are a lot of things [chuckling]. I think it’s sort of coaxing people to engage freely with the developers. I guess there are two sides of it. The challenge that’s been sort of directly facing us is taking the existing engineers that are working for Inktank, and now Red Hat, and making sure that all of our development is as transparent and open as possible. We’re putting communication on IRC and email lists and doing design discussions, doing CDS and so forth so that everybody can see what they’re doing and easily join in.
But at the same time, when we actually have those people who are interested, getting them to talk to us and engage on those same channels has always been – it feels hard, getting them to come out of their shell. People have a tendency to talk about something and get an idea, and then they’ll go off and work on it in private for a while and then come out again. I think that’s the general open source social challenge to make this all work.
Ross: Yeah, and once we started getting critical mass inside the office in Los Angeles, it is absolutely less efficient to send an email to somebody than it is to stand up and walk ten feet away and have a conversation with four people. You have to constantly make these decisions of, “In this moment, I have to do something slightly less efficient so that the community can benefit from what I do.” There’s always this tension between writing the code as fast as you can and maintaining a level playing field.
Alexis: In a way, it’s almost like a day-to-day thing. When you look at your own schedule, you’ve got so many things to do but you have to really prioritize what’s going to have the most effect on a much larger scale there.
Ross: Yeah, and you have to try to prevent sub-cultures from being created. There’s the joke I say: There are two kinds of people – the people who can go to lunch with Sage and the people who can’t [chuckling]. It’s just because we’re a post-geographic team now, so you have to prevent those kinds of walls from being built. That’s a tough thing you have to do every day.
Alexis: Speaking of geography, a bit of a tangent from the community here – I find it kind of strange that you all are based in Los Angeles. The thing is, you usually hear the companies are based in the Bay Area – Mountain View, SF. What’s behind that? Have you met any kind of challenges because of where you are located?
Sage: I think you have to be careful when you say “you all.” I think it was 45 people when it was acquired, and I think 15 were in L.A., so it’s no more than a third of the company. Everyone else is just distributed everywhere globally, pretty much. North America and Europe, mostly.
The reason why we’re in L.A. is just historical; that’s where DreamHost is based. That’s where I was living. A lot of our engineers, we sort of built up that team in DreamHost – just a bunch of grads that we sort of slurped into our team, and that’s where it really began. We lucked out with Ross being based here in L.A.; I think he was pretty excited to work for a company that was actually local.
Ross: Yeah, I had done 12 years of telecommuting where I never left my apartment, unless I was going to the airport. That becomes a pretty lonely place, so I was really excited about a cool, successful project, starting an open source company right here in my backyard. And of course the irony, Sage, is that I hadn’t been to the office in a couple of weeks now because I’ve just been on the road all the time.
You never quite get exactly what you think you’re going to get. I spent a lot more time on the road than I thought, but it was great being able to go into an office and have that efficiency I was talking about earlier, even though now my boss and all my entire team are not in Los Angeles. We’re back to square one with that, a little bit.
Sage: One thing that we’ve done recently in the last couple of years is we started making heavy use of video chat. We used video for a while, but DreamHost uses Google Hangouts and now we’re using Blue Jeans, which is the tool that Red Hat uses. We do all of our meetings that way by default, which makes it an even playing field for people regardless of where you’re based. Just having that as a habit where you can just have a conversation with somebody face to face has been great; it makes it much easier to work from home some days or stay connected when you’re traveling and just keep the team together.
Alexis: I interviewed a game developer last, last week who works remotely in Chicago and there’s a team of about five other people in Valencia, California. He has his own monitor stacked on top of two programming books where he has a constant Google Hangout going on all day [chuckles]. Pretty funny what people are doing for contact when they work remotely.
Sage: It’s surprisingly effective, just having that awareness of somebody next to you and be able to socialize, I guess.
Ross: The big surprise for me was actually going back. We started using the video chats and I thought, “Oh this is pretty cool.” It wasn’t until we got acquired and we started seeing teams have just phone calls that I was shocked. I’m like, “What is this? This isn’t communication anymore.” It didn’t make sense any more. We’re slowly getting all the people we know at Red Hat to use the video conference now as much as we can [chuckles].
Alexis: So what advice would you give to folks with open source projects, whether big or small, when it comes to working with a community and really building one around their project?
Sage: I have a few thoughts, but I’m sure Ross has more well-articulated ones.
One of the things that we made a conscious decision to do with Ceph and Inktank was to keep the company and the project brand separate, and I think that was a perfect move, in our circumstance at least. It made it really clear what Inktank’s role and position was, and it made it much easier for partners to engage.
You see a lot of companies doing it the other way and I think that could be successful, but it can be challenging down the road. It’s easier in the beginning and harder later on. I think there’s a general, pervasive understanding that it was important to pay attention to the community and build community and foster community in order for the project and the business to be successful. Those two things would sort of come hand in hand. What do you think, Ross?
Ross: I’ll say this. When I came to DreamHost in 2012 to take a really hard look at Ceph and start to take care of its community, I was surprised at how many decisions that Sage made correctly. Just because I was looking around and going, “Man, there’s nothing on fire, especially. It just all needs to be scaled and it all needs to be taken to the next level.”
I think that using standard infrastructure, just the basic tools everybody understands – using GitHub and accepting pull requests and being timely in accepting pull requests, accepting patches by email. Making sure that you’re working the same way that the developers who are already familiar with working with open source projects can understand was a big one.
I think licensing was another big one. Sage, you decided in the early days of the project to go with a LGPLv2 and a fragmented copyright, and that sends a really strong message to the community because it’s a very permissive but viral license and you can’t change the license without getting the approval of all of the contributors to the project since its very beginning. It’s a way of making sure that the license is controlled by the community and not by some company that choose to invest in the project later.
I’m not saying that’s right for every project, but that sends a very clear message to the community that the technology and the community came first and it was owned by a group of people and not by a company. Going back to what Sage said earlier, that’s why it was extra important even though we had two brands – Ceph and Inktank – that you could look at either brand, Ceph.com or Inktank.com, that it was unmistakable that they had the same spirit.
Which was which? Definitely you had different color schemes and everything, but the basics of the brand were very similar. The basics of the messaging architecture was similar. What we were attempting to do there was make sure that the project had its own life and that it was governing itself, but that the company and the project were conceptually continuous. I think that was really important. It was especially important once you’ve made that really strong statement that this is community-owned technology and it will always be that way. That’s an incredibly strong thing to say.
Alexis: We’re starting to mention Inktank more and more. What are some of the monetization models that came into play when you formed Inktank around Ceph?
Ross: The first model that we chose was the simplest one possible, and that was selling support licenses for people who were using the Ceph technology. It’s the simplest business model because people were already using Ceph and they needed help, and so we would sell them a subscription to buy help. But we pivoted about a year ago to become a product company.
You see a lot of commercial open source companies are product companies as opposed to just support contract companies, and what it means is that you’re selling it as a subscription with some extra stuff as opposed to just the open source bits. It was not actually too much of a pivot for us because the pricing remained the same; it was largely – we were just talking about it in terms of a product as opposed to a subscription. But in the very beginning, it was simple. It was services and support.
Sage: Yeah. I guess it was about a year ago with the product pivot, we made this decision to add more stuff into the product box than you can get just from the open source project. Since we were going after an enterprise that was sort of building this nice GUI on top that would manage the system – it was a hard decision for us because that part wasn’t open source and so you’re getting more stuff when you bought the Inktank product, and we’re like open source. It was sort of viewed as a necessary evil in order to solidify Inktank’s financial position and to secure funding and customers and so forth.
There’s also this sort of recognition that the people who are interested in the stuff that wasn’t open source that was in the product are the same people that are willing to pay money, and the people who are interested in the open source bits aren’t as interested in those pieces. The organizations that are deploying the open source code at scale and they’re managing themselves and they’re sort of savvy users, they wouldn’t use or care about all the sick tools and the integrations with other enterprise products, but the people who have all the money that are going to keep the company alive and pay the engineers are.
That was sort of the strategy we pursued for a while; it was still a huge relief when Red Hat bought us and we’re able to just say, “No, we’re an open source company, everything’s open source and we open source that piece as well.” It was a good day for the engineers.
Ross: Yeah, I’m not even sure if I know whether the – we never really got far enough into having an open core product before we got acquired by Red Hat and they said, “No, that doesn’t conform with the model” and were like, “Hurray!” and open sourced it. I don’t even know if we can say whether the product drove sales or not, because we hadn’t gotten far enough into it.
But it was a really tough decision for us to go open core, and we’ve seen so many people do it badly. The problem with open core is, as Sage was saying, if you mess that equation up and you make features only available through a proprietary license that the community actually wants, then you compete with your community and you put yourself in a very awkward situation. If you go the other direction and you don’t put anything interesting in that product box, then nobody will see the value and nobody will buy it, and then perhaps the community gets hurt anyway because then you starve the project of resources. It’s a really tough thing to get right.
Alexis: One thing that I found interesting was that you say you kept pricing the same even after the pivot. It sounds like you “got it right.” Lots of people have struggled with pricing. What did you learn from it and did you think you got it right or not?
Ross: I don’t know. The pricing model was always per raw capacity with Ceph, so regardless of how many servers you have, regardless of how many disks you had it spread across, regardless of how your replication level or how much of that capacity you burned getting data durability, it was the same price for raw capacity and it was sort of on a sliding scale. The effect being that if I build this giant cluster to manage three petabytes worth of data, it doesn’t matter what I use that cluster for. Object block, how many servers it’s on – it doesn’t matter. A lot of folks would go on a per node pricing model. I don’t know. Do you think we got it right, Sage, with the pricing model?
Sage: I think it might be a little bit of an unfair question because in the beginning nobody played list price anyway; the pricing was always one-off per customer, as I am sure is the case with most startups. So I’m not sure that whatever the starting point one was the right number.
We did recognize some general issues that having this flat pricing for the source system regardless of how you use it is a great story, but the reality is that you’re competing in these distinct markets where the pricing is very different. So the going rate for object storage, whatever that means to the industry, is different than block storage, is different from file.
It made it a little bit difficult for us to play on all three, or all two I guess, since we’re not really playing in the file space carefully. I don’t know. It’s complicated. I’ll leave that to the business people and focus on writing the code [chuckles].
Ross: Yeah. Well something else that was interesting is that, and I don’t know if you see it the same way, Sage, we ended up competing on pricing and on a lot of other things –messaging and everything else – with a lot of very entrenched people in the storage industry. I don’t know that we, as Inktank were a storage industry company; I always considered us as a distributed software company, or a company that knows how to build distributed systems and not necessarily storage, because we were a bunch of software engineers.
When we talk to the market, we weren’t talking to people like a storage company. We were talking to people like an infrastructure company. I think a lot of what I learned was how to position a software solution in what is currently a hardware market. You walk in with a software solution like Ceph to a company and they say, “Well how many IOPS can you push with this?” and you say, “Well that depends on what you run it on. It depends on you disks and your hardware and your network and everything.”
A lot of the people we were competing against could walk in and they could say, “I’m going to have a forklift drop off this giant refrigerator and data center and then you configure it with a web browser and that’s it.” It’s a totally different conversation.
The reality or the conclusion that I’ve come to at the end of this journey – well, in the beginning of the journey that has become with Ceph at Red Hat – the conclusion that I’ve come to is that it’s a completely different kind of conversation and you have to prepare these companies. If they’re expecting an apple, you can’t deliver them an orange. You have to walk in and tell them that it’s an orange and say, “This is not the storage that you’re used to pulling. It’s totally different, and it won’t fit into the same shape that your other storage solutions necessarily fit into. Or if you try to fit it in there, you might be disappointed. Instead you should look at it as a new thing that has characteristics that are unique to it alone.”
Alexis: Now speaking of positioning the company and the product – I always feel strange when I ask these questions, “How did you do marketing? How did you spread the word?” because in many cases it feels like this is a big company. It got acquired for $175 million; of course it must have been easy for them to have had the word spread. They’re these giant staple, in a way, in the open source community, but there’s always – from the outside, you don’t see the inner workings of what’s done to spread the word and make people aware of the company itself and the services and the products. What went into that?
Ross: You might be surprised to learn – or you might not – that we probably looked a whole lot bigger from the outside than we were on the inside. We always felt like there was so much more we could be doing to spread the word, and it always felt like we were juggling chainsaws [chuckles]. It’s like, “Uh-oh, that one’s coming back down!” You know?
Ross: About 18 months into my gig at Inktank, I took over VP of Marketing as well, so towards the end I was running marketing. We didn’t do a whole lot of spending money on marketing. Not a whole lot of ad campaigns, not a whole lot of traditional, expensive marketing stuff.
What we did do is we got out there. We went to a ton of trade shows. More mature companies with mature marketing departments will measure trade shows based on number of leads and opportunities and number of closed deals and all that – we didn’t measure any of that because we weren’t big enough to, yet, but we knew we had to be out there.
I think we went to something like 43 trade shows in the two years that we were a company, just telling everybody the story, pressing the flesh, making sure everybody knew that we were there and we were available for questions. I think that did a whole lot of it, but the truth is, our marketing came from the community and that’s just how it was. People knew about Ceph because they found out about it through its Linux kernel module or through its popularity with OpenStack or through some other direction. It was almost all in-bound.
The marketing function at Inktank was largely around telling the story and measuring the results. It wasn’t your typical buying ads and stuff; it was following what was happening in the community, making sure that people were getting the right amount of excitement in the right areas, and if they weren’t, going and educating evangelists on what was cool about this technology and then just letting it happen. A lot of it happened organically and I think that that’s how we knew we had a winner is when the organic adoption started happening. At least that’s when I knew that it was something special and unusual is when –.
There was this phrase we used to always say. On my marketing team, I used to have a staff that would tell me, “This doesn’t work anywhere else. We’re scared, because one day this might stop working and then you’re going to be mad at us.” [Chuckles] Because everything was lined up and it didn’t take a whole lot of effort; it just took a couple of nudges. I don’t know if that was your experience as well, Sage, because you were with it from the very beginning and you were pushing the rock uphill for a long time. But by the time that it got to 2012, it felt unstoppable.
Alexis: The rock was on skates.
Alexis: How long did you bootstrap before getting venture capital? Or was venture capital something that was on the table from the very beginning?
Sage: It depends on how you define venture capital. We spun out of DreamHost at the beginning of 2012 and DreamHost funded us through the first year. We had a small investment from Mark Shuttleworth – I guess not small – a million dollars from Mark Shuttleworth, and then when we came around to do our first round of funding, we went out looking for venture capital. It was a little bit early as far as real adoption and the typical Sand Hill Road folks were a little bit skeptical that open source was going to have this success.
We weren’t getting traction there, but we had a strategic investor that came in to the tune of another five million, and Shuttleworth put in more money, and so that got us another year. When we were out again for the second round was when Red Hat came along.
Alexis: That’s good timing.
Sage: But none of our funders were the traditional VC establishment. It was DreamHost, strategic and Mark Shuttleworth. It was an unusual funding story, [chuckling] it made for some stress.
Alexis: Again, it’s difficult to give advice when it comes to raising money, I would assume, since every company is different. But what are some tips or lessons that you’ve learned that might apply to others?
Sage: Oh, God. Don’t do it. It sucks. No, I don’t know [chuckles]. I think you need a team that has a range of skill, so I think it’s really important to have – technical co-founders go a long way and if people are willing to accept a lot of business naiveté from them if they believe in their technical vision. You also need someone with a strong understanding of the business and the funding process. If you have those two things, then you have a winning combination. I think it’s challenging to do one without the other open in either direction.
Ross: I would say, as the person who bore most of the brunt for making the investor pitch, the actual sales deck that you give to the presenters, my advice to people who are starting something new is it’s never too early to start measuring your traction. We got to the point where we knew all this traction was happening around us, but we had no way to measure it – how many commits, how many committers, how big the community is, how much it’s growing, what the adoption is, all the different deployments around the world.
It took a lot of time to pull all that together, and I guess that would be good advice, to do that upfront. It’s never too early to start taking notes about the good story that you want to tell to your investors.
Alexis: Usually open source projects, when it comes to hiring, they just go, “Hey contributors, come on over! We’re hiring, we need your help full-time.” What qualities do you look for when hiring and was that something that you find also applied to Ceph?
Sage: You have to be careful with this strategy, honestly. If your goal is to just amass as much technical expertise in your project, then absolutely go to the committer base and that’s where you have experienced people that you can just hire. But if you’re also trying to grow the community, the size of the community and breadth of the community, if you put all those people under a single roof, you’re sort of cannibalizing the future growth of that ecosystem.
We’ve done our share of hiring from the community, but we hesitate each time and decide whether this is really the right thing or not. We try to avoid doing that when possible.
It’s funny that you put it that way though. In the beginning when Ceph was first open sourced, I thought it was so simple. You just have this great idea and you put the code up on SourceForge, at the time, and then magic would happen and the commits would start rolling in. That wasn’t really how it worked especially with something that’s sort of difficult for new users to pick up on and start contributing meaningfully. It takes hard work to cultivate that level of expertise in the people who are participating.
One of the nice things actually about being acquired is now I can focus more on the project and the core technology and where it’s going and less on talking to venture capitalists. I can spend a lot more time working with the people that are our partners and other people who are participating on the project and hiring and really spend time helping them get involved and start contributing meaningfully to the system. It’s a lot of work, but it’s very rewarding.
Alexis: I have a suspicion I know the answer to the following question, but what’s a position or a kind of – what need should somebody look to hire for that’s often overlooked in the early days, when they just don’t need technical folks?
Sage: It’s hard to put your finger on what that thing is. I think our most successful hires have been people who don’t necessarily have the specific skillset for the job that you want them to do, but to just show themselves to be extremely smart, intelligent and resourceful and they’ll grow into the role that you build for them or they’ll come up to speed quickly.
I think getting people straight out of school, if you can establish that they have initiative and will go off and figure something out on their own I think is hugely valuable, especially for a startup where you don’t have a lot of cycles to train people as much. You really need them to go find problems and solve them. I don’t know what the magic thing is.
Ross: As far as the what kind of role you might look to hire, documentation I think has been one of the – I think we started, in the Ceph project, really ramping up documentation in late 2011, early 2012. This has been a project since 2004, and so we had a lot of catching up to do, just writing all the various reference docs and how to’s and guides and just content, and that’s something that as an open source project might never have happened until we started thinking about how to commercialize it. But that’s a huge contribution to the project, is the documentation; you can’t succeed without it.
Alexis: You actually probably answered my very next question which was, I’m going to reference how you had mentioned that many folks have done open core badly. What are open source projects are doing wrong that they should be doing or what are they neglecting? I guess that’s probably documentation.
Ross: Well, some people get it really right. Some people get documentation really, really right. With Ceph, it’s a particular challenge because you have all these tunables that have to be documented and all of these command lines – there’s all of this reference documentation, this big collection of reference documentation. And then there’s all this surrounding documentation, how to get Ceph to work with OpenStack, in CloudStack, in ProxMox, in Ganeti, and then there’s how to get it to work with Samba, in Ganesha, and there’s just this tons of surrounding documentation.
We might have had a unique challenge, that somebody like WordPress – well actually that’s another good example; that needs lots of doc too because of the extensibility. There are some open source projects where that wouldn’t be as much of a concern, but I think a lot of projects get the doc wrong, but a lot of them get it right too. I’m seeing that increasingly, that a lot of them are getting it right too.
Alexis: Since we’re winding down, a few more questions that aren’t necessarily so specific to Ceph. What do you all see as the biggest opportunity in open source? That is intentionally vague to see where you’ll take it.
Sage: I don’t know if it’s the biggest opportunity, but I have a bit of a pet peeve about one particular area, and that’s figuring out how to transition the work that is done within academia in building interesting new systems and bridging those projects and ideas into viable open source projects.
One of the experiences I had in grad school was that I had a lot of peers who were building these really interesting and innovative systems. Of course everyone’s building it on top of open source tools because that’s what’s available and free and all that. To some degree, people understood how to work with the community, but usually they didn’t. There’s no real education around open source or how communities work or what it takes to really make something go.
But by the time you’ve written your thesis and you’ve cobbled things together and made it work just well enough to generate all your graph and do your whatever, you write your thesis, you get your degree, and then you get a job, and everybody who’s hiring is, for the most part, a proprietary shop that we want the talent and not the project. It was an unusual set of circumstances that allowed Ceph to bridge the gap from the research project, to an incubated open source project, to a business that made it successful. I think if we can figure out how to make that process that works better, then we’ll have a huge amount of innovation, more innovation happening in this space.
There are a couple of key challenges – there’s a gap between what’s necessary for a research prototype for academic purposes and what’s necessary for a pre-production or production prototype before you can start monetizing something. Figuring out how to incubate projects that show promise in the interim – I think there’s a lack of education in universities for undergraduate and graduate research. Every curriculum has classes set aside to teach you about large scale software development and Scrum and Waterfall and how it all work within a traditional –.
Alexis: To be a cog in the wheel.
Sage: Yeah, a traditional closed source engineering team, but it is few and far between and rare that you find programs that actually teach you about open source communities, how they work and have students go through the process and try to make meaningful contributions upstream.
In some cases, you even have structural issues where the universities have IP licensing issues around the research that they fund and patent issues and all that stuff. I wish I never had to think about that and I mostly not try to, but I think if we can figure out what the right combination is, I think we could — or what this sort of – I don’t know what it is. I don’t know what it is, but I think there’s a huge opportunity there that deserves more attention.
Alexis: And what’s the best way to make sure open source projects stay sustainable?
Sage: You need to build a self-sustaining community. I think that means it has to solve a real problem that people need and use to fill a gap, and so you have people who are relying on it, using it and contributing to making it work. I think you have to make ample business opportunities around it as well. There are instances where projects have been successful without that, but I think that to have the biggest success, you have to make it the enabling factor not just for end users but also for businesses.
Ross: In one way or another, your community has to pay its mortgage [chuckling].
Alexis: So this goes for both of you, and it can be a joint answer if you so desire. What’s one mistake you’d rather not repeat?
Ross: Oh, God.
Sage: You go first, Ross.
Ross: I’m trying to think of one that I’m willing to say in a podcast [chuckling]. Oh, boy.
Alexis: And you think this question’s hard; wait until you hear the last one [chuckling].
Sage: I don’t think this is the biggest one but I think if I did it again, I would do it differently. I think you need to be careful drinking too much of your own Kool-Aid and going too big too fast. One of the reasons why we had trouble in the first round of funding for Inktank is we spent a lot of money in that first year, and it was a little bit premature. We brought in too many people too quickly and weren’t able to utilize them effectively.
Ross: That’s so tough to get right, too, because you also have to be aggressive. You can’t take the wait and see approach when you’re in a startup; you have to run. So that’s so tough.
Sage: And I abhor the dogma that you hear from a bunch of capitalists around the lean startup. I think it’s a little bit selfish and naïve, I guess. Not naïve, but it’s self-serving for the funder because it minimizes their risk while maximizing the leverage of their potential reward, but it doesn’t mean that’s the right combination for the project. I think you got to use your instincts and just be careful.
Alexis: And now on the flip side – wait, Ross, did we get your mistake? Because we can’t let you escape that easily [chuckling].
Ross: Oh, boy. My mistake. That is so tough! I think that if I could pull it back and make it thematic, I think the mistake I tend to repeat over and over again is making a decision too quickly and running in one direction instead of waiting and seeing what’s around. I think that that happens to me a little bit too often too, and that it comes out of the startup mentality that a decision is better than no decision.
But then again, on the flip side, it’s tough to balance that. It’s tough to make sure that you’re acting at the right times and not waiting and seeing, but there are a few times in my life that I think I would have done better of waiting and seeing.
Alexis: On the flip side, what’s one decision that you’re particularly proud of.
Ross: Going to Inktank? [Chuckling] I mean, best decision I made in my entire life!
Sage: Yeah. I guess there are a bunch of things that I could say. Deciding to open source Ceph and work on that instead of getting a job somewhere else was; I think starting Inktank, realizing that you really needed to have this company that’s focused on enabling the project to make it truly successful. I think we got lucky. I don’t know how conscious a decision it was, but attaching ourselves to the OpenStack ecosystem to some degree, was a strategy that paid huge dividends – figuratively and literally, I suppose.
Alexis: Particularly currying the favor of Mark Shuttleworth [chuckles].
Sage: Yeah [chuckles].
Ross: What’s funny is that the benefits of that weren’t entirely clear to me until I guess six months ago when it finally occurred to me that it’s not that the use case matched and that the technology matched – it’s also that the community matched. People who are deploying OpenStack are used to dealing with distributed systems; they’re used to dealing with systems that have a hundred tunables, where you have to understand the intricate nature of this system. Those users are not as scared of something like Ceph as an appliance consumer would be. I think the brilliance of that move wasn’t as clear to me as it was until six months ago, when I realized it’s about the people that – we got that affinity with OpenStack, but it was also getting affinity with the community of like-minded people who were caring about the same thing. That was something that I don’t know we went into it thinking, but I certainly think it now.
Sage: Yeah. There’s that alignment of maturity, of overall vision, high-level for what data centers should look like, buying patterns – it’s everything. It was just a good match.
Alexis: Okay, now for the very tough question. When it comes to code, what’s your text editor of choice?
Sage: [Chuckles] Emacs.
Alexis: Oh! Oh I think you might be the first Emacs person here.
Ross: Oh, wow. Yeah, when I’m in a shell, I use Vim; when I’m on a Mac, I use Coda.
Ross: You should ask Sage what his email program is. That’s the real question.
Sage: [Laughs] That’s the embarrassing one. That’s where I really date myself. I still use Pine with a million colorization rules. I use Vi in a shell for random stuff; not even Vim. And then I use Emacs for coding.
Alexis: Man, you guys are hardcore, so to speak. We’ve had a lot of Sublime Text users over the past 22, 23 episodes or so. Where can listeners go if they’d like to learn more about Ceph?
Sage: That would be Ceph.com.
Ross: Yup, Ceph.com is a good place to start. Ceph.com/docs is where all the docs live, and Ceph.com/get is where you can go to download it. Nice, convenient URLs.
If you search on YouTube for intro to Ceph, you’ll see talks from Sage, you’ll see talks from me. Sage’s talks are a lot more dense and technical and mine are a lot more whiteboard-y and big shape-y. Both are good for new users.
Alexis: And if we’d like to dive into some more Inktank stuff, where can we go?
Ross: Inktank.com is the place to go for now. Eventually all those assets will be moved over to redhat.com, but for now Inktank.com is still up and operational. There’s a URL, Inktank.com/resources that has all of the white papers and data sheets and reference architectures and that sort of stuff.
Alexis: And if we’d like to get 140-character updates on your lives, where can we follow you at?
Ross: Sage, you first.
Sage: There’s the Ceph Twitter, @ceph. My personal Twitter is @liewegas – L-I-E-W-E-G-A-S – my name backwards. But probably Ceph is the place to go for Ceph stuff.
Ross: Probably Ceph. And I’m @rossturk on Twitter. There’s also a Red Hat storage handle that you can go to if you want to hear how we’re turning Ceph into a Red Hat product. You can stay up-to-date on all of that there if you’re interested. Pretty much everything that’s on Red Hat’s storage that is applied to the Ceph community will probably be retweeted on Ceph anyway.
Alexis: And for us, you can find us @Binpress and myself @alexissantos. Ross, Sage, thank you for coming on the show.
Sage: Thank you.
Ross: Thank you.
Sage: Thanks for having us.
Alexis: And most importantly for coming on here together because we somehow managed to cram all these into just over an hour, I think, so congratulations to us. We managed it.
Ross: Hurray! [Chuckles]
Sage: Well done!
Alexis: And for the listeners, we’ll catch you next week.
Author: Alexis Santos