Going from open source to enterprise open source

About

Ninety percent of enterprise companies are using open source software, and many are running mission-critical applications with it. Open source is becoming an integral component of the IT landscape, yet not all companies treat it as an enterprise application. Without the right level of enterprise OSS support, companies expose themselves to vulnerabilities, security issues, stability issues, lack of expert support, version inconsistencies, limited community support, and unnecessary costs. 

Companies adopting an enterprise approach to open source benefit from the same enterprise support they get from proprietary software and a community of developers supporting their efforts. The community will not only innovate and accelerate development but also be a great pool of talent for later hires.

Travis Oliphant, CEO at OpenTeams and Quansight, founder of Anaconda, NumFOCUS, and PyData, and creator of NumPy, SciPy, and Numba, will provide some examples of companies who have adopted enterprise open source support models and the benefits they have gained. 

Learn how to get enterprise support and extract all the benefits from your open source investments.

Travis Oliphant

Founder of OpenTeams, Quansight, Anaconda & NumFOCUS. Creator of NumPy, SciPy & Numba.

Transcript

Steve: Before our second session of these tech shares. We’re sitting down with Travis Oliphant to go over open source enterprise Open source. And Travis is an accomplished open source developer and executive that has created open source technologies that have been used by over 30,000 companies across 80% of the enterprise. Travis created numpy, sci-py, numba and also numfocus, pidata and quansight and most recently open teams. So Travis, thank you for being here and for funding this event.

Travis: [00:00:44] Thank you, Steve. It’s a pleasure to be here.

Steve: Yeah. So we have a very interesting topic to talk to everyone about. In the session we’re going to address the topic of enterprise open source. Now, recent studies have shown that 80% of organizations are using open source for mission critical applications. So that’s nine out of ten companies use open source. But unlike proprietary systems where companies can get support and continuous development, open source doesn’t always have that luxury unless it’s company backed. And what we’d like to know is for companies looking for enterprise support Travis, where can they get it and what can they expect from that support if it’s not company backed?

Travis: [00:01:32] Let’s step back up a step and talk about enterprise open source. Because your industry is a new term, many of you will not be aware of the open source enterprise. Open source is sort of a useful concept and the idea is really just to solidify use of open source so that enterprises can trust it just like they might. For a lot of years, I’ve been doing this for 20 years. You know, ten years ago companies would be using open source and would be thinking, Oh man, I don’t have a vendor to go to. The lawyers particularly would go, Who backs this? They’ve got the license, right? What does this mean? I’ve got to have more. I’ve got to have something here to protect me. And recently people have gotten more comfortable with realizing that community driven, open source or open sources company backed does have a lot of support. Actually, it’s not like you’re just going in the Wild West and there’s nobody to help you. There’s a lot of people helping you. In fact, most of the popular approaches to software are open source fundamentally.

[00:02:24] So it’s changed people’s thinking to say, well, you know, is there really a difference between open source and enterprise open source? The answer is yes, there still is. It’s more about a question of quality and then how do I get the support I’m looking for? And I would say the ecosystem of enterprise support for open source is still evolving, but there are now more options than ever to get the help you need. To make sure open source can be a trusted feature of your enterprise deployments of everything in your organization. No longer do you have to just go to a company to get software because they’re the only ones that can support it. You can go today use any open source you like. Get some advice as to which ones you should depend on and then get support for that open source. Open Teams offers a general open source support contract. We can support your use of open source. For how do I use it? What do I do? How do I get bugs fixed? Bug fix support. And that’s a unique offering. One of the first of its kind. There’s a company called Tide Lift, for example, that offers a identification, security, and maintenance open source for a wide variety of open source projects that fit under their support contract.

[00:03:38] So, at least you have multiple options now to ensure that if you’re concerned about those factors and that’s a reason you haven’t been taking advantage of the wealth of technologies around and really to help you save time and money in your organization, then that’s not a concern anymore. You can actually solve that problem today. So, I think these are important things people need to be aware of and need to start contemplating because their competitors are doing it. You know, people around you are doing it, and so they’re going to save time and money. They’re going to save the effort. They’re not be spending the money that you’re otherwise spending, trying to build out and maintain the software infrastructure yourself. Everybody needs technology. Every single company out there relies on technology. Almost all of that technology has foundations on open source and open source communities. And so it’s not something you can just bury your head in the sand and forget about and just go to your one vendor. We’re trying to fix that. We’re trying to help people to navigate the large ecosystem of available expertise and give them a place to go to help them.

Steve: So if there was a list of questions as you’re picking to introduce a new open source technology into your environment, and there was a checklist of things that you need to consider, what would those be? So, you mentioned a few about the security. You mentioned a few about who’s behind that? Is that a foundation? Is it a company? Is it a community? What would your suggestion and guidance be to companies as they come into open source?

Travis: [00:05:12] We have what was called a business suitability score that we’ve come up with that averages six or seven different. We have scores on six or seven dimensions, for example, dimensions we’ve looked at, that are important to understand our the IP, the license, what’s it under? Is it a license that requires you to as you build on the open source, do you have to make the stuff you build open source? Is it an open license? Are you protected from patents that might creep into that license, into the software you’re using? So the IP risk, I think the community health is something you have to look at, particularly the more you rely on it, how many developers are in the community? How many maintainers? How many users? Is it popular? Are other people using it? It’s the decreased risk of going with the flow, not always the right idea. That’s always the best idea, but at least it gives you some comfort that if you’re stranded on an island, you’ll be stranded together with a bunch of people and you can solve the problem together.

[00:06:20] Then things like the dependencies. That’s often an important overlooked issue is a particular project might have a list of dependencies, and one of the dependencies is actually not well supported. So you need that score to be recursively computed to ensure that the dependencies of the project you’re leaning on also are well supported. So those are some of the factors. I think security is critical. Security typically the open source, you can have security issues with people injecting code into the open source project. That’s less of a risk these days, but it can happen and that really can happen with projects that are maintained by one person or two people. The more maintainers that are on a project, the more people that are actually contributing code and reviewing pull requests, the less that becomes an issue. But if there’s one person and there are a lot of projects out there with not very many maintainers, and so yeah, it becomes a risk that, that you can have inappropriate code built. It’s kind of surprising that doesn’t happen more often, perhaps, but it can happen. The bigger risk actually is when you’re downloading binaries. Binary artifacts from the internet, you know, that’s an easier vector to interfere with some of these large scale public repositories. You’ll have two kinds of problems multiple kinds, but two very common ones are you just type the name wrong a little bit, and there’s a misspelled version of that package that’s actually a Trojan horse sitting on somewhere else, and you can easily accidentally pull it in. Instead of getting a typo and syntax error, you get something you didn’t want.

[00:07:54] So where are you getting libraries from? Where are you getting the binaries from? Has become a critical question for a lot of people. They rely on open source. We want to make sure the binary artifacts they’re either building themselves or getting from a trusted vendor like Anaconda or Red Hat or somebody else. So that’s an important question is where am I getting the supply chains important? I think it’s important to have somebody to review that. So a lot of organizations are creating what the open source program offices . Big organizations can usually do that. Small organizations, even those large organizations should really take advantage of the vendors out there and the communities out there that are providing help to help you accomplish those goals. Things like the open teams visibility score, things like some of the offerings that we have that help that we and other vendors have to help people manage that risk.

Steve: That’s really good. Really good checklist for companies as they’re looking at different open source components. One thing you did bring up is the interoperability of open source. So absence of a standardized licensing body, it makes it very useful for people to know which combination of products and open source variations work together. And what are the best practices to be implemented when you’re trying to bring in different open source components with your products to make them all work together for your project? What do you suggest there? What should they be looking for? Who can they trust?

Travis: [00:09:26] Yeah, that’s a good question. I think ultimately this is still resolving. I mean, this is one of those areas that’s going to continue to be with us, which is how do we make all this incredible explosion of technology innovation work together well? It’s fortunately that interoperability is also being pursued with the same kind of community driven and cooperation that we’ve seen in creation of open source projects. But I would be looking for are there emerging standards around a particular topic? I mean, if you look at databases, for example, and SQL emerges as a standard for how do I query databases rather than everybody have their own language for querying databases. There was a lot of cooperation that created SQL.

[00:10:11] Now there’s variations of SQL and people are still innovating around the SQL language, but at least it became a really strong standard. In the same way we have standards emerging around things like array computing and data frame computing. In fact, it’s one of the areas we need a little more work right now. There’s also standards emerging around binary data formats. How I store my data is in an open format with lots of tools around it or in some specific proprietary or underused format, how I use the data? Interoperability starts to matter when it comes to APIs or the APIs open is the tool I’m using have a standardized open API and is that API have software development kits or SDKs that are available in multiple languages, or is that there’s one way to access it. Is the way to access it modern? Is it using modern approaches like REST and JSON interaction or is it soap and XML and some of the older approaches to interacting with APIs?

[00:11:11] So these are, I can tell a little bit of the anecdote of data APIs, data-APIs.org, which is I got my start in Python by writing sci-py PI and starting that project with several of my friends. But numpy is what I’m actually most known for. The whole point of numpy was actually to create an interoperable array library and array library that tried to unify the libraries that were diverging. So I’ve kind of built a career out of creating interoperability effectively. Anaconda is an interoperable packaging approach, so you can package and pull together multiple languages. So instead of just being a package manager for one language, it’s a package manager for any language and any product, any any story.

[00:11:56] So keeping in mind that because of people get really focused on a thing they do well the open source you pull in, you might inadvertently be pulling in a narrower view of the world and interoperability problem that can easily happen. And so it does mean that you do need to get advice, get help, get expertise, get someone to help you, an architect to help you navigate that interoperability difficulty that could be arising maybe unbelongs to you. And so as that evolves, I think interoperability is important. A lot of good projects, the arrows and other good projects that’s emerged as interoperable, binary form approach, a way to store and compute on data in a interoperable way. But the data API project is helping to define a data frame standard. It’s already defined a great array standard so that lots of projects like PyTorch and NumPy and ndarray and other array concepts use the same API to talk about array programming in Python. We need the same kind of energy across multiple other disciplines.

[00:13:00] So looking for that, looking what’s the energy there? Sometimes people replace that cooperative energy with just who’s winning and that also sometimes you do. You just go, well, where is the standard going to be? Because there’s a factor standard emerging because of popularity of a project. You have to be careful because typically that’s one area that’s very manipulable by venture capital money. You basically what’s been happening is a lot of VCs have essentially been buying influence by promoting a particular technology that isn’t actually coming from users. It’s just being pushed down from marketing spend. That’s definitely been happening. And what I prefer is to see it’s fine to market something, but I love to see when it’s kind of bubbling up from users.

[00:13:46] NumPy came to dominate python not because anybody spent any marketing money because everybody used it. That’s pandas came to become the dataframe standard, not because there was any money to market it, but because people adopted it. So, where did that come from and why is that happening? It’s important to understand, especially in our world today, with lots of investment money, chasing the next biggest databricks or the next red Hat.

Steve: Great insight into that question and regarding a follow up to that one, because it’s important. Often organizations miscalculate the total cost of ownership for their open source systems because they underestimate the amount of work and they’re influenced by that big marketing message you talked about required with open source. What are your thoughts regarding how to calculate the time, the skills, the cost for one of these projects? How should they do that?

Travis: [00:14:46] Good question. It’s actually difficult. I mean, there’s rules of thumb that you can still apply in the sense that the cost it takes to implement any system, you’re going to have to spend some percentage of that cost and maintenance. And that’s just going to be true. And, you know, those numbers range from 10% to 20%. Sometimes you can get it lower than that. If you if you have a particular. Your system is mostly cobbling together open source software, like the less you’re changing open source, the more it’s just more merging together open source projects and put anything at a high level wrap around it. You can reduce those maintenance costs to 5%, for example. That’s kind of the calculus is like if you do it, you go all in and you’re going to do everything yourself and modify the open source and own it. Then your maintenance cost may grow. The challenge of the maintenance costs is you don’t keep spending them and keeping your technical debt at bay. You end up with essentially a new project you have to build within five years. These days, 4 to 5 years, you’ll have to build. You’ll have to spend that money again. I’ve talked to lots of managing directors and lots of folks in large organizations who bemoan this, and it becomes frustrating because just as you’ve spent $50 million on a particular software rollout, you find that four years later you have to spend another 50 million. And it feels like, wait, we’re just constantly treadmill here. And that can be that actually can be if you’re not building on standards, if you’re not building an open source, if you’re not leveraging the power of the communities that are out there, you can find yourself continually spending a lot of money on software. But there’s another way, and that way is to leverage the open source ecosystem, support those communities with key lighthouses placed at the right places to ensure that what’s producing is helping you, and then understanding there is a commitment you have to make long term. And it’s not like I’m just going to build this and done. Think that there’s going to be some maintenance costs going to get that down. And there are other ways to position that. Maybe it’s through supporting foundations, maybe it’s through supporting innovative approaches to supporting open source. But there’s going to be a cost. You minimize it by not changing, relying on open source components and supporting the communities that will maintain those components and then ensuring that you’re thinking closely about what your business value is and what you’re building and focus on that, focus on the delivery of what you’re building rather than the whole entire infrastructure that’s needed for what you’re building.

Steve: Oh, yeah. Those are great answers and good best practices. Now what we’re going to do here because we have 10 minutes left before the end of the session is we’re going to open it up to the audience for questions. And if they don’t have any questions, I have several I would like to ask. But if you would like to type your comment in the chat and question on the chat and we will answer it. So as we wait, I will go ahead and ask another one. To follow up on that question you just. Last best practice you shared. What can organizations do to ensure the longevity of the software that they’re using on open source? Tarvis, you kind of mentioned the end. You talked about foundations. You talked about participation. It’s not just install it and use it, but there is investment that they have to continue doing. And how should they do that to ensure that the software keeps getting developed with open source?

Travis: [00:18:29] Great question. I think I’ll start with something meta just did meta just handed PyTorch over to the foundation. Big, big work. I talked to Sumith. He’s a strong open source champion started PyTorch project inside of Meta. As long wanted to make sure that the open source project PyTorch became more community driven. Migrated from company back to community driven. So I think one thing you can do is an organization with some influence is actually to encourage those that activity, encourage the open source software you depend on to become part of a foundation. And there’s actually many foundations out there that will be fiscal sponsors of your project these days. There’s Numfocus? There’s Linux Foundation, there’s Software Conservancy, and so all it takes is a little bit of communication with the community leaders, community maintainers to encourage that. And it can happen. So that’s one way. And then, you know, making a regular donation to that community. I think looking for ways to sponsor community work orders, community driven activity. You know, these are simply projects, little projects you want to get done. For example, there’s a merging site, Oasisgrants.com. It’s just it’s very alpha mode right now and it’s basically doing small development grants for open source, a place you can people can actually put grants together. So a little sponsorship, these small development grants is an initiative that Numfocus was started and that other people do. But it’s a way to kind of ensure you’re helping to encourage the features you care about, get into the open source. So I think just recognizing that’s part of your budget as you have your engineering budget put together, recognize that a part of your budget should be going to support the open source you depend on.

Steve: And as a follow up question, is that one many companies we’ve talked about support. We’ve talked about working with the Open Source Community, Foundation, sponsorship. 40% of companies have put together open Source Program Office to help manage that. But there may be a lot of people that are trying to figure out, should I do that? How will that help me? Can you share your insights a little bit about this governance structure that companies have put together to help them coordinate these efforts?

Travis: [00:20:47] Yeah, I think that’s a good movement in the sense that there are a lot of factors to think about. I mean, my goal personally is to help those open source program offices be 1 to 2 people. I like it, but effectively they can get a lot of the support they need from the community, maybe larger for larger organizations because there’s different factors to consider. One, it depends on what’s driving that initiative. Sometimes it’s licensing concerns, sometimes it’s how do I have a general message to my organization about how to contribute to open source? A lot of companies appropriately have a concern about just turning on the lights and saying, hey, everybody here can just contribute to open source projects. There’s questions around that. Wait a minute, are you speaking for my organization as you’re doing that open source contribution or are you individually? But there’s answers to those questions. The Apache Foundation has done a lot of work on this to ensure that developers have a developer voice instead of a company voice.

[00:21:42] But there’s questions like that. I would say iterate, understand what your specific goals are for your open source program office and then what are the priorities for it. There’s a long list of other things that other things you’ll get from it. Look to see how much of that you can actually outsource, how much of that is available from an organization that you can partner with that can actually give you half of your open source program office capability without spending twice the money. How do you do that? There’s an easy way to think about that. Happy to talk to anybody about that question as well. That’s exactly what I love. Helping people do is help organizing their thinking around supporting open source communities. And the open source that they depend on. It may depend on how much open source you’re producing. Like, is that part of your strategy internally is to have the company produce open source. So these are kind of different factors involved in these open source program offices. Am I going to be absorbing open source and making sure the licenses are right? Am I worried about producing open source internally and using that as a developer growth and developer engagement or engineering engagement, mechanism for my organization. What’s the purpose of this office? Is it a risk mitigation? Am I trying to understand the software I’m depending on and can I ensure that what I’m depending on is going to be sustained and supported? And how do I then think about influencing those communities to have a regularized approach? Because that can be very important. For example, if you’re an organization and you look at your depending on 1000 open source projects or 10,000 project which is not hard to do. It’s very possible that your efforts to support and sustain are essentially focused on five of them. And then the other 99 or 95 are getting no interest from you. And that’s I think an important thing to recognize is, wait a minute. I need to make sure that we’re not just, you know, shoring up one thing and then having this massive hole in another thing. In another part of the ecosystem that I’m not aware of. So I think it’s an opportunity. There are people out there that to help you. One nice thing these days, there’s 40 years of open source engagement and there are people that have spent decades in open source communities you can actually lean on to help provide advice. And maybe some of those want to come and join you as a full time employee in your open source program office. But often maybe they don’t. Maybe there’s willing to give you some time and some consulting time to work on these problems.

[00:24:03] So take advantage of that. There’s a lot of expertise out there, but it is one of those things that takes if somebody doesn’t have at least ten years of experience in the open source ecosystem, there’s things, there’s holes. I’ve spent 25 years in the open source ecosystem. I still have things I’m learning. There’s still things I’m trying to understand and try to really nail about how this works and how it helps business. And so it can be challenging to find that right help. But it is out there. There are a lot of there are people out there who can help you.

Steve: It is out there. And I know Travis and the team are open teams is willing to provide that support, too. So what we’re going to do is we’ll put I appreciate Travis for your time today and I know you’re willing to give a little bit more help and guidance to people as they reach out. So we’re going to put this presentation out on the TechShares website, on the Open teams website, and we’ll send out an email with opportunity for you to engage on more of a one on one discussion with Travis, Lalitha and others that open teams around enterprise open source support, open source program office, how to work with the open source community. So I really appreciate your time, Travis, for you being here. We will have next session with Dharhas Pothina on the best open source, the best Python open source dashboard tool for fintech companies. So that one is upcoming here in a couple of minutes. So if you need to take a break, we’ll give you a short break and then we’ll resume shortly. Thank you, Travis, for your time.

Travis: [00:25:38] Thank you, Steve.

Steve: Thank you.

Travis: [00:25:41] Take care.

Steve: Bye.