On April 30, 2020, KernelCare CEO Igor Seletskiy was interviewed on The SaaS CX Show, a podcast hosted by SaaS consultant Frank Bria. During this podcast, Igor answers why his company entered the live patching market, what it’s like to do business in it, and how he plans to expand into new applications. Below is an overview of his answers and a recording of the podcast.
What led CloudLinux to create a live patching system?
Customer requests. Three years after he started CloudLinux, Oracle bought Ksplice, the original live patching system. CloudLinux customers were using Ksplice, but they didn’t want to deal with Oracle, so they asked Igor if CloudLinux could help them live-patch their Linux kernels.
He talked to his team at CloudLinux and said, “Let’s see if we can do it from scratch, using a completely different approach.” They began building a new kind of live patching system, and actually found a way to do kernel patching differently than Ksplice.
In 2014, CloudLinux released its live patching system, KernelCare.
Who is using KernelCare?
KernelCare didn’t start from a drawing-board idea, it started from customer requests, and it’s growing the same way: through customer requests. Igor’s guiding principle is, “If so many customers need it, we have to offer it.”
Now CloudLinux is fielding a significant number of requests from people moving to distributions such as Amazon Linux and Ubuntu. They want a live patching system that will work on these other distributions as well as Oracle Linux, so they turn to KernelCare as a solution.
KernelCare is also getting more enterprise customers these days. The enterprise market is different from its traditional hosting market, with a different way of thinking and working. This market is relatively unfamiliar to the KernelCare team, so it proceeds the same way it has in the past: its members listen to the customer.
Igor describes it as an interesting business with no room for error. With most software, like an app, a bug is an inconvenience: if the app has a bug, it might not work as well. When dealing with the Linux kernel, a bug can make servers spontaneously reboot, or take them down entirely, which can present big problems for an enterprise.
How does KernelCare deliver its service?
The Linux kernel is extremely complex, the overriding concern of the KernelCare team is to not make a mistake. It’s critical to customers that it doesn’t. Once they did a mistake that crashed over 100 servers, so it changed the way it releases patches: to a small group, then through a gradual rollout.
KernelCare has made a huge investment in testing its patches. The team might change one line in the kernel, but to make sure that it works, that it’s not going to crash anything, it will do thousands of tests on different configurations.
What helps is that:
- KernelCare has a large customer base of hosting providers that run lots of different software configurations on hundreds of servers.
- It also has enterprise customers with large operations. They may run 30,000 servers with six generations of hardware.
The team will first roll out patches to the staging environment and tests them thorougly. Once it works there, a patch will be provided to clients. Enterprises put it in their staging environments, and once it works there, they roll it out to all their servers.
What has presented the steepest learning curve?
Over the past decade, dealing with people has presented the steepest learning curve to Igor. That is, focusing on people, and learning how to deal with different sorts of people.
“People skills” are important for him to have, because kernel patching is extremely hard, and they enable a CEO to attract and retain the talented people who can do it. Dealing with people doing this difficult, specialized work requires finesse as well as technical knowledge.
What’s next for KernelCare?
KernelCare is now a “set-and-forget” product, so customers are now requesting help with patching libraries such as glibc and OpenSSL. They’re turning to KernelCare to help them patch these libraries automatically instead of manually.
Also, Igor is now talking to vendors in the “Internet of Things” space. The IoT market is a thousand times bigger than KernelCare’s traditional hosting market, and has the same problem of scheduling downtime.
Right now, KernelCare is about two years ahead of the IoT market. For example, some ARM device manufacturers (most IoT devices use ARM-based chips) are still using hard-coded patches. In this market, efficient, effective patching isn’t a priority yet.
It soon will be, however. One of the main ways that hackers get access to networks is now hacking IoT-enabled printers. As this becomes more problematic, manufacturers will want rebootless kernel patching.