Overview
EP 204 - Transforming Industrial Operations: The Power of Edge Computing - David Puron, CEO, Barbara |
|
May 28, 2024 | |
In this episode, we spoke with David Puron, CEO of Barbara. Barbara specializes in secure edge computing platforms designed to enhance industrial operations. David shared insights on how edge computing enables real-time decision-making, improves data security, and optimizes network efficiency in various industrial applications. During our discussion, we delved into the practical strategies and real-world implementations that are shaping the future of industrial IoT through edge computing. Key Discussion Points: | |
Subscribe |
Transcript
Erik: David, thank you for joining me on the podcast today. David: Yeah, no problem, Erik. I'm very glad to be with you and your audience. Erik: Yeah, I'm glad we found this time to talk. I know our teams have been in touch for a while, and now it's the first time that we're meeting. I'm looking forward to picking your brain on trends in AI. David: Yep. Yeah, looking forward to it. Erik: So before we get into the topic at hand, I want to understand a little bit more why you decided to run and set up the company Barbara. I mean, you have a very strong background in quite mature companies: Telefonica, Huawei, SEVEN Networks, Silent Circle, Bayshore Networks. And so it's always interesting to me when somebody has a very successful career and then they decide to establish a new company. So why now, and why Barbara? David: Well, it's been a little bit long history. As you said, I started working for large corporations like Telefonica and Huawei. But then, back in 2010, I started to work with some startups. And then we have the NSA revelations. Probably, you remember when it let us know. Then I was quite involved with that topic. And then I decided to launch and start my first company there. This was a company called Blackphone. We were manufacturing our securest smartphones. So then, I started to work a lot with cybersecurity. Then when we sold Blackphone in 2016, I realized that cybersecurity was also a gap on many other markets like the IoT. I really wanted to apply everything that I learned during my previous work on cybersecurity in smartphones, in a really exploding market like the IoT. So it was kind of a natural career for me but everything trying to cover some market gaps where I think it is important. And for me, cybersecurity has been always an important topic to cover. Because I think it really affects to how the world is developing. This is why I've been always on this topic. And this way, I started Barbara back in 2017. Erik: Now Barbara, as it is today at least, extends quite a bit beyond cybersecurity. So you set this up in 2017. At that point, of course, edge computing was quite a bit less mature than it is today. But nonetheless, you have to pick your battles, right? The edge computing could apply to pretty much any industry, many, many different problems. How did you define the scope where do you want to play to win here? David: Yeah, so we started developing back in 2017 a secure IoT operating system. We started from this cybersecurity focused. We developed a cybersecure operating system for IoT devices. But then, as you said, edge computing was starting to be more and more mature. So some of our customers asked us. "Okay. Can you also run some remote workloads on your operating system?" We said yes. So then we became a cybersecure edge computing system. Then all the AI wave has arrived a couple of years ago. This is something that we will be speaking later for sure. But then we have become, in the last two years, an edge AI management platform. We are the most secure edge AI management platform that is in the market. So cybersecurity is still on our core of value differentiation. But in terms of value proposition, what we are offering today is an edge management and orchestration tool focused on orchestrating AI models. Erik: That's interesting. Okay. So security came first, and then you added AI processing functionality as that particular part of the tech stack matured. And I can see, if I'm looking at the industries that you're focused on, the digital grid, smart water, also smart manufacturing. But certainly, those are industries where cybersecurity is critical determinant in terms of their ability to adopt digital technologies, or it may be a bottleneck. If they're not able to guarantee security, they would rather just stay unconnected in many cases. So those are the big industries. I guess manufacturing is the one there that's a bit less well-defined. Within manufacturing, are there specific industries where you find that security resonates more with the customer? David: Yeah, so before we jump into the industries, let's tackle the problem that we are addressing. So as I said, Barbara is an edge management and orchestration tool focused on orchestrating AI models. This is a little bit complex statement. But for our audience, let's put it very simple. AI models, the data science, as of today, they are very used to work in cloud, okay? They gather the data. They put it on cloud. They train the models to do predictive analysis, to do whatever AI is capable to do. And they do it in cloud. The problem is that for many industries, like the ones that you were describing, cloud doesn't work. It doesn't work for many reasons. It doesn't work sometimes because of latency, data latency. They need real-time response of the algorithms and the predictions. Cloud cannot make this real-time response. Sometimes it's because of scalability and cost. If they are managing huge tons of data, this will be very costly in the cloud. And last but not the least, again, privacy and security. Sometimes they don't want their data to be hosted in an environment they don't control, like the cloud. So what we do applies more to those industries where cloud computing doesn't fit perfectly. These are industries that their common requirement is that they have very distributed assets. So when there are very distributed assets, cloud computing for AI doesn't work. So what are these industries? As you've said, energy. For example, you can think in energy substations, an energy distributor can have 90 or 100,000 energy substations distributed all over the country. Can you think in connecting all of them to the cloud doing AI intensive data analysis? No, it can't be done in the cloud. Also, as you said, water management. You have water treatment plants distributed all over the country. So distributed assets mean cloud doesn't work, and then you need edge computing. And for manufacturing, it's not obviously applicable to any manufacturing segment, but we are more successful in those manufacturing environments with distributed processing. This is more process manufacturing. This includes, for example, chemical companies. This includes pharmaceutical companies and those manufacturing segments where the manufacturing process is more distributed rather than concentrated. Like for example, automotive manufacturing is relatively concentrated within one factory. But if you talk to process manufacturing where you have the chemical process in one location, the packaging process is in another location, quality is maybe in another location, then it makes more sense to use edge computing in cybersecurity. Erik: You already listed a few of the reasons why these particular industries are more suitable for edge computing and why they have requirements that cloud cannot match. Now edge computing has certain constraints. How do you define what then is feasible within edge computing and what either cannot be done or then would have to be done on the cloud? I imagine that you then have interfaced cloud for certain workloads that need to be done there. But how do you define those boundaries? David: Yeah, so normally, when it comes to AI, for simplification, we divide the AI process in three big phases. One is the data collection. The second one is the algorithm training or the model training. Then the third one is the model deployment, understanding the model deployment as when the model is already running and doing these predictions. I think edge computing is quite useful for data gathering, as well as model deployment. However, it's not that useful for model training. Because model training normally requires a lot of large computation power. Normally, it's quite more productive to train the models in the cloud. The problem of edge computing is normally resources, because these are resources that companies need to pay upfront. And it's not as flexible as cloud. So when you need a lot of resource consumption, like in model training, maybe cloud is better. Normally, we work collecting data from factories, helping companies to collect data. Then we send this data to their cloud systems. Then they train the models there. Then using our platform, they can get the model to train that model. Then they can deploy the models in hundreds or even thousands of gateways for its edge computing nodes. So our platform abstracts data scientists from the complexity of integrating with sensors, actuators for data gathering, and also with integrating with hardware for model deployment. Erik: Okay. David: But for training, they can be used in the cloud normally. Although, there are some use cases where training is done locally, then the edge. Erik: Are there particular constraints about what types of models you're able to deploy on the edge, and what are not feasible because the hardware just can't process them? Or is that typically just a matter of cost, of how much hardware do we want to put in? I mean, I guess, to some extent, you could do everything on the edge. What defines for you a case study that is feasible to do at scale, at a reasonable cost, when it comes to deployment of edge computing? David: Yeah, there are a couple of sweet spots for us. The first one is regarding to the type of model that the company wants to use. There are normally two types of AI. One is the non-real-time AI or batch AI. So these typical models where you run, for example, a prediction every night, taking all the data of the day, like this item financial, I don't know, financial models, and so on. So non-real-time AI, and then the real-time inference AI. Edge computing is more suitable for real-time inference AI. So if you want to take real-time data and do a prediction, like, for example, if you want to take a value of a sensor, of a vibration sensor, and decide whether this company is going to break in the next 10 minutes, this is perfect for the edge. Because if you send the data to the cloud, and you do all the data cleaning and data time series analysis and everything in the cloud, you will probably get minutes or even hours of delay. But if you do it at the edge, then you will have real time. So edge is more suitable for real-time AI rather than batch AI. Then when it comes to technology, there is still quite a lot of fragmentation in the AI stacks. At Barbara, for the moment, we've been focusing on two specific stacks, which is TensorFlow and Keras. But we are also building the compatibility with another three biggest stacks throughout the year. Our objective by the end of 2024 is being compatible with all the major AI stacks. But that is challenging. And for the moment, we are also a bit limited on technology. So we are compatible with TensorFlow and Keras stacks. Erik: To stay on this point of what is possible, I think it's an interesting point. Because the technology is developing very quickly, so what was possible two years ago and what will be possible two years in the future is quite different. So if we look at a simple case, let's say we define a model, and we're tracking a specific temperature variable. And so when the temperature goes above this threshold, that's probably something that could have been done two years ago or four years ago pretty effectively. If we are trying to model, let's say, a simulation of a chemical process that has 500 different data points, that becomes much more complex to do on the edge. Where is kind of the threshold of the complexity that can be done today, and then if you can share a bit your thoughts on how quickly is that moving? What is currently can only be done on the cloud that you believe will be able to be migrated to the edge in 1, 2, 3, 5 years? What does that roadmap look like? David: I think it's complex because the main bottleneck is always the hardware. As long as the hardware allows in terms of data store, and CPU processing or GPU processing allows to process a number of data points with a complex algorithm, with a complex AI model in time, in milliseconds time, then this can be done. And this depends on both the hardware and the firmware that runs in this device. The hardware capacity is growing amazingly. I mean, you have the Jetson Nano, which is a very small device and a very powerful device, which was capable of processing large workloads. Then you have now the Jetson Nano Orin or the Jetson Orin Nano. I don't remember the order. But whatever it is, it's 80 times more powerful that the Jetson Nano at the same price. We are running their algorithms that are taking 4,000 data points per second and doing predictive algorithm, for example, for a water treatment plant of their chemical process. So it's taking 4,000 data points, and it's predicting the amount of chemicals that the customer needs to put on the process. This is something that I wouldn't say two years. I will say that maybe even one year ago, that was not possible. So this is moving really, really fast. Now in Barbara, we are also working in real-time responses of our operating system that mean deterministic responses, which is something that only PLCs, industrial PLCs, deal in the past. Now we're working also in being able to answer that the model is going to answer on a specific timeframe. All these are developments that are making the world and the industry evolve really, really, really fast. I think probably in three to five years, we will be able to even train the models on site. Probably, cloud will be, at the end, very suitable for specific industries like media or financial where you have huge, worldwide regional dispersion. But for industrial use cases, and so on, I think almost everything will be able to be done on-premise and without the cloud. So I think the edge is going to move fast. And, yeah, we can see a lot of predictions that speaks about cloud repatriation and many things that is making the edge to gain more and more market. So I'm sure that in three to five years, edge computing will be probably over 50% of all the AI use cases for the industry. Erik: Yeah, it's so interesting. So if we compare cloud versus edge, I guess, traditionally, the key constraint was just the ability to process on the edge. That constraint is being reduced as the hardware and firmware evolves. I guess then you move into security versus cost structure. Then also, there's a performance element. And so your assumption is that — I would assume that the cost structure in the longer term would still favor the cloud to some extent, but maybe performance and security would favor edge. Or do you see also the cost structure of edge computing reaching parity with cloud computing, if we look out five or more years? David: I don't know what the hyperscalers will do. But the reality is that, as of today, I heard, for example, a statistic that says, in the UK, more than 20% of the companies have already moved half of their workloads to the edge. And the top reason is cost. So 43% of those companies are moving to the edge because of cost, and 33% of those companies are moving because of privacy and security constraints. So this 43% of companies that have already moved half of their workloads to the edge are receiving really, really the invoices from the hyperscalers in the cloud. So I think cloud made a promise 10 years ago about many things, but they didn't make any promise about the cost. And if you are processing lots of workloads on the cloud — like, for example, the one that I was telling you, about using 4,000 data points — it's unmanageable. Unmanageable. And with edge, the hardware, first of all, is quite cheap. It's becoming very cheap. And also, its CapEx. I mean, it's a one-time investment and then it lasts all your life. Also, the OpEX, with companies and products like Barbara where you can operate this complex deployment from a very simple tool, centralized, with very simple UX and UI, the total cost of operation is also becoming less and less expensive. So I don't think cloud can compete in the future with edge in cost. I think it can compete definitely when you have really, really lots of one-way deployments. Let's take, for example, Netflix. Netflix will never go to CNET, right? So it will go to large data centers and so on. But again, for industrial companies, I think it has a lot to say versus cloud. This includes also cost savings. Erik: Okay. Yeah, interesting. And then goes the cost of connectivity, it would also convert to CapEx. But then also, anyways, you have to pay for CapEx if you want to have connectivity to the cloud. So there's going to be a direct cost reduction there. There's also been a lot of progress recently or, at least, I guess you could argue about how much progress but in terms of the connectivity solutions on the edge that enable more wireless environments, which I think has also been a constraint in some circumstances. And so I know maybe this is a little bit adjacent to your core product. But how impactful is that on enabling use cases? For example, 5g enabling a higher bandwidth at the edge or some of the just lighter solutions. Do they really have a significant impact on customer's ability to adopt to your solutions, or are those a bit more tangential and less important in the decision? David: I will say for the industrial customers, I will then say that 5g has yet become like a key driver. I think it definitely needs a little bit more of development. I think the 5g promises, including the ultra-low latency and some others, are still to be consolidated in the industry. This is something that always happen with the cellular network releases. I think you had the 2g and then the 3g which were kind of standards that didn't have a really, really good performance. Then you have 4g which, perfect, right? Then now you have 5g. But 5g still has some interoperability challenges between vendors and so on. I think that the promises that 5g, in many times, at least from what we see, hasn't arrived yet to the real market and to the field. I think we are very looking for running to the 6g. However, there are some other developments that had an impact. Like, for example, LoRa. I think that we had these non-regulated network standards like LoRa or Sigfox. Sigfox, unfortunately, I think is becoming less and less popular. But for example, LoRa, it's a wireless connectivity standard that is being used in many factories, in many agriculture fields. This is something that we support and is enabling us a lot of use cases. Many industrial customers are yet going through wired environments using Ethernet cables. But wireless, in terms of wireless, I will put more LoRa on the sovereign's network on a more impactful state than 5g. I think 5g will eventually lead the space to 6g. Probably, that will be the release that will have more traction. Erik: Yeah, that's what we found as well. We did a study with Siemens on this a while back. We're trying to find the use cases that 5g really unlocked, and it was challenging. Because the use cases are — they're just use cases that people are not really using that much anyways, right? Things like augmented reality on the shop floor or something, that anyways has such a low adoption. It's basically maybe not a gimmick, but it's very niche. David: But the reason why they are not adopting this is not because the use case doesn't make sense. It does. But the reality is that the network can't make its promise yet. So I think everybody talks, for example, real-time surgeries, right? So the doctor operating a patient where the patient is in Asia, and the doctor is in Europe. That is great. I mean, why not? But this is very, very critical, obviously. A network cannot fail. A network, if it says 100-millisecond delay, it needs to be 100-millisecond delay. There are a lot of things like network slicing and network quality of service, which in the paper and in the standards look very good, but when you need to achieve interoperability between devices, vendors and in large roaming networks, it's not working. So use cases make sense. However, technology, I think, is not yet there despite what the standards say. I mean, the real world is very different. Erik: Coming back to the hardware topic, on your website for your enterprise edition, you have here access to custom firmware and hardware. What does that look like? Is this a custom solution you've developed, or are you working individually with each enterprise to customize based on their requirements? David: Our product is divided into layers. The above layer is our management platform. The management platform is the one that allows you to orchestrate and manage the devices, the models, the data applications that are distributed on the devices. The second part of the product is the firmware that goes in the device. The firmware that goes in the device provides you the ability to be orchestrated, obviously, and also provides you a layer of security. That firmware is compatible with a number of devices and architectures. It's compatible with Intel, ARM, and some other big names. Right? That firmware, with our business license, we sell a firmware which is standard. With our enterprise license, we allow the customers to customize that firmware. We can customize that firmware for a large enterprise with their models, with their network configuration, with their proxy configuration, with whatever is needed so that they can put this firmware on the factory of their devices or their operations. But the installation and deployment time is shortened to minutes. While if you don't have a customized firmware, then you need to deploy and then you need to customize. It's more target to mistakes. So this is why for large enterprises with lots of devices, like thousands of devices, we always require to have your own firmware customized. Because this will help you to have less mistakes on the provisioning process and ensure that you can provision new devices very, very quickly because the firmware is already customized with what you need. Erik: Yeah. David: This is what we refer by customization. But in terms of hardware support, we support, I mean, I will say, most of the edge computing nodes that are out there which are based in Intel or ARM. We also have, by the way, a virtual firmware. So if the customer has built-on machine server on the premises, we can also put our firmware there. Then they can do the edge AI management and orchestration on a virtual device. They don't 100% require hardware. Erik: I guess, one other way of looking at that two tier is that the business, so let's say the lower tier, could also be suitable for running pilots or validating a solution, and that you do the custom work. I'm curious. I know companies have quite different perspectives on the value of a pilot project, right? Does it actually validate? Does it maybe give you the illusion of something that doesn't necessarily scale? What's your philosophy there? Do you find pilots to be highly effective in predicting success or not so? David: We try to differentiate with the customers when they are speaking about a trial and when they are speaking about a pilot. A trial is if you want to try the tool on your desktop, on your table, with an edge computing node that is sitting next to you. That is not useful at all. I mean, the problems arrive when you take this device and you put it on the field. Then you get into the factory, and then you realize that the data that you were supposed to gather, you are not gathering because there is a firewall in the middle. Then the IoT people tells you that they cannot open this firewall. So this is a pilot. If the pilot is done in a real environment — you can call it pilot. But at the end of the day, it's in a small deployment — then it is useful. For trials, if the customers want to try our product, this is fine. But this one answered you that this will be useful on the field. If you want to ensure this is useful on the field, take this edge computing node to a factory, to an electric substation, and then run your models there. This is something that can take one or two months. I mean, it's not that very long. But if you do it and this is successful, this will answer you your return of investment and everything. I will say that over 75% of the pilots we do, speaking about this type of pilots, end up in a less production environment. But we are not speaking about trials. Trial is a different thing. But if you take our technology with your models to the field, 75% of the times, this will end up in a real use case with large deployment, with a lot of return of investment. But you need to do the pilot on the field. If not, then it's not useful at all. Erik: Got you. Yeah, good point. So a pilot needs to be a proper deployment in order to actually be helpful. You mentioned time of maybe one or two months. I guess a lot of that time is collecting the data and processing the data and assessing results. What would be the workload involved? If somebody says, "I have a concept. I want to validate this," — you don't have to necessarily share pricing here — but what would be the workload involved in terms of man hours? Maybe give us a kind of a rough cost range for somebody to actually go through that one- to two-month process? David: Yeah, so it obviously depends on how mature they are on the process. There are companies which they already have the data. They already have the model. They have already even trained the model. And the problem they are facing is that they cannot run this model in the cloud. Because, for example, they cannot connect their plants to the cloud. Then they are looking into deploying this model at the edge. If they have the model, they have the data and everything ready, and the problem they want to tackle is just that they want to deploy this application or this model to the edge, we're talking about one- or two-weeks’ time of one engineer. So it's very short. It's very short. If the company is less mature, and they need to take data from a sensor or a few sensors, or the database, or SCADA, and then they have to train the model and develop the model, fine tune the model, this is what I said. Maybe two months' time, involving maybe one or two persons. But at Barbara, we have focused a lot on usability and use of experience. Our tool is really simple to use, very intuitive, with one or two clicks for making many of the things that they have to do. And compared with some other tools where you have this command line interface, very complex integration, with code repositories and very complex things, compared with other tools, I think Barbara can take the models to production at three or four times faster. So with some other edge AI computing tools, it can take you six to eight months. With Barbara, it's one or two months using one or two people. And we have always a customer success team helping 24/7 to those customers to make their stories success. Erik: Okay. Let me ask one final question here, and this is something I'm kind of feeding into every podcast this year. This is the year of the LLM. That's a completely different type of data. But what I hear from our customers is that they are very hesitant to move a lot of their data to the cloud. So this is now dealing with PDFs and other kind of data, that typically would not be a factor in edge computing. But are you seeing customers using edge computing to process these types of solutions in their facilities to avoid migrating to the cloud? How do you see that? David: No, I think if you say the edge, I will differentiate between VNet and VKet or maybe device edge and data center edge. I think LLM's large model processing is done more on the data center edge. I mean, you can say it is edge computing because it's not — hyperscaler is you're rack on the data center. But it's close to edge computing. So when people say they don't want to move the data for LLMs to the cloud, they are moving it to another data center, which is the norm. But when we refer to edge computing, we're speaking more about device edge computing. We are not seeing customers processing LLMs on that type of edge computing. But having said this, obviously, LLM is going to be a huge market. But it's more related to server or data center edge computing, which is something that Barbara is not involved to. Barbara is more involved on predictive analysis of industrial data on real time, that is. Erik: Yeah, got it. Great. Well, David, I think you've given us a good overview of where we are with edge computing today. Anything that we didn't touch on that you'd like to share with folks, or any last thoughts? David: No, I enjoyed a lot, the conversation. I hope our audience also do. I hope for our time and our resources too to help anybody who wants to deep dive on use cases and the technology. I think we need to build the edge computing world together. Erik: Yeah, great. And for the folks listening, the website is barbara.tech. Is the best way for folks to reach out just to contact you via the website? Any other approach that you would suggest? David: Yes, we do have a chatbot. A little bit AI but also humans behind the chatbot. Entering in barbara.tech and using the chatbot, you will get an immediate response from our team. Erik: Wonderful. David, thank you. David: Thank you very much, Erik. |