From Data to Deployment: Practical Steps to Implement AI

Transcription

Sean Riley: So with all the fancy introductions out of the way, welcome to the podcast, Carol.

Carol Reiley: Hi.

Sean Riley: You gave a great presentation on AI. There were some additional questions that we weren't able to get to because of the time, so I'm going to follow up with some of those now. Where would a company start to create generative AI transformation and then they followed it with, "I have the data," but what platform would you suggest I use?

Carol Reiley: I think we're extremely lucky. In the past few years, we've had this accessibility of easier ways to build your own model with whatever data. The starting point is, I'm glad that they have their own data. I think it also very much depends on many different factors. One is the size of the data, the security, if you want it private or if you're fine, it's public. I think there's a lot of different considerations, but the good news is that there are many different platforms to get started relatively quickly. I think you can probably have something up and running within a week and then comes the hard arduous part of crafting.
I think the very first step would be to pick model architecture so you can ... Most are transformer-based ones, or you can go with open-source frameworks like TensorFlow, Hugging Face, PyTorch. Those can all be downloaded and implemented very quickly. So you just want to get an LLM model and just run it on your network.
And then once you train it, then you can fine-tune it and then this is where you just plug in different numbers and you can tweak it and this is where it gets hard because I think what you'll see open-source models, they'll get a big jump from your older data analysis, any technique you had used previously, this will probably give you the highest percentage points you've ever gotten, the highest accuracy. But then the hard part, a lot of companies get to the high 80% talent. Can't push past that. I would just get started without too much thought. Just think about the privacy issue.
If you have your own data, it's been cleaned up and labeled, get started, grab a transformer architecture like GPT-3 or Bart or an open-source one and you could plug it in. I think a lot of the nuances of software engineering of how do you get your data in a format that can be plugged in. So I would say that's probably more the hard part.

Sean Riley: You had mentioned during your talk that smaller and medium-sized companies often lack the resources to implement AI effectively. So how can these companies overcome whether it's a financial or a technical barrier to adopt AI solutions to avoid falling behind?

Carol Reiley: Yeah, that one is also a fantastic question. I feel like there's just a few key nuggets I run to, which is if you put together a data strategy, it doesn't have to be a hundred percent AI. I think you have to carefully look at your company and assess where you are in terms of data analysis, how much does the leadership team get behind a data strategy, and what has your team done on just simple metrics. Look for the low-hanging fruit, where has data shown any cost savings? And that's like little leads in where you can start to explore with AI, go with cloud-based services. Those are the easiest to get started, and then you can probably figure out long-term.
But to start with, I think it's easiest just to buy a few tokens, sign onto the Amazon Web Services or Microsoft Azure or the Google Cloud, and just pay-as-you-go pricing, very minimal setup. So anything that's really easy for non-technical users could be a way to just get your team going and then just try to upskill your current team to be able to do data analysis, simple data analysis, look at results and ask some good questions. Wherever you see a hint towards the metrics you care about, just get started without overthinking and over-analyzing or over-talking to a lot of different people. I think there are lots of different platforms that are low-cost to full-service, white gloves, depending on the pricing and your data size to get started and just jump in.

Sean Riley: It seems like a running theme is just to start to get into it and you go from there.

Carol Reiley: You learn as you go. So I think that's the biggest insight I have because I think AI is so revolutionary. It's just good to dive in, but to start small, don't get too crazy with trying to solve your company's biggest problem with AI from day one.

Sean Riley: That makes sense. So we have an industry packaging and processing and manufacturing with a lot of tribal knowledge. So what are some strategies for fostering a productive partnership between AI systems and the workforce, particularly with that strong tribal knowledge?

Carol Reiley: Yeah, I have a lot of respect for companies who have subject matter expertise with packaging. There's a lot of little nuanced things in the way a company is run, have respect for your cultural context and the people who are very familiar with the system. I think that's one common mistake I see because I think you defer the knowledge of the data experts, and while they may be ... It's like bringing in a genius, but they might not know all the nuances. So you really want to figure out how to really make those two cultures equal.
I think data privacy and security is a huge concern for most companies because that is what the tribal knowledge is, get the company's team trained and working very closely with the data scientists. I think it's hard because it's a cultural shift that's happening and I think there's been huge strides from AI-first companies. It's a different type of DNA, different types of structures, software versus hardware versus DIY cultures. There's a lot of friction between these different groups, and I think this is where leadership begins to have probably a heavy hand and set the pace.

Sean Riley: Yeah, communication's key there, I would think. As AI continues to automate some tasks and more and more tasks, how do you envision upskilling employees to then transition into roles that emphasize decision-making and creativity? Is there specific programs or trainings that would be most effective?

Carol Reiley: Well, we're also super lucky that there's a bunch of free online classes through Coursera, Udacity, Databricks, and those classes vary too on intensity. One could be like an AI for Everyone class, which I think is probably eight hours. So I think everyone in the company can watch that just to get some basic vocabulary just to understand and be on the same page to very hardcore machine learning classes. I don't think everyone in the company needs to know how to build their own LLM or build their own models, but I think the AI for Everyone classes would be great.
I think for now at an age where we're going to have these tools at our fingertips and we know it's going to save time, but it's a different skill from doing it yourself to being able to manage systems. I do think being able to step away and see a bigger strategic view would be great. So I think to get every person to take from management classes because I do think we're stepping into that age where you're putting these pieces together, it's just not doing simple steps, and whether you have robots doing it, whether you have AI doing it, I think we're about to upskill into management. And I think with management comes high-level concept thinking and being able to communicate with other managers about certain problems.

Sean Riley: Very interesting. What are the best practices you might have for manufacturers to improve their data collection and labeling processes? Just that way they can enhance their AI performance a little bit easier.

Carol Reiley: I think data labeling is undervalued. Most of the time you give it as an interim project, but it becomes one of the most crucial processes because if your data is mislabeled or inconsistent, then no matter how awesome your model is, you're never going to get there. So one is just clear labeling guidelines and having double, triple checks. I think what you need is just very clear labeling guidelines, making sure everything is consistent, but it's not just one type of data, and then you'll get 99th percentile, and then you move to a wider view, and you're in trouble.
So I think setting up early your data strategy of testing it, checking it, running that small data set through, and seeing if there's any gross errors. I think if a human can't tell, then it's going to be hard for AI to tell. I think one fallacy is maybe the AI will see patterns that humans don't see, but if you can't tell this is a cat, this is not a cat, then you're going to be stuck at that like 70th percentile for a long time. But lots of different things on just realizing costs because data labeling is extremely expensive.

Sean Riley: Good advice. With AI playing a growing role, we hear it a lot with predictive maintenance, quality assurance. How do you see AI reshaping operational efficiency this person's asking in the next five years or so, especially in areas like generative design and fault detection?

Carol Reiley: It's extremely exciting. These big projects with generative design and fault detection I think are the two big buckets that AI can probably tackle relatively well side by side with a human. You'll probably see real-time anomaly detection, so being able to have things highlighted immediately so that a human can go and see if that actually is true. You'll have multi-sensory data, so not just cameras, but LIDARs and all these views double-checking. I think it can also explain the reasons what is behind the fault, which would be great. An example we commonly use in terms of emergencies. We don't like self-driving cars or planes in this human collaboration mode because it's really hard to go from just sitting in the back seat of a car, and all of a sudden there's an emergency, then you have to jump up and take over. It's very jarring for a human to process what has happened if they haven't been in control.
So I think what we want is the human in the driver's seat managing these AI systems much like they are interns, where they would be like, "I think this might be an error." And then the humans ultimately are the ones responsible. So if something slips through, it's the manager's fault. So I think just the accountability needs to be clear that it's on the humans.
And then for design and optimization, I think this has clear benefits. It's very easy. There's lower risk of anything safety critical. So I think this is where humans can be extremely free and creative, and AI can supplement with ideas as a thought partner to generate things that maybe a single brain or a team of humans may not be able to come up with. It's amazing the new ideas that have been thought of with AI, whether it's on the design side, whether it's creating new tools, new music videos. I would say I think we're at the brink of human imagination right now. It's about to be unleashed in a new way. I think that's the area that we're going to be super-powered. I think we're going to see and experience cool new ideas that would have taken to an evolutionary process like tens of hundreds of new years today.

Sean Riley: Very cool. This has been great. Is there a resource that you recommend for staying up to date with AI advances, whether it's a blog, a podcast, a website, some ways so that people ... since this is changing so fast, that people could stay abreast of it?

Carol Reiley: Yeah, it's funny because I used to go to NIPS, what is now I've recalled NeurIPS, and it used to be a group of researchers that were 300 of us at a conference, and once a year you would get your conference proceedings to flip through them. And it is just insane. I would say daily is barely enough to keep up. It's many people's jobs now that they just read. You almost need minute-to-minute views. So while I think it's still great to go to conferences, I think it's still great to take online classes like the Coursera or YouTube's got a lot too. Databricks, Udacity and read these research papers. I read TLDR for my daily new AI news. There's a few resources, but I think Techpresso and TLDR, and The Batch, which is a weekly newsletter are probably my three main sources.

Sean Riley: Okay, very good. That would be very helpful. I've already taken more than enough of your time, so I really appreciate you coming on here and answering some additional questions after your presentation at the annual meeting. Thank you very much, Carol, for coming on the pod.

Carol Reiley: Thank you so much for having me.

PMMI Podcast

From Data to Deployment: Practical Steps to Implement AI

Speaker

Carol Reiley

Transcription

Thank you to the PMMI Foundation Benefactor