My Progress wih Local AI - Failing to Get NPU to "Just Work"

2 comments

ahmadmangalast month2 min read

This is a follow-up to the previous article in which I talked about

For AMD APUs, this framework mainly uses CPU and GPU via Vulkan. I noticed that Koboldcpp doesn't use NPU.

The reason I even bought

so not utilizing it didn't sit right with me. I tried to look for alternatives...

NOTE: An NPU (Neural Processor Unit) is a chip inside the computer that does neural operations, which makes it perfect for generative AI.**

My research made me discover that the GGUF library does not use the NPU. It may do in the future, but not right now. If I want to make use of the NPU in my system, I had to use

The app-server I found that does that was

though when I installed it, it couldn't recognize my NPU hardware.

I realized that I need to install Ryzen AI Software first. I installed it, after installing its requirements,

and VS Code 2022 (I installed 2026 instead which is backwards compatible.) It wasn't easy to set up, but I finally managed to get it working:

Still, even with Ryzen AI Software, Lemonade couldn't recognize my NPU hardware... Apparently, Lemonade only works with Ryzen AI 300 series or newer.

My hardware isn't supported by Lemonade!

But I still have the Ryzen AI Software, I ran the QuickTest python application, and it worked... I tried to install a small ONNX-format model, but I found out they're HUGE!

I've been using the GGUF library format up until now, and only the quantized models at that, so I didn't realize how big ONNX models could be. What's worse, to use them, I'll have to pull a whole repositery from HuggingFace, unlike GGUF models which I know which files to pick and download, the ONNX seem to need the whole folder.

So, I'm going to stop looking into the ONNX format for now, but I'll keep RyzenAI installed just in case.

What do you think?