iptv techs

IPTV Techs

  • Home
  • Tech News
  • AI PCs aren’t very excellent at AI « Pete Warden’s blog

AI PCs aren’t very excellent at AI « Pete Warden’s blog


AI PCs aren’t very excellent at AI « Pete Warden’s blog


I’ve lengthy been a fan of Qualcomm’s NPUs, and I even collaborated with them to get experimental help for the underlying HVX DSP into TensorFlow back in 2017 (pursues remain here). That unbenevolentt I was very excited when I heard they were transporting those same accelerators to Windows tablets, adviseing up to 45 trillion ops per second. As soon as the Microsoft Surface Pro version running on Arm was freed, we bought a bunch and readyd to engage them as the main platestablish for our instant translation app, since it needs a lot of computing power to run all the alterer models that power it.

Unfortunately I struggled to get anywhere proximate the publicized carry outance using the NPU. In fact, in my experience it was usuassociate meaningfully sluggisher than the CPU. To try to get to the bottom of these rehires, I’ve uncover sourced a benchtag where I try to get the best possible carry outance on a set upational AI operation, multiplying two huge matrices, and show that the NPU is sluggisher than the CPU path. I only see 573 billion operations per second, less than 1.3% of the 45 trillion operations per second that’s cataloged in the specs (and four times less than the Nvidia RTX 4080’s 2.16 teraops in my gaming laptop with the same benchtag).

I’m engaged to not getting fantastic utilization of AI acceleration difficultware, frequently getting to 10% of the theoretical peak thrawput is pondered a excellent result, but I’m disnominateed at the 1.3% we’re seeing here. It’s difficult to alert where the problem lies, but I’m hoping it’s in the software stack somewhere, since I’ve seen much better carry outance with aappreciate chips on Android. It could even be an rehire with how I’m calling the code, though I’ve tried to chase the write downation as seally as possible. I’m guessing the Onnx runtime, drivers, and on-chip code haven’t had enough toil done on them yet, which is excellent news becaengage those all should be mendable with software modernizes. I also leave out the ability to compile and run my own operations on the DSP, since that would provide an escape hatch to these rehires, but that’s apparently not apvalidateed on Windows.

Hopefilledy we will get some help solving wdisappreciatever rehires are stoping us from achieving the carry outance that we’d foresee. If you have ideas, phire experience free to fork the code and give it a try yourself, I’d cherish to hear from you. I’m still certain that the difficultware can transfer, but right now it’s very disnominateing.

Source join


Leave a Reply

Your email address will not be published. Required fields are marked *

Thank You For The Order

Please check your email we sent the process how you can get your account

Select Your Plan