Health Ranger - Mike Adams - AI Capabilities Advancing RAPIDLY Even on the Same Hardware

April 4, 2026 - Health Ranger - Mike Adams

11:10

AI Capabilities Advancing RAPIDLY Even on the Same Hardware

# text to speech software development ocr technology ai code generation bright answersai bright learnai claude code google turboquant gpu warranty hardware advancements kv cache naturalnewscom nvidia nvidia gpu sales ocr models

Mike Adams details how AI developers rewrote complex codebases in hours using Claude Code and 14 autonomous agents, a task once taking years. He highlights Google's TurboQuant paper compressing KV caches sixfold to handle massive contexts on existing GPUs, while efficient one-billion-parameter OCR models outperform larger systems. These hardware efficiencies may reduce demand for high-bandwidth memory, threatening NVIDIA's sales strategy, especially after the company denied a warranty claim for a faulty $9,000 GPU. Ultimately, rapid AI advancements are reshaping computational economics and challenging traditional hardware dominance. [Automatically generated summary]
[Read More]

Transcriber: CohereLabs/cohere-transcribe-03-2026, sat-12l-sm, and large-v3-turbo

Large

Audio Only |

Scroll

Time	Text
AI Reviewing Entire Code Base 00:05:04
	So here's an AI update for you.
	You know, I'm an AI developer and I've built several very popular platforms and I'm building more things that are going to be really fascinating.
	And one of the techniques that I use now, which shows you the capabilities of artificial intelligence is that I ask AI.
	See what?
	Once I'm part way through a project, I ask the AI engine to tell me what features or improvements would make this project more successful given the project goals.
	So, in other words, I ask it to give me suggestions, and sometimes I ask it to ask me questions.
	Like, you know, here I want to add this feature.
	You ask me what you think is relevant about this feature.
	Just give me a series of questions, and I'll give you the answers and give you the details.
	Other times, I tell it to come up with new ideas and new features and to review the entire code base and find anything that's missing or anything that might make.
	Processing more efficient or more redundant, more resilient to errors or disconnects or things like that.
	And I found that today, the AI systems that exist, like Claude Code, for example, are incredibly capable now with this process.
	They're very good at coming up with suggestions.
	In fact, it was just yesterday I was working, let me back up, I've rewritten my document processing engine.
	For classifying and cleaning documents to be used as indexed reference documents for our bright answers.ai engine, as well as the bright learn.ai book engine.
	And this also powers the research engines for the articles at naturalnews.com, as you probably have guessed, because they're so well researched now.
	You're like, how do they do all that research?
	Yeah, AI agents with this incredible, massive document base.
	That's how it's actually done.
	But anyway, I.
	I decided to rewrite the whole thing, the entire way that document processing is handled, because I was running into some inefficiencies with the old design.
	Anyway, so I rewrote the whole thing in about four hours with the help of AI.
	That was a project that used to take six months, let's say a year and a half ago, or yeah, even heck, even a year ago would have taken months.
	But I got that done in four hours because I was able to reuse a lot of the existing code, but I had to restructure the whole workflow, et cetera.
	Anyway, that took about four hours.
	And then after the four hours was done, I then asked the, I was using Claude code for this one.
	And so I asked Claude, I said, I want you to review the entire code base now, and here are the goals, and I want you to give me suggestions.
	on what's missing, you know, what have I left out of this?
	Now, before, let's say six months ago, it could not have reviewed the entire code base because the code base was too large.
	It didn't have a large enough context window.
	It couldn't keep it all in its memory at the same time.
	You know, it couldn't really process the whole code base.
	Now it can.
	So it goes through the whole code base, which for this project, I don't know, it's a couple hundred K of Python code.
	So it's not massive, but it's not tiny either, right?
	So it goes through the whole code base and it comes back with 14 suggestions.
	14 suggestions separated by high priority, medium priority, and low priority.
	And I went through all 14.
	I just, you know, reading through, like, oh, yeah, that sounds good.
	I forgot about that.
	Oh, yeah, the retries over here.
	Yeah.
	Oh, there's a file renaming collision problem over here, blah, blah, blah.
	And I looked at all 14 and I said, Yes, all 14 do it, you know.
	So the engine says, okay, sir, you know, we'll do all 14, and it spawns 14 agents and then it updates all the code, it adds all 14 things.
	And then there's another step that I always do after it makes a bunch of changes like that.
	I always go back and I tell it, review all the changes you just made one more time to see if you introduced any errors or if you left something out.
	And occasionally it will find problems that way.
	Not always.
	Sometimes it just comes back and says, everything was perfect, no problem.
	But every once in a while it's like, oh yeah, I found there's two bugs and here they are, Bloom.
	I'm going to fix these.
	So, anyway, I had to go back and check all 14 features that it added.
	Everything was good.
	And so then I ran the program and smooth as silk, you know, smooth as silk.
	It's just like churning away.
	And again, this would have taken months for a human programmer to do not that long ago.
NVIDIA Hardware Memory Limits 00:04:57
	Months.
	And I know this because I hired human programmers to do this, you know, a couple of years ago when I started the whole AI project and the data pipeline processing.
	I was paying human programmers to do this.
	Now, I just use AI and myself, and it all happens in a few hours or a few minutes in some cases.
	So, anyway, this is where AI is today.
	And AI advancements have not slowed down.
	They have not slowed down.
	There have been advancements that make the existing AI inference hardware, the hardware base that's installed around the world, including all of our hardware in my mini data center, I call it.
	48 workstations.
	Our hardware will be able to do more and more with each passing month as more innovation takes place.
	For example, Google famously released a paper called TurboQuant, and TurboQuant allows about a six times compression of the KV cache, which is one of the, it's like the context cache that's needed for the model to process all your context.
	You know, when you paste in a hundred K of text and you ask a question like, here, you know, Here's an entire book, or here's a PDF file, and I want you to find an answer that's somewhere in this book, right?
	So you paste in the whole book.
	That's a lot of context.
	And it turns out that that context takes up loads of memory in the GPU.
	It's the memory hog, actually.
	I mean, obviously, the model itself takes up memory, but the KV cache takes up even more.
	Well, it can, depending on your context size.
	And remember that DeepSeq version four, which is supposed to be coming out soon, Has a 1 million token context window.
	A million tokens, that's going to burn a lot of memory in your GPU, obviously.
	So, anyway, Google comes out with TurboQuant and says we can reduce that by a factor of six without losing any fidelity in the model.
	And through some clever math, it actually works.
	And people are now demonstrating that.
	So, all of a sudden, not only does the same hardware handle six times as much context, you know, theoretically.
	But also then looking through that KV cache is much, much faster for the GPU, you know, for, for the inference process.
	It's faster because it's less stuff to sort through.
	So it's faster and smaller.
	It's still the same base model, and it's the same base hardware, but it does more.
	And also, in a similar fashion, we've seen some new OCR models that are very, very small, like 1 billion parameter OCR models that are incredibly good, just ridiculously good, better than models that were 20 times larger at OCR.
	TTS models for text to speech are also, some of them are getting very good and very fast, et cetera.
	So, a lot of improvements are coming on the same hardware.
	And that's why demand for high bandwidth memory is actually suddenly, you know, falling because people are realizing that, hey, we might not actually need as much high bandwidth memory.
	We can do more with smaller GPUs.
	And this sucks for NVIDIA, obviously, because NVIDIA wants to sell you the largest, most expensive card possible.
	And by the way, I had a bad thing with NVIDIA recently.
	I had a $9,000 GPU that was faulty, it kept.
	Dying in inference.
	And I went through a warranty replacement process with NVIDIA.
	And at first, they were helpful and they're like, you know, send us the serial number and this and that, and give a photo and proof of purchase and all this stuff.
	And they were like, you know, they were acting like we're going to replace it.
	And then at the end, they said, no, we're not going to replace it.
	Even though it has a three year warranty, they said, you got to talk to the retailer that sold you this.
	I'm like, are you kidding me?
	This is a manufacturer's warranty replacement.
	So I'm still fighting with NVIDIA over that.
	You know, you spend $9,000 on a card, you expect it to work.
	And if it doesn't work, you expect them to replace it.
	But so far, they have refused.
	So NVIDIA sucks for that reason.
	But I mean, I spend a lot of money with NVIDIA, you know, hundreds of thousands of dollars a year.
	And you would think that they would treat me like a customer, you know, instead of a piece of trash, but whatever.
	So just be careful with NVIDIA.
	A lot of times, their cards don't work.
	And you may have to get a warranty replacement.
Free BrightLearn AI Engines 00:01:05
	Anyway, just letting you know about all of that AI is getting a lot more technical and capable.
	You can use my AI engines, they're all free.
	You can use my AI engine at brightanswers.ai, for example.
	That's our deep research engine.
	Or you can use our book creation engine at brightlearn.ai.
	And there you can also, of course, download books and the audiobooks that we have available now.
	They're all completely free.
	Downloadable.
	That's at brightlearn.ai.
	And you can follow more of my podcasts at brightvideos.com and my articles at naturalnews.com.
	So, a lot of AI advancements coming, and I have more announcements coming up this year as well.
	Major things are happening, major improvements.
	So, just be ready for that.
	Thank you for listening.
	Take care.
	Start your day right with our organic, hand roasted whole bean coffee.
	Low acid, smooth, and bold.
	Lab tested, and ethically sourced.
	Taste the difference only at HealthRangerStore.com.

Chapters

00:00:03 AI Reviewing Entire Code Base
00:05:08 NVIDIA Hardware Memory Limits
00:10:05 Free BrightLearn AI Engines