Show HN: Gemma 4 Multimodal Fine-Tuner for Apple Silicon

Name: Show HN: Gemma 4 Multimodal Fine-Tuner for Apple Silicon
Rating: 4.5 (20 reviews)

A local fine-tuning solution for Gemma 4 on Apple Silicon, specifically addressing the lack of audio fine-tuning support in MLX and memory constraints for longer sequences.

135

Traction Score

Discussions

Apr 8, 2026

Launch Date

View Origin Link

Product Positioning & Context

AI Executive Synthesis

A local fine-tuning solution for Gemma 4 on Apple Silicon, specifically addressing the lack of audio fine-tuning support in MLX and memory constraints for longer sequences.

This project delivers a local fine-tuning solution for Gemma 4 multimodal models on Apple Silicon, specifically targeting M2 Ultra Macs. It addresses critical challenges like streaming large audio datasets from Google Cloud Storage during training and overcoming memory limitations (OOM) associated with longer sequences. The developer explicitly highlights the absence of audio fine-tuning capabilities in MLX as a primary motivation. For B2B SaaS, this tool enables cost-effective, privacy-preserving local fine-tuning of advanced AI models, particularly for companies with sensitive data or limited cloud budgets. It democratizes access to multimodal AI customization, allowing developers to iterate rapidly on specialized models without relying solely on cloud infrastructure, thereby accelerating AI application development and deployment on powerful local hardware.

About six months ago, I started working on a project to fine-tune Whisper locally on my M2 Ultra Mac Studio with a limited compute budget. I got into it. The problem I had at the time was I had 15,000 hours of audio data in Google Cloud Storage, and there was no way I could fit all the audio onto my local machine, so I built a system to stream data from my GCS to my machine during training.Gemma 3n came out, so I added that. Kinda went nuts, tbh.Then I put it on the shelf.When Gemma 4 came out a few days ago, I dusted it off, cleaned it up, broke out the Gemma part from the Whisper fine-tuning and added support for Gemma 4.I'm presenting it for you here today to play with, fork and improve upon.One thing I have learned so far: It's very easy to OOM when you fine-tune on longer sequences! My local Mac Studio has 64GB RAM, so I run out of memory constantly.Anywho, given how much interest there is in Gemma 4, and frankly, the fact that you can't really do audio fine-tuning with MLX, that's really the reason this exists (in addition to my personal interest). I would have preferred to use MLX and not have had to make this, but here we are. Welcome to my little side quest.And so I made this. I hope you have as much fun using it as I had fun making it.-Matt

Related Ecosystem & Alternatives

Discover adjacent products, open-source repositories, and developer tools sharing similar technical architecture.

Deep-Dive FAQs

What is Gemma 4 Multimodal Fine-Tuner for Apple Silicon?

Gemma 4 Multimodal Fine-Tuner for Apple Silicon is analyzed by our AI as: A local fine-tuning solution for Gemma 4 on Apple Silicon, specifically addressing the lack of audio fine-tuning support in MLX and memory constraints for longer sequences.. It focuses on This project delivers a local fine-tuning solution for Gemma 4 multimodal models on Apple Silicon, specifically targeting M2 Ultra Macs. It address...

Where did Gemma 4 Multimodal Fine-Tuner for Apple Silicon originate?

Data for Gemma 4 Multimodal Fine-Tuner for Apple Silicon was aggregated directly from the Hacker News community ecosystem, representing raw developer and early-adopter sentiment.

When was Gemma 4 Multimodal Fine-Tuner for Apple Silicon publicly launched?

The initial public indexing or launch date for Gemma 4 Multimodal Fine-Tuner for Apple Silicon within our tracked developer communities was recorded on April 8, 2026.

How popular is Gemma 4 Multimodal Fine-Tuner for Apple Silicon?

Gemma 4 Multimodal Fine-Tuner for Apple Silicon has achieved measurable traction, logging over 135 traction score and facilitating 20 recorded discussions or engagements.

Which technical categories define Gemma 4 Multimodal Fine-Tuner for Apple Silicon?

Based on metadata extraction, Gemma 4 Multimodal Fine-Tuner for Apple Silicon is categorized under topics such as: Gemma 4, Multimodal Fine-Tuner, Apple Silicon, M2 Ultra Mac Studio.

Is Gemma 4 Multimodal Fine-Tuner for Apple Silicon recognized by media or academic researchers?

Yes. It has been covered by media outlets like Github.com. This indicates the concept has reached a level of mainstream or scientific viability beyond just developer forums.

What are some commercial alternatives to Gemma 4 Multimodal Fine-Tuner for Apple Silicon?

Our semantic intelligence engine identifies potential commercial alternatives in the SaaS space, such as Investor Updates, which offers overlapping value propositions.

How does the creator describe Gemma 4 Multimodal Fine-Tuner for Apple Silicon?

The original author or development team describes the product as follows: "About six months ago, I started working on a project to fine-tune Whisper locally on my M2 Ultra Mac Studio with a limited compute budget. I got into it. The problem I had at the time was I had 15,..."

Community Voice & Feedback

mandeepj • Apr 8, 2026

> I had 15,000 hours of audio datado you really need that much data for fine-tuning?

neonstatic • Apr 8, 2026

Just a heads up, that I found NVIDIA Parakeet to be way better than Whisper - faster, uses less compute, the output is better, and there are more options for the output. I am using parakeet-mlx from the command line. Check it out!

conception • Apr 7, 2026

I’m pretty excited about the edge gallery ios app with gemma 4 on it but it seems like they hobbled it, not giving access to intents and you have to write custom plugins for web search, etc. Does anyone have a favorite way to run these usefully? ChatMCP works pretty well but only supports models via api.

pivoshenko • Apr 7, 2026

nice!

yousifa • Apr 7, 2026

This is super cool, will definitely try it out! Nice work

LuxBennu • Apr 7, 2026

I run whisper large-v3 on an m2 max 96gb and even with just inference the memory gets tight on longer audio, can only imagine what fine-tuning looks like. Does the 64gb vs 96gb make a meaningful difference for gemma 4 fine-tuning or does it just push the oom wall back a bit? Been wanting to try local fine-tuning on apple silicon but the tooling gap has kept me on inference only so far.

craze3 • Apr 7, 2026

Nice! I've been wanting to try local audio fine-tuning. Hopefully it works with music vocals too

dsabanin • Apr 7, 2026

Thanks for doing this. Looks interesting, I'm going to check it out soon.

Discovery Source

Hacker News

Aggregated via automated community intelligence tracking.

Tech Stack Dependencies

No direct open-source NPM package mentions detected in the product documentation.

Media Tractions & Mentions

Show HN: Gemma 4 Multimodal Fine-Tuner for Apple Silicon
Github.com • Apr 7, 2026

Deep Research & Science

No direct peer-reviewed scientific literature matched with this product's architecture.