Show HN: WaveletLM – wavelet-based, attention-free model with O(n log n) scaling

Name: Show HN: WaveletLM – wavelet-based, attention-free model with O(n log n) scaling
Rating: 4.5 (1 reviews)

A novel, efficient language model architecture with O(n log n) scaling, outperforming GPT-2 Medium and Transformer-XL Standard on WikiText-103 with significantly less training data and budget, offering lower VRAM for generation and faster training.

Traction Score

Discussions

Apr 27, 2026

Launch Date

View Origin Link

Product Positioning & Context

AI Executive Synthesis

This represents a critical advancement in LLM architecture, addressing the prohibitive computational costs and scaling limitations of current attention-heavy models. The O(n log n) scaling, reduced VRAM requirements (4-5 GB for generation), and faster training times (16.25 hours with 20 GB VRAM) directly impact infrastructure expenditure and accessibility for AI development. Outperforming established models with less data and budget signals a potential paradigm shift towards more efficient, sustainable AI. This innovation could democratize advanced AI model deployment, enabling broader enterprise adoption where cost and resource optimization are paramount. Factuality still requires downstream techniques like RAG and instruction tuning.

WaveletLM is a wavelet-based, attention-free architecture that replaces self-attention with learned lifting wavelet decomposition, a Fast Walsh-Hadamard Transform, per-scale gated spectral mixing with SwiGLU activation, an inverse FWHT, and wavelet reconstruction. Combined with expanded MLPs and sparse product-key memory, this yields a model with O(n log n) scaling in sequence length.With 23.8 PPL on WikiText-103, WaveletLM beats both GPT-2 Medium, which was trained on 80× more data, and Transformer-XL Standard, which uses recurrence to extend its effective context. It is undertrained and underregularized due to budget constraints, so there is much room for development and improvement.I invite anyone who is curious to examine the model, test it out, and extend its capabilities further. All code and weights are fully open source, and a PG-19 run will be completed in 2-3 days. Generations can be done in 4-5 GB VRAM at 28.8 tokens/second, and the model is trainable in 16.25 hours with 20 GB of VRAM, both on a 5090.README for comparison tables, instructions, logs, and future plans:
https://github.com/ramongougis/WaveletLMWeights:
https://huggingface.co/ragou19/WaveletLM/tree/mainGenerations:
https://github.com/ramongougis/WaveletLM/blob/main/logs/wiki...The following samples were chosen for coherence, not factual accuracy. Factuality will require scaling and downstream techniques such as RAG and instruction tuning.> The history of the city is reflected in its architecture, which includes the historic Old Town and New Castle County Courthouse Square Historic District. The building was designed by John H. Stevens, who also designed the Albany-Fulton Celebration in 1906 and built a steel-hulled shipyard on the lake shore.> The album was released on August 25, 2007 by Sony Music Entertainment and features several songs from the record including "Never Say Die", "The Show", "Don't Cry for Me Argentina" and a cover of "I Can Only Imagine (But You Are Not Alone)".> The species was first described by Swedish zoologist Carl Linnaeus in 1758 as Agaricus adustus. The genus name is derived from the Latin words perma "to tie", and pous ("like") means "with a large head". In 1821, French mycologists Jean-Baptiste de Lacaille placed it in section Cricetae of the order Carnivora. He later renamed it Spongiforma punctata after the Greek kribensis.

Related Ecosystem & Alternatives

Discover adjacent products, open-source repositories, and developer tools sharing similar technical architecture.

Deep-Dive FAQs

What is WaveletLM – wavelet-based, attention-free model with O(n log n) scaling?

WaveletLM – wavelet-based, attention-free model with O(n log n) scaling is analyzed by our AI as: A novel, efficient language model architecture with O(n log n) scaling, outperforming GPT-2 Medium and Transformer-XL Standard on WikiText-103 with significantly less training data and budget, offering lower VRAM for generation and faster training.. It focuses on This represents a critical advancement in LLM architecture, addressing the prohibitive computational costs and scaling limitations of current atten...

Where did WaveletLM – wavelet-based, attention-free model with O(n log n) scaling originate?

Data for WaveletLM – wavelet-based, attention-free model with O(n log n) scaling was aggregated directly from the Hacker News community ecosystem, representing raw developer and early-adopter sentiment.

When was WaveletLM – wavelet-based, attention-free model with O(n log n) scaling publicly launched?

The initial public indexing or launch date for WaveletLM – wavelet-based, attention-free model with O(n log n) scaling within our tracked developer communities was recorded on April 27, 2026.

How popular is WaveletLM – wavelet-based, attention-free model with O(n log n) scaling?

WaveletLM – wavelet-based, attention-free model with O(n log n) scaling has achieved measurable traction, logging over 6 traction score and facilitating 0 recorded discussions or engagements.

Which technical categories define WaveletLM – wavelet-based, attention-free model with O(n log n) scaling?

Based on metadata extraction, WaveletLM – wavelet-based, attention-free model with O(n log n) scaling is categorized under topics such as: wavelet-based, attention-free model, O(n log n) scaling, learned lifting wavelet decomposition.

What are some commercial alternatives to WaveletLM – wavelet-based, attention-free model with O(n log n) scaling?

Our semantic intelligence engine identifies potential commercial alternatives in the SaaS space, such as Freesolo Flash, which offers overlapping value propositions.

How does the creator describe WaveletLM – wavelet-based, attention-free model with O(n log n) scaling?

The original author or development team describes the product as follows: "WaveletLM is a wavelet-based, attention-free architecture that replaces self-attention with learned lifting wavelet decomposition, a Fast Walsh-Hadamard Transform, per-scale gated spectral mixing w..."

Community Voice & Feedback

No active discussions extracted yet.

Discovery Source

Hacker News

Aggregated via automated community intelligence tracking.

Tech Stack Dependencies

No direct open-source NPM package mentions detected in the product documentation.

Media Tractions & Mentions

No mainstream media stories specifically mentioning this product name have been intercepted yet.

Deep Research & Science

No direct peer-reviewed scientific literature matched with this product's architecture.