Seeking Asymmetry
Posts
Limitations of Modern-Day Models

Limitations of Modern-Day Models

Invisible Gaps

Fred Yu
November 19, 2024

It may be slightly out of fashion to talk about traditional machine learning models when AI has taken centre stage. However, the truth is that the vast majority of commercial problems are still better suited to traditional models rather than current AI toolkits—Generative AI included—and this will remain the case until the next paradigm shift occurs.

Advanced machine learning is no longer a fairy tale, even for non-tech companies, thanks to the deep commoditisation of automated machine learning platforms and infrastructure. The days of memorising and manually implementing algorithms are long gone.

It is easy to assume organisations now have the best models and are extracting optimal value from their data assets. The reality could not be further from the truth. When modelling was a scarce craft, it was often the limiting factor. But now that modelling capability has reached abundance, the limitations have shifted. Value extraction and commercial exchange remain subpar, with the root causes more nuanced and, arguably, harder to solve than mere capability uplift.

Modelling Doesn’t Replace Product Thinking

Few people know that Dropbox popularised the concept of a “Growth Team” in Silicon Valley. Today, growth is seen as a focused way to scale products towards escape velocity. At its core, growth is about building a great product that delivers what people truly want.

One of Dropbox’s most famous moments was a comment it received on Hacker News before it even had a product. Critics said, “Why build this? You’re not creating anything new.” Had Dropbox listened to such advice and settled for mediocrity, the term “cloud storage” might still be foreign to us today.

A neglected truth in today’s world is that most products are not groundbreaking—they are solution providers, empowered by existing capabilities.

Capability building has always seemed exciting and futuristic. However, it is also the easier part. Why? Because the problem is well-defined, “product-market fit” is assumed, and there is usually a quantifiable metric to optimise for. Yet, this simplicity comes with a curse: capability-driven improvements follow the path of commoditisation, where incremental returns converge to zero rapidly.

While abundant capability means we can build anything at low cost, it does not mean anyone can build any great product. The customer-centric lens—specific to each product niche is often missing during capability building. In this sense, the application layer of technology is an entirely different beast compared to the infrastructure layer.

At the end of the day, commercial success still lies in building something people want. The process of delivering commercial models should resemble running a start-up, but this mindset is often weak, if not absent, in the modelling workforce today.

Models or Biomarkers

Traditional machine learning models were once constrained by limited computing power. Data scientists often compromised by using smaller sample sizes, fewer variables, or simpler models. Today, ensemble models combining multiple neural network sub-models with hundreds—if not thousands—of features have become commonplace.

Yet, one neglected question remains: do we need these complex models? More importantly, do they solve the problems at hand?

In medicine, most diagnoses rely on the presence of biomarkers. For example, Hutchinson-Gilford Progeria Syndrome (HGPS), a rare disease occurring in just 1 in 4–8 million people, can be identified through the presence of progerin, the disease-causing protein that leads to premature cell death.

Consider what a typical data scientist without medical expertise might do. They would collect extensive datasets on both HGPS and non-HGPS patients and build a sophisticated model. However, for rare diseases with extreme class imbalances, such models are often less useful than a simple biomarker like progerin, which offers near-perfect recall and precision.

This example highlights a broader principle: when it comes to target identification — be it for diseases or customers — biomarkers often outperform heavy models. Knowing who to target requires business acumen, as models cannot answer the question if you do not know what you are asking. Interestingly, once you have asked the right question, you often do not need a complex model at all.

Identifying customer “biomarkers” involves understanding their unique behaviours. Just as progerin indicates HGPS, customer biomarkers are found in the nuances of their journeys. If you do not invest in understanding your customers, your models will not either.

The Lost Dimensions

Although this discussion focuses on traditional machine learning, similar limitations apply to AI, particularly in the context of data dimensionality.

Alex Wang’s had a bold observation: the world is running out of public data to train LLMs (and synthetic data is rubbish).

Further to Alex’s comments, the AI world has been abuzz with Ilya’s recent comments about the scaling limits of LLMs.

This sounds daunting, but it follows the typical path of evolution: we start with quantity, then move to depth and quality. Quantity, it seems, has been exhausted.

The way data is currently presented to models fundamentally differs from how humans process information. Consider a savings account with a bonus rate: as humans, we intuitively understand the underlying layers:

Rule: The qualifying criteria for the bonus rate.

Intent: The goal of earning the bonus rate.

Action: Regular money transfers.

When this information is ingested into models, both the rule and intent are stripped away. The actions are condensed into aggregated figures, such as total deposits and withdrawals over a period. This compression leads to a permanent loss of context and prevents true understanding of customer behaviour.

For models to evolve beyond their current state, the underlying data must transition from 1-D feature stores to a 3-D feature universe. This universe would structurally map rules, intents, and actions, preserving the dimensions critical for understanding.

Conclusion

We live in an interesting time where modelling capabilities are abundant, yet the commercial value extracted from them is limited by invisible gaps in data representation and understanding. The future of machine learning (and AI), depends not just on building better models, but on asking better questions and embracing deeper dimensions of data. Until then, the actual commercial benefits of technological advances may remain subdued.