Why chasing the spotlight?

Posted on August 8, 2025 by sodatea2014

When I was little, kids with attention seeking behaviors were often dismissed and even punished by our teachers and parents. Self promotion was considered boasting and was a sign of weakness and lacking true strength. I was told by my late father that small & weak dogs bark a lot more than big & strong ones.

However, if you look around on any social media, or even at in-person social events, there are a lot more people comfortably promoting themselves and seek attentions without hesitation. What were they told by their teachers and parents when they were little? I kept wondering while observing their behaviors.

Maybe their backgrounds (family, education, culture etc.) are the complete opposite to mine and they can comfortably misrepresent themselves online or in person, especially when doing so is financially rewarding.

There are probably no right or wrong answers, just the facts that people are adjusting their behaviors to the current social and economical environments. Even I’d still feel uncomfortable or even shameful to keep calling attention to myself, as a parent, I probably should not dismiss or punish my own kids for their attention-seeking behaviors anymore.

[The above is human(myself) generated content]

Anyways, below is a summary from Google Gemini which seems to be reasonable.

Pop culture encourages people to be in the spotlight, partly due to its celebration of individualism, success, and fame, and partly because social media platforms amplify this desire by providing tools for self-promotion and the pursuit of attention. This can be seen in the way celebrities are presented as aspirational figures and how online platforms incentivize the creation and sharing of engaging content.

Here’s a more detailed breakdown:

Individualism and Achievement:Pop culture often emphasizes individual achievement, talent, and success, which can be showcased through being in the spotlight.
Celebrity Culture:The constant presence of celebrities in media and entertainment normalizes the idea of being famous and admired, creating a desire for recognition and validation.
Social Media Amplification:Social media platforms are designed to encourage engagement and sharing, which can translate into a desire for likes, followers, and visibility.
The “Selfie” Generation:The rise of the “selfie” culture and the emphasis on personal branding have made it easier for individuals to present themselves to the world and seek attention.
Fear of Missing Out (FOMO):The constant exposure to others’ curated lives on social media can lead to FOMO, making people feel the need to be visible and participate in the digital spotlight.

Too many experts on the GenAI bandwagon now?

Posted on July 2, 2025 by sodatea2014

Simply put, GenAI/LLM is fundamentally based on tokenizing information to estimate associations and similarities between tokens (weights and biases). Tokens are discrete units, but our world and human experiences are continuous. If you understand how things work inside GenAI then you should be better positioned to tell the differences between the current hype and the real abilities/limitations of GenAI. It’s really fun and can be useful for video/audio/text generation though.

However, if you are on Linkedin or Youtube, it’s not hard to find many self-claimed GenAI experts are clearly fueling the hype instead of communicating the true abilities & limitations, either intentionally or unintentionally. Why? Glory and fortune? But how long can this last?

[The above is human (myself) generated content]

Below is Google Gemini’s answer to a question of mine.

I find it useful.

Question: New self-claimed AI experts often seem to only focus on GenAI and ignore the classic machine learning use cases and operations already in place for decades. Why?

Answer from Google Gemini:

“Wow Factor” and Media Hype:
- GenAI is more tangible and visually impressive: Tools like ChatGPT, Midjourney, and Sora can generate human-like text, stunning images, and even videos from simple prompts. This immediate and creative output is incredibly engaging and easy to demonstrate, making for great headlines and social media content.
- Traditional ML is often “behind the scenes”: While foundational and critical, classic ML applications (e.g., fraud detection, predictive analytics, recommendation systems, spam filters) often work silently in the background, optimizing processes or making predictions that aren’t as directly visible or “creative” to the average person. Their impact, though immense, is often less immediately “shareable” or exciting.
Lower Barrier to Entry (for using GenAI):
- User-friendly interfaces: GenAI tools are designed for broad accessibility. Anyone can open a browser, type a prompt, and get a result. You don’t need to write code, understand algorithms, or manage data pipelines to get started.
- Traditional ML requires deeper technical skills: To implement classic ML, you typically need to understand data preprocessing, feature engineering, model selection, training, evaluation, deployment, and ongoing monitoring. This requires programming skills (Python, R), knowledge of libraries (scikit-learn, TensorFlow, PyTorch), and statistical understanding.
Perception of Novelty and “The Future”:
- GenAI feels cutting-edge: It represents a new frontier in AI, pushing boundaries in content creation and human-computer interaction. It feels like “the future” of AI.
- Traditional ML feels “established”: While continuously evolving, the core concepts and many applications of traditional ML have been around for decades. Some might perceive it as “old news” or less innovative, even though it forms the bedrock of most current AI applications.
Simpler Narrative for Non-Technical Audiences:
- GenAI is easy to explain: “It creates new stuff!” is a simple and compelling message.
- Traditional ML is harder to simplify: Explaining the nuances of a gradient boosting model for churn prediction or a CNN for image classification is far more complex for a general audience. The “experts” targeting business leaders or the public find it easier to discuss GenAI’s transformative potential without getting bogged down in technical details.
Monetization and Consulting Opportunities:
- Immediate business buzz: Companies are scrambling to understand and implement GenAI, creating a huge demand for “experts” who can explain it, consult on strategies, and lead workshops. This often translates to higher speaking fees, consulting rates, and job opportunities compared to the more mature and often specialized traditional ML market.
- “Disruption” narrative: The idea that GenAI will “change everything” appeals to businesses looking for a competitive edge, leading them to seek out anyone who claims to understand this “disruptive” force.

It’s important to clarify: Generative AI is a subset of machine learning, typically utilizing deep learning architectures (like transformers for LLMs) trained on massive datasets. It builds upon decades of ML research and breakthroughs.

The danger of this GenAI-only focus by “self-claimed experts” is that it can lead to:

Misapplication of technology: Trying to force GenAI solutions onto problems where traditional ML is more efficient, accurate, or robust (e.g., using an LLM for simple classification tasks when a classical model would suffice with less computational overhead and higher reliability).
Ignoring existing value: Overlooking the massive, proven value that traditional ML systems already deliver across industries for critical operations like fraud detection, credit scoring, demand forecasting, quality control, and medical diagnostics.
Lack of foundational understanding: Without a grasp of core ML principles, it’s hard to truly understand the limitations, biases, and ethical implications of even GenAI models, which can lead to unrealistic expectations or dangerous deployments.

Ultimately, genuine AI expertise requires understanding both the novel capabilities of GenAI and the robust, long-standing power of traditional machine learning, and knowing when and how to apply each appropriately.

Study the fundamentals before formulating an opinion, please…

Posted on June 28, 2025 by sodatea2014

It’s becoming almost disturbing that, so many GenAI experts I have seen and talked to, either online or in person, have so little understanding of the inner workings of GenAI. They have no or little knowledge about (or interest in knowing) the basic concepts such as transformer, tokenization, weights and biases.

In one occasion a GenAI expert got invited to an opening ceremony where I was sitting in. After her crazy statement about the superintelligence that GenAI is about to achieve. I asked her a question: “why my three year old daughter can identify cats and dogs after only showing her five photos in five minutes (costed me a small chocolate), while any GenAI foundation models will require millions of samples to learn the same thing and still make mistakes after costing millions of dollars for training?“.

Her answer was, “soon, GenAI will get there…”.

Then I knew our conversation was over.

Anyways, the content above was generated by human (myself).

The content below was generated by Google Gemini.

Question: Why so many people suddenly become “AI experts” on social media in almost no time?

The surge in people claiming to be AI experts or thought leaders, especially those from non-tech backgrounds or without relevant degrees, can be attributed to several factors:

Accessibility and Democratization of AI Tools: The rise of user-friendly AI tools, particularly large language models (LLMs) like ChatGPT and Gemini has significantly lowered the barrier to entry for interacting with and generating AI-powered content. Individuals can now leverage these tools to perform tasks that previously required technical expertise, leading to a perception that they “understand” or “master” AI with relatively little effort.
“AI Superagency” and Productivity Gains: AI tools can greatly enhance individual productivity and creativity, even for non-technical users. This “superagency” can lead to impressive outputs in various fields (e.g., marketing, content creation, business strategy), making it seem like the user has deep AI knowledge when they are primarily skilled at prompt engineering and applying off-the-shelf solutions.
Hype and FOMO (Fear of Missing Out): AI is currently a hot topic, generating immense media attention and investment. This creates a strong incentive for individuals and businesses to associate themselves with AI, often leading to exaggerated claims of expertise to capitalize on the trend or avoid being left behind.
Social Media Amplification: Platforms like LinkedIn, X (formerly Twitter), and others provide a powerful megaphone for self-promotion. It’s easier than ever to share opinions, “insights,” and examples of AI tool usage, quickly building a public persona as an “expert” without rigorous peer review or credential verification. Viral content and engagement can often be mistaken for genuine expertise.
Focus on Application vs. Deep Technical Understanding: Many self-proclaimed experts are focused on the application of AI in specific domains (e.g., how AI can impact marketing, law, or finance) rather than the underlying technical principles, algorithms, or ethical implications. While valuable in itself, this practical focus can be misconstrued as comprehensive AI expertise.
“Credential Inflation” and Evolving Definitions of Expertise: The traditional pathways to expertise (e.g., PhDs in computer science, published research in top-tier conferences) are still critical for deep AI research and development. However, the rapid evolution of AI means that “expertise” is becoming a broader concept. Some argue that understanding how to effectively use and apply AI, even without knowing the code behind it, constitutes a new form of expertise. This can lead to a devaluation of traditional academic credentials in favor of perceived practical acumen.
Demand for “AI Translators”: As AI becomes more ubiquitous, there’s a growing need for individuals who can bridge the gap between technical AI developers and non-technical business leaders or the general public. People who can effectively communicate AI concepts, explain its potential, and guide its adoption are in high demand, regardless of their technical depth. This creates an opportunity for those with strong communication and strategic skills to position themselves as “thought leaders.”
Lack of Clear Standards: The field of AI is still relatively nascent and rapidly changing, making it difficult to establish universally recognized standards for what constitutes an “expert” or “thought leader.” This ambiguity allows more room for individuals to define themselves as such.

While it’s true that valuable contributions to AI can come from diverse backgrounds, especially in terms of ethical considerations, societal impact, and domain-specific applications, the sudden proliferation of “experts” from non-technical fields can sometimes dilute the meaning of true AI expertise, which often involves deep mathematical, statistical, and programming knowledge, alongside a rigorous understanding of AI algorithms and their limitations. It’s important for consumers of AI content to critically evaluate the claims and backgrounds of those presenting themselves as authorities.

Support Vector Machines vs. Deep Neural Networks – from a Math POV

Posted on June 15, 2014 by sodatea2014

While some geeks like to compare SVM with DNN for their use cases and optimal tasks, people rarely differentiate these two from a mathematical point of view. This post is to document the comparable aspects of these two classes of algorithms I have learned in the past 12 months and keep it for future reference.

Linear Algebra and Statistics are two fundamental branches of modern Mathematics. Both are critical to the success of Machine Learning algorithms in recent years, even though some Mathematicians tend to dismiss Statistics as “not real math” (the same group may also consider Linear Algebra and Differential Equation as “not real math” ).

The objective of SVMs is to find the optimal hyperplane that separates data points of different classes. Linear algebra operations are used to find the hyperplane with the maximum margin. Data points are represented as vectors that can perform operations such as addition, subtraction, and scalar multiplication. More advanced operations required in SVMs include:

Dot Products: Calculating the distance and relationships between data points in a high-dimensional feature space.
Linear Transformations: Using kernel functions (though they can be non-linear, the underlying computation often involves transforming data into a higher-dimensional linear space).
Matrix Operations: Solving optimisation problems that can be formulated using matrices (e.g., finding the Lagrange multipliers).

Statistical learning theory provides a theoretical framework for SVMs in terms of minimising training error and generalisation error. For instance, Vapnik-Chervonenkis (VC) dimension is an important concept used to predict a probabilistic upper bound on the test error of a classification model. If the test error is much higher than the training error then the trained model is considered overfitted. Such a statistically motivated technique is to improve generalisation and prevent overfitting. Moreover, it is common to believe that a larger margin will lead to better performance on data that is unseen to the trained models in production.

Regularisation: The “C” parameter in SVM is used to control the tradeoff between maximisation of the margin and minimisation of the training error which in turns helps preventing overfitting.
Probabilistic Output: Although the default SVM output is a class label, fitting a separate statistical model to the SVM’s decision function scores can produce probabilistic output.

Each layer of a DNN is essentially built upon linear algebra operations. Input data (vectors or tensors) is multiplied by weight matrices at every layer with bias vectors added to the results of matrix multiplications.

Tensor Operations: Operations on multi-dimensional arrays are critical to more complex architectures such as convolutional or recurrent neural nets.
Gradient Calculation: The famous backpropagation algorithm relies heavily on the chain rule of calculus applied to matrix operations to calculate gradients of the loss function in relation to the weights.

DNNs are highly complex non-linear statistical models built for learning intricate patterns from large amount of data. Minimising the loss function (statistical in nature, e.g. cross-entropy, MSE) is a vital part of training DNNs involving quantification of the difference between predictions and the true labels.

Optimisation Algorithms: Due to its statistical properties and the benefits of approximation (reduced batch size to compute), Stochastic Gradient Descent (SGD) and its variations are extensively used in DNN training models on very large datasets. SGD provides a computationally cheaper approach to iteratively update the weights based on gradient calculated during back propagation.
Regularisation Techniques: Statistical overfitting is also a significant problem to DNNs. Dropout, Weight decay (L1 and L2 regularisation) and batch normalisation are common techniques to mitigate.
Probabilistic Interpretation: The output of the final layer of a DNN, when performing classification tasks, is often passed through a softmax function to provide a probability distribution over the classes.

Key Differences in the Application of Linear Algebra and Statistics:

Feature	Support Vector Machines (SVMs)	Deep Learning Neural Networks (DNNs)
Linear Algebra	Central to finding the optimal separating hyperplane, kernel trick.	Fundamental building block at each layer (matrix multiplications, etc.), crucial for backpropagation.
Focus	Finding a single optimal linear boundary (potentially in higher dimensions).	Learning hierarchical representations through multiple layers of linear transformations.
Statistics	Rooted in statistical learning theory, margin maximisation, regularisation.	Statistical modeling, loss function minimisation, optimisation algorithms, regularisation.
Interpretability	Can be more interpretable (especially linear SVMs) in terms of the separating hyperplane and support vectors.	Often considered “black boxes” with limited interpretability of individual weights.
Complexity	Generally less complex in terms of the number of parameters.	Can have millions or even billions of parameters.
Data Scale	Performs well on smaller to medium-sized datasets.	Excels on large datasets where it can learn complex patterns.

In a nutshell, while both SVMs and DNNs leverage linear algebra for their core computations, DNNs utilise linear algebra operations more extensively to as building blocks to their layered & hierarchical architecture.

Both SVMs and DNNs are grounded in statistical principles, but SVMs have stronger ties to statistical learning theory for generalisation guarantees, while DNNs rely more on empirical risk minimisation and various regularisation techniques to manage the behaviours of their models.

Agile Soda

Yet Another Blog – Dr. Haibo Yang

Too many experts on the GenAI bandwagon now?

Study the fundamentals before formulating an opinion, please…

Support Vector Machines vs. Deep Neural Networks – from a Math POV