COSMICOUS

Demystifying AI/ML algorithms – Part III: Selfies, the Unsupervised.

November 17th, 2024

About the series

In this third part of the series on the basics of AI/ML algorithms, I would deal with so called Un Supervised algorithms, which I refer to as Selfies. ‘Seen it before’ or Supervised algorithms were the subject of discussions in the second part ( https://ai-positive.com/2024/10/20/demystifying-ai-ml-algorithms-part-ii-supervised-algorithms-2/). The series started with my treatment of Good-Old-Fashioned-AI that gave a real start to practical use of AI (https://ai-positive.com/2024/08/28/understanding-gofai-rules-rule-and-symbolic-reasoning-in-ai/).

Getting rid of teacher

All algorithms we discussed in the second part require data with label, the outcome which the teacher as a trainer relate to the other input variables in the data to identify pattern. The output of the machine learning algorithms is a trained model which when fed with a new data can predict the label to which the new data belong, hence the outcome. Finding a good teacher is always a challenging task. What if we must automatically find the pattern in the data without explicit label?

Child learns within the first 3 years after born without a real teacher! They observe, listen, touch, taste, and smell everything they encounter which helps them learn about their environment. Imitation and play enable child to learn quickly. The logic of learning is already there in child’s mind. It becomes human nature to group things together or categorize them to make better sense of things. We see stars and constellations appear. Unsupervised algorithms in that sense are selfies which find hidden patterns, structures, and relationship within the data. There are several popular unsupervised learning algorithms widely used for machine learning. Let us look at most common ones.

K-Means Clustering: This algorithm partitions data into K distinct clusters based on the distance to the centroids of those clusters. This is like separating items of particular colour from mixed items of various colours. K indicates number of clusters to be formed. Popular implementations of K-Means clustering algorithms are:

Customer Segmentation: Grouping customers based on purchasing behaviour for targeted marketing.

Image Compression: Reducing the number of colours in an image by clustering similar colours together.

Document Classification: Organizing documents into topics or categories based on their content.

Hierarchical Clustering: It is a method of clustering that builds a hierarchy of clusters, which can be visualized as a dendrogram. Hierarchical clustering works something like this. Suppose you have group of students in a class and want to form groups based on similar interests for club activities. Initially each student is a cluster. Identify how similar each student is to every other student based on their interests. Group together two students who have most similar interests. Now find similarity of this new group to other students or other groups so formed. Repeat the process until everyone is part of any of the groups.

Typical use cases of Hierarchical Clustering are:

Gene Expression Analysis: Grouping genes with similar expression patterns in biological research.

Market Research: Segmenting markets based on consumer preferences and behaviours.

Social Network Analysis: Identifying communities within social networks.

DBSCAN (Density-Based Spatial Clustering of Applications with Noise): DBSCAN groups together points that are close to each other based on their density and marks points that are in low-density regions as outliers/ noise. It uses parameters like radius and minimum points to define dense regions and expands clusters from core points.

Suppose you are at a crowded party, and you want to figure out who are loners. Starting with a person check who are all close by. Anyone within the close range is part of same group. For each of the new group members, you again see who is close to them and keep adding them to the group. If someone is not near enough to form a group, they are considered an outsider or noise. The process is to be repeated till everyone at the party is either part of a group or classified as noise.

It is quite natural that DBSCAN is used for these applications:

Anomaly Detection: Identifying outliers in datasets, such as fraudulent transactions.

Geospatial Analysis: Detecting clusters of geographic locations, like hotspots in crime data.

Astronomy: Clustering stars or galaxies based on their characteristics.

Apriori Algorithm: Apriori algorithm is a learning method which discovers frequent itemsets in data. It then generates several association rules for those set of items. By calculating two factors – namely ‘confidence’ and ‘lift,’ Apriori algorithm eliminates rules which do not meet the minimum requirement and retains only those rules that qualify.

This algorithm works in context of a supermarket to identify items which are bought together frequently. It looks first at individual items such as soap or shampoo and counts how often they are bought. Retaining items that are bought frequently enough (above a particular number of times in a period like week), the algorithm looks at pairs of these times such as ‘soap and shampoo’ to see how often they are bought together. Retaining further only those pairs bought together frequently enough, the algorithm looks for larger sets of items like soap, shampoo and possibly conditioner and repeats the process. It keeps expanding and counting sets of items filtering out the ones that are not bought often enough together. The process results in item sets that are frequently bought together which helps to understand customer behaviour to make decisions by supermarket management.

Apriori Algorithm is used for:

Market Basket Analysis: Finding frequent item sets in transactional data to understand buying patterns.

Recommendation Systems: Suggesting products to customers based on their purchase history as well as other customer’s purchase history.

Web Usage Mining: Identifying common patterns in web navigation behaviour.

Self-Organizing Maps (SOM): SOM is used for clustering and visualization of high-dimensional data, i.e., data with several features/ variables. Preserving the topological structure of the original data, it creates a lower dimensional grid of computational units (called neurons) making it easier to identify patterns, clusters, and relationships.

SOM can be used to visualize the similarities between songs breaking down higher dimensional features such as Tempo, Genre, Duration, Energy, Danceability, Loudness, Musical key, Acousticness, Valence, Instrumentalness and Speechiness into a 2 dimensional grid with top-left cluster containing songs with high energy, fast tempo and high danceability (dance and electronic music category) and bottom right with high acousticness, low energy and high instrumentalness (classical, acoustic music category) and the centre containing songs with moderate energy, positive valence and high speechiness (pop and hip-hop music category).

Typical real-life applications of SOMs include:

Speech/ Handwriting Recognition: Recognizing patterns in complex datasets, such as speech or handwriting.

Social Network Analysis: Visualizing the structure of social networks and identifying communities or influential individual within the network.

Manufacturing Process: Used to monitor the health of machinery and detect potential failures based on the pattern of sensor data on temperature, vibration, and acoustic emissions and identifying deviations from the normal patterns indicating potential issues.

Principal Component Analysis (PCA): Reduces the dimensionality of data by transforming it into a new set of variables (principal components) that capture the most variance. Dimensionality refers to the number of features or variables in a dataset. In other words, PCA simplifies the data while preserving the variance as much as possible so that resulting data is easier to visualize and analyse. PCA is used as pre-processing step to reduce the number of features prior to applying any other machine learning algorithm to build a model.

Imagine you have a huge photo album, and each photo has several details like people, locations, activities, attires, and dates. It would be overwhelming to look through every photo and find key moments. We can identify key features or common themes such as weddings, birth days, vacations that most photos share and group them according to the theme. Choose a few representative photos from each group that capture the essence of the theme, which will highlight the key moments and people. PCA works like this.

Most common use cases of PCA are:

Face Recognition: Identifying prominent features in facial images for recognition, typically used by the police to identify a criminal from the description by a witness.

Stock Market Analysis: PCA is used to analyse and reduce the dimensionality of financial data, to identify the most key factors affecting stock prices and make informed decisions.

Environmental Studies: Analysing environmental data such as air quality and water pollution to determine the main sources of pollution to develop strategy for environmental protection.

Seen-it-before or Selfies – which way to go?

Selfies are the best for exploratory tasks such as customer segmentation, anonymity detection and market basket analysis when you do not have labels.

Selfies focus on exploration and discovering insights from data without pre-defined labels, but not on accuracy that can be obtained from Seen-it-before algorithms.

Selfie algorithms can be used to pre-train a model or extract features from data which then can be fed into Seen-it-before algorithms for building models with more accurate prediction.

There is also a cross between Seen-it-before algorithms and Selfies where a small amount of labelled data is combined with a large amount of unlabelled data to improve learning accuracy iteratively.

Ensemble method of using multiple models from both categories are used to arrive at the most accurate final model.
Demystifying AI/ML algorithms – Part II: Supervised algorithms

October 20th, 2024
About the series

This is the second part of my series on Demystifying AI/ML algorithms. This series is intended for curious people, who missed the buzz around AI/ ML until GenAI largely captured their attention. Some of the contents I have already shared years back, but feel it is important to revisit them before plunging into GenAI. I traced the origin of AI/ML in my first part of the current series and discussed how Good-Old-Fashioned-AI gave a real start to AI and still remains relevant (https://ai-positive.com/2024/08/28/understanding-gofai-rules-rule-and-symbolic-reasoning-in-ai/).

Patterns and Meaning

What makes us human is the need for us to search for meaning. If you want to get clarity from chaos, you try to identify patterns among chaos. Patterns are observations organized into meaningful categories. Charles Darwin’s theory of evolution and Gregor Mendel’s laws of heredity are outcome of careful observations of nature around. Patterns can be derived from observations of numbers, people’s behaviours, musical scores, and even our thoughts. We need a large number of observations to identify patterns. Data from observations and eliciting patterns from them brings out clarity of the real world represented by the data and enables predictability. Statistics, considered by many as a boring part of mathematics provides methods to derive patterns from data.

Machine learning algorithms are rooted in statistics. Statistical foundations of these algorithms enable them to learn from data, adapt, and generalize:
- Learn from Data: They identify patterns and relationships in data without needing specific ‘if-then-else’ rules.
- Adapt and Improve: They can adapt to new data and improve their performance through training and validation.
- Generalize: They aim to generalize from the training data to make accurate predictions on unseen data inputs.
When it comes to learning, there comes a teacher. There are also self-learners. There emerge two subcategories of these machine learning algorithms which I refer to as ‘Seen it before’ and ‘Selfies.’ In literature they are classified as Supervised and Un-Supervised algorithms.

Seen it before Algorithms

This category of supervised algorithms revolves around:
- Learning to see similarities between situations and thereby inferring other similarities, like if two patients have similar symptoms, they may have the same disease.
- The key is judging how similar two things are and which similarities to take forward and how to combine them to make new predictions.
They help solve real-word problems thru:
- Regression – deriving extent of relationship between set of data points reflecting the problem to predict a new value in the problem context.
- Classification – sorts data from problem context into distinct groups and helps predicting whether a new data point belongs to a particular group or not.
These algorithms need a label to group a set of data points during training to create a model that helps predicting the group for a new set of data points, the reason they are referred as Supervised Algorithms.

Linear Regression

If you want to predict to what extent you will feel relaxed when you sleep for a particular number of hours on a specific day, a line drawn between number of hours of sleep data on one axis and extent of relaxation on the other axis gathered from a good number of observations will become the pattern that would help to model using Linear Regression.

Used for predicting continuous values, Linear Regression models the relationship between a dependent variable and one or more independent variables from among the data to elicit a pattern.

Typical use-cases:
- Used for predicting price to be offered for apartments by builders based on features like location, number of rooms and other factors.
- Businesses use it forecast future sales based on historical sales data, market spend and economic indicators.
- I have used it for estimating efforts for software testing based on the characteristics of application under test.
Logistic Regression

If you want to predict whether your favourite IPL team will win a particular match or not, logistic regression helps to determine the probability of this result happening based on factors like home advantage, strength of teams and weather conditions.

Logistic Regression uses past data to give a percentage chance of an outcome and then making a yes/ no prediction based whether the probability is at least more than 50% or not. Used for binary classification problems, it predicts the probability of binary outcome unlike Linear Regression which works on continuous values.

Typical use-cases:
- Predicting whether a patient has a certain disease based on factors such as medical history, age, weight, and lifestyle.
- Predicting whether a customer will buy a product or not based on past behaviours.
Decision Trees

Used for both regression to predict numerical value as in Linear Regression and for classification like ‘yes/ no’ as in Logistic Regression, decision trees split the data into branches based on various values in the data creating a tree structure to produce an output.

Referred as non-parametric models, decision trees make fewer assumptions about data distribution unlike Linear Regression or Logistic Regression models which assume a normal distribution. While Decision Trees are flexible to adopt the pattern of underlying data, they are more complex and require more data to achieve satisfactory results. Decision Trees are better choice when there exist complex interactions between various fields in the data and in scenarios where interpretability the prediction process is key.

Typical use-cases:
- Marketing teams to segment customers based on purchasing behaviour, demographics, and engagement, when data consists of label.
- Credit scoring agencies to identify riskier applicants based on income, credit history and employment status.
Support Vector Machines (SVM)

Used for both classification and regression problems like Decision Trees, SVMs are better when the number of features (data fields) runs to hundreds. Suppose if the problem is to make a robot sort between apples and oranges based on various characteristics of apple and orange, SVM identifies the best ‘straight line’ between them. If there is any overlap, SVM performs ‘kernel trick’ to transform data into a 3D space and separate them thru hyperplane easily.

While SVM algorithms manage high dimensional spaces, Decision Trees are simpler and better interpretable when the data fields are in tens and not in hundreds. Overfitting can happen in Decision Trees when the number of features increase in which case SVM is a better choice.

Typical use-cases:
- SVM works well for problems that can be solved by classification such as identifying objects in photos and detecting faces in images.
- Sentiment analysis in social media postings such as whether it contains hate speech largely depend on text categorization capability of SVM algorithm.
- SVM algorithms are also used in speech recognition applications as it can be used to recognize spoke words and convert them into text.
k-Nearest Neighbours (kNN)

A simple, instance-based learning algorithm k – Nearest Neighbours (kNN) can be used for both classification and regression. It classifies a data point based on the majority class among its k nearest neighbours.

Suppose there is a party related to musical awards event and there are fans of major composers in the party. In general, we can expect fans of a particular composer to get close to each other and engage in animated discussions. If a new person enters the party hall and gets settled closer to one of those groups, then it is quite possible that the new person is a fan of the same composer whose fans are close to each other. kNN does this in a mathematical way finding the ‘distance’ between data points and using the majority vote of the nearest neighbours to make predictions.

Typical use-cases:
- Recommendation systems like how Netflix recommends movies based on your viewing history.
- Anomaly detection like in ‘fraud detection and network security’ detecting unusual data points in data sets.
- Speech recognition applications such as identifying and classifying speech patterns to activate voice-based systems use kNN.
kNN can be used for simple to moderate sized data sets. It works on the entire data set and finds out k nearest neighbours, k being the size of elements in the group to be recognized as neighbours. It is less complex and there is no need for training but computationally expensive as it needs the entire data set in memory unlike other algorithms which create a model out of training data.

Extreme Gradient Boosting (XGBoost)

All the above algorithms handle problems that can be solved by classification and regression. Choosing any of them for a particular problem is based on the data set on hand.

Ensemble methods combine multiple models to improve prediction accuracy and robustness. It is like a committee of multiple experts working co-operatively together to arrive at a decision.

Considered as a rock star, XGBoost is the most powerful and efficient among the ensemble methods. XGBoost builds models sequentially, where each new model corrects the errors of previous ones. It also uses smart ways to deal with missing data and thereby eliminates preprocessing.

Typical use-cases:
- Credit scoring agencies use XGBoost to predict the probability of loan default and assess credit worthiness of loan applicants based on age, income, existing loan, previous defaults, and other details.
- Banks use XGBoost to detect fraudulent transactions by identifying usual patterns among data such as the type of transaction, time of transaction, and location.
- Telecom companies use XGBoost to predict customer churn based on usage patterns and customer activities.
Key Take-aways

Seen-it-before algorithms
- can be used for any prediction problem that can be solved using classification or regression technique and has underlying big data from which patterns can be elicited as in several use-cases cited.
- can be used individually or combined to form an ensemble to improve predictability and performance.
Rules Rule – GOFAI: Demystifying AI algorithms

August 28th, 2024
In this series of articles titled ‘Demystifying AI Algorithms’, I would like to explore the basic nature of AI/ML algorithm categories and develop simple to understand perspective of the algorithms and their applications. Consciously I will not get deeply into technical aspects and will deal only with those applications that explicitly touch our personal and work lives. There could be some technical examples for the sake of details, which can be ignored. In this first part, I will bring out perspectives on so called ‘Good Old Fashioned Artificial intelligence (GOFAI)’.

The thumb of Old-fashioned

We will start from where it all started and tell you what is now referred to as ‘Good Old Fashioned Artificial intelligence (GOFAI) which was prominent during mid-1950s to mid-1990s and is still relevant. Not that all young people ignore elderly old-fashioned people. Trying to mimic experts was the earliest approach to install intelligence in machines. Rules and thumb rules in expert’s knowledgebase get processed in their brain to come out with answers for questions which often surprise the ordinary souls. I refer these categories of algorithms as ‘Rules Rule’.

How Rules rule?

In this approach knowledge is represented symbolically, and logical rules framed to simulate human intelligence. Knowledge can also be coded into rules and represented as trees which can be searched. Inverse deduction is one of the methods used to arrive at a result based on available rules.

To get a feel of how inverse deduction works let’s take up a simple rule set: ‘Cow eats grass’, ‘Sheep eats grass’, ‘Horse eats horse grams’, ‘Grass is plant material’, ‘Horse grams is plant material’, ‘Herbivores eat plant material’.

From the above rule we can deduce that ‘Horse is herbivores’ and the process is known as inverse deduction. The below implementation in prolog program can produce result as to whether cow is a herbivore.

% Facts

eats(cow, grass).

eats(sheep, grass).

eats(horse, horse_grams).

eats(deer, grass).

eats(lion, deer).

is_plant_material(grass).

is_plant_material(horse_grams).

is_animal_material(deer).

% Rule: Herbivores eat plant material

herbivore(X) :- eats(X, Y), is_plant_material(Y).

% Rule: Carnivores eat animal material

carnivore(X) :- eats(X, Y), is_animal_material(Y).

% Query: Is Horse a herbivore?

?- herbivore(horse).

% Query: Is Lion a carnivore?

?- carnivore(lion).

% Query: Is Horse a herbivore?

?- herbivore(horse).

We define the facts about what each animal eats, and that grass and horse grams are plant material. When we run this Prolog program, it will deduce that “Horse is Herbivores” based on the given rules. Unlike the imperative programming languages such as python or Java where we explicitly specify control flow using if-then-else statements, Prolog is a declarative language where we specify what we want to achieve rather than how to achieve it.

What is the nature of GOFAI?

If your problem can be solved using a set of rules or a tree structure, and you can find the solution by following these rules or searching the tree, then it’s a good fit for the GOFAI approach.

GOFAI is well-suited for building expert systems that emulate human expertise in specific domains. MYCIN was an early expert system developed in the 1970s by Edward Shortliffe at Stanford University. Its primary purpose was to assist doctors in diagnosing and recommending treatments for patients with blood diseases, particularly bacterial infections.

Pathfinding, game playing, and puzzle solving are some of the use cases where search techniques like search, breadth-first search, and depth-first search can be used to navigate through tree representations. Other applications include medical diagnosis, legal reasoning, financial advice, and troubleshooting complex machinery.

Problems like automatic planning, scheduling, resource allocation involve searches through possible states starting from initial state and executing actions to achieve a specific goal can be handled through algorithms that are of GOFAI category.

Limitations of GOFAI

There are several notable limitations that have influenced the shift towards other AI approaches like machine learning and neural networks:
1. They lack the flexibility to adapt to new or unexpected scenarios.
2. GOFAI systems do not learn from experience.
3. As the complexity of the problem domain increases, the number of rules required can grow exponentially leading to scalability issues.
Why should we get along with this old-fashioned guy?

GOFAI provides a foundational understanding of AI principles and techniques. You may wonder the relevance of this old-fashioned fellow in the current AI world where we hear a lot about LLMs like ChatGPT.
1. GOFAI systems are often more transparent and easier to understand compared to complex machine learning models. This makes them valuable in applications where explainability is crucial, such as legal and medical decision-making.
2. Modern AI often integrates GOFAI principles with machine learning techniques to create hybrid systems. For example, combining symbolic reasoning with neural networks can enhance the interpretability and robustness of AI models.
I have pointed out how the AI world is taking a leap back to GOFAI in my earlier blog – ‘Back to basics: Machine Learning for Human Understanding ( https://ai-positive.com/2024/05/30/back-to-basics-machine-learning-for-human-understanding/ ). In the world of ever growing AI models amidst web of AI/ML algorithms, there is an opportunity for GOFAI, which can be investigated further.
Back to Basics: Machine Learning for Human Understanding!

May 30th, 2024

Can human understand machine learning and develop trust? Trustworthiness of AI systems stands on three delicate pillars – Fairness, Explainability and Security. The objectives are to avoid biases due to social stereotypes, prevent misinformation and stop privacy leak. In addition, compliance framework serves as a support mechanism to the three pillars and provide better stability while evaluating Trustworthiness of AI systems.

I briefly touched upon the fairness in one of my earlier posts – Ethics in Artificial Intelligence (https://cosmicouslife.wordpress.com/2024/01/15/ethics-in-artificial-intelligence/). Fairness is judgemental as it deals with diverse discriminations. This aspect requires much better treatment, later.

Compliance with respect to regulations is nothing but interpretations of the other three attributes of trustworthiness to the local environment in which the AI operate. This will make policy makers busy and enable a lot of business opportunities to the top consulting firms, adding to the overall cost of AI systems.

Explainability requires some explanations. As long as we get value for the money or do not bother too much about the occasional small losses, we may trust the black box recommendation systems even if it keeps its mouth shut on why it recommended a particular thing. I undergo interesting explanation experience. An interesting person with whom I interact often never just answers any questions posed but comes out with a chain of reasons for the answer every time, challenging my patience. Do we need explanations for everything? No, but for some high impact situations. There is a cost attached to it.

Accuracy versus Interpretability trade-off constraints the choice of algorithms used for machine learning and so the explainability. Linear regression and Neural Networks are at both end of the spectrum, with Decision tree, K – Nearest Neighbours, Random Forest, and Support Vector Machines in-between. With neural network Large Language Models (LLM), interpretability is more of a challenge. If the system comes out with smart explanations, are we smarter enough to distinguish concocted explanations that the AI systems are capable of?

Explainability involves pointing out to the parts of an image like eyes that make an algorithm to label the image as a frog or building a decision tree of reasoning to provide traceability. Explanations are nothing but post facto confessions like ‘Why I did What I did’ derived from trained prediction models. Google came out with ‘chain-of-thought prompting’ for getting LLMs to reveal their ‘thinking.’ Thilo Hagendorff, a computer scientist at the University of Stuttgart in Germany has gone to the extent of saying that psychological investigations are required to open up the mad machine learning black boxes. Researchers are measuring the machine equivalents of bias, reasoning, moral values, creativity, emotions, and theory of mind in AI models to evaluate their trustworthiness. Like neuro imaging scan for humans, researchers are designing lie detectors that look at activation of specific neurons in neural network models to identify those set of neurons which wire together and then fire together. Anthropic, an AI safety and research company created a map of the inner workings of one of their models on May 21^st, 2024. This map can aid understanding of neuron like data points called features that affect the output.

A refreshing approach to explainability revolves around making the networks to learn from explanations rather than justifying the predictions. I was intrigued listening to Prof Vineeth Balasubramanian of IIT, Hyderabad, talking about his works on ‘Ante-hoc explainability via concepts’ in a recent IEEE event held in Chennai. Prediction models based on supervised learning are like memorizing everything and vomiting answers without adequate conceptual foundation. Researchers refer them as ‘stochastic parrots,’ meaning those models are probabilistic combination of patterns of text encountered while training, without understanding of the fundamental concepts. Models with concepts and rules seem to set the direction for explainability. This is counter intuitive to conventional prediction models. Expert systems of erstwhile era are decent implementation of explainable AI systems. Is it not like going back to the basics for human understanding of machine learning?
Feel Good: 10,000 hours or 10,000 steps.

February 19th, 2024

10, 000 hours in learning is Malcom Gladwell’s rule for achieving true expertise in any skill. Based on research that practice is the essence of achieving genius, this number was set. Walking 10,000 steps a day can help reducing the risk of common health problems such as heart disease, diabetes, blood pressure and of course depression and obesity too. Again, the number 10,000! Unfortunately, Edward Deming, our Sadhguru of Quality advocated us to eliminate numeric goals!

Is it not funny to know that the Japanese character for number 10,000 appearing like a person walking (万) is the reason behind the target of 10,000 daily steps? A clever marketing campaign ahead of 1964 Tokyo Olympics made 万 popular, the idea caught on and the number got struck. Priyanka Runwal refers to recent study in National geographic article published on January 3,2023 indicating that short bursts of vigorous activity every day – climbing stairs, carrying a heavy load of groceries, or stepping up the pace of housework – can provide substantial health benefits. University of Sydney exercise scientist Emmanuel found that three one-minute bursts of intense physical activity every day can lower a person’s risk of death by up to 40%. Should we spend 3 minutes or over 100 minutes for 10,000 steps in a day?

Can you become an expert by doing the same task or read the same book for 10,000 hours? What is the sanctity of 10,000 hours goal for learning to master any skill? In his book ‘Into the Impossible,’ Brian Keating asks, “while learning to fly an airplane, can logging 10,000 hours in the same plane in same weather conditions, in same take off & landing runways make one expert pilot?” While the variation matters, Is the number same for everyone regardless of innate aptitude? He later clarifies that the number is a statistical average – within the group of talented people what separate the best from the mean is that 10,000 hours of effort! The spirit behind the number 10,000 is getting into the talent pool either by nature or nurture and then boil for hours in varying vessels to get distilled as an expert!

When Deming said, “eliminate numerical goals, numerical quotas and management by objectives,” he meant, “provide required resources and leadership to achieve goals than shouting a number.” We can interpret it as the difference between an ‘arbitrary number’ and number backed by some science, possibly data science for management insights. On an individual basis, we can see it as difference between goals we set for ourselves and numerical goals others impose upon us.

Tamil Nadu in India has the highest number of Ramsar sites in India. Last weekend I was at Karaivetti Bird Sanctury, one among them. Neither I could clock 10,000 steps on that day nor able to identify all species, maybe I will have to clock many more hours of bird watching to reach the 10,000 hours goal! At the end of the day, I felt good being one of the first few to visit the latest Ramsar site declared on January 31, 2024.

Arbitrarily some numbers become popular periodically. Tendencies make us to go with the trends. Should we fall trap for ‘feel good metrics’ by ignoring facts behind and feel bad later? Measure what matters.
When Geography becomes History

January 24th, 2024

BYE FROM NATIONAL GEOGRAPHIC

The year 2024 also started with National Geographic magazine making the heartbreaking announcement that it will go off newsstands and said goodbye to all staff and avid lovers of the iconic magazine with yellow bordered cover page carrying wonderful photographs and insightful articles on environment.

“Like one of the endangered species whose impending extinction it has chronicled, National Geographic magazine has been on a relentlessly downward path, struggling for vibrancy in an increasingly unforgiving ecosystem”, Washington Post said in its obituary.

ELIXIRS OF LIFE IN DANGER

According to Annual Global Report 2023 published by National Centres for Environmental Information, last year was the warmest year since global records began in 1850 at 1.18°C (2.12°F) above the 20th century average of 13.9°C (57.0°F). This value is 0.15°C (0.27°F) more than the previous record set in 2016.

WWF Scientists say that if emissions continue to rise unchecked, the Arctic would be devoid of ice in couple of decades as ocean and air temperatures continue to rise rapidly. Erratic weather and disasters prefixed with ‘natural’ will become new normal globally.

“ Eighty-seven percent of lifeforms on this planet – microbes to worms, human beings, and all vegetation on the planet is sustained by an average of thirty-nine inches of topsoil and during the last forty years, forty percent of the world’s topsoil has been lost”, observes Sadhguru, the Mystic Guru and the proponent of the Conscious Planet movement to Save the Soil. Sadhguru predicts that agriculture will become a history in 45 to 60 years, if we don’t act now. Not only losing the topsoil above but also suction of the inner fossil fuel and burning the atmosphere around will make the earth just a suffocating shell ready to crumble.

Necessary actions to fix the soil can take care of water which is another elixir for life. There are some global actions to control the pollution in the air for the later generations to enjoy some oxygen to breathe. Based on the concept of Pancha Boodha, there are two more elements ‘fire’ and ‘space’ apart from the earth, air and water which collaborate to make the life happen. Fire requires oxygen as fuel and so depends on the quality of air. Space is the fabric on which the entire ‘tamasha’ happens.

DE-EXTINCTION TO EXTEND LIFE ?

If deteriorations go beyond certain level, it may not be possible to revive due to multiple feedback loops involved in the ecosystem as reported in a study by IIT Mumbai. The study published in Dec 2023 issue of Nature reveals that even though greening of India has increased over the last two decades, carbon uptake by forests has reduced due the impact of global warming. Will the unforgiving ecosystem make our earth history too?

Once endangered, the murphy radio which I got hooked to during the childhood has reincarnated as FM radio on the move providing hundreds of jobs to radio jockeys and income to businesses. They play some music apart from dancing to the music of the market transforming poor listeners to rich consumers. Not to talk about radio jockeys becoming psychologists, coaches, sociologists, judges, and counsellors liberally letting out all possible advice when they run out of humor.

Evolutionary changes in technology and business ecosystem made the radio come out of its hibernation. What about other life forms? De-extinction is the process of resurrecting the extinct species. Which species will de-extinct the homo sapiens?

UNIVERSAL HISTORY IN THE MAKING

Our universe has survived 13.8 billion years. “It will not be until 22 billion years before sun finishes its job and becomes a white dwarf or 10¹⁴ years before all stars fade or 10¹⁰⁰ years before the galactic-mass black holes too evaporate and the entire universe would be gone”, according to Richard Gott, professor of astrophysics at Princeton University. It will be a cause for concern only if we evolve into homo super sapiens and mange to exist in 22 billionth year.

Nature seems to have its own conscious agenda while homo sapiens have got into a growing consuming agenda. Ecologist and conscientious Tamil writer Nakeeran recollected an incident after watching the so called ‘naturalists’ assembling in front of the marina beach to raise “Save Nature” slogan and dismiss, but not before a press meet. A young local school kid watching the incident seems to have remarked “is it not that nature protects us, why these people want to protect nature?” The answer seems to be in the question which can raise further questions.

If we believe in the idea of multiverse, the existence of our universe is just a bubble on one of the branches of the multiverse. Super homo sapiens may find ways to diffuse from one verse to another before end of our epoch and universal history will be made. In the meantime, minimalist living can be the maxim for the current geography.
AI landscape: Aitomation to Aipps, but not yet – Aepps

January 18th, 2024

Advent of GenAI

To predict whether AI will make an impact in 2024 requires natural intelligence and a little data. Generative AI (GenAI) caught the fancy of all generations. Two months after launch, ChatGPT broke records as the fastest-growing consumer app in history, until Meta’s Threads overtook it. ChatGPT now has about one hundred million weekly active users while more than 92% of Fortune 500 companies are using the platform, according to OpenAI.

Investments into AI

Nvidia, the makers of chip that power training models for AI saw its market value shooting up by $800 billion making it the biggest percentage gainer among the large tech companies in the last year. According to AMD, Nvidia’s rival, market for AI chips for data centers would hit $400 billion by 2027, which is equal to the pre-covid global market size of all semiconductors. Companies like OpenAI building large language models attracted about $27 billion as per PitchBook. Goldman Sachs observes that investment in AI is just below 0.5% of GDP and estimate it to go beyond 2.5% of GDP by 2032. Will these investments result in improved productivity and better business results for users of AI? Looks like what happened during the gold rush when providers of mining equipment, tools and supplies made more profit than the people who managed to hit the gold!

Return of Investment

Tech companies are the earliest gainers from employee cost as several back-office jobs are getting replaced by AI. Study by Stanford author Brynjolfsson found that average productivity of call center workers rose by 14% within a few weeks of using the AI infused technology. Interesting thing to note is that the gains were 35% for lowest skill level workers. Lower the skill levels, higher the gains to the business as those jobs become extinct. The focus now is on how best to aitomate (‘ai’ based tools to automate tasks – my own terminology) tasks that can boost productivity. It is going to be a while before we see a real killer app based on AI that can fetch several fold returns.

Cautious Business and Road Blocks

Spending on Gen AI this year will not be more than $20 billion, which is just one fifth of the spend on IT security according to Gartner. Businesses will scale up spend over the next few years after a lot of trial and errors resulting in monetizable Aipps (‘AI’ based apps – again my own terminology). While the technology is progressing from Artificial Narrow Intelligence to Artificial General Intelligence, issues such as hallucination, bias, deep fakes, ethics besides the cost are to be controlled for AI to take over as pilot from just being co-pilot.

A Flashback

Recently, I landed upon a photograph of mine talking at a round table conference in early 90’s on Artificial Intelligence at the Vishakhapatnam chapter of ’The Institution of Engineers’. Three decades back, I led a team to develop an ‘Expert Mine Planning System’ at Ramco Systems, India with ‘expert systems’ approach in the blasting module. Expert systems emulate human decision making by reasoning through specific body of knowledge represented within system as ‘if-then-else’ rather than through procedural code. A technical paper about the system was accepted at the XXIV APCOM (Application of Computers and Operations Research in the Mineral Industries) conference held during Oct 31- Nov 3, 1993, at Montreal, Canada. Internationally well-known commercial vendors of ‘Mine Planning’ systems took serious note of our product and were surprised that such a product development was happening in India while the entire Indian IT was actually waiting for the Y2K to happen to get global recognition.

Narrow, not Super

Developments in cloud and big data over the decades have paved way for the current avatars of GenAI and AI has become a household term hallucinating between fake and real. Arrival of Aepps (Emotional Intelligence aware artificial intelligence apps – now you know that this is my terminology too) in times to come can catapult AI into the next generation. My comments are of ’narrow’ trending towards ‘general’ but not ‘super’ going by the categories of Artificial Intelligence.
Ethics in Artificial Intelligence

January 15th, 2024

She is on the 100 Brilliant Women in AI Ethics List according to her LinkedIn profile. With the advent of Gen AI, human bias become more pervasive, aided by the big data naturally reflecting the human. Ethics in AI is definitely more important as intelligence is becoming artificial. Machine learning models cannot be blamed for the biased data using which they learn. Human biases in color, class and creed get naturally impregnated into the learnt models, the results of which are consumed by the public at large. Property built through Intellect and Privacy go for a task as the devices become more intrusive and omnipresent. Distinguishing fake from real is becoming a day-to-day challenge as in an old Tamil movie song ‘unmai ethu poi ethu onnum puriyale, namma kanna nammala namba mudiyele‘. Regulatory framework & policy on the one hand and using the same AI technology on the other hand the world is attempting to build an ethical neural layer in artificial intelligence. But the larger question is whose ethics? Suchana Seth, who is under custody for suspected murder her own 4-year-old son in Goa is one of the global top women in ethics for AI. It appears that it is ok for her to lose her son as long as her estranged husband is deprived of meeting his son. An eye for eye seems to be the ethics under play for not willing to go by the court arrangement for the couple who didn’t see eye to eye. The question ‘who should make policy or build technology for ethics in AI’ will remain a tough question but we get data to answer ‘who shouldn’t’.
Process is in blood, saves lives

January 14th, 2024

2024 started with news of escape of 379 people aboard Japan Airlines (JAL) Airbus A350 catching fire while landing after a collision with a smaller aircraft at Tokyo’s Haneda airport on 2nd January. While the media reported the escape as miraculous, not so obvious is the innate nature of Japanese discipline of the crew that proved that 20 minutes is more than sufficient to evacuate all the passengers. Process in the blood manifested as calm behavior and careful action sequence ‘by the book’ during those crucial minutes to get the passengers out. The attitude results out of the dignity of work irrespective of the title and level in the hierarchy. Right weight process prescribing minimum things to maximize purpose is the core contributor to the successful outcome. The world keeps us teaching, if we are prepared to listen, learn and live the learning.

Advent of GenAI

Investments into AI

Return of Investment

Cautious Business and Road Blocks

A Flashback

Narrow, not Super