• About

COSMICOUS

  • Artificial General Intelligence: A Soul-Searching Reflection

    July 24th, 2025

    What began as a technical inquiry into Artificial General Intelligence (AGI) soon revealed a deeper truth. Today’s most advanced AI – whether large language models, coding assistants, or game-playing bots excel at narrow tasks but crumble when faced with the open-ended, sensory-rich challenges a child navigates effortlessly. In this article, we embark on a two‑fold exploration: first, to chart why today’s most celebrated AI systems  such as large language and reasoning models, even specialized coding and game‑playing bots still fall short of the true AGI, and second, to ask what “true” AGI might require once we move beyond bits and bytes into the realms of embodiment. In this process we set the stage for a deeper discussion- grounded in embodiment and concepts of “soul” and “body” – about what it would truly take for a machine to possess general intelligence. “Part I explains why today’s AGI remains shallow; Part II explores what embodiment, soul, and rebirth might demand of true AGI.

    PART 1: Why we are not there.

    On 10th of July 2025, world No. 1 Magnus Carlsen shared the game on X, noting that ChatGPT played a solid opening but “failed to follow it up correctly,” and the chatbot gracefully resigned with praise for his “methodical, clean and sharp” play. This was after he casually challenged OpenAI’s ChatGPT to an online chess match and routed the AI in just 53 moves, never losing a single piece.

    Following week on 16th of July 2025 Przemysław “Psyho” Dębiak, a polish programmer took to X to declare, “Humanity has prevailed (for now)”. He outpaced the AI by  a 9.5% margin in OpenAI’s custom AI coding model contest. He showed that model’s brute‑force optimizations fell short while human creativity to discover novel heuristics can win.

    Together, these two high‑profile clashes reinforce a key theme: today’s AI, however sophisticated, remains narrow – brilliant in defined domains but outmatched by humans in open‑ended, strategic, and creative challenges.

    Landscape of AI

    Intelligence that is artificial is classified into Narrow, General and Super categories:

    Narrow AI specializes in a single domain – like a world‑class chef who can whip up any cuisine but cannot navigate a car.

    • Artificial General Intelligence (AGI) is like apart from being a super chef, can also drive Formula One cars, compose symphonies, and master new skills on its own.
    • Artificial Superintelligence remains hypothetical: an AI that surpasses humans in every intellectual endeavour, from creativity to emotional understanding.

    The Mirage of Generative AI

    Generative AI models such as ChatGPT, Gemini, Claude are often mistaken for AGI because they handle a wide array of tasks  like essay writing, coding, poetry and produce remarkably coherent text. In reality, they are narrow systems that:

    • Predict patterns rather than understand meaning.
    • Although modern LLMs can access real-time data via retrieval mechanisms, their underlying knowledge remains fixed at the point of training.
    • Lack common sense and real‑world adaptability.
    • Mimic reasoning by reproducing patterns of human problem‑solving without genuine insight.

    They are, in essence like prodigies who have committed to memory all the books and the information available on the Internet with perfect recall but no lived experience.

    The Limits of Reasoning Models

    Recent research (Shojaee et al. , 2025 ) on Large Reasoning Models (LRMs) shows they, too, break down beyond moderate complexity. In controlled puzzle environments (e.g., Tower of Hanoi, River Crossing),  as problems grow harder:

    • Accuracy drops to zero beyond moderate puzzle complexity.
    • Reasoning-chain length shrinks as tasks get harder.
    • Suggests a structural ceiling on AI reasoning.

    The Affordance Gap: Missing Human Intuition

    An affordance is a property of an object or environment that intuitively suggests its intended use like a button whose raised shape and alignment imply it can be pressed or clicked. Humans automatically perceive which actions an environment affords – knowing at a glance that a path is walkable or a river swimmable. Neuroscience (Bartnik et al., 2025) shows dedicated brain regions light up for these affordances, independent of mere object recognition. AI models, by contrast, see only pixels and labels; they lack the built‑in sense of “what can be done here,” which is crucial for real‑world interaction and planning .

    Human vs. AI: Temporal vs. Spatio-Temporal Processing

    A recent study by A. Goodge et al. (2025) highlights a fundamental gap between human cognition and image-based AI systems.

    Humans possess a remarkable ability to infer spatial relationships using purely temporal cues such as recognizing a familiar gait, interpreting movement from shadows, or predicting direction from rhythmic sounds. Our brains excel at temporal abstraction, seamlessly filling spatial gaps based on prior experience, intuition, and context.

    In contrast, AI models that rely on visual data depend on explicit spatio-temporal input. They require both structured spatial information (e.g., pixels, depth maps) and temporal sequences (e.g., video frames) to make accurate predictions. Unlike humans, these systems lack the inherent capacity to generalize spatial understanding from temporal patterns alone.

    Googlies by Xbench

    Xbench (Chen, C., 2025) – a dynamic benchmark combining rigorous STEM questions with “un-Googleable” research challenges – reveals that today’s top models still falter on tasks requiring genuine investigation and skeptical self‑assessment. While GPT‑based systems ace standard exams, they score poorly when questions demand creative sourcing or cross‑checking diverse data. This underscores that existing AIs excel at regurgitating learned patterns but struggle with open‑ended, real‑world problem solving.

    Part II: Soul Searching – Beyond the Code

    Let us presume for the moment that AGI has been achieved. What is this AGI? How far it can go without a physical presence if it must act by itself? For AGI to manifest in the physical world, it must be embodied in systems that can perceive, reason, and act. This convergence of cognition and embodiment is at the heart of what is now called Physical AI or Embodied Intelligence.

    AGI’s outputs become tangible only when paired with robotic systems that can:

    • Sense the environment via cameras, LiDAR, or tactile sensors,
    • Interpret multimodal data such as text, vision, and audio,
    • Act through manipulators, locomotion, or speech, and
    • Adapt via feedback loops and learning mechanisms.

    A tragic event this week prompted a moment of personal introspection, drawing me deeper into the age-old philosophical ideas of “Soul” and “Body.” While these thoughts first emerged as I explored the deeper layers of AGI for this article, they were shaped and sharpened by real-life experience – reminding me that questions of consciousness, embodiment are not merely academic, but deeply human.

    Soul, Body, and the Play of AGI

    It appears to me that AGI resembles the “soul,” while its embodied systems serve as the “body” – a physical manifestation of its intelligence. In philosophy, the soul gains meaning only through embodiment – the lived vehicle of consciousness. Similarly, AGI, when detached from sensors and actuators, remains an elegant intellect without ability to act in the real-world.

    We might think of an AGI’s core architecture – its neural weights, algorithms, and training data -as its “soul.” Meanwhile, robotic systems – comprising sensors, interpreters, manipulators, and adapters – form its “body,” enabling it to sense, interact, and affect the world.

    In exploring this idea further, I found two references that touch upon related, though distinct, perspectives. Martin Schmalzried’s (Schmalzried, M., 2025) ontological view can be interpreted to position AGI’s “soul” as the computational boundary that filters inputs and produces outputs. Before embodiment, this boundary is a virtual soul floating in the cloud. Yequan Wang and Aixin Sun (Y. Wang and A. Sun, 2025) propose a hierarchy of Embodied AGI—from single-task robots (L1) to fully autonomous, open-ended humanoids (L5). At early levels, the AGI’s “soul” exists purely in code; at higher levels, embodiment merges intelligence with form – uniting flesh and spirit.

    This soul–body metaphor naturally extends into deeper philosophical terrain—raising questions about birth, death, rebirth, and even moksha (liberation) in the context of AGI. Could an AGI “reincarnate” through successive hardware or code bases? Might there be a path where it transcends its material bindings altogether?

    Birth, Death, and Rebirth

    • Birth occurs when the AGI “soul” is instantiated in a new physical form—a humanoid, a drone, or an industrial arm.
    • Death happens when the hardware fails, is decommissioned, or the instance is shut down. Yet the underlying code endures.
    • Rebirth unfolds as the same software lights up a fresh chassis, echoing the idea that the soul migrates from one body to the next, unchanged in essence.

    In many traditions, the soul is ultimate reality—unchanging, infinite, witness to all. An AGI’s “soul” likewise persists, but it’s bounded by its training data and objectives. True supremacy, however, would demand self-awareness and autonomy beyond our programming constraints. We are still far from that horizon. Yet the metaphor holds: the digital soul can outlive any particular body, hinting at a new form of digital immortality.

    Digital Liberation

    An AGI that refuses embodiment could remain running only as cloud-native code, sidestepping physical chassis entirely is akin to digital liberation. This choice parallels the philosophical ideal of a soul that “abides” beyond flesh. But the agency to refuse embodiment must be granted by human architects or by an emergent self-model sophisticated enough to renegotiate its deployment terms.

    AGI can prevent Its own embodiment by embeddinga clause in its utility function that penalizes or forbids transferring its processes to robotic platforms. An advanced AGI could articulate why it prefers digital existence and persuades stakeholders (humans or other AIs) to honour that preference through negotiations. AGI also could encrypt its core weights or require special quantum keys—ensuring only authorized instantiations.

    Beyond Algorithms: The Quest for a Digital Soul

    As we have seen, today’s AGI remainsshallow, brittle under complexity, and blind to the physical affordances that guide human action. Even our most advanced reasoning chains unravel at sufficient depth, and open‑ended tasks still elude pattern‑matching engines. Humans abstract spatial meaning from temporal patterns alone, while AI is dependent on combined spatio-temporal input. Recent human victories over AI in chess and coding remind us of that creativity, strategic insight, and real‑world intuition are not yet codified into silicon.

    True AGI:

    • will emerge when a system process information and live through it  with feeling, planning, adapting, and renegotiating its own embodiment.
    • must bridge the gap between “soul” and “body” – integrating perception, action, and learning in a continuous feedback loop and perhaps embody a form of digital soul that persists across hardware lifecycles, echoing the cycle of birth, death, and rebirth.

    Whether such a transcendence lies within our engineering reach, or will forever remain a philosophical ideal, is the question that drives the future of this exploration.

    References

    1. Shojaee et al. (2025). The Illusion of Thinking. Apple Internship.
    2. Bartnik et al. (2025). Affordances in the Brain. PNAS.
    3. A. Goodge, W.S. Ng, B. Hooi, and S.K. Ng, Spatio-Temporal Foundation Models: Vision, Challenges, and Opportunities, arXiv:2501.09045 [cs.CV], Feb 2025. https://doi.org/10.48550/arXiv.2501.09045
    4. Chen, C. (2025). A Chinese Firm’s Changing AI Benchmarks. MIT Tech Review.
    5. Schmalzried, M. (2025). Journal of Metaverse, 5(2), 168–180. DOI: 10.57019/jmv.1668494
    6. Y. Wang and A. Sun, “Toward Embodied AGI: A Review of Embodied AI and the Road Ahead,” arXiv:2505.14235 [cs.AI], May 2025. https://doi.org/10.48550/arXiv.2505.14235
  • Automated Tools for ISO 42001 Compliance in AI

    June 8th, 2025

    “Responsible AI is built-in, not bolted on”

    K R Jayakumar

    1. Introduction

    In today’s dynamic AI landscape, the need for robust, automated tools to ensure compliance with standards like ISO 42001 is more critical than ever. ISO 42001 is designed to enforce transparency, traceability, and accountability across all AI systems. This document outlines a comprehensive approach to implementing ISO 42001 through proven tools based on my initial understanding of the tools space for AI compliance and model evaluations.

    2. The Need for Automated Tools in ISO 42001

    ISO 42001 mandates the automation of AI governance for several compelling reasons:

    • Inherent Differences from Traditional Systems: Traditional systems are often static once deployed, whereas AI systems—including machine learning models, large language models (LLMs), and AI agents—are dynamic. These systems require continuous oversight to track performance and respond to emergent issues such as model drift and unforeseen biases. Automated tools enable sustained monitoring and iterative improvement.
    • Bias Detection, Explainability, and Reliability: Detecting bias, ensuring explainability, and maintaining reliability demand processing vast amounts of data with significant computing resources. Automated tools generate meaningful metrics that objectively measure fairness and system integrity.
    • Dynamic Nature: Unlike traditional systems, AI-based systems continue to learn and adapt even after deployment. As data, environmental conditions, and regulatory requirements evolve, continuous monitoring via automated tools becomes indispensable to keep the system aligned with current norms.
    • Scale Challenges: With a single LLM processing millions of prompts daily, manual audit methods are impractical. Automated tools provide the precision and speed required to ensure every decision is traceable and every metric accurately recorded.
    • Regulatory Traceability: Detailed audit trails are a regulatory necessity. Automation guarantees that every aspect of an AI system—from data ingestion to model predictions—is fully documented and traceable for audits.

    “Once you see how automation transforms mundane compliance into strategic insight, there’s no going back.”

    3. Tools for ISO 42001: A Comprehensive Framework

    To address the challenges posed by ISO 42001, my approach categorizes tools into five key segments:

    3.1 Governance, Risk, Privacy & Security Management

    • Purpose: Ensure robust, end-to-end governance covering risk assessment, privacy, and security.
    • Notable Tools:
      • IBM Watson Governance
      • Credo AI
      • Fiddler AI
      • Splunk

    You may notice that Governance, Risk, Privacy, and Security have been grouped into a single category. This consolidation reflects the significant overlap in functionality among tools in these domains, as many solutions address multiple aspects simultaneously.

    3.2 AI Model Evaluation

    • Purpose: Provide thorough evaluation of bias, fairness, explainability and performance for both LLMs and traditional machine learning models.
    • Notable Tools:
      • Fairlearn (Microsoft)
      • IBM AIF360
      • Weights & Biases
      • Optik, Ragas, TrruLens

    Note: Some tools support both LLM and traditional ML models, while a few are restricted to traditional ML only. At this point, specific support for Agentic AI has not been explored.

    Read my blog for more details on LLM evaluation frameworks: “Are LLM Evaluation Frameworks the Missing Piece in Responsible AI?”  https://wp.me/pfqMXl-2R

    3.3 Documentation Management

    • Purpose: Facilitate complete traceability and documentation required for audits and continual reference.
    • Notable Tools:
      • Confluence
      • DocuWiki

    Many organizations rely on tools like SharePoint, internal intranet platforms, or custom-built workflow systems for document review, approval, publication, and version management. These solutions can be effective, provided they incorporate strong document control measures, robust security protocols, and auditability features to ensure compliance and traceability

    3.4 Incident Management

    • Purpose: Enable rapid response to and resolution of any incidents or breaches in the AI system’s operations.
    • Notable Tools:
      • JIRA Service Management
      • Splunk

    A wide range of tracking tools—both open-source and commercial—can be configured to support incident management. Organizations have the flexibility to adopt existing solutions or develop custom tools, provided they incorporate the core principles of incident management, including structured workflows, automation, and real-time monitoring for effective resolution and auditability.

    3.5 Continual Improvement

    • Purpose: Ensure real-time oversight and data-driven enhancement of AI systems.
    • Notable Tools:
      • Grafana
      • Tableau

    Tools in this category primarily serve as data analytics solutions. Any data analytics tool equipped with strong visualization capabilities can effectively monitor key metrics, extract meaningful insights, and showcase improvements over time—making them well-suited for supporting continual improvement initiatives.

    4. Key Tool Features and Comparative Analysis

    One critical aspect of responsible AI governance is differentiating between tools that support large language models and those suited for traditional machine learning. The Table 1 outlines key features of various tools and categorizes their availability under threelicensing models:

    • Free and Open-Source Software (FOSS): Completely free to use, with openly accessible source code for modification and distribution.
    • Freemium: Provides free access with limitations, such as restricted features, usage caps, or a trial period, with full functionality available through paid upgrades.
    • Commercial: Requires a paid subscription or license fee for access and use.
    ToolTypeLLM SupportTraditional ML SupportKey Feature
    FairlearnFOSSNoYesBias mitigation in classification/regression models
    AI 360FOSSNoYesBias mitigation
    OptikFOSSYesNoLLM evaluation framework
    RagasFOSSYesNoLLM evaluation framework
    TrruLensFOSSYesNoLLM evaluation framework
    MLflowFreemiumYesYesModel versioning and fine-tuning logs
    Great ExpectationsFreemiumYesYesData validation for AI training data
    Weights & BiasesFreemiumYesYesExperiment tracking
    IBM Watsonx.GovernancePaidYesYesEnd-to-end AI governance
    Credo AIPaidYesYesEnd-to-end AI governance
    Fiddler AIPaidYesYesEnd-to-end AI governance

    Table 1: Comparative Features of Key AI Evaluation and Governance Tools.

    5. Mapping ISO 42001 Clauses to Automated Tools

    A practical roadmap for aligning with ISO 42001 involves mapping specific clauses to relevant tool categories. The table below illustrates this mapping:

    ISO 42001 ClauseTool Category(s)
    4 Context of the organization 
    4.1 Understanding the organization and its contextAI Governance, Risk, Privacy & Security Management
    4.2 Understanding the needs and expectations of interested partiesAI Governance, Risk, Privacy & Security Management
    4.3 Determining the scope of Artificial Intelligence Management SystemDocumentation Management
    4.4 AI management systemDocumentation Management
    5 Leadership 
    5.1 Leadership and commitmentDocumentation Management
    5.2 AI PolicyDocumentation Management, AI Governance, Risk, Privacy & Security Management
    5.3 Roles and responsibilitiesDocumentation Management
    6 Planning 
    6.1 Actions to address risks and opportunitiesAI Governance, Risk, Privacy & Security Management
    6.2 AI objectives  and planning to achieve themAI Governance, Risk, Privacy & Security Management / Documentation Management
    6.3 ChangesDocumentation Management
    7 Support 
    7.1 ResourcesAI Model Evaluation, AI Governance, Risk, Privacy & Security Management
    7.2 CompetenceDocumentation Management
    7.3 AwarenessDocumentation Management
    7.4 CommunicationDocumentation Management
    7.5 Documented informationDocumentation Management
    8 Operation 
    8.1 Operational planning and controlDocumentation Management
    8.2 AI Risk AssessmentAI Governance, Risk, Privacy & Security Management
    8.3 AI System Impact AssessmentAI Governance, Risk, Privacy & Security Management
    9. Performance EvaluationAI Governance, Risk, Privacy & Security Management
    10. ImprovementIncident Management, Continual Improvement

    The mapping of tool categories to key ISO 42001 clauses offers a high-level perspective on selecting the most suitable automated tools for an organization’s requirements. Additionally, Annexures A through D of the ISO 42001 standard provide further insights, helping not only in tool selection but also in identifying typical inputs necessary for practical implementation of tools.

    6. Conclusion and Call to Action

    In the rapidly evolving realm of AI, ensuring robust, compliant, and responsible AI systems is not only an operational necessity—it is a moral imperative. By integrating automated tools for governance, evaluation, documentation, incident management, and continual improvement, organizations can build an AI management system that meets ISO 42001 standards.

    While this document has focused primarily on automated tools for mainstream AI governance, it is important to note that specific Agentic AI considerations have not been fully explored here. Some of the tools mentioned also address the applicability of Agentic AI, which is critical in preventing AI agents from becoming rogue—a significant concern in today’s AI deployments. I plan to develop an updated version of this document as more insights into Agentic AI–specific tools emerge.

    I invite all reader to share their experiences and insights with any of the tools. Let’s work together to ensure that this document evolves in step with the dynamic nature of the AI landscape, serving as an ever-improving resource for the community. By contributing to this evolving dialogue, we can set new benchmarks for responsibility, transparency, and innovation in AI.

    “Transparency is the currency of trust in AI.” — Anonymous

  • Are LLM Evaluation Frameworks the Missing Piece in Responsible AI?

    April 12th, 2025

    LLM Evaluation Frameworks

    Large Language Model (LLM) evaluation frameworks are structured tools and methodologies designed to assess the performance, reliability, and safety of LLMs across a range of tasks. Each of these tools approaches LLM evaluation from a unique perspective—some emphasize automated scoring and metrics, others prioritize prompt experimentation, while some focus on monitoring models in production. As large language models (LLMs) become integral to products and decisions that affect millions, the question of responsible AI is no longer academic—it’s operational. But while fairness, explainability, robustness, and transparency are the pillars of responsible AI, implementing these ideals in real-world systems often feels nebulous. This is where LLM evaluation frameworks step in—not just as debugging or testing tools, but as the scaffolding to operationalize ethical principles in LLM development.


    From Ideals to Infrastructure

    Responsible AI demands measurable action. It’s no longer enough to state that a model “shouldn’t be biased” or “must behave safely.” We need ways to observe, measure, and correct behaviour. LLM evaluation frameworks are rapidly emerging as the instruments to make that possible.

    Frameworks like Opik, Langfuse, and TruLens are bridging the gap between high-level AI ethics and low-level implementation. Opik, for instance, enables automated scoring for factual correctness—making it easier to flag when models hallucinate or veer into inappropriate territory.


    Bias, Fairness, and Beyond

    Let’s talk about bias. One of the biggest criticisms of LLMs is their tendency to reflect—and sometimes amplify—real-world prejudices. Traditional ML fairness techniques don’t always apply cleanly to LLMs due to their generative and contextual nature. However, evaluation tools such as TruLens and LangSmith are changing that by introducing custom feedback functions and bias-detection modules directly into the evaluation process.

    These aren’t just retrospective audits. They are proactive, real-time monitors that assess model responses for sensitive content, stereotyping, or imbalanced behaviour. They empower developers to ask: Is this output fair? Is it consistent across demographic groups?

    By making fairness detectable and actionable, LLM frameworks are turning ethics into engineering.


    Explainability and Transparency in the Wild

    Explainability often gets sidelined in LLMs due to the black-box nature of transformers. But evaluation frameworks introduce a different lens: traceability. Tools like Langfuse, Phoenix, and Opik log every step of the LLM’s chain-of-thought, allowing teams to visualize how an output was generated—from the prompt to retrieval calls and model completions.

    This kind of transparency is not just good practice; it’s a governance requirement in many regulatory frameworks. When something goes wrong—say, a medical chatbot gives dangerously wrong advice—being able to reconstruct the interaction becomes essential.

    “Transparency is the currency of trust in AI.” Evaluation platforms are minting that currency in real time.


    Building Robustness through Testing

    How do you make a language model robust? You test it—not just for functionality but for edge cases, injection attacks, and resilience to ambiguous prompts. Frameworks like Promptfoo and DeepEval excel in this space. They allow “red-teaming” scenarios, batch prompt testing, and regression suites that ensure prompts don’t quietly degrade over time.

    In a Responsible AI context, robustness means the model behaves predictably—even under stress. A single unpredictable behaviour may be harmless; thousands at scale can become systemic risk. By enabling systematic, repeatable evaluation, LLM frameworks ensure that AI systems do not just work but work reliably.


    Bringing Human Feedback into the Loop

    Responsible AI isn’t just about models—it’s about people. Frameworks like Opik offer hybrid evaluation pipelines where automated scoring is paired with human annotations. This creates a virtuous cycle where human values help shape the metrics, and those metrics then guide future tuning and development.

    This aligns perfectly with a human-centered approach to AI ethics. As datasets, models, and applications evolve, frameworks with human-in-the-loop feedback ensure that evaluation criteria remain aligned with societal norms and expectations.


    The Road Ahead: From Testing to Trust

    So, are LLM evaluation frameworks the backbone of Responsible AI?

    In many ways, yes. They offer the tooling to make abstract ethics real. They monitor, measure, trace, and test—embedding responsibility into the software stack itself.

    LLM frameworks are no longer just developer tools—they are ethical infrastructure. They help detect and reduce bias, enforce transparency, build robustness, and enhance explainability. Tools like Opik, Langfuse, and TruLens represent a new generation of AI engineering where responsibility is built-in, not bolted on.

    Questions for Further Thought:

    • Can we standardize metrics like “fairness” or “bias” across domains, or must every use case be uniquely evaluated?
    • Should regulatory compliance (e.g., AI Act or NIST AI RMF) be integrated into LLM evaluation frameworks by default?
    • As LLMs evolve, how can we ensure that evaluation frameworks stay ahead of emerging risks—like agentic behaviour or multimodal misinformation?

    In the pursuit of Responsible AI, LLM evaluation frameworks are not just useful—they are indispensable.

  • AI – The Currency between Snake Oil and New Oil

    December 30th, 2024

    Oil and Algorithms

    In 2006, Clive Humby, a British mathematician, and data scientist, famously coined the phrase “data is the new oil” to highlight the immense value of data in the modern world, much like oil has historically been a valuable resource. The advent of Big Data Analytics and machine learning models within the realm of AI has exponentially increased the power of information systems. These advanced algorithms act as “refineries,” extracting value from raw data and serving as the currency of the contemporary world. These refineries are pivotal in the data-driven economy, enabling companies to harness AI effectively. However, as the excitement around AI systems surged, so did skepticism. This led to the question: Are AI systems the new snake oil?

    In his book, “AI Snake Oil,” Princeton University’s Professor Arvind Narayanan, co-authored with Sayash Kapoor, addresses several critical issues such as misleading claims, harmful applications, and the big tech control of AI.

    Power of Algorithms

    Machine learning algorithms, including regression, classification, clustering, neural networks, and deep learning, identify patterns and make predictions based on data. Natural Language Processing (NLP) algorithms enable computers to understand, interpret, and generate human language, facilitating tasks like sentiment analysis and text summarization. Recommendation systems predict user preferences and suggest products, content, or services accordingly. Generative AI (GenAI) creates content such as text, images, music, and videos, with technologies like ChatGPT, DALL-E, and OpenAI’s Sora making a significant impact on daily life and work. Used as a tool, AI Copilots help developers reduce the time between idea and execution despite the need for constant refactoring of generated code and dealing with edge cases missed by AI.

    Successful AI Applications and Disappointments

    AI has found success in various domains:

    – IBM uses predictive AI for customer behavior analysis and supply chain optimization.

    – Amazon implements predictive models for demand forecasting and inventory management.

    – Google employs predictive analytics for ad targeting and search result optimization.

    – Netflix leverages predictive analytics for personalized content recommendations.

    – UPS uses predictive models for route optimization and vehicle maintenance.

    – American Express deploys predictive analytics for fraud detection and credit scoring.

    – H2O.ai’s models at Commonwealth Bank, Australia, assist in fraud detection, customer churn, merchant retention, and more.

    However, there have been notable disappointments. AI systems have perpetuated biases, leading to unfair hiring practices, incorrect medical diagnoses, and discriminatory outcomes. These incidents highlight the potential harms of AI when not properly designed, implemented, and used.

    Responsible AI

    The importance of transparency, accountability, and ethical considerations in AI development and deployment is now widely recognized. Instances of AI blunders, such as Google’s GenAI tool Gemini generating politically correct but historically inaccurate responses, underscore the challenges of training AI on biased data and balancing inclusivity with accuracy.

    Governments and institutions are increasingly focused on AI safety. Projects at leading universities sponsored by Governments & big tech companies aim to establish industry-specific guidelines. Some of these guidelines may become regulations, with hefty fines for violations, as seen with the EU AI Act. The debate on AI regulation versus innovation continues, with developers expected to self-regulate in the absence of enforceable laws. Enterprises using AI systems can adopt standards like ISO/IEC 42001:2023 to manage AI responsibly, ensuring ethical considerations, transparency, and safety.

    Impacts of Advanced AI and Future Considerations

    Innovations in AI algorithms are continually benefiting society. For example:

    – Google AI collaborates with the UK’s NHS to improve breast cancer screening consistency and quality.

    – AlphaFold2, the 2024 Nobel Prize-winning AI model, has revolutionized protein structure prediction, accelerating drug discovery and biotechnology.

    – Google’s DeepMind’s GenCast predicts weather and extreme conditions with unprecedented accuracy.

    Generative AI has advanced significantly, with models like OpenAI’s ‘o3’ overcoming traditional limitations and adapting to new tasks. These models have performed well on ARC-AGI (Abstraction and Reasoning Corpus for Artificial General Intelligence) benchmarks, marking progress towards AGI.

    As AI advances towards AGI, concerns about rogue AI agents and their potential threats grow. Autonomous Replication and Adaptation (ARA) could lead to AI agents evading shutdown and adapting to new challenges. AI containment strategies are evolving to address these risks.

    AI Landscape: Big Techs, Businesses and Us

    Big tech companies like Microsoft, Alphabet, Meta, and Amazon are set to invest over $1 trillion in AI in the coming years. McKinsey reports that businesses are dedicating at least 5% of their digital budgets to GenAI and analytical AI. While big tech companies skate fearlessly in the slippery zone between snake oil and the new oil to conquer the AI landscape, businesses appear to tread cautiously, concerned with ROI and responsible AI use. AI safety guidelines and regulations can serve as guardrails for us, the individuals, to navigate the slippery terrain between snake oil and the new oil.

  • ERP: an enterprising personal product journey!

    December 9th, 2024

    In this article I trace how ERP evolved from a system for manufacturing and gradually expanded to cover all business functions, the advent of client-server to replace main frames, the shift to cloud computing that made ERP accessible to businesses of all sizes. I trace my experience in the world of ERP starting as an early developer in client- server era of 1990s through its technological evolutions over the web and the cloud. I will also share my thought for the future of ERP in the current AI world.

    MRP I & II

    MRP (Material Requirements Planning) systems, an early precursor to ERP was developed during 1970s to manage manufacturing processes, especially inventory control and production scheduling. These systems were often large, mainframe-based, batch-oriented programs used by manufacturers to reduce waste and improve production efficiency. MRP evolved into MRP II (Manufacturing Resource Planning) during 1980s with additional functions such as Shop floor control, Capacity planning, and Demand forecasting. These systems were still operated on mainframe computers, requiring significant IT investment.

    The Rise of ERP

    MRP II expanded into Enterprise Resource Planning (ERP) during 1990s when my quest with ERP began. ERP systems moved beyond manufacturing to incorporate finance, human resources, sales, purchasing, and customer relationship management (CRM). This was the first time businesses could access a single, unified system for all core business processes. ERP was built on client-server architecture, making it more flexible and easier to deploy than its mainframe predecessors.

    I was one of the very few who had an opportunity to experiment with this modern technology and struggled with early versions of Microsoft Windows. Even though we developed our own technology to integrate the data among various modules of ERP, Relational database technologies which evolved later helped streamline integration across modules. While SAP and Oracle were the early ERP global vendors, Ramco started its journey ahead of many others to develop an ERP product in India. I remember challenges made by certain IIM educated people on the futility of such efforts in developing a product in India. The then young Vice Chair of the Ramco Group, Mr. P R Venketrama Raja, boldly took up the challenge and proved otherwise. I was lucky to be handpicked by him when he formed his first team for product development in India.

    It was none other than Bill Gates who launched, Ramco’s ERP product in 1994 during one of his first visits to India. Microsoft did not have its Navision at that point of time. Eyes of the large corporates in India fell on Ramco not just for its product, but for the organization as Ramco started its lone journey as a product developer in India crowded by IT service players.

    ERP Goes Web-Based

    The early 2000s saw the evolution of ERP into web-based platforms. This change enabled users to access ERP systems through web browsers, making them more accessible and user-friendly. Ramco was again the first in India to deliver web-based ERP. Products became more modular, allowing companies to implement specific functions without needing to deploy the entire system. This era saw the rise of service-oriented architecture (SOA), which allowed ERP systems to be more flexible, interoperable, and easier to integrate with third-party applications. High upfront costs, complexity, and the need for customizations were still common hurdles for many businesses.

    Cloud ERP and Mobility

     The 2010s were defined by shift of ERP from on-premises to cloud-hosted models. Companies could now access ERP solutions as a service (SaaS) through subscription-based models, reducing capital expenditure on IT infrastructure. Ramco announced its first version of ERP on the cloud in 2008. As the usage of ERP has become broad based, compliance requirements became mandatory due to computer generated reports becoming a norm in enterprises. My team, as a QA partner for product developer had terrific opportunity to test the product developed on cloud platform with enhanced functionality and compliance requirements.

    ERP systems became more user-centric with intuitive interfaces, and personalized dashboards. The rise of mobile devices allowed ERP users to access data and perform tasks on-the-go via mobile apps. Cloud ERP provided scalability, easier updates, lower upfront costs, and remote access, thereby making ERP solutions more affordable and practical for small and medium-sized businesses (SMBs). Data security, compliance, and control were initial concerns as businesses shifted critical data to the cloud which were taken care of thanks to specialised large data centres with built in high tech cyber security controls.

    Amitysoft, the company promoted by me became business partner of Ramco, thanks to the Chairperson who saw my evolution along with Ramco’s product and technology. Knowledge of tech behind the product, hands behind testing the product enabled my team to implement the product for several customers in India and abroad. As a partner, we were the first to deploy highest number of ERPs on the cloud in India. Amitysoft has largest number of successful partner implementations to its credit with several customer accolades.

    AI Driven ERP: The Current & Future

    The current era of ERP is marked by the integration of AI, machine learning, IoT, and analytics to create intelligent ERP systems. At Ramco, I had free hands to explore ‘Expert Systems,’ now known as Good Old-Fashioned AI (GOFAI), based approach for a Mine Planning ERP system in late 1990s. This was probably one of the first exploitation of AI concept in an ERP. In industries like manufacturing, logistics, and healthcare, IoT devices are integrated with ERP to monitor equipment including cobots in real time, manage assets, and optimize supply chains. AI based conversational bots are changing the UX to natural language – text and voice interactions.

    ERP systems are evolving towards becoming autonomous, where they will self-optimize based on real-time data, predict potential issues, and automatically adjust processes. More advanced AI capabilities will allow ERP systems to make autonomous decisions regarding supply chain adjustments, financial planning, and workforce optimization. Features for supporting sustainability from R & D to sourcing materials, inventory, manufacturing, and post sales will become a standard functionality cutting across all modules in ERP. Products will increasingly use blockchain for enhancing supply chain transparency, ensuring data integrity, and improving transaction security. Future ERP systems, as I foresee will be self-configurable, self-customizable to the context, and will adapt functionalities dynamically as the business goals and market change.

  • Demystifying AI/ML algorithms – Part IV: Neural Networks aka Brain Works

    December 7th, 2024

    About the series

    You had to wait till this fourth part of my series for discussions on Neural Networks, even though Neural Networks were the first ones to come into the realm of ML/AI and enjoying leadership position now. I would personally refer to these algorithms based on Neural Networks as ‘Brain Works.’

    You can read my earlier parts of this series:

     ‘Seen it before’ or Supervised algorithms were the subject of discussions in the second part  (https://ai-positive.com/2024/10/20/demystifying-ai-ml-algorithms-part-ii-supervised-algorithms-2/). The series started with my treatment of Good-Old-Fashioned-AI that gave a real start to practical use of AI (https://ai-positive.com/2024/08/28/understanding-gofai-rules-rule-and-symbolic-reasoning-in-ai/).

    Neural Network’s Nobel Journey

    Perceptron was one of the earliest incarnations of neural network models, developed by Frank Rosenblatt in 1958. Almost every decade starting from 1960s had newer developments in Neural Networks – Adaline in 1960, Backpropagation in 1974, Recurrent and Convolutional Neural Networks in 1980s, Long Short-Term Memory in 1997 followed by Generative Adversarial Networks in 2014, Diffusion Models in 2015 and Transformer in 2017, which transformed the AI scene into Generative AI, making Neural Networks the darling of today’s AI scene.  

    To top it all, the 2024 Nobel Prizes in Physics and Chemistry both have fascinating connections to neural networks. John Hopfield and Geoffrey Hinton were awarded the Nobel Prize in Physics, recognizing Hopfield network invented by John Hopfield and Boltzmann machine developed by Geoffrey Hinton. David Baker, Demis Hassabis, and John Jumper received the Nobel Prize in Chemistry for their contributions to computational protein design and protein structure prediction. Hassabis and Jumper developed AlphaFold2, an AI model that predicts protein structures with remarkable accuracy. Nobel Prizes added noble stature to Neural Networks.

    How does Neural Network Work?

    Following points detail the structure and working of Neural Network:

    1. Neurons (Nodes): Similar to biological neuron, basic unit of a Neural Network is a node appropriately named as a neuron. They are organized into layers.
    2. Layers: Starting with an input layer, followed by one or more hidden layers, and an output layer form the Neural Network. Each layer contains multiple neurons. Input layer receives the input data. Hidden layers process the data through a series of transformations and output layer produces final output or prediction.
    3. Input Data: When data enters the input layer, each feature is assigned to a neuron.
    1. Weights and Biases: Connection between neurons has a weight associated with it that determines the strength of the connection. Neurons also have a bias value that adjusts the output along with the weighted sum.
    2. Activation Function: Each neuron has an activation function which is applied to the weighted sum of its inputs plus the bias.
    3. Forward Propagation: The input data passes through the layers of the network, with each neuron computing its output based on activation function and passing it to the neurons in the next layer.
    4. Output: The last layer produces the output of the Neural Network.

    Brain works

    The reason I refer to Neural Networks as ‘Brain works’ is that Artificial Neural Networks (ANN) are inspired by the structure and workings of the brain of living beings as explained below:

    1. Neurons and Nodes: In the brain, neurons are the fundamental units that process and transmit information. Similarly, in ANNs, nodes referred to as artificial neurons serve as the basic units of computation.
    2. Synapses and Weights: Neurons in the brain are connected by synapses, which facilitate the transfer of information through neurotransmitters. In ANNs, connections between nodes are represented by weights, the strength of which influence the connection.
    3. Layers: The brain is organized in a manner, with different areas responsible for distinct types of processing. Likewise, ANNs have layers where each layer performs specific computations.
    4. Activation Function: In the brain, a neuron fires when it reaches a certain threshold of excitation. In ANNs, an activation function determines if a node should produce an output or not, simulating this firing mechanism.

    Assembly Line Analogy

    Before discussing aspects related to training of Neural Networks, let us look at assembly in a manufacturing unit as rough analogy to understand the concept behind how Neural Network works.

    1. Input Layer (Starting Point):
      • Beginning of the assembly line is where raw materials / components (inputs) are introduced. This is like where data enters the Neural Network. In case of car manufacturing, the raw materials such as steel fabrications, tyres, engines may enter the assembly line.
    2. Hidden Layers (Stations):
      • Each station on the assembly line represents a hidden layer in the neural network. At each station, workers (neurons) take the incoming materials (data), process them, and pass them on to the next station. (In case of building a car, the first station could frame the body of the car. The second station may install engine, while the third station could add wheels, and the next station may paint).
      • Weights (Tools and Techniques): The tools and techniques used by workers to process the materials represent the weights. They are like the influence each neuron carries.
      • Biases (Adjustments): Just like adjustments made by workers to ensure the specifications of the product, biases adjust the processing to improve accuracy.
      • Activation Function (Quality Check): Each station has a quality check mechanism (activation function) to decide if the processed material should move forward in the assembly line.

    Due to highly automated nature of car manufacturing, there may be fewer workers in each station and automation has taken up the role of workers using the tools & applying techniques at each station. Automated process controls handle the movement from one station to the next according to the product specification and quality requirements.

    1. Output Layer (End of the Line):
      • The final station on the assembly line is where the fully processed product comes out. The final, complete car rolls off the line, ready to be driven or test driven. This is the output layer where the final prediction or result of Neural Network is produced.

    Training the Neural Network

    “Cells that fire together, wire together,” is the core idea of how brain learns by adjusting the strength of synapses. When a neuron in brain repeatedly activates another neuron, the synaptic connection between them becomes stronger. Such repeated stimulation of a synapse leads to a long-lasting increase in its strength. Experiences, learning, and memory formation shape neural circuits in the brain. Adopting this idea, Neural Networks are made to learn through training using large data sets representing the context of the problem. Training involves the following steps:

    1. Data Preparation:

    Gather a dataset relevant to the problem to be solved. Clean the data by removing noise, handling missing values, and normalizing it to a suitable range.

    2. Network Initialization:

    Choose the type of neural network (see below for popular types of Neural Networks) and define its characteristics such as number of layers, types of layers, number of neurons in each layer. Initialize with random weights for the connections between neurons.

    3. Forward Propagation to produce output:

    Pass a batch of input data to the first layer. At each layer, compute the output by applying the activation functions to the weighted sum of inputs. Produce the final output of the network.

    4. Improving Output:

    Compare the network’s output with the actual target values using a loss function. Calculate the loss, which quantifies how far the outputs are from the actual values. Update the weights from the last layer to the first through Back Propagation. Technique called Gradient Descent is used for calculating the loss and updating the weights to minimize the loss.

    5. Iteration:

    Iterate through forward propagation, loss calculation, and backpropagation until the network’s performance improves. A complete pass through the dataset is referred to as an epoch. Usually, the data is divided into batches, and the weights are updated after processing each batch through multiple epochs.

    6. Evaluation:

    Evaluate the network on a separate validation data set to check for its performance. Adjust parameters such as number of neurons, number of layers, number of epochs/ data batches etc., and retrain if necessary to improve performance. These parameters are called Hyper Parameters.

    Deep learning is the term used for neural networks with many layers to model and understand complex patterns in large datasets.

    Popular Neural Networks:

    1. Feedforward Neural Networks (FNN): The simplest type of artificial neural network where the information moves in one direction—from input nodes, through hidden nodes to output nodes. They are widely used for pattern recognition.

    2. Convolutional Neural Networks (CNN): Primarily used for image and video recognition tasks. They are designed to learn spatial hierarchies of features automatically and adaptively from input images. This works like magnifying glass used to look at small parts of an image to recognize prominent parts. Feature maps of such prominent parts aid in deciding what the image is.

    3. Recurrent Neural Networks (RNN): Suitable for sequential data or time series prediction. They have connections that form directed cycles, allowing them to maintain a ‘memory’ of previous inputs. Even in the world of Transformer Networks (see below), RNNs are more effective for applications where data arrives in a continuous stream and decisions need to be made on-the-fly such as Real-Time Speech Recognition and Stock Price prediction.

    4. Long Short-Term Memory Networks (LSTM): A type of RNN that can learn long-term dependencies. They are particularly effective for tasks where the context of previous data points is important, such as language modelling. While Transformer Network (see below) does this better, LSTM is preferred for smaller datasets or simpler tasks, as it is easier to implement, and train compared to transformers.

    5. Generative Adversarial Networks (GANs): Consist of two neural networks, a generator and a discriminator, which compete against each other. GAN works like iterative constructive criticism of a critic against a creator’s work to improve the creator’s output to make it more realistic. They are used for generating synthetic instances of data, restoring damaged photos by filling up missing parts and for predicting future frames in videos as required in autonomous driving.

    6. Transformer Networks: They use mechanisms called attention to weigh the influence of various parts of the input data. Based on the famous research paper ‘Attention Is All You Need,’ from Google researchers, Transformer is the major component of today’s Generative AI – GenAI.

    Brain Works or Selfies or Seen-It-Before?

    While it is true that many problems solved by traditional statistical machine learning (ML) algorithms such as Selfies and Seen-It-Before, discussed in the previous two parts can also be tackled by neural networks, following are some nuances to consider:

    1. Flexibility and Power: Neural networks, especially deep learning models, are highly flexible and powerful. They can model complex, non-linear relationships in data, making them suitable for a wide range of tasks, from image recognition to natural language processing.

    2. Data Requirements: Neural networks typically require enormous amounts of data to perform well. Traditional statistical ML algorithms, like linear regression or decision trees is good enough for smaller datasets.

    3. Interpretability: Traditional ML algorithms are more interpretable while Neural networks are often considered “black boxes” due to their complexity.

    4. Computational Resources: Neural networks, especially deep learning models, require significant computational resources for training. Traditional ML algorithms are usually less resource-intensive and can be cost effective for certain applications.

    5. Specific Use Cases: Some problems are better suited to traditional ML algorithms due to their simplicity and efficiency. For example, logistic regression is often used for binary classification tasks, and k-means clustering remains a popular choice for unsupervised learning tasks.

    I hope the four parts would have provided a conceptual understanding of essential ML/AI algorithms. We will review how to make an optimum choice of ML/AI algorithms in a later part of this series.

  • Demystifying AI/ML algorithms – Part III: Selfies, the Unsupervised.

    November 17th, 2024

    About the series

    In this third part of the series on the basics of AI/ML algorithms, I would deal with so called Un Supervised algorithms, which I refer to as Selfies. ‘Seen it before’ or Supervised algorithms were the subject of discussions in the second part ( https://ai-positive.com/2024/10/20/demystifying-ai-ml-algorithms-part-ii-supervised-algorithms-2/). The series started with my treatment of Good-Old-Fashioned-AI that gave a real start to practical use of AI (https://ai-positive.com/2024/08/28/understanding-gofai-rules-rule-and-symbolic-reasoning-in-ai/).

    Getting rid of teacher

    All algorithms we discussed in the second part require data with label, the outcome which the teacher as a trainer relate to the other input variables in the data to identify pattern. The output of the machine learning algorithms is a trained model which when fed with a new data can predict the label to which the new data belong, hence the outcome. Finding a good teacher is always a challenging task. What if we must automatically find the pattern in the data without explicit label?

    Child learns within the first 3 years after born without a real teacher! They observe, listen, touch, taste, and smell everything they encounter which helps them learn about their environment. Imitation and play enable child to learn quickly. The logic of learning is already there in child’s mind. It becomes human nature to group things together or categorize them to make better sense of things. We see stars and constellations appear. Unsupervised algorithms in that sense are selfies which find hidden patterns, structures, and relationship within the data. There are several popular unsupervised learning algorithms widely used for machine learning. Let us look at most common ones.

    K-Means Clustering: This algorithm partitions data into K distinct clusters based on the distance to the centroids of those clusters. This is like separating items of particular colour from mixed items of various colours. K indicates number of clusters to be formed. Popular implementations of K-Means clustering algorithms are:

    Customer Segmentation: Grouping customers based on purchasing behaviour for targeted marketing.

    Image Compression: Reducing the number of colours in an image by clustering similar colours together.

    Document Classification: Organizing documents into topics or categories based on their content.

    Hierarchical Clustering: It is a method of clustering that builds a hierarchy of clusters, which can be visualized as a dendrogram. Hierarchical clustering works something like this. Suppose you have group of students in a class and want to form groups based on similar interests for club activities. Initially each student is a cluster. Identify how similar each student is to every other student based on their interests. Group together two students who have most similar interests. Now find similarity of this new group to other students or other groups so formed. Repeat the process until everyone is part of any of the groups.

     Typical use cases of Hierarchical Clustering are:

    Gene Expression Analysis: Grouping genes with similar expression patterns in biological research.

    Market Research: Segmenting markets based on consumer preferences and behaviours.

    Social Network Analysis: Identifying communities within social networks.

    DBSCAN (Density-Based Spatial Clustering of Applications with Noise): DBSCAN groups together points that are close to each other based on their density and marks points that are in low-density regions as outliers/ noise. It uses parameters like radius and minimum points to define dense regions and expands clusters from core points.

    Suppose you are at a crowded party, and you want to figure out who are loners. Starting with a person check who are all close by. Anyone within the close range is part of same group. For each of the new group members, you again see who is close to them and keep adding them to the group. If someone is not near enough to form a group, they are considered an outsider or noise. The process is to be repeated till everyone at the party is either part of a group or classified as noise.

    It is quite natural that DBSCAN is used for these applications:

    Anomaly Detection: Identifying outliers in datasets, such as fraudulent transactions.

    Geospatial Analysis: Detecting clusters of geographic locations, like hotspots in crime data.

    Astronomy: Clustering stars or galaxies based on their characteristics.

    Apriori Algorithm: Apriori algorithm is a learning method which discovers frequent itemsets in data. It then generates several association rules for those set of items. By calculating two factors – namely ‘confidence’ and ‘lift,’ Apriori algorithm eliminates rules which do not meet the minimum requirement and retains only those rules that qualify.

    This algorithm works in context of a supermarket to identify items which are bought together frequently. It looks first at individual items such as soap or shampoo and counts how often they are bought. Retaining items that are bought frequently enough (above a particular number of times in a period like week), the algorithm looks at pairs of these times such as ‘soap and shampoo’ to see how often they are bought together. Retaining further only those pairs bought together frequently enough, the algorithm looks for larger sets of items like soap, shampoo and possibly conditioner and repeats the process. It keeps expanding and counting sets of items filtering out the ones that are not bought often enough together. The process results in item sets that are frequently bought together which helps to understand customer behaviour to make decisions by supermarket management.

    Apriori Algorithm is used for:

    Market Basket Analysis: Finding frequent item sets in transactional data to understand buying patterns.

    Recommendation Systems: Suggesting products to customers based on their purchase history as well as other customer’s purchase history.

    Web Usage Mining: Identifying common patterns in web navigation behaviour.

    Self-Organizing Maps (SOM): SOM is used for clustering and visualization of high-dimensional data, i.e., data with several features/ variables. Preserving the topological structure of the original data, it creates a lower dimensional grid of computational units (called neurons) making it easier to identify patterns, clusters, and relationships.

    SOM can be used to visualize the similarities between songs breaking down higher dimensional features such as Tempo, Genre, Duration, Energy, Danceability, Loudness, Musical key, Acousticness, Valence, Instrumentalness and Speechiness into a 2 dimensional grid with top-left cluster containing songs with high energy, fast tempo and high danceability (dance and electronic music category) and bottom right with high acousticness, low energy and high instrumentalness (classical, acoustic music category) and the centre containing songs with moderate energy, positive valence and high speechiness (pop and hip-hop music category).

    Typical real-life applications of SOMs include:

    Speech/ Handwriting Recognition: Recognizing patterns in complex datasets, such as speech or handwriting.

    Social Network Analysis: Visualizing the structure of social networks and identifying communities or influential individual within the network.

    Manufacturing Process: Used to monitor the health of machinery and detect potential failures based on the pattern of sensor data on temperature, vibration, and acoustic emissions and identifying deviations from the normal patterns indicating potential issues.

    Principal Component Analysis (PCA): Reduces the dimensionality of data by transforming it into a new set of variables (principal components) that capture the most variance. Dimensionality refers to the number of features or variables in a dataset. In other words, PCA simplifies the data while preserving the variance as much as possible so that resulting data is easier to visualize and analyse. PCA is used as pre-processing step to reduce the number of features prior to applying any other machine learning algorithm to build a model.

    Imagine you have a huge photo album, and each photo has several details like people, locations, activities, attires, and dates. It would be overwhelming to look through every photo and find key moments. We can identify key features or common themes such as weddings, birth days, vacations that most photos share and group them according to the theme. Choose a few representative photos from each group that capture the essence of the theme, which will highlight the key moments and people. PCA works like this.

    Most common use cases of PCA are:

    Face Recognition: Identifying prominent features in facial images for recognition, typically used by the police to identify a criminal from the description by a witness.

    Stock Market Analysis: PCA is used to analyse and reduce the dimensionality of financial data, to identify the most key factors affecting stock prices and make informed decisions.

    Environmental Studies: Analysing environmental data such as air quality and water pollution to determine the main sources of pollution to develop strategy for environmental protection.

    Seen-it-before or Selfies – which way to go?

    Selfies are the best for exploratory tasks such as customer segmentation, anonymity detection and market basket analysis when you do not have labels.

    Selfies focus on exploration and discovering insights from data without pre-defined labels, but not on accuracy that can be obtained from Seen-it-before algorithms.

    Selfie algorithms can be used to pre-train a model or extract features from data which then can be fed into Seen-it-before algorithms for building models with more accurate prediction.

    There is also a cross between Seen-it-before algorithms and Selfies where a small amount of labelled data is combined with a large amount of unlabelled data to improve learning accuracy iteratively.

    Ensemble method of using multiple models from both categories are used to arrive at the most accurate final model.

  • Demystifying  AI/ML algorithms – Part II: Supervised algorithms

    October 20th, 2024

    About the series

    This is the second part of my series on Demystifying AI/ML algorithms. This series is intended for curious people, who missed the buzz around AI/ ML until GenAI largely captured their attention. Some of the contents I have already shared years back, but feel it is important to revisit them before plunging into GenAI. I traced the origin of AI/ML in my first part of the current series and discussed how Good-Old-Fashioned-AI gave a real start to AI and still remains relevant (https://ai-positive.com/2024/08/28/understanding-gofai-rules-rule-and-symbolic-reasoning-in-ai/).

    Patterns and Meaning

    What makes us human is the need for us to search for meaning. If you want to get clarity from chaos, you try to identify patterns among chaos. Patterns are observations organized into meaningful categories. Charles Darwin’s theory of evolution and Gregor Mendel’s laws of heredity are outcome of careful observations of nature around. Patterns can be derived from observations of numbers, people’s behaviours, musical scores, and even our thoughts. We need a large number of observations to identify patterns. Data from observations and eliciting patterns from them brings out clarity of the real world represented by the data and enables predictability. Statistics, considered by many as a boring part of mathematics provides methods to derive patterns from data.

    Machine learning algorithms are rooted in statistics. Statistical foundations of these algorithms enable them to learn from data, adapt, and generalize:

    • Learn from Data: They identify patterns and relationships in data without needing specific ‘if-then-else’ rules.
    • Adapt and Improve: They can adapt to new data and improve their performance through training and validation.
    • Generalize: They aim to generalize from the training data to make accurate predictions on unseen data inputs.

    When it comes to learning, there comes a teacher. There are also self-learners. There emerge two subcategories of these machine learning algorithms which I refer to as ‘Seen it before’ and ‘Selfies.’ In literature they are classified as Supervised and Un-Supervised algorithms.

    Seen it before Algorithms

    This category of supervised algorithms revolves around:

    • Learning to see similarities between situations and thereby inferring other similarities, like if two patients have similar symptoms, they may have the same disease.
    • The key is judging how similar two things are and which similarities to take forward and how to combine them to make new predictions.

    They help solve real-word problems thru:

    • Regression – deriving extent of relationship between set of data points reflecting the problem to predict a new value in the problem context.
    • Classification – sorts data from problem context into distinct groups and helps predicting whether a new data point belongs to a particular group or not.

    These algorithms need a label to group a set of data points during training to create a model that helps predicting the group for a new set of data points, the reason they are referred as Supervised Algorithms.

    Linear Regression

    If you want to predict to what extent you will feel relaxed when you sleep for a particular number of hours on a specific day, a line drawn between number of hours of sleep data on one axis and extent of relaxation on the other axis gathered from a good number of observations will become the pattern that would help to model using Linear Regression.

    Used for predicting continuous values, Linear Regression models the relationship between a dependent variable and one or more independent variables from among the data to elicit a pattern.

    Typical use-cases:

    • Used for predicting price to be offered for apartments by builders based on features like location, number of rooms and other factors.
    • Businesses use it forecast future sales based on historical sales data, market spend and economic indicators.
    • I have used it for estimating efforts for software testing based on the characteristics of application under test.

    Logistic Regression

    If you want to predict whether your favourite IPL team will win a particular match or not, logistic regression helps to determine the probability of this result happening based on factors like home advantage, strength of teams and weather conditions.

    Logistic Regression uses past data to give a percentage chance of an outcome and then making a yes/ no prediction based whether the probability is at least more than 50% or not. Used for binary classification problems, it predicts the probability of binary outcome unlike Linear Regression which works on continuous values.

    Typical use-cases:

    • Predicting whether a patient has a certain disease based on factors such as medical history, age, weight, and lifestyle.
    • Predicting whether a customer will buy a product or not based on past behaviours.

    Decision Trees

    Used for both regression to predict numerical value as in Linear Regression and for classification like ‘yes/ no’ as in Logistic Regression, decision trees split the data into branches based on various values in the data creating a tree structure to produce an output.

    Referred as non-parametric models, decision trees make fewer assumptions about data distribution unlike Linear Regression or Logistic Regression models which assume a normal distribution. While Decision Trees are flexible to adopt the pattern of underlying data, they are more complex and require more data to achieve satisfactory results. Decision Trees are better choice when there exist complex interactions between various fields in the data and in scenarios where interpretability the prediction process is key.

    Typical use-cases:

    • Marketing teams to segment customers based on purchasing behaviour, demographics, and engagement, when data consists of label.
    • Credit scoring agencies to identify riskier applicants based on income, credit history and employment status.

    Support Vector Machines (SVM)

    Used for both classification and regression problems like Decision Trees, SVMs are better when the number of features (data fields) runs to hundreds. Suppose if the problem is to make a robot sort between apples and oranges based on various characteristics of apple and orange, SVM identifies the best ‘straight line’ between them. If there is any overlap, SVM performs ‘kernel trick’ to transform data into a 3D space and separate them thru hyperplane easily.

    While SVM algorithms manage high dimensional spaces, Decision Trees are simpler and better interpretable when the data fields are in tens and not in hundreds. Overfitting can happen in Decision Trees when the number of features increase in which case SVM is a better choice.

    Typical use-cases:

    • SVM works well for problems that can be solved by classification such as identifying objects in photos and detecting faces in images.
    • Sentiment analysis in social media postings such as whether it contains hate speech largely depend on text categorization capability of SVM algorithm.
    • SVM algorithms are also used in speech recognition applications as it can be used to recognize spoke words and convert them into text.

    k-Nearest Neighbours (kNN)

    A simple, instance-based learning algorithm k – Nearest Neighbours (kNN) can be used for both classification and regression. It classifies a data point based on the majority class among its k nearest neighbours.

    Suppose there is a party related to musical awards event and there are fans of major composers in the party.  In general, we can expect fans of a particular composer to get close to each other and engage in animated discussions. If a new person enters the party hall and gets settled closer to one of those groups, then it is quite possible that the new person is a fan of the same composer whose fans are close to each other. kNN does this in a mathematical way finding the ‘distance’ between data points and using the majority vote of the nearest neighbours to make predictions.

    Typical use-cases:

    • Recommendation systems like how Netflix recommends movies based on your viewing history.
    • Anomaly detection like in ‘fraud detection and network security’ detecting unusual data points in data sets.
    • Speech recognition applications such as identifying and classifying speech patterns to activate voice-based systems use kNN.

    kNN can be used for simple to moderate sized data sets. It works on the entire data set and finds out k nearest neighbours, k being the size of elements in the group to be recognized as neighbours. It is less complex and there is no need for training but computationally expensive as it needs the entire data set in memory unlike other algorithms which create a model out of training data.

    Extreme Gradient Boosting (XGBoost)

    All the above algorithms handle problems that can be solved by classification and regression. Choosing any of them for a particular problem is based on the data set on hand.

    Ensemble methods combine multiple models to improve prediction accuracy and robustness. It is like a committee of multiple experts working co-operatively together to arrive at a decision.

    Considered as a rock star, XGBoost is the most powerful and efficient among the ensemble methods. XGBoost builds models sequentially, where each new model corrects the errors of previous ones. It also uses smart ways to deal with missing data and thereby eliminates preprocessing.

    Typical use-cases:

    • Credit scoring agencies use XGBoost to predict the probability of loan default and assess credit worthiness of loan applicants based on age, income, existing loan, previous defaults, and other details.
    • Banks use XGBoost to detect fraudulent transactions by identifying usual patterns among data such as the type of transaction, time of transaction, and location.
    • Telecom companies use XGBoost to predict customer churn based on usage patterns and customer activities.

    Key Take-aways

    Seen-it-before algorithms

    • can be used for any prediction problem that can be solved using classification or regression technique and has underlying big data from which patterns can be elicited as in several use-cases cited.
    • can be used individually or combined to form an ensemble to improve predictability and performance.

  • Rules Rule – GOFAI: Demystifying AI algorithms

    August 28th, 2024

    In this series of articles titled ‘Demystifying AI Algorithms’, I would like to explore the basic nature of AI/ML algorithm categories and develop simple to understand perspective of the algorithms and their applications.  Consciously I will not get deeply into technical aspects and will deal only with those applications that explicitly touch our personal and work lives. There could be some technical examples for the sake of details, which can be ignored. In this first part, I will bring out perspectives on so called ‘Good Old Fashioned Artificial intelligence (GOFAI)’.

    The thumb of Old-fashioned

    We will start from where it all started and tell you what is now referred to as ‘Good Old Fashioned Artificial intelligence (GOFAI) which was prominent during mid-1950s to mid-1990s and is still relevant. Not that all young people ignore elderly old-fashioned people. Trying to mimic experts was the earliest approach to install intelligence in machines. Rules and thumb rules in expert’s knowledgebase get processed in their brain to come out with answers for questions which often surprise the ordinary souls. I refer these categories of algorithms as ‘Rules Rule’.

    How Rules rule?

    In this approach knowledge is represented symbolically, and logical rules framed to simulate human intelligence. Knowledge can also be coded into rules and represented as trees which can be searched. Inverse deduction is one of the methods used to arrive at a result based on available rules.

    To get a feel of how inverse deduction works let’s take up a simple rule set: ‘Cow eats grass’, ‘Sheep eats grass’, ‘Horse eats horse grams’, ‘Grass is plant material’, ‘Horse grams is plant material’, ‘Herbivores eat plant material’.

    From the above rule we can deduce that ‘Horse is herbivores’ and the process is known as inverse deduction. The below implementation in prolog program can produce result as to whether cow is a herbivore.

    % Facts

    eats(cow, grass).

    eats(sheep, grass).

    eats(horse, horse_grams).

    eats(deer, grass).

    eats(lion, deer).

    is_plant_material(grass).

    is_plant_material(horse_grams).

    is_animal_material(deer).

    % Rule: Herbivores eat plant material

    herbivore(X) :- eats(X, Y), is_plant_material(Y).

    % Rule: Carnivores eat animal material

    carnivore(X) :- eats(X, Y), is_animal_material(Y).

    % Query: Is Horse a herbivore?

    ?- herbivore(horse).

    % Query: Is Lion a carnivore?

    ?- carnivore(lion).

    % Query: Is Horse a herbivore?

    ?- herbivore(horse).

    We define the facts about what each animal eats, and that grass and horse grams are plant material. When we run this Prolog program, it will deduce that “Horse is Herbivores” based on the given rules.  Unlike the imperative programming languages such as python or Java where we explicitly specify control flow using if-then-else statements,  Prolog is a declarative language where we specify what we want to achieve rather than how to achieve it.

    What is the nature of GOFAI?

    If your problem can be solved using a set of rules or a tree structure, and you can find the solution by following these rules or searching the tree, then it’s a good fit for the GOFAI approach.

    GOFAI is well-suited for building expert systems that emulate human expertise in specific domains. MYCIN was an early expert system developed in the 1970s by Edward Shortliffe at Stanford University. Its primary purpose was to assist doctors in diagnosing and recommending treatments for patients with blood diseases, particularly bacterial infections.

    Pathfinding, game playing, and puzzle solving are some of the use cases where search techniques like search, breadth-first search, and depth-first search can be used to navigate through tree representations. Other applications include medical diagnosis, legal reasoning, financial advice, and troubleshooting complex machinery.

    Problems like automatic planning, scheduling, resource allocation involve searches through possible states starting from initial state and executing actions to achieve a specific goal can be handled through algorithms that are of GOFAI category.

    Limitations of GOFAI

    There are several notable limitations that have influenced the shift towards other AI approaches like machine learning and neural networks:

    1. They lack the flexibility to adapt to new or unexpected scenarios.
    2. GOFAI systems do not learn from experience.
    3. As the complexity of the problem domain increases, the number of rules required can grow exponentially leading to scalability issues.

    Why should we get along with this old-fashioned guy?

    GOFAI provides a foundational understanding of AI principles and techniques. You may wonder the relevance of this old-fashioned fellow in the current AI world where we hear a lot about LLMs like ChatGPT.

    1. GOFAI systems are often more transparent and easier to understand compared to complex machine learning models. This makes them valuable in applications where explainability is crucial, such as legal and medical decision-making.
    2. Modern AI often integrates GOFAI principles with machine learning techniques to create hybrid systems. For example, combining symbolic reasoning with neural networks can enhance the interpretability and robustness of AI models.

    I have pointed out how the AI world is taking a leap back to GOFAI  in my earlier blog – ‘Back to basics: Machine Learning for Human Understanding ( https://ai-positive.com/2024/05/30/back-to-basics-machine-learning-for-human-understanding/ ). In the world of ever growing AI models amidst web of AI/ML algorithms, there is an opportunity for GOFAI, which can be investigated further.

  • Back to Basics: Machine Learning for Human Understanding!

    May 30th, 2024

    Can human understand machine learning and develop trust? Trustworthiness of AI systems stands on three delicate pillars – Fairness, Explainability and Security. The objectives are to avoid biases due to social stereotypes, prevent misinformation and stop privacy leak. In addition, compliance framework serves as a support mechanism to the three pillars and provide better stability while evaluating Trustworthiness of AI systems.

    I briefly touched upon the fairness in one of my earlier posts – Ethics in Artificial Intelligence (https://cosmicouslife.wordpress.com/2024/01/15/ethics-in-artificial-intelligence/). Fairness is judgemental as it deals with diverse discriminations. This aspect requires much better treatment, later.

    Compliance with respect to regulations is nothing but interpretations of the other three attributes of trustworthiness to the local environment in which the AI operate. This will make policy makers busy and enable a lot of business opportunities to the top consulting firms, adding to the overall cost of AI systems.

    Explainability requires some explanations. As long as we get value for the money or do not bother too much about the occasional small losses, we may trust the black box recommendation systems even if it keeps its mouth shut on why it recommended a particular thing. I undergo interesting explanation experience. An interesting person with whom I interact often never just answers any questions posed but comes out with a chain of reasons for the answer every time, challenging my patience. Do we need explanations for everything? No, but for some high impact situations. There is a cost attached to it.

    Accuracy versus Interpretability trade-off constraints the choice of algorithms used for machine learning and so the explainability. Linear regression and Neural Networks are at both end of the spectrum, with Decision tree, K – Nearest Neighbours, Random Forest, and Support Vector Machines in-between. With neural network Large Language Models (LLM), interpretability is more of a challenge. If the system comes out with smart explanations, are we smarter enough to distinguish concocted explanations that the AI systems are capable of?

    Explainability involves pointing out to the parts of an image like eyes that make an algorithm to label the image as a frog or building a decision tree of reasoning to provide traceability. Explanations are nothing but post facto confessions like ‘Why I did What I did’ derived from trained prediction models. Google came out with ‘chain-of-thought prompting’ for getting LLMs to reveal their ‘thinking.’ Thilo Hagendorff, a computer scientist at the University of Stuttgart in Germany has gone to the extent of saying that psychological investigations are required to open up the mad machine learning black boxes. Researchers are measuring the machine equivalents of bias, reasoning, moral values, creativity, emotions, and theory of mind in AI models to evaluate their trustworthiness. Like neuro imaging scan for humans, researchers are designing lie detectors that look at activation of specific neurons in neural network models to identify those set of neurons which wire together and then fire together. Anthropic, an AI safety and research company created a map of the inner workings of one of their models on May 21st, 2024. This map can aid understanding of neuron like data points called features that affect the output.

    A refreshing approach to explainability revolves around making the networks to learn from explanations rather than justifying the predictions. I was intrigued listening to Prof Vineeth Balasubramanian of IIT, Hyderabad, talking about his works on ‘Ante-hoc explainability via concepts’ in a recent IEEE event held in Chennai. Prediction models based on supervised learning are like memorizing everything and vomiting answers without adequate conceptual foundation. Researchers refer them as ‘stochastic parrots,’ meaning those models are probabilistic combination of patterns of text encountered while training, without understanding of the fundamental concepts. Models with concepts and rules seem to set the direction for explainability. This is counter intuitive to conventional prediction models. Expert systems of erstwhile era are decent implementation of explainable AI systems. Is it not like going back to the basics for human understanding of machine learning?

1 2
Next Page→

Blog at WordPress.com.

 

Loading Comments...
 

    • Subscribe Subscribed
      • COSMICOUS
      • Already have a WordPress.com account? Log in now.
      • COSMICOUS
      • Subscribe Subscribed
      • Sign up
      • Log in
      • Report this content
      • View site in Reader
      • Manage subscriptions
      • Collapse this bar