ARJUN NAGENDRAN
  • Portfolio
    • Mursion
  • Relativ
  • Robotics
  • Blog

All things AI, VR, Entrepreneurial, Academic, and Fun!

Why Generative AI will positively disrupt Career Readiness preparation.

4/15/2024

0 Comments

 
Picture

​In my first blog post about Generative AI and Large Language Models (LLMs), I explained their inner workings and why they are well-suited for adoption in both academic and business contexts. I’m building on this introductory blog by diving deeper into their inner workings in specific contexts. In this second blog, I will focus on the characteristics of LLMs that make them well-suited to help with career readiness preparation, from the grassroots stage (high schools, colleges), to later in your career, when you’re ready to land your dream role at your favorite company. We’ve already seen that Large language models (LLMs) are revolutionizing various fields, including interview coaching to find your first job, college admissions preparation, and preparing for job interviews while navigating various careers (experience a demo). Their unique properties enable personalized guidance based on individual performance, making them an invaluable tool in refining the human skills needed to be successful in any career. Let’s look at some of these properties in more detail and explore the science behind “why” LLMs are so well-suited to providing personalized guidance, and what constraints must be put in place to ensure a reliable output from these models.

Advanced Natural Language Understanding

LLMs possess advanced natural language understanding capabilities, allowing them to accurately interpret interview responses. They can comprehend nuances in language, identifying strengths and weaknesses in communication skills. The best reference for Large Language Models (LLMs) possessing advanced natural language understanding capabilities to accurately interpret interview responses and identify strengths and weaknesses in communication skills can be found in the paper "Improving Language Understanding by Generative Pre-Training" by Radford and Narasimhan. This paper introduces the popular decoder-style architecture used in LLMs, focusing on pretraining via next-word prediction, which enables these models to comprehend nuances in language and exhibit advanced natural language understanding capabilities. To take advantage of this capability, it is important to provide information and context to the A.I. model that are specific to your needs. There are several ways to perform this task, ranging from few-shot learning to fine-tuning, the scope of which is beyond the scope of this blog. Courses such as these from DeepLearning.ai can be extremely useful if you are new to this field.

Adaptive Feedback Mechanisms

These models utilize adaptive feedback mechanisms to tailor coaching based on individual needs. By analyzing interview responses, LLMs can provide targeted feedback, focusing on areas requiring improvement while reinforcing strengths. To better understand the science of LLMs and their adaptive feedback mechanisms to tailor coaching based on individual needs, you can refer to the following articles:
  • AdaPlanner: Adaptive Planning from Feedback with Language Models: This paper introduces a closed-loop approach that allows LLM agents to adaptively refine their plans based on environmental feedback. It discusses in-plan and out-of-plan refinement strategies, a code-style LLM prompt structure, and a skill discovery mechanism. AdaPlanner has shown superior performance in various environments, particularly with few-shot learning.
  • AdaRefiner: Refining Decisions of Language Models with Adaptive Feedback: This work presents a framework designed to enhance the synergy between LLMs and Reinforcement Learning (RL) feedback. AdaRefiner includes a lightweight Adapter Language Model that refines task comprehension based on RL agents feedback, without altering the generalization abilities of LLMs, while improving decision-making capabilities in and adaptable and efficient manner.
These two papers provide valuable insights into how LLMs can effectively utilize adaptive feedback mechanisms for tailored coaching based on individual needs.

Attention Mechanisms & Transfer Learning

LLMs leverage vast datasets to generate data-driven insights into interview performance. By comparing responses to successful interview patterns, they can offer actionable advice to enhance performance. One such dataset is the MIT Interview Dataset, which comprises 138 audio-visual recordings of mock interviews with internship-seeking students from the Massachusetts Institute of Technology (MIT). This dataset was used to predict hiring decisions and other interview-specific traits by extracting features related to non-verbal behavioral cues, linguistic skills, speaking rates, facial expressions, and head gestures. Modern day LLMs have some unique properties that allow them to exhibit similar capabilities with contextual information and new data, even if that data is not as comprehensive as the dataset described above. I briefly highlight two of these properties below:
  • Attention Mechanisms: LLMs employ attention mechanisms, such as self-attention or multi-head attention, to weigh the importance of different words or phrases in interview responses. This allows them to focus on relevant information and provide more accurate feedback. In simpler terms, they consider all the words in a sentence as context and weight the importance of each word when processing another word, allowing them to discern meaning. You can read more about this in Google’s landmark paper here.
  • Transfer Learning: LLMs often utilize transfer learning, where they are pre-trained on large-scale datasets for general language understanding tasks before fine-tuning on interview-specific data. This transfer of knowledge enables LLMs to leverage previously learned patterns and features, enhancing their ability to provide effective coaching by tapping into their general language understanding capability.
The result is that LLMs can be used to analyze conversational exchanges and responses of an end user, without the need for massive amounts of training data, if you can provide the model with sufficient contextual information on what to look for, in a manner that the LLM can comprehend (e.g. fine-tuning).

Real-Time Computation

One of the key strengths of LLMs is their ability to provide real-time analysis given structured data. This instantaneous feedback enables candidates to adjust their approach on the fly, improving their performance as they go. The architecture of LLMs, which are essentially complex neural networks, is optimized for efficient computation. These neural networks have already learnt a mapping between the input and output based on billions of parameters and are utilizing these learnt weights to perform mathematical computations at a rapid pace. This allows them to analyze conversational information such as interview responses in real-time, providing immediate feedback to users during these interactive sessions. Based on analysis and feedback, LLMs can help create personalized learning paths for career readiness preparation. They can even be tuned to recommend specific resources or exercises tailored to target areas for improvement, thereby maximizing their effectiveness.

How can Generative AI help my organization with career readiness?

In conclusion, the properties of large language models make them exceptionally well-suited for providing personalized career readiness preparation. Their advanced natural language understanding, adaptive learning mechanisms, data-driven insights and real-time computation capabilities offer invaluable support to individuals navigating the complexities of the pursuing their career goals. As LLMs continue to evolve, they hold the potential to revolutionize career development, empowering individuals to achieve their professional goals with confidence and competence. The reliability and consistency of the output from these LLMs is however heavily dependent on the quality of your input data and the precise definition of context that you can provide. At Relativ, we help organizations gather input data and create guidelines with sufficient fidelity for their A.I. models to infer “what good looks like”.  We help them experiment with their own data and understand how an LLM works with different contextual information, so they can expand these capabilities and begin to measure various skills that individuals exhibit during their conversational exchanges. When tailored to specific job descriptions, these customized A.I. models can give end users a competitive advantage by not only identifying the skills they require, but also helping them improve their performance on those skills through personalized feedback. Head over to relativ.ai or reach out to us to learn how we can help you deploy your own AI models, infused with psychology, and linguistics, to empower your organization and end users with the career readiness skills they require to meet the challenges of the future of work.
0 Comments

"Why" AI is poised to drive business outcomes.

4/2/2024

0 Comments

 
​With the explosion of Generative Artificial Intelligence (GenAI), and the widespread adoption of Large Language Models (LLMs), there are widespread opportunities for organizations and individuals to augment their existing functions with AI. As with the introduction of any new technology, I have seen varied opinions from people ranging from being fascinated, to resisting change, and even attempting to disprove that the technology is actually beneficial, by trying every possible way to make the technology fail. Worse, we are constantly highlighting one-sided dangers of the technology, with prejudice and bias, without taking the time to understand why or how certain technologies can be beneficial, what its strengths are, and what its limitations may be. Public perception is often the biggest threat to innovation.

Like many people, I have spent a lot of time experimenting with AI but haven't found a compelling explanation of "why" AI can help drive business outcomes. One of the key aspects that I've been focused on is the use of my own curated and contextual data, which has positively disrupted the outputs from these fantastic AI models. As I continue my learning journey through my career and through life, I’ve decided to create a short series of blogs about my findings and my experiments in the space of GenAI and LLMs.  While it will primarily serve as a reminder of my own career pathway, I intend to make this information relevant and helpful to anyone taking the time to read it. Throughout my blog posts, I will include relevant resources to where a reader can find further information, should they wish to dive deeper into a certain topic. I hope you enjoy reading this series, as much as I enjoyed writing it.
Picture
In my first blog post about this exciting area of research and application, I will aim to explain the inner workings of Large Language Models, and some of their characteristics, in a way that allows us to use a data-driven approach to decision making. The hope is that this information will allow the adoption of these AI models in our workplaces (and our personal lives), to augment our efforts, and help us be more productive, and efficient in the long run.

What are Large Language Models?
​

Almost everyone that has heard of Artificial intelligence, has also probably heard of the term “Neural Network”. They are a specific architecture which allows computers to learn the relationship between input samples and output samples, by mimicking how neurons in the brain signal each other. The first Neural Network, called a Perceptron, developed in 1957, had one layer of neurons with weights and thresholds that could be adjusted in between the inputs and the outputs. A fantastic introduction to this topic can be found here. Large Language Models (LLMs) are a type of Neural Network. In contrast to the Perceptron from 1957, some of these LLMs are infinitely more complex, with nearly a hundred layers and 175 billion or more neurons.

What do Large Language Models do?

One of the first practical applications of neural networks was to recognize binary patterns. Given a series of streaming bits (1s and 0s) as input, the network was designed to predict the next bit in the sequence (output).  Similarly, in very simple terms, Large Language Models can predict the next word given a sequence of words as input. If that is all that they can do, why are they so powerful and appear incredibly intelligent? The scope of this goes well beyond a blog post, but here is an incredible resource that will help you understand the inner workings of large language models.

Characteristics of Large Language Models

In the final sub section of this blog, I will attempt to highlight some properties or characteristics of Large Language Models. I will refer to these characteristics in various future blog posts, so the utility of these properties and their application areas can be better understood. It is these characteristics and properties that contribute to the ability of LLMs to appear to understand and generate human-like language.

  • Word Vectors: Computers represent all information as bits. LLMs, in specific, represent words as numerical vectors. If you read through the link in the previous sub section on what LLMs do, you will already be aware that by storing these words as vectors, LLMs can capture semantic relationships (or meaning) between words. More importantly, LLMs operate in such a high dimensional space that they can compute the meaning of entire paragraphs and sections of text, comprising of thousands of words. This is what allows them to understand human requests and generate responses.
  • Transformers: One of the key components of LLMs that can actively exchange information with humans is the transformer architecture. The transformer architecture allows parallel processing of information while also propagating information between sequential layers, allowing the model to learn complex relationships and patterns in the language (i.e. the relationship between words).
  • Training Data: The top LLMs in use today have benefited from vast amounts of training data. It must be noted that these LLMs are trained on unlabeled text using self-supervised or semi-supervised learning methods. In the first stage, the model is trained on all the textual information that was available on the internet (sourced using crawlers or similar). With this data, the neural network learns to predict the next word in the sequence. In the next stage, the same model is given labeled data, of a much higher quality, with labels, but on a much smaller scale. The process is referred to as fine-tuning and use to generate contextually relevant content, in a format that matters to the end user, from the LLMs. Andrej Karpathy has an excellent podcast on how LLMs are trained, and offers valuable insights into how prediction and compression (e.g. zip files) can be shown to have a close mathematical relationship. It follows that larger the amount of data, and larger the model parameters, the better the ability of the model to predict the next word in the sequence. Using this base model to perform fine-tuning inherently guarantees better results (subject, of course, to the dataset and labels used in the fine-tuning phase).
  • Fine-Tuning and Prompt Engineering: As I previously mentioned, fine-tuning is essential for LLMs to perform specific tasks in a consistent and effective manner. Fine-tuning helps optimize the model's performance for a particular task such as analyzing sentiment or generating summaries in a specific format. When used in combination with other techniques such as prompt-engineering, the ability of a fine-tuned model to adapt to a new task and context can be quite magical and have a huge utilitarian value. A simple online search will yield several great resources on prompt-engineering, which is a highly sought after skill / emerging field. I strongly recommend that anyone interested in using GenAI for your personal or business use, to experiment with these techniques and solve relevant challenges that you encounter on a daily basis.
  • The Black Box Problem and Emergent Behavior: So far, we’ve focused on the things that we truly understand about LLMs. But it is also important to recognize that the sheer size and complexity of LLMs pose a significant challenge to our understanding of their true inner workings. The “black box” problem draws attention to the fact that the logic behind the decision-making process of these large neural networks is not easily traceable, and hence, not well-understood. This article in Nature highlights the black box problem and offers a good balance in perspectives on why this problem matters, and how we can continue to work around it while receiving the benefits that AI offers. On a related note, LLMs exhibit something known as “Emergent Behavior” i.e. as the size of the models gets bigger, they exhibit capabilities that they were not trained to perform. It is this characteristic that makes experimenting with LLMs to solve business challenges a particularly exciting area of interest.

What can Large Language Models help with?

At Relativ, we have been experimenting with Deep Learning, Psychology, Linguistics, and Large Language Models, by qualitatively assessing the generated text across thousands of interactions, and continuously instruction-tuning these models to quantize their output. We have learned that when LLMs are combined with proprietary algorithms and curated data, their output can be transformative and insightful. When used in thoughtful conjunction around the end-user experience, as well as the intended business or academic outcome, we are starting to see some very promising results. We are already beta-testing these models in recruiting, sales, learning and development, retrospectives, and career readiness where early adopters are reaping the rewards of experimenting early, and gaining a competitive advantage from the learning that occurs.
In the next series of blogs, I will attempt to describe how Relativ's models are being used in each of the above fields, and why they can be a game changer in the long run. I will refer to the characteristics of LLMs highlighted in this blog entry, for continuity and chain-of-thought (pun-intended), throughout this series. In the meantime, head over to relativ.ai or reach out to us to learn how we can help you deploy your own AI models, infused with psychology, and linguistics, to help you drive business outcomes.
0 Comments

    About

    Arjun is an entrepreneur, technologist, and researcher, working at the intersection of machine learning, robotics, human psychology, and learning sciences.  His passion lies in combining technological advancements in remote-operation, virtual reality, and control system theory to create high-impact products and applications.

    Archives

    April 2025
    December 2024
    June 2024
    April 2024
    June 2020
    December 2018
    November 2018
    October 2018
    September 2018
    August 2018
    July 2018
    June 2018

    Categories

    All
    AI Roleplay
    Corporate Learning
    Generative AI Analytics
    Learning & Development
    Natural Language Processing
    Professional Development - Soft Skills
    Role Play In Virtual Reality
    Simulation Design
    Virtual Reality

    RSS Feed

© 2024 Arjun Nagendran. Deep Learning | Generative AI | Robotics to solve tomorrow's hardest challenges. All rights reserved. ​
  • Portfolio
    • Mursion
  • Relativ
  • Robotics
  • Blog