With the explosion of Generative Artificial Intelligence (GenAI), and the widespread adoption of Large Language Models (LLMs), there are widespread opportunities for organizations and individuals to augment their existing functions with AI. As with the introduction of any new technology, I have seen varied opinions from people ranging from being fascinated, to resisting change, and even attempting to disprove that the technology is actually beneficial, by trying every possible way to make the technology fail. Worse, we are constantly highlighting one-sided dangers of the technology, with prejudice and bias, without taking the time to understand why or how certain technologies can be beneficial, what its strengths are, and what its limitations may be. Public perception is often the biggest threat to innovation. Like many people, I have spent a lot of time experimenting with AI but haven't found a compelling explanation of "why" AI can help drive business outcomes. One of the key aspects that I've been focused on is the use of my own curated and contextual data, which has positively disrupted the outputs from these fantastic AI models. As I continue my learning journey through my career and through life, I’ve decided to create a short series of blogs about my findings and my experiments in the space of GenAI and LLMs. While it will primarily serve as a reminder of my own career pathway, I intend to make this information relevant and helpful to anyone taking the time to read it. Throughout my blog posts, I will include relevant resources to where a reader can find further information, should they wish to dive deeper into a certain topic. I hope you enjoy reading this series, as much as I enjoyed writing it. In my first blog post about this exciting area of research and application, I will aim to explain the inner workings of Large Language Models, and some of their characteristics, in a way that allows us to use a data-driven approach to decision making. The hope is that this information will allow the adoption of these AI models in our workplaces (and our personal lives), to augment our efforts, and help us be more productive, and efficient in the long run.
What are Large Language Models? Almost everyone that has heard of Artificial intelligence, has also probably heard of the term “Neural Network”. They are a specific architecture which allows computers to learn the relationship between input samples and output samples, by mimicking how neurons in the brain signal each other. The first Neural Network, called a Perceptron, developed in 1957, had one layer of neurons with weights and thresholds that could be adjusted in between the inputs and the outputs. A fantastic introduction to this topic can be found here. Large Language Models (LLMs) are a type of Neural Network. In contrast to the Perceptron from 1957, some of these LLMs are infinitely more complex, with nearly a hundred layers and 175 billion or more neurons. What do Large Language Models do? One of the first practical applications of neural networks was to recognize binary patterns. Given a series of streaming bits (1s and 0s) as input, the network was designed to predict the next bit in the sequence (output). Similarly, in very simple terms, Large Language Models can predict the next word given a sequence of words as input. If that is all that they can do, why are they so powerful and appear incredibly intelligent? The scope of this goes well beyond a blog post, but here is an incredible resource that will help you understand the inner workings of large language models. Characteristics of Large Language Models In the final sub section of this blog, I will attempt to highlight some properties or characteristics of Large Language Models. I will refer to these characteristics in various future blog posts, so the utility of these properties and their application areas can be better understood. It is these characteristics and properties that contribute to the ability of LLMs to appear to understand and generate human-like language.
What can Large Language Models help with? At Relativ, we have been experimenting with Deep Learning, Psychology, Linguistics, and Large Language Models, by qualitatively assessing the generated text across thousands of interactions, and continuously instruction-tuning these models to quantize their output. We have learned that when LLMs are combined with proprietary algorithms and curated data, their output can be transformative and insightful. When used in thoughtful conjunction around the end-user experience, as well as the intended business or academic outcome, we are starting to see some very promising results. We are already beta-testing these models in recruiting, sales, learning and development, retrospectives, and career readiness where early adopters are reaping the rewards of experimenting early, and gaining a competitive advantage from the learning that occurs. In the next series of blogs, I will attempt to describe how Relativ's models are being used in each of the above fields, and why they can be a game changer in the long run. I will refer to the characteristics of LLMs highlighted in this blog entry, for continuity and chain-of-thought (pun-intended), throughout this series. In the meantime, head over to relativ.ai or reach out to us to learn how we can help you deploy your own AI models, infused with psychology, and linguistics, to help you drive business outcomes.
0 Comments
Your comment will be posted after it is approved.
Leave a Reply. |
AboutArjun is an entrepreneur, technologist, and researcher, working at the intersection of machine learning, robotics, human psychology, and learning sciences. His passion lies in combining technological advancements in remote-operation, virtual reality, and control system theory to create high-impact products and applications. Archives
December 2024
Categories
All
|