Showing posts from May, 2024

On the Necessity and Challenges of Safety Guardrails for Deep Learning Models

In recent years, deep learning models, especially so-called transformer and diffusion models, became the powerhouses of the AI world. They demonstrated superhuman or near-superhuman performance in tasks such as natural-language processing, image generation, and others. Yet their superiority or their expressive capacity also brings trouble, as it makes them more dangerous and more entangled within our reality. In this article we address issues of safety guardrails for deep learning models, and their unique challenges of explainability. My favorite Human Analogy is Understanding Minds and Brains You and I often arrive at decisions — you to behave, and me to predict how you might behave — when we each have very little insight into the other’s inner control loops. And we’re often highly successful in gaining trust and safety that way, thanks to societal conventions, laws, governance — the ‘rules of the game’ that create robust norms for behaviour. The same dynamic applies to machine learni

Does Fine-Tuning cause more Hallucinations?

  In the past few years, the gains in capability of large language models (LLMs) have been teased out by pre-training on vast text corpora — the vast mass of raw data essentially ensembles factual knowledge parametrically in the model — and, after this, supervised fine-tuning is added to deliberately shape the model towards particular behaviors. This often involves a ‘soft gold standard’ by training the model on outputs from human annotators or on other language models which didn’t itself have access to the same knowledge, but can ‘hallucinate’ new facts. This raises the question: how does an LLM integrate new (extrapolated) facts beyond the knowledge it’s ‘seen’ during pre-training, and what impact does this have on hallucinations? The study “ Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations ?” explores the implications of fine-tuning large language models (LLMs) on new factual knowledge. Researchers employed a novel method, Sampling-based Categorization of Knowledge (S

Revisiting the Five Dysfunctions of a Team!

  About 20 years ago, I was considering transitioning from an individual contributor role to a management position at the company where I was employed at the time. I picked up Patrick Lencioni's book. As time passed, I leveraged all the learnings from this reading and put them into practice. Introduction After a brief piece of prose that defines organizational health and explains why it is so often neglected despite being ‘an imperative for any business that wants to succeed’, Lencioni signals to his readers that the book they are about to read is in fact a fable — it’s fiction. Still, it deals with the challenges that teams face. Underachievement The story is about Kathryn Petersen, the new CEO of the American technology company DecisionTech, who inherits a team of talented but dysfunctional people. Lencioni introduces his ensemble cast of characters and sets up the team dynamic, including a few of the initial warning signs that flag dysfunction. Lighting the Fire Kathryn decides