Project Flux
Posts
Major Breakthrough's in AI Memory: To Infinite and Beyond

Major Breakthrough's in AI Memory: To Infinite and Beyond

Imagine never needing to remind ChatGPT what you said last week—those days are almost behind us. A major challenge for AI has been retaining context over extended interactions or vast datasets. Current Large Language Models (LLMs) struggle with memory and attention, often forgetting past exchanges or losing track over long sequences. However, emerging capabilities are set to change this, allowing AI to remember and use vast information effortlessly, making interactions smoother and more intelligent.

Aaron Garner
November 25, 2024

Microsoft's Advancements in Infinite Memory

Microsoft, in collaboration with OpenAI, is developing AI models with near-infinite memory, expected to be released in 2025. This innovation will allow AI systems to retain and recall all previous interactions across sessions, eliminating the need for users to remind the AI of past preferences or decisions. This will create richer, more consistent dialogues.

Key features include continuous dialogue maintenance, allowing AI to build upon all past conversations, providing a highly personalised experience. With enhanced contextual understanding, the AI will be able to draw on historical data for more accurate responses and improved decision-making.

Microsoft's AI Chief, Mustafa Suleyman, has confirmed that prototypes with these capabilities already exist. Although there are the typical challenges of computational costs and infrastructure, the technology remains on track for a public release.

Google's Advancements in Infinite Attention

Google is taking a different approach, developing Infini-Attention, a technique allowing LLMs to process virtually unlimited text in a single prompt. This overcomes traditional context limitations using a compressive memory system that optimises data storage by retaining and compressing past segments.

Following the footsteps of ChatGPT, Google has now introduced a memory feature for its upcoming Gemini AI, which remembers information shared by users, enhancing personalisation. The Gemini memory feature is being tested, with broader release expected soon.

Infini-Attention relies on transferring older data into compressed memory when reaching capacity, efficiently freeing space while retaining important context. It combines long-term and local attention, allowing the model to understand complex data relationships while managing immediate context.

These advancements enable Google's models to process vast text inputs, useful for tasks like book summarisation and long-context modelling. They also enhance LLMs' ability to perform complex reasoning and planning, essential for nuanced information handling.

What’s inside?

The core components of these AI capabilities include compressive memory systems, long-term linear attention, and local masked attention. Compressive memory systems store and compress key-value states, reducing storage needs while retaining access to important historical data. Long-term linear attention allows AI to connect information across extended inputs, maintaining coherent responses during long conversations. Local masked attention manages immediate context, ensuring the model stays relevant in real-time while also working on broader analyses.

Google combines these attention methods within Transformer blocks, using state-sharing and compression to manage memory efficiently. This enables both historical insight and real-time processing.

Comparing Microsoft's and Google's Approaches

Microsoft and Google have distinct approaches to AI memory. Microsoft focuses on near-infinite memory, retaining user interactions to enhance personalised engagement and continuous dialogue. Their prototypes are geared towards making AI interactions more natural and adaptive over time.

Google, by contrast, focuses on efficiently processing unlimited input using compressive memory. This approach is ideal for handling large volumes of text and overcoming context window limitations. While Microsoft emphasises long-term user engagement, Google aims for efficiency in processing extensive data sequences. Together, these diverse approaches are unlocking new capabilities, creating an exponential leap in AI development. By tackling different aspects of AI memory, companies like Microsoft and Google are driving us towards a future where AI systems are not only highly personal and adaptive but also technically sophisticated, pushing the boundaries of what AI can achieve. Memory is just the start…

How Infinite Memory and Attention Will Improve LLMs

Infinite memory and attention will significantly enhance LLMs. By using dual attention mechanisms, models can understand user needs comprehensively, following complex arguments and connecting ideas across extensive inputs. AI will be capable of generating responses that consider the entire scope of conversations.

These advanced memory systems will also improve the AI's ability to learn continuously from interactions, adapting without forgetting prior details. This capability for continuous learning will allow AI to handle intricate reasoning and plan responses effectively, especially when long-term context is required.

Persistent memory will also improve user engagement, making experiences more personalised and less repetitive. AI assistants will evolve into digital companions, understanding preferences similarly to human assistants.

Impact on Project Delivery

Infinite memory and attention capabilities could enhance project management. AI will be able to maintain comprehensive records of project decisions, discussions, and documentation, ensuring continuity across phases. This historical context will lead to more informed decisions and better planning.

AI will recall relevant past information, link related content, and identify potential issues by analysing long-term data—useful for complex projects with multiple phases. Moreover, AI will suggest resource allocation based on team expertise, tracking progress to improve efficiency. As the capabilities continue to scale, we'd expect LLM's to support with not just whole projects but programmes and potentially portfolios.

For client relations, AI with infinite memory will ensure consistent communication and accurately recall requirements, strengthening client relationships and providing reliable support throughout a project's lifecycle.

To sum up

The advancements in near-infinite memory and attention by Microsoft, OpenAI, and Google represent a major milestone in AI development. By overcoming context retention and data processing limitations, these technologies are set to revolutionise daily life and industry applications. As AI evolves to maintain long-term interactions and process vast data, we can expect more intuitive, personalised, and effective solutions to assist in complex and meaningful ways.

The rabbit hole

Hear from Mustafa Suleyman on near infinite memory 🔗

An excellent walkthrough read on Google infini-attention 🔗

Google’s research paper on infini-attention 🔗

Understanding the attention mechanism in AI 🔗