Proudly sponsored by ConstructAI, brought to you by Weston Analytics.

Hello Project AI enthusiasts,

As we close out the year and head into the festive break, I want to take a moment to thank you for staying with us through what has been a pivotal year for AI and delivery. 2025 has been less about dramatic breakthroughs and more about hard lessons — where AI works, where it doesn’t, and what organisations are still struggling to adapt. We’re grateful to have you as part of the Project Flux community, and we’re looking forward to building on these conversations in the year ahead.

This week’s lead story is our End of Year Round-Up, which steps back from daily headlines to look at the deeper shifts that actually mattered in 2025. Rather than focusing on tools or releases, it reflects on how trust, judgement, and organisational readiness evolved — and why many teams still find themselves constrained not by technology, but by how decisions are made around it.

We also examine OpenAI’s move to let users directly adjust ChatGPT’s confidence. At first glance it looks like a minor interface tweak, but it signals something more important: a recognition that tone, certainty, and perceived authority are now central risks in how AI is used inside real delivery environments.

Another piece looks at a persistent blind spot — large language models and numbers. As AI outputs become more embedded in planning, estimation, and reporting, the gap between fluent language and fragile numerical reasoning becomes a delivery risk that teams can no longer ignore.

We then turn to a quieter concern: whether habitual use of AI prompts is beginning to erode independent thinking. As cognitive load is offloaded to machines, the long-term impact on professional judgement and skill development deserves closer scrutiny.

Finally, we review this year’s State of AI reporting, which reinforces a familiar conclusion. The biggest failures ahead are unlikely to be technical. They will stem from governance gaps, unclear accountability, and organisations that mistake access to AI for readiness to use it well.

In This Edition

Flux check-in

What 2025 Revealed About the State of Project Delivery

When the noise of product launches and bold predictions fades, the real story of 2025 comes into sharper focus. This year-end round-up steps back from daily headlines to examine how AI actually shaped project delivery over the past year. The picture that emerges is uneven: technical capability advanced quickly, but organisational readiness lagged behind. Across sectors, the biggest constraints were not tools or models, but trust, judgement, and governance — particularly around when to rely on AI, when to challenge it, and who remains accountable for decisions influenced by machine output. In many cases, AI did not create new problems so much as expose existing weaknesses in data discipline, ownership, and decision-making structures built for a slower pace of change. Read the full breakdown

What Does This Mean for Me?

If 2025 showed anything clearly, it’s that AI advantage is no longer created by early adoption alone. Organisations with access to the same models and tools delivered very different outcomes depending on how decisions were structured, who retained authority, and how governance adapted to machine-supported work. For delivery leaders, this reframes the challenge. The question is no longer whether to use AI, but how responsibility, judgement, and accountability are preserved as automation increases. Teams that failed this year did not lack technology; they lacked clarity on when humans should intervene, challenge, or override AI-driven outputs.

Key Themes

  • Capability ≠ readiness: Access to AI tools did not translate into consistent delivery performance

  • Judgement gaps exposed: AI surfaced weak decision ownership rather than replacing it

  • Governance lag: Delivery models struggled to keep pace with automated decision cycles

  • Accountability blur: Responsibility became unclear when recommendations turned into actions

Down the Rabbit Hole

Confidence Is Not Capability: What AI Enthusiasm Is Doing to Project Delivery

OpenAI’s decision to let users directly adjust ChatGPT’s confidence may look like a cosmetic interface change, but it reflects a deeper acknowledgement of risk. As AI systems are used more widely in professional settings, how confidently an answer is delivered increasingly shapes how it is trusted, challenged, or accepted. This update implicitly recognises that fluent, assertive outputs can mislead as easily as they can help — particularly when AI is embedded in review, planning, or decision-support workflows. Rather than solving the problem, confidence controls expose it: the real issue is not model accuracy alone, but how humans interpret and rely on machine-generated judgement. Read the full breakdown

What Does This Mean for Me?

If AI is influencing decisions in your organisation, tone is now a delivery risk. Confident AI outputs can bypass scrutiny, compress debate, and create false consensus — especially under time pressure. Giving users control over confidence does not remove that risk; it pushes responsibility back to teams and leaders to decide when certainty is appropriate and when caution is required. For delivery professionals, this reinforces a critical shift: governance must now account for how AI speaks, not just what it produces. Trust calibration becomes an operational discipline, not a UX preference.

Key Themes

  • Confidence as risk: Assertive outputs shape decisions regardless of correctness

  • Trust calibration: Users must actively manage how much certainty AI projects

  • Human judgement central: Controls expose responsibility rather than removing it

  • Governance gap: Few organisations define when AI should sound confident

Down the Rabbit Hole

Unlock the Future of Digital Construction

The DTSA micro-credential gives young people and career changers barrier-free access to digital twin education – a first for the UK construction industry. Built on 32 months of work at the University of Cambridge’s CDBB, it opens doors to cutting-edge skills in safer, smarter, and more sustainable project delivery.

With portfolio-based assessment (offered as part of an Apprenticeship) and real industry insight, the course creates a clear pathway into digital construction for site teams, aspiring architects, engineers, surveyors, and project owners / funders. In partnership with the Digital Twin Hub and OCN London, the DTSA is shaping the next generation of talent and helping position the UK as a global leader in digital construction and innovation.

Sign up by emailing [email protected]

When AI Sounds Certain but the Numbers Are Not

Large language models are increasingly used in contexts that involve numbers, such as estimates, forecasts, financial narratives, and performance reporting. Yet, this blog highlights a fundamental limitation that is often overlooked. LLMs do not reason about numbers in the way humans assume they do. They generate numerically plausible outputs based on patterns in language, not an underlying understanding of quantity or arithmetic. As AI-generated text becomes more fluent and confident, this gap becomes harder to detect and more dangerous in delivery environments where numerical accuracy underpins decisions. The risk is not occasional error, but systematic misinterpretation masked by linguistic confidence. Read the full breakdown

What Does This Mean for Me?

If AI is being used anywhere near planning, costing, risk modelling, or progress reporting, numerical fragility becomes a governance issue, not a technical footnote. Confident prose can give false reassurance, allowing errors to propagate into decisions before they are challenged. For delivery leaders, this means AI outputs involving numbers must always be treated as drafts, not conclusions. Verification workflows, secondary checks, and clear escalation points are essential. The more polished the output sounds, the more disciplined teams need to be about interrogating the numbers underneath it.

Key Themes

  • Fluency hides fragility: Numerical errors are masked by confident language

  • Pattern, not reasoning: LLMs predict text, they don’t compute meaningfully

  • Decision contamination: Small numerical errors scale into major delivery risks

  • Verification discipline: Human review remains non-negotiable

Down the Rabbit Hole

When Machines Go Wrong: What the Gemini Incident Reveals About AI Risk in Project Delivery

The Gemini incident is often framed as a model failure, but this blog argues that interpretation misses the real lesson. The problem was not that the system produced an incorrect or troubling output, but that existing oversight mechanisms were ill-equipped to detect, challenge, or intervene before reputational damage occurred. As AI systems become more capable and more autonomous, the margin for error narrows — and the cost of delayed human judgement rises. The incident exposes a structural weakness: many organisations are deploying AI faster than they are redesigning accountability, escalation paths, and review practices suited to machine-mediated decisions. Read the full breakdown

What Does This Mean for Me?

If AI systems are already embedded in your workflows, the Gemini incident is a warning about complacency. Oversight cannot be passive, periodic, or assumed to sit “somewhere else” in the organisation. Human review must be deliberately designed into systems that operate at speed and scale. For delivery leaders, this shifts the focus from model performance to organisational preparedness. The real risk is not that AI will occasionally go wrong — it’s that teams may not notice quickly enough, or know who is empowered to intervene when it does.

Key Themes

  • Oversight lag: Governance frameworks are being updated incrementally while AI systems are deployed rapidly, creating gaps between capability and control.

  • Accountability gaps: As AI outputs move closer to action, it becomes unclear who owns decisions when something goes wrong or escalates unexpectedly.

  • Automation complacency: Familiarity with AI tools increases trust over time, often reducing the level of scrutiny applied to their outputs.

  • Design failure: Many systems assume humans will intervene when needed, but fail to explicitly design how, when, and by whom that intervention should occur.

Down the Rabbit Hole

The Next Failure Mode Will Be Organisational, Not Technical

This year’s State of AI reporting underscores a persistent truth: the largest failures in AI won’t be about models failing mathematical proofs or hallucinating words. They will be about how organisations deploy, govern, and operationalise these systems. Technical capability has grown rapidly — but organisational readiness has not kept pace. As teams integrate AI into core processes, the limitations are not primarily in compute or architecture, but in accountability frameworks, deployment discipline, and risk management maturity. The report makes clear that organisations with poorly defined decision rights, weak monitoring, and inadequate governance structures face the greatest downside risk, not because the AI itself is flawed, but because human systems are. Read the full breakdown

What Does This Mean for Me?

For delivery leaders, this shifts the burden from picking the right model to building the right organisation. Your competitive advantage will increasingly be driven by how well your teams can operationalise AI responsibly, not how quickly you adopt it. Delivery risk now looks more like governance risk. Without clear oversight, incident response, and accountability mechanisms, even technically sound AI initiatives can falter. Leaders must reorient efforts toward building robust governance frameworks that match the scale and pace of adoption — including formalised risk assessment, continuous monitoring, and cross-functional ownership — if they want to avoid expensive mishaps, reputation loss, and project backslides.

Key Themes

  • Organisational readiness gap: Technical capability has advanced faster than governance maturity, leaving many organisations structurally unprepared to deploy AI at scale.

  • Risk shifts upstream: Failures increasingly originate in deployment choices, operating models, and controls rather than in the underlying model performance itself.

  • Governance as infrastructure: Oversight can no longer be treated as policy or compliance overhead; it must be deliberately engineered into delivery workflows and decision processes.

  • Accountability architecture: Clear ownership and decision rights are essential to prevent small AI errors from cascading into systemic operational or reputational failures.

Down the Rabbit Hole

The pulse check

Governance & Security

AI governance is moving decisively from abstract principles to enforceable regulation. In the U.S., New York’s passage of the RAISE Act marks a clear escalation in state-level intervention, following California in introducing binding AI safety obligations. The move reflects growing impatience with voluntary frameworks and a broader pattern: where federal clarity lags, states are stepping in to define accountability, disclosure, and risk management expectations for AI systems used in real-world settings.

At the same time, warnings from the UK’s AI Safety Institute point to a parallel concern at the frontier. As models become more capable, the Institute has cautioned that barriers to misuse could fall if governance, evaluation, and access controls do not keep pace. Taken together, these signals point to a tightening environment. Regulation is fragmenting, risk thresholds are becoming more explicit, and for organisations deploying AI, governance is no longer a box to tick later — it is fast becoming a prerequisite for scale, trust, and continued access to advanced capabilities.

Robotics

iRobot files for Chapter 11 as regulatory pressure reshapes consumer robotics – iRobot filed for Chapter 11 bankruptcy on December 15, 2025, marking a dramatic fall for one of the most recognisable names in consumer robotics. The collapse follows the failure of its proposed $1.7 billion acquisition by Amazon, blocked after an 18-month regulatory review by the U.S. Federal Trade Commission. Amazon formally withdrew from the deal in early 2024, leaving iRobot exposed amid rising competition from Chinese manufacturers, escalating tariff costs, and shrinking margins. Founder Colin Angle has been openly critical of what he described as overzealous antitrust enforcement, arguing that regulatory delays ultimately doomed the company’s ability to compete at scale. Read the story

Tesla begins fully driverless robotaxi testing in Austin – Tesla has started testing fully driverless robotaxis in Austin, removing safety monitors from vehicles for the first time. Elon Musk confirmed the tests in December, triggering a sharp rise in Tesla’s share price. While a limited robotaxi service has been operating since June, the move raises fresh questions about readiness and safety. Publicly available crash data suggests Tesla’s vehicles have been involved in significantly more incidents per mile than human drivers, and independent analysis indicates the scale of the programme may be smaller than public statements suggest. The tests underscore the widening gap between ambition, public perception, and operational reality in autonomous mobility. Explore more

Waymo robotaxis stall during San Francisco power outage – Waymo temporarily suspended its robotaxi operations during a major power outage in San Francisco over the December 21–22 weekend, after vehicles became stranded on city streets. A fire at a PG&E substation caused widespread infrastructure disruption, affecting traffic lights, cellular connectivity, and live traffic data. The incident exposed how dependent autonomous systems remain on external infrastructure. Waymo has since announced a software update designed to help vehicles navigate disabled intersections more decisively during future outages, highlighting the ongoing challenge of resilience in real-world deployments. Read Further

Unitree launches the world’s first humanoid robot app store Unitree has launched what it describes as the world’s first app store for humanoid robots, opening public beta access in December. The platform allows developers and users to control the G1 humanoid robot via a smartphone, downloading motion routines and task models ranging from basic movements to object manipulation. Voice-controlled applications are expected in 2026. The launch signals a shift toward a software-driven ecosystem for robotics, echoing the early evolution of smartphone platforms and hinting at how third-party development could accelerate humanoid adoption. Explore further

Physical Intelligence demonstrates generalist robots with π0.6 model – Physical Intelligence has released π0.6, a new vision-language-action model designed to power generalist robots capable of performing a wide range of tasks without task-specific programming. In its “Robot Olympics” demonstrations, robots completed activities such as door entry, tool use, cleaning, and textile manipulation autonomously. The company reports significant performance gains when training models using human video alongside robot data, reinforcing the role of human demonstration in accelerating robotic learning. The work represents steady progress toward robots that can operate flexibly across diverse environments. See the demonstrations

Trending Tools and Model Updates

  • NotebookLM receives major upgrades, positioning itself as a serious research companion – Google has rolled out significant updates to NotebookLM, strengthening its role as an AI-assisted research and synthesis tool. New capabilities improve how users organise source material, extract insights, and reason across documents, signalling Google’s intent to push NotebookLM beyond experimentation into everyday knowledge work. Read the full update

  • Otter.ai crosses $100M ARR as AI meeting assistants mature Otter.ai has crossed the $100 million annual recurring revenue milestone, underscoring how AI-powered meeting transcription and summarisation tools are becoming embedded in professional workflows. The milestone reflects growing enterprise demand for systems that turn conversations into structured, searchable outputs rather than raw recordings. Read more

  • Lovable expands integrations, pushing deeper into no-code AI workflows One of the ways AI models are rapidly improving is in their image editing capabilities, to the extent that they can now quickly take care of tasks that would previously have taken a substantial amount of time and effort in Photoshop. Explore integrations

  • Manus launches Design View to streamline AI-assisted interface building – Manus has introduced Design View, a new feature aimed at simplifying how users design and iterate on interfaces within AI-assisted workflows. By making layout and visual structure more accessible, the update reflects a broader push to lower friction between AI-generated logic and human-centred design. See what’s new

  • NVIDIA releases NitroGen, a vision-action foundation model trained from video – NVIDIA has open-sourced NitroGen, a 500-million-parameter vision-action model trained on 40,000 hours of gameplay video across more than 1,000 games. By learning directly from observation rather than reinforcement learning, NitroGen generalises to unseen environments, with implications extending beyond gaming to robotics and autonomous systems. Explore the announcement

Links We are Loving

  • ChatGPT launches year-end review like Spotify Wrapped
    OpenAI rolled out a personalised “Your Year with ChatGPT” recap that surfaces usage patterns, recurring themes, and playful highlights from users’ interactions. Beyond novelty, the feature reflects how consumer AI products are borrowing engagement mechanics from social platforms to deepen habit formation and brand affinity. It also hints at how personal data narratives may become a differentiator in AI interfaces.

  • Alphabet acquires Intersect Power for $4.75B to secure AI energy capacity
    Alphabet’s acquisition of clean-energy developer Intersect Power underscores how access to power and grid infrastructure is becoming a strategic constraint for AI scale. As training and inference demands surge, energy security is now as critical as compute availability. The deal signals that AI competition is increasingly being fought at the infrastructure layer, not just the model layer.

  • Yann LeCun confirms launch of AMI Labs world-model startup
    Former Meta chief AI scientist Yann LeCun has formally confirmed the launch of AMI Labs, a new venture focused on advancing world-model-based AI research. The move highlights renewed interest in foundational approaches to intelligence rather than short-term productisation. It also reflects a broader trend of top AI researchers stepping outside big tech to pursue long-horizon bets.

  • U.S. AI startups raise $100B+ across major funding rounds — U.S.-based AI companies raised more than $100 billion in 2025 through a wave of large funding rounds spanning vertical applications, infrastructure, and agentic systems. The scale of investment suggests investor confidence remains strong despite growing scrutiny around AI returns. Capital is increasingly flowing toward companies that promise defensible differentiation and clear paths to enterprise adoption.

  • McKinsey State of AI 2025 shows enterprise value shifting from pilots to scale — McKinsey’s latest global survey indicates that AI is beginning to generate measurable business value as organisations move beyond experimentation toward scaled deployment. The findings show a sharper focus on governance, operating model redesign, and a smaller number of high-impact use cases. The message is clear: value is emerging not from access to AI, but from disciplined execution around it.

  • China unveils $70 billion of financing tools to bolster investmentChina will deploy policy-based financial tools worth 500 billion yuan ($70.25 billion) to accelerate investment projects, the state planner said on Monday, as part of efforts to support the slowing economy.

Community

The Spotlight Podcast

When AI Stopped Being Magic: The Moment Reality Set In

This week’s reflection looks back at 2025 as the year AI finally lost its sense of magic — not because the technology stalled, but because organisations were forced to confront the harder question of what it actually takes to make AI work. The early thrill of capability gave way to a more sobering realisation: models alone do not create value. Integration, trust, governance, and human judgement do.

The episode traces how many teams entered the year believing AI adoption was primarily a tooling challenge, only to discover that the real friction lay elsewhere. Data quality issues resurfaced. Decision rights became blurred. Responsibility for AI-influenced outcomes was unclear. In many cases, AI didn’t fail — it simply exposed weaknesses that had always existed in how organisations made decisions and coordinated work.

What emerges is a broader pattern. As AI becomes normalised, advantage shifts away from novelty and toward discipline. The organisations that progressed were not those chasing the most advanced models, but those willing to do the unglamorous work of redesigning processes, clarifying accountability, and building confidence in when to rely on machines and when not to.

The implications extend well beyond AI teams. Project delivery professionals, leaders, and specialists across industries face the same inflection point. If AI is no longer magic, then success depends on how deliberately we integrate it into human systems. The future belongs less to those with access to AI — and more to those prepared to do the real work required to use it well.

Event of the Week

AAAI-26: The 40th Annual AAAI Conference on Artificial Intelligence

20–27 January 2026 | Singapore EXPO, Singapore

The AAAI Conference on Artificial Intelligence is one of the premier international gatherings for AI research and practice, bringing together academics, industry leaders, and practitioners focused on the forefront of intelligent systems and applications. The 2026 edition features peer-reviewed presentations, special tracks, workshops, tutorials, and poster sessions across foundational and applied AI domains. For project delivery professionals, attending provides a deep dive into emerging methodologies, cross-disciplinary insights, and networking opportunities with the global AI community. Learn more


One more thing

That’s it for today!

Before you go we’d love to know what you thought of today's newsletter to help us improve The Project Flux experience for you.

Login or Subscribe to participate

See you soon,

James, Yoshi and Aaron—Project Flux 

1 

Keep Reading

No posts found