The AI Agentic Workflow Explosion

by Marc Wilson | 01 Apr 2024 | business, random | 0 comments

I am Awe-Inspired and Terrified (to quote a friend)

I have said that I had an AI moment a few weeks ago (here). It was like I got hit by a lightning bolt.

It was such a simple thing – I saw Matt Wolfe embed a very targeted GPT request into a workflow. It’s a five month old video.

That was it. My world changed. I could see that using a GPT iteratively for inference-based micro tasks would change everything.

Since then I have watched what must be nearing a hundred of hours of video, read voraciously, experimented in deploying AI in Python and enabling local AI in large parts of our business platform. It has felt like I am living in science fiction.

The Current Dominant Consumer State of AI

I think many people have been using ChatGPT as kind of an advanced search (hallucinations and all). As they become more sophisticated they might start using it as an advisor and coder and use that for software configuration. The combination of those two can with be used for diagnosis and then correction.

This extended use can shortcut work or resolve unresolved errors. That is a big impact on productivity.

AI as a Workflow Step

With my new interest, my Google Assistant newsfeed showed me Sam Altman discussing how GPT-5 will include the ability to launch agents. I think this is where productivity really begins to explode, where multiple tasks in multiple apps on multiple servers can be chained together in a free-form fashion, as opposed to a deterministic workflow.

Altman says GPT 5 or something like a 4.5 will release in the US summer. So that is soon.

I think that will cement OpenAI not just as a chatbot and LLM but the front end to a ecosystem of models , orchestration and agents. The ChatGPT web or mobile interface already appears to be moving towards this.

While interacting with ChatGPT and watching various videos about AI, it is clear that ChatGPT is interface to more than the large language model . It is already calling agents for certain tasks. for example when we ask ChatGPT to draw a graph it spawns a python shell and generates a graph using that shell and that seems to represent a combination of the LLM which is good at inference and a python agent which is good at deterministic output. Perhaps the future will reflect a combination of these capabilities in a way that’s imperceptible to us where they combine and leverage and check one another.

Agent Visionaries

I am a bit late to this. Andrew Ng has been talking about agentic workflows for a while. At Sequoia’s recent AI open day (excellent – watch the whole day here), Andrew showed the impact of Agentic Workflow on the performance of underlying LLMs. It is staggering. It elevates lower capability LLMs to higher level performance – e.g. ChatGPT 3.5 outperforming ChatGPT 4.

Source: Andrew Ng – Sequoia Capital AI Ascent Day

I think what we’re going to see is the emergence of an AI agent economy and ecosystem.

In what seemed like simultaneous to my epiphany, Devin launched. It shook the software development industry. Suddenly macro level tasks that were outside the realm of GPT prompts due to their limited context window and lack of real-time content were accessible. Devin can independently execute an entire software development project. Within days, OpenDevin launched – an open source version. And then the open source Devika.

Devin is already working on tasks that are listed on Upwork. Devin solves it and then earns the bounty.

Devin represents a system using an overall orchestration platform for software development together with the ability to spawn agents and iteratively interact with LLMs.

I think that concept could be extended into other domains like research.

David Ondrej is a big fan of Crew.ai and Andrew Ng drew attention to the Python Langchain library (another presenter at the Sequoia open day). Both allow the use of agents in python. This is foundational to the explosion of further agentic applications like Devin.

Things accelerated. Yesterday Microsoft launched AutoDev – its agentic software development tool.

This will move exponentially faster now. I believe we will see agentic tools for provisioning IT servers and end-users in the next few weeks – it is an obvious candidate.

I think the above also will result in the atomization and extension of AI at multiple levels.

Agents as a Solution to Yann LeCun’s Hierarchical Planning Challenge

I wrote about Yann LeCun’s (Meta Chief AI scientist) criticism of the current AI models as a path to AGI here. He highlighted the inability of current LLMs to cross from the language domain and the complexity of hierarchical planning – his example: a trip from New York to London.

I think that the use of LLMs for discrete inferential tasks within broader workflows challenges that. Perhaps what we will see is a disaggregated model of AI performing these complex tasks and freedom from the context window limitations of current LLMs.

In my naive thinking, I imagine this to be like the specialist areas of the brain (e.g. the visual cortex) processing small discrete specialist tasks within a broader process.

Software Development Project Execution as a Model for Other Domains

Software development is similar to strategy consulting in that it requires a mix of hierarchical planning with lots of complexity and then execution. The idea that AI replaces a consultant is perhaps a long way off. The opportunity to execute a task exceptionally well remains a current opportunity. However, the combination of AI within consulting workflows and perhaps agents in ChatGPT5 is very exciting.

Researchers working with BCG (here) found that application of AI in task completion led to the following improvements:

Increased Productivity: Consultants using AI completed significantly more tasks compared to those without AI. Specifically, they completed 12,2% more tasks on average.
Enhanced Quality of Work: The quality of work, as measured by human graders, was significantly higher for consultants who used AI. The improvement in quality was more than 40% higher compared to the control group without AI.
Faster Task Completion: Consultants using AI were able to complete tasks 25,1% more quickly, indicating a substantial increase in efficiency.
Benefit Across Skill Levels: The study found that AI augmentation benefited consultants across the skills distribution. Consultants below the average performance threshold experienced a 43% increase in their scores, while those above the threshold saw a 17% increase compared to their own baseline scores without AI.

These are early days. However, let’s consider that summary again:

“For each one of a set of 18 realistic consulting tasks within the frontier of AI capabilities, consultants using AI were significantly more productive (they completed 12,2% more tasks on average, and completed tasks 25,1% more quickly), and produced significantly higher quality results (more than 40% higher quality compared to a control group).”

I think the challenge will be to structure consulting tasks into defined workflows so that discrete tasks suitable for AI and application of agents can be identified and automated. However, few would have thought software development was significantly less complex and could be challenged so quickly.

The Atomization of AI

We are already seeing “mixture of expert” LLM models – like the recent open source release of DBRX. This splits LLMs into expert domains and then chooses combinations to best answer a query. This has the benefit of smaller less costly LLMs and very fast tokens-per-second outputs. This is critical for agentic workflows that need to rapidly and iteratively execute calls to LLMs.

Perplexity.ai provides a free and paid for front end to multiple LLMs (e.g. OpenAI, Mistral, etc). This too allows some of the atomization too via aggregation at the front end. In the last few days, we now have an open source app that accomplishes this too.

Workflow, integration and agent tools like WSO2, Zapier, Make, Huginn and n8n clearly become infinitely more powerful using AI agents for embedded inferential tasks. Scanning unstructured text and returning a structured JSON string is trivial for an inferential agent.

We saw the combination of tools a long time ago – AlphaGo in 2016. Denis Hassabis believes that LLMs plus Tree Search is the fastest way to AGI.

There is quite a lot of talk that ultimately this will breed an AI OS – an OS of AI with agents and orchestration. And that will sit on top of AI optimised hardware (for native neural networking capability). This is already the direction Nvidia is going.

The Rise of Agent Ecosystems

I think that we’ll see agent ecosystems around particular software and tasks. For example, take the WordPress plugin library. Perhaps instead of marketplaces of plugins will see functionality requests being satisfied through AI.

In areas such as R and python, the library ecosystems will become dynamic where functionality will be extended by agents and AI. Libraries might represent a static view of functionality at a point in time before being further improved by agents – much as they represent a static version today before further human improvement.

The End of Application Marketplaces as a Monetization Mechanism?

Applications such as WordPress, SuiteCRM or others combine open-source software with marketplaces for extended functionality. This could be a method to monetize functionalities that are in high demand but not part of the free core system. It also presents an opportunity to bring in more developers.

On-demand agent-driven software development could destroy a current means of monetizing core open-source software.

In many cases, the existing plugin libraries attract huge subscription fees for extended functionality, particularly in business niches.

Data as a AI Building Block

Agent ecosystems could lead to interaction with broader application areas like finance and underpinning those application areas will be data – not just training data, but real-time data.

Perhaps instead of LLMs being trained on custom data sets we will move to an interaction between capable LLMs and dynamic data sets – like document libraries, knowledge bases, financial data, etc. LLMs and agents will then be used in real-time to interact with up-to-date data. In a very basic way, I think we already seeing something like that with ChatGPT’s ability to search the web and feed results into its responses. Retrieval Augmented Generation (RAG) is already part of PrivateGPT – a means of locally implementing LLMs to interact with private data.

Winners and Losers

The medium-term AI short trade is surely obvious now.

I can see the losers in the new world – those performing repetitive tasks with unstructured data. For example (at a very discrete level), isolating contact details from an email is a surprisingly difficult deterministic task via parsing rules. It becomes trivial with AI. There are hosts of admin intensive tasks in this kind of domain. Repetitive tasks such as Search Engine Optimisation change completely.

IT system administration, helpdesks, etc – they are a huge target. Software development as indicated above. System transformation and integration. AI can untangle logic in legacy systems and rebuild it better in modern platforms.

Call centers are already under threat. Klarna released outputs showing that customers received better (or no-worse) service from automated chatbots. Those chatbots handled two-thirds of customer service chats in their first month.

The creator economy is hugely impacted by AI. I have seen people claiming to generate multiple (10+) videos via AI within 1 hour and launch into Youtube and monetise via AI tuning to exponential (viral) growth. Given that AI strength right now is generative, I think there is a likely explosion of content. At some point this makes stored content overwhelming and meaningless (the “dead internet”). Ultimately any content can be generated real time. That blows the creator model and associated monetization. That feels like a very current coming change.

What happens to the Google goldmine – search with advertising? I think search is crucial to combined use with AI. LLMs are static and search provides context and currency. I think most people misinterpret AI as a replacement for search. That is poor use, But what people do, will be. A mitigating step is that AI summaries already present in Google and Bing. Perhaps the broader implication is the rise of subscriptions or on demand paid-for opt-outs from advertising impregnated results. Currently top-end AI is only meaningfully available through paid subscription (e.g. GPT4) or behind paid service (e.g. Microsoft Copilot). So hybridisation is probably the outcome.

Software firms could be major losers as functionality is generated and customised for end users. Microsoft is moving exceptionally quickly. An AI operating system with ecosystem integration must be a medium to long-term prospect.

Law is changing immediately. Combining LLMs with libraries utilising embeddings and knowledge vectors will allow reliable searching for precedents and referencing. I think we will see AI with the ability to present referenced arguments in the near term.

There are lots of opportunities in manufacturing, supply chain and logistics. This was an early target for Andrew Ng – but he does talk of the difficulties in using AI in fields like quality control, etc. Not a slam dunk and probably lots of opportunities on small tasks, particularly related to integration with IoT.

The potential disruption in security, law enforcement, etc is enormous. AI is already being used for monitoring camera footage for alarm conditions – this happened a long time ago. China has built a surveillance state on the back of AI-driven surveillance. I believe most of these were based on custom and tailored development. Agentic workflow extends customised opportunities to a huge market.

Foundational layers win for the forseeable future – computing power and energy – so yes, obviously NVIDIA.

However, AI could obsolete IP – i.e. it might design better software, drugs or even a GPU. Hinton and others are talking of a cure for cancer. An optimal GPU for another NVIDIA seems trivial. For example – it takes a PhD student 5 years to map one protein. Deepmind did 200 million proteins in no comparative time. AI potentially improves exponentially as it is applied to the full engineering stack – from materials, to hardware to software.

There are already players who are redesigning for AI from the silicon – Groq designed their LPU chips for inference from the ground up – allowing 500 tokens per second with Mixtral. This means that multiple LLM responses can be processed with no delay to the user. Beyond AI agents performing quicker, this allows LLM agents to check one another in real time to stop hallucinations being presented to the user.

Elon Musk has mentioned a few times that beyond chips, it becomes as simple availability of other components such as step-down transformers for computing processors in the near term (within the next year).

If you go more foundational than that, then silicon and other materials win no matter what.

But there must be something more clever than that. Probably the clever stuff is which fields take off due to being changed completely – health and pharmaceuticals is obviously one area.

One of the most hopeful outcomes (beyond improved healthcare) is improved education. It is a short step to a chatbot for every poverty-stricken child on a smartphone (probably with a LLM on their phone) – with the whole of the internet as training data. That kid gets a perfect (soon) teacher instead of an poorly trained one in a badly resourced classroom. That also talks to the extension of AI to a huge target base as opposed to Google search only being available to a more affluent segment of the population. Monetisation becomes big issue – but you could displace badly spent money to subscriptions – e.g. education budgets.

Larry Summers believes AI is coming for the cognitive class. Fields like asset management and wealth management will surely be transformed. AI supports algorithmic trading and investing, removing effects of bias and performing arbitrage to the extent it still exists. Perhaps it scales to far bigger effects than this.

I think we realistically can expect to be able to analyse all standard financial measures via AI – e.g. dump investment Due Diligence data into AI and get an analytical output. Then that can be scaled to all publicly available data. Anthropic’s Claude is already better at ChatGPT at this. Private data access becomes a game changer.

Managing AI hallucinations and testing will be absolutely critical. Perhaps just like how transformers in LLM model training turned out to be one of the biggest AI breakthroughs, so will checking in an atomized AI architecture be, with agents performing this task. This is already happening in the software development tools mentioned above.

There is surely an anti-AI reaction – a bit like people going back to LPs and film photography. This exists at many levels, and perhaps face-to-face and relationships become more sought after.

The old big spender is what it always was – defence. Hinton mentions US defence now and again. AI clearly has immediate application to misinformation bots or monitoring. China has an enormous AI-driven surveillance state.

There are lots of conspiracies on OpenAI cracking encryption (as being the reason behind Altman was temporarily fired). Apparently Ilya Sutskever is now only focused on AI ethics and security. I am not sure of inference being able to break encryption. Maybe I am just scared of the consequence. And AI engines are becoming better at maths. JP Morgan have had a team working on defence against AI encryption cracking for a few years. If AI was close, the genie would be out of the bottle. I imagine that Russia and China will be racing to leverage all the open domain and spied AI progress to get ahead. Cleary the world will pay anything to defend against this – if it is defendable.

Broader Implications

There must be room for game changing combinations of technology. Clearly energy and AI supercomputing is one area. Microsoft and Amazon have both pursued adjacency to nuclear power sources to cater for the massive energy needs of current AI. Sam Altman talks of the potential of nuclear fusion and perhaps fission. The combination of AI with quantum computing is an exciting or scary prospect. These combinations become nearer term when perhaps AI is deployed to solve the road-blocks to progress in those domains.

For me, the most scary part of the pace of development is that while the dominant actors such as OpenAI, Anthropic, Amazon, Microsoft and Google might be scrambling to manage the potential for a bad AI outcome, this genie is well and truly out of the bottle. While rigorous testing, LLM ablation and other techniques might mitigate bad outcomes, jailbreaks are still a problem. However, the bigger problem is surely that developers are feeding off one another at a rate of knots. Previous thoughts on roadblocks to AI development such as what were thought to be diminishing returns on LLM scaling have proven to be false and encouraged others to rapidly intensify their efforts. The scope for bad actors such as hacker groups to use localised open source LLMs as an inference engine in a hacking agentic workflow seems obvious and immediate. Or to increase effectiveness of spam and cyberscams. Or to launch massive paid misinformation campaigns. At a Nation State level, restrictions baked into massive LLMs are not relevant if researchers build their own.

But, more immediate to most of us, is that in the current term, application of AI to discrete tasks for inference is happening now and agentic workflows change the world today.