We typically consider an agent as an LLM that has access to tools, knows how to use them, and can decide when to do so depending on the inputs, the intermediate outputs, and context information. Instead of the sequence of actions being coded, they decide which actions to do next.
However, the idea of using tools is not exclusive to an agent. Chains can connect to other tools too. The key difference lies on the idea of states. Agents have explicit states from where they transition from and to. In an agent, we are asking those language models to transition between those states implicitly as they are being called. You are asking the model to be more aware if something can be done in one shot or it requires a plan.
Agents are not trivial to implement and maintain. All the pieces the system uses to resolve a given state and transition is what we tend to call “a cognitive architecture”. Multiple cognitive architectures have emerged lately. Most may be aware of the Reason and React architecture (ReAct), but there are other options like Plan and solve, Reflexion, Dialog-Enabled Resolving Agents (DERA).
In this talk, we will recap the different cognitive architectures for agents from a practical point of view, the main challenges they face when it comes to implementation, and what the landscape looks like.