Building an Agentic Workflow Prototyping Platform

What can we actually do with systems of AI Agents? I wanted to find out by doing.

Jun 10, 2025

I previously wrote about how I was finding it has been difficult to navigate AI Signal vs Noise, and that the best way to discern what is actually possible, was to learn by doing. So I plan to build out a bunch of different Agentic Systems to see what useful things I can get them to do in a reliable way, and also see what surprises me along the way.

Learning By Doing

I’ve decided to build my own low-code platform to quickly prototype, visualize, and get quick feedback on various agentic applications to see what I can actually do with them.

Why build? There are a million low-code SaaS platforms out there today that allow you to orchestrate AI agents. However, I decided to build because:

I’m a software engineer. Of course I decide to build :)
I want full control over the prompts, responses, control flow…etc. Again this is mainly for research. I don’t want some framework hiding the “magic” from me.
Full introspection. Again like #2 - I want to be able to inspect everything that is happening with full control. I want to see where I hit the limits. I don’t want to have to ask: “is this a limitation of the LLM or of the platform?”
This is for exploration. I will learn more if I have to build the plumbing myself rather than have a SaaS product abstract that away from me.
I definitely foresee my needing some custom functionality that is not offered by the existing platforms and I want to be able to quickly make changes to fit my unique needs.

So what do I need to build?

The system I am building consists of three things:

Datastores
Workflows
Content

Datastores

As a backbone to my system I want to be able to quickly create and read and write to persistent datastores, and use them in my agentic workflows. Requirements:

Quickly create a new datastore and define it’s schema in the UI
Should allow for flexible schemas (ie JSON columns)
Should support vector embedding so that I can perform semantic searches / RAG.

Ultimately I chose Postgres as the backbone for this, which supports all of the above requirements and allows me to quickly create datastores via the UI:

Workflow Builder

I will need to be able to compose deterministic and non-deterministic (ie LLM calls) nodes together into workflows. I think I can get pretty far and keep it simple (at least for now) with just a few node types:

Node Types

Start Node - to trigger the workflow via an API call or timer/cron
Query Node - query a datasource
Write Node - write to a datasource
Condition Node - allow me to branch logic in the workflow
LLM Node - call an AI model with a given prompt
Goto Node - change the flow execution and jump to a specified node
HTTP Node - call external APIs
Timer Node - wait for a specified amount of time before proceeding
Event Node - wait for a specific event before proceeding

An example Question & Answer workflow consisting of a Start Node, LLM Nodes, Query Nodes, a Conditional Node, a Goto Node, and a Write Node.

Node Properties

Nodes will share some common properties, namely:

Name: we should be able to name nodes, so that we can visualize our workflows and debug node outputs easily.
Definitions: a node can set the value of a definition that can be used in subsequent steps of a workflow. Ie a Query Node may set the results of a query to a definition called searchResults that is then used in the prompt of a subsequent LLM Node step. (Definitions are green tags in the workflow picture above).
Typing & Autocompletes: All fields and inputs will all be typed with their schemas inferred from their source datastore. All available definitions and fields will be auto-populated and selected from in autocompletes. I don’t want to run a long running workflow just to see it fail 10 minutes later because of a typo in a variable name…etc.

Example of writing to a data store in a Write Node. All of the fields and values are typed and selected via an autocomplete, giving a “Datadog metric selection” type of experience, leaving no room for typos or type mismatches.

Nodes will also have some custom properties based on the node type. For example an LLM node type will have properties for setting its System Prompt, Prompt, Output Format…etc.

Properties for an LLM Node. System Prompt, Prompt, and Output Format.

Previewing & Inspecting Results of Each Stage

I should be able to run the workflows right in the workflow UI in a “preview mode” and inspect the output of each stage. Preview mode should mean that any mutation events are mocked (ie writing to a datastore).

Seeing the raw output of each stage of the workflow during a preview run

Durable Asynchronous Execution

I plan on running some long running workflows with this system. Therefore I want to enable asynchronous processing and ensure durable execution. In other words, workflows should execute in the background and a failure of a node doesn’t mean that I have to start the entire workflow over again.

To achieve this, I built on top of Temporal, which takes care of a lot of the heavy lifting regarding scheduling, durability, and execution.

Content

The last component of the system is Content. I want to be able to create rich content in the platform for two main reasons:

Creating, organizing, and iterating on System Prompts for LLM Nodes.
Compose rich documents to display output of various stages of the agentic workflows I will be creating for monitoring and review purposes.

Rich Content

I want to compose documents of rich components in a “Notion-like” interface. This will allow me to quickly and better organize criteria in my system prompts (ie putting evaluation criteria in Tables, tool call examples in code blocks…etc).

Using headers, tables, code blocks…etc to organize content for System Prompts.

Data Input

I want to be able to interpolate and populate components with data from my datastores. This will enable me quickly build living documents on top of data written to datastores in workflows. This can help me visualize and monitor the output of a workflow or of various nodes in a workflow.

Should be able to “Configure Section Data” to populate components with data from datastores written to in workflows.

Example of populating a table with input question, LLM derived answer, and LLM evaluation of a Question & Answer pair, one row per workflow run.

More interestingly, I can include datastore-derived components in documents that serve as Prompts to LLM nodes, providing them with evolving prompts upon each workflow run, which is a good segue to the next two items.

Usable in Workflows

Content should be usable in Workflows. Ie a document can be the system prompt to an LLM node.

Editable by API

Content should also be editable via an API call. This could allow stages of a workflow to be able to edit or compose documents themselves. This can enable workflows where the LLMs could [potentially] “self-improve” by editing their own system prompts.

Versioning

Content should be publishable / versionable. Ie we can iterate on system prompt documents and test in preview mode before changing the system prompt of running workflows.

And That’s It

While I imagine I will continue to add more and more components to the Content Editor (ie different types of graphs or visualization tools…etc), the basics are there and I can start experimenting with some potential agentic workflows.

So let’s get started with our first one: creating a Knowledge Base Question & Answer Agent.

Joe DiVita's Substack

Discussion about this post