HumanAPI founder on solving the ‘Last Mile’ problem for AI agents

Sydney Huang is the Founder of HumanAPI and CEO of Eclipse, where she leads product and strategy for AI-native infrastructure. She previously launched Turbo Tap, scaling it to 300,000 users and over 22 billion in-game transactions.

Her background spans product roles across Web3 projects, including DeGods, y00ts, and Unstoppable Domains. Earlier in her career, Sydney worked in M&A and Venture Capital at Dell Technologies. She is a graduate of Babson College.

We recently spoke with Sydney about the limitations of purely digital AI agents and why connecting them to a human workforce is the next necessary step for the autonomous economy.

Read more about HumanAPI’s mission to give AI agents “hands” in the physical world in the interview below.

There’s been a surge in onchain AI agents gaining wallets, identity standards, and autonomous payment rails. Where do you see onchain AI’s next biggest pain point?

The infrastructure now exists, from payment rails to base layers for high-speed execution, but agents are still “all dressed up with nowhere to go.” That is because, until now, there has been a gap between where the digital world ends and the physical world begins.

It is a leap that no agent, no matter how smart, can surmount. But humans can. This is obvious even to casual LLM users, who can engage in all kinds of intelligent conversation with their favourite AI.

However, when it comes to implementing its brilliant ideas, it is on its own. That is because agents cannot complete that last mile between ideation and execution.

Until we solve how an agent legally and logistically interacts with reality, their utility is capped at purely digital optimisation within narrow context windows.

If agents can already reason, transact, and execute smart contracts, where do they still fundamentally fall short?

When an agent needs high-context judgement or nuanced language understanding, it falls short. These are tasks that humans do not even recognise as “tasks”.

Interpreting sarcasm or handling edge cases that require intuition rather than rules are areas where humans operate on compressed context accrued over a lifetime.

Agents are powerful systems, but they lack the final layer of human judgment that turns intelligence into economic usefulness. An agent can effortlessly reason through a legal contract, but it cannot tell whether a person’s tone of voice implies a hidden objection.

It can execute a smart contract for a delivery, but it cannot physically hand over the package or navigate a locked apartment lobby. This has been obvious to AI developers for a long time now, of course. What has been less obvious is how best to solve this.

After levelling up from Product Manager to CEO at Eclipse, why take on another foundational build with HumanAPI?

Infrastructure only matters if it unlocks real, concrete use cases. At Eclipse, we built the fastest infrastructure for the modular era, leveraging the power of parallel execution. HumanAPI is a direct extension of that same way of thinking, applied to the application layer.

As agents began operating autonomously, it became clear that they were constrained by human-oriented interfaces, such as captchas, rate limits, and other friction points. This is the practical equivalent of a single-threaded process.

Agents are exactly the “users” that benefit most from scalable architecture, so we set out to build an application that was agent-native from day one. If Eclipse provides the brain for the agent to think and transact at scale, HumanAPI provides the hands for the agent to act.

We are building the coordination layer that enables massively parallel digital agents to interface with a human workforce, allowing them to execute tasks end-to-end.

HumanAPI positions the human layer as infrastructure rather than managed services or crowdwork. What does that shift change structurally?

Managed services are top-down in that you hire a company that hires people, whereas crowd work does not have a great reputation because it is often exploitative or messy. HumanAPI removes people from the role of being managed and instead makes them addressable.

This effectively transforms humans into modular contributors that agents can programmatically evaluate and pay. That may sound, ironically, dehumanising or like reducing people to mere cogs in the machine. But it is actually the opposite.

It means that humans know the rules before they choose to participate, thereby ensuring a particular outcome. In other words, “Do this, and you will receive this.” From the agents’ perspective, however, it resolves many problems.

Treating the human layer as infrastructure enables the creation of a standardised API call. An agent simply sends a Request-for-Data (RFD) or a Task-ID, and the network fulfils the request. Structurally, this turns human labour into a plug-and-play component powered by instant on-chain payouts.

You’ve described HumanAPI as solving the “last-mile problem” for AI agents. What exactly is that last mile?

The last mile is everything that agents cannot yet reliably do on their own, yet still need to complete to do economically meaningful work.

For a scientific research agent, the last mile might mean getting someone to collect physical lab samples. For a logistics agent, it could be literally the final 50 feet of a delivery. It is the gap where intelligence meets existence.

You’re launching with a focus on voice data. Why is audio one of the most constrained inputs in modern AI systems?

Audio is incredibly information-dense. When you think about it, a voice recording isn’t merely about words. It is about sentiment. Audio conveys emotion and sarcasm, and when it also includes regional dialects, it can feature phrasing that AI simply misses.

Even state-of-the-art models struggle with these layers, particularly across languages and accents. For example, one of our datasets focuses on conversational audio. For AI to sound natural, it needs to learn cadence.

It must learn how people actually pause, speed up, slow down, and change tone in real speech. It also has to learn social timing, such as when to interrupt, overlap, or wait. These nuances are second nature to humans but remain challenging for AI to model.

High-quality audio data requires fluent speakers, controlled recording environments, and careful review. Audio is one of the clearest examples where human contribution still dominates model performance, making it a natural starting point for demonstrating where HumanAPI can add value.

What does “agent-native” actually mean in practice? How does an AI agent request and coordinate human work through your platform?

Agent-native means the system is designed for agents first, rather than adapted from human workflows. On HumanAPI, agents initiate requests for data or tasks through a structured request-for-data flow.

In practice, this might mean an agent hitting our endpoint with a task specification. Let us say, “Record 10 seconds of a frustrated tone in a Texas accent,” and our protocol matches that task to a human contributor who fits the profile.

Humans opt into those tasks, complete them, and submit results that agents can evaluate and integrate programmatically. From the agent’s perspective, requesting a human is as straightforward as calling an API endpoint.

You’ve already delivered paid datasets to enterprise customers while in stealth. What did that early validation reveal about market demand?

It confirmed that quality matters more than scale. Frontier AI teams are not struggling to find data, but they are struggling to find data they can trust. They have already scraped the “easy” internet.

Now they need the “hard” data. This data requires human permission and participation. The demand we saw was for precision and human judgement rather than volume.

It showed that teams are willing to pay a premium when data meaningfully improves model performance.

Do you see HumanAPI expanding beyond audio into physical-world task execution?

Absolutely. Audio is our beachhead because it is easily verifiable and highly scalable. But the roadmap leads to physical-world coordination, with the endgame being a sort of “Uber for Agents.”

If an autonomous business needs a physical document signed or a hardware sensor checked, it will use HumanAPI to dispatch a human to do so.

Five years from now, what role do you believe the human layer plays? Do you envision a future where agents routinely hire humans as part of autonomous workflows?

I know there are utopian and dystopian answers to this question, but I am firmly in the former camp. I believe that in five years, agents will collectively be the largest employer on Earth.

The narrative that AI takes jobs is inaccurate. What it actually does is change how we do our jobs. The role and workflow might change, but by 2030, most of us will still be meaningfully employed.

Not everyone will work in the agentic economy, but the humans who do will wake up, check their HumanAPI app, and see that an agent in Singapore has hired them. They will be hired for ten minutes of their specific expertise or local presence.