Why human in the loop AI will fail as oversight

How upstream automation quietly shapes the decision

By Yule Guttenbeil

Most AI governance frameworks I encounter in Australian businesses rest on a single load-bearing assumption. The assumption is that a human at the end of the decision chain provides meaningful oversight of the automated systems that fed into it. I do not think that assumption holds, and the reasons it does not hold are structural rather than the product of any individual failing.

Human in the loop architecture, on its own, is not a sufficient indicator that a system is operating safely, and what executives should ask in its place. It draws on a submission I recently made to the Office of the Australian Information Commissioner on the new automated decision-making transparency obligation in the Privacy Act.

What actually happens before the human decides

Decision-shaping in modern enterprise software increasingly occurs upstream of the formal decision point. Automated systems perform an expanding share of their work through intake screening, triage, prioritisation, summarisation, and routing. All of these operate before a human formally makes a decision.

Consider a paradigm case that will be familiar to anyone using Microsoft 365. A staff member is asked to make a judgement call on a customer matter. They open the email thread, but before they read it, Copilot produces a summary. They read the summary. They form a view. They act on it.

A summary is not a neutral compression. It selects, omits, and characterises. Where the human’s view is formed on the summary rather than on independent reading of the underlying record, the summarisation has shaped the decision in substance, not merely in form. The decision-maker may not, after the fact, be able to identify whether their view was independently formed or was anchored by the characterisation in the summary.

That retrospective inability to isolate the system’s contribution is not a reason to treat the system as outside the scope of governance. It is an aggravating feature. Decision-shaping that cannot be traced after the fact is decision-shaping that cannot be contested after the fact.

The accountability sink

This is where the structural problem appears. Automated systems shape decisions but cannot bear responsibility for them. Responsibility, in the legal sense, can attach only to human or corporate actors.

Where the configuration of a decision pipeline routes the visible accountability to a human at the end of the pipeline, while leaving the upstream systems and the entity that procured, deployed, and configured them at one or more removes from individuated accountability, the result is predictable. Responsibility is concentrated on the actor with the least influence over the substantive content of the decision.

I call this an accountability sink. It is not the product of bad faith. It is the predictable product of architecture in which the human ratifier is the only locatable actor at the moment of decision, and every upstream contribution is too diffuse or invisible to attribute.

Why override capacity does not solve the problem

The most common reassurance I hear is that the human can always override the system. In principle, yes. In practice, override capacity is routinely present in modern systems but rarely exercised, particularly under time pressure or where the human must document reasons for divergence. Humans need strong reasons to want to override a system that appears to be doing its job.

Three operational facts compound this:

First, the order of presentation matters. A system output that reaches the human before they form an independent view anchors the decision. A post-hoc check is a different exercise.
Second, base-rate concordance, the proportion of cases in which the human matches the system, is typically high, and once it is high it is self-reinforcing.
Third, the cost of disagreement is asymmetric. Where overriding requires the human to document reasons, escalate, or accept personal accountability for divergence, the system is functionally determinative even where it is formally advisory.

None of this is visible from a governance document that records “human in the loop”. All of it is visible from observing how the system is actually used and operates in practice.

The questions executives should be asking

If you sit on a board, run an executive team, or carry general counsel responsibilities, the practical questions are not about whether AI is “in use” or whether a human is involved. The answers moving forward will almost always be yes to both. They are about the functional influence of the system on actual decisions. I would suggest five.

What information does the system curate before the human sees the file, and can the human detect what was filtered or characterised?
What proportion of decisions match the system’s recommendation, and is that proportion trending up over time?
How much time does the human have to engage with the underlying record, and what is the cost to them of disagreeing?
Where the system performs intake, triage, or summarisation, is that activity recognised as decision-shaping, or is it being treated as administrative?
If a customer or staff member challenged a decision, could you reconstruct what the system contributed at the time of the decision?

If the answers are unclear, you do not yet have oversight. You have a ratifier, and you have an accountability sink behind them.

What changes when you see it this way

The shift this requires is from formal governance to functional governance. Formal governance asks whether a human is involved. Functional governance asks what the human can actually do with the information environment the system has produced for them.

This is the same shift that mature businesses make in safety, in financial controls, and in privacy more broadly. The presence of a control is not the operation of a control. The operation of a control is what produces the outcome.

For AI specifically, the implication is that the most useful governance investments are not at the decision point. They are upstream, in understanding what your systems are doing before the human gets there, and downstream, in the ability to reconstruct what the system did when a decision is later challenged. That is a different governance posture from the one most Australian businesses are running, and it is the one the next phase of regulation will need to recognise.

If you would like to talk about how this applies to your environment, you can reach me at yule@attune.legal.