Who Is Responsible For Risk In AI Warfare?

What Lavender, The IDF, Palantir and Amazon Tell Us About Risk, Responsibility and Power

smoke billows from a factory in a city

A column of smoke resulting from the Israeli bombing of the Gaza Strip // Mohammed Ibrahim March 5, 2023

The phrase “AI warfare” conjures images of killer robot dogs, not necessarily recommender systems, but in our present reality it resembles the latter more than the former. So what do Netflix recommendations and the military industrial complex have in common? The answer to this question lies in a conflict which is turning into a proving ground for AI tools - the war in Gaza.

The Israel Defence Forces (IDF) have been using two AI-powered tools - “The Gospel” and “Lavender” - to generate targets for their military operations and bombing raids. They are essentially functioning as recommendation systems, suggesting military targets and assigning their strategic value. Data like daily habits, locations, family, or associates are processed to create probabilistic lists of locations and people, often linking the two. Outputs include calculations for likelihood of collateral damage in relation to target value and even suggested weapon type. Understanding the likelihood of collateral damage is crucial to these systems, because the IDF must be sure to obey International Humanitarian Law (IHL) as it balances the risk of collateral damage and military objectives. These calculations provide a useful veneer of objectivity when making justifications, especially as criticism and legal action mounts. This raises an interesting question - can the risk of collateral damage be realistically quantified in warfare? How we answer this question will have important implications as more automated AI systems come to the battlefield.

How is Risk Defined & Justified

Ortwin Renn defines risk as “the possibility that human actions or events lead to consequences that affect aspects of what humans value.” Renn notes there are important questions embedded in this definition:

What is a bad outcome (and who decides)?
How do we quantify the likelihood of a bad outcome?
How do we prioritize some outcomes over others?

In some circumstances, we rely heavily on values to assess risk, in others we prioritize numbers. The first and third questions are highly subjective in nature - our values in this case should drive how we decide what constitutes ‘bad’ and how to situate ‘bad’ and ‘worse’. In our case, they are embodied in IHL. The principle of proportionality in IHL prohibits excessive civilian injury and death in relation to any direct military advantage gained, and it relies on quantification for strike justification. In this case, IHL interestingly situates risk as a cost-benefit analysis - a framing which Renn notes privileges probabilities over values, and which is normally reserved for risk analysis with much lower stakes. It shifts the discussion toward a more technical and value-less understanding of risk, which means quantification becomes an important and central tool.

Quantifying Risks in a Networked Battlefield

As we’ve discussed, quantifying risk serves an important purpose in the case of collateral damage: justification in the face of international law and public perception. Values impact target selection, and quantifying aspects of selection doesn’t change this. Lavender may estimate the 99% probability that a target is in an apartment building but if the target is a child, how meaningful is this probability?

Lucy Suchman points out that a long-running project of the IDF has been to solve the problem of ‘situational awareness’ - the idea that soldiers (and militaries) are operating with incomplete or imperfect data in battle. Their solution is a more fully networked battlefield - one with data from the “objectively existing world” which can be transformed directly into actionable and quantifiable outputs. Given Israel’s investment in AI infrastructure, and historic surveillance in Gaza, the IDF’s models are well equipped to generate such outputs.

The only problem, Suchman notes, is that the methods of collection, storage, and transformation are all occurring in a subjective context. This means the outputs from “The Gospel” and “Lavender” are at the mercy of the biases intentionally and unintentionally imprinted into the data. Suchman elsewhere argues that these claims of accuracy “are based on a systematic conflation of the relation between weapon and its designated target on one hand, and the definition of what constitutes a (legitimate) target on the other.” In other words, these models may give precise probabilities that a target exists in space, but they cannot explain clearly what constitutes a target, rendering the output functionally useless in the context of justifying proportionality. In the case of the IDF, the quantifiable probabilities given by their models gives legitimacy to their decision, obscures the values it took to produce the calculations in the first place, and bypasses the values that IHL attempts to enforce.

Responsibility in The Face of Obscurity

So we have established that subjective data makes quantifiable risk questionable, but these calculations still are used as objective. What does this mean for the power dynamics in warfare? In the case of soldiers interacting with these systems, their responsibility is almost entirely offloaded onto the "objectivity" of their models. Soldiers are technically in control, but can effectively say “computer says yes” when given their calculations, and move on to their next target without assuming real responsibility.

So the model absorbs the responsibility, but what is the model? The model consists of a variety of technologies that collect, store, transform and operationalize data - all of which we know bias outputs in various ways. Major elements are developed by the IDF, but significant input and infrastructure are provided by American tech companies. This is an international, multi-stakeholder, private-public ecosystem. So when soldiers offload their responsibility, where exactly does responsibility land? On the IDF’s Unit 8200? Palantir? Amazon?

Sheila Jasanoff notes that, historically, the “[responsibility] for the management of complex technological systems is distributed in ways that limit accountability.” AI technologies are so entangled that responsibility is difficult to pin down, but this does not diminish their power. This obscurity becomes an important function of the system itself - it serves as a feature, not a bug. If we can’t point to the soldier, and we can’t define who is responsible for the model, this means that at the lowest level and the highest level, there is limited accountability for what becomes of the model’s outputs. As we move toward a future where AI is more embedded in military decisions, this will not cut it. We must have a better criteria for assessing the accuracy of risk calculated by AI models, and assigning responsibility for their outcomes.

← Back to Archive