In this paper, I present a novel concern with standard total utilitarianism as a normative ethical theory, modify the theory to remedy the problem while preserving the theory’s virtues, and explore some implications of this modification. The concern with standard utilitarianism is that, since its utility function is a simple sum of the valance of every individual’s experiences considered in isolation, it may encourage sufficiently powerful agents to make individuals more similar for the sake of generating happiness more efficiently using limited resources, at the extreme recommending to superintelligent agents that they optimize away society as we know it. The modified “solution” theory, information utilitarianism, grounds itself in functionalism about the mind and applies information theory to mental contents in an attempt to justify a utilitarian calculus that devalues redundancy in happiness experiences both within and across individuals, thus encouraging diversity of experience among sentient beings. After laying out the modified theory, I explore its potential implications for ethics applied to areas like ecology and cross-cultural tolerance, as well as the proper goals of hypothetical superintelligent agents.
* * *
Utilitarianism, one of the most influential theories of ethics, holds that actions are morally right to the extent that they maximize happiness and minimize suffering for as many people as possible. In this paper, we will discuss the most standard form of utilitarianism, formalized in the most straightforward way, a view which we might call expected total act-utilitarianism (henceforth, just “utilitarianism”). This view uses a utility function to assign numerical utility values to all possible states of the universe, and holds that the goal of all moral action is to maximize the value of this utility function in expectation – i.e., on average, considering uncertainty about which possible universe-state will result from an action given inherent randomness and/or the actor’s limited knowledge. No hard line is drawn between moral permissibility and impermissibility; actions are simply better the higher their expected outcome utility, and worse the lower their expected outcome utility.
So far, this description applies to a variety of consequentialist ethical theories, which differ in the specifics of the utility function that they seek to maximize. Utilitarianism’s utility function works like this: identify all individual sentient beings throughout the universe’s history, identify every positive or negative emotional experience they ever have, determine the degree or intensity of every positive experience (“happiness”) and sum them up across all individuals, determine the degree or intensity of every negative experience (“suffering”) and sum them all up, and subtract the total suffering from the total happiness (“net happiness”) to get the final utility value. In practice, a moral decision-maker is expected to restrict their analysis to the small subset of times and places that they expect an action to predictably affect at all, and, unless the number of affected experiences is tiny, to estimate net happiness effects based on generally observed patterns rather than attempt any situation-specific precise calculations.
Since first being proposed in a recognizable form by Jeremy Bentham in the early 19th century, utilitarianism has been the subject of rich philosophical debates, and numerous criticisms have been levied against it. In this paper, I will criticize utilitarianism in a way I have not seen it criticized before, and then propose a modification of the theory to address this criticism. Before I do so, however, I will explain why I find it worthwhile to attempt to modify utilitarianism rather than discard it outright, by explaining some of the virtues that I think the theory has — in other words, why I want to make it work.
First, utilitarianism is parsimonious. It defines morality entirely in terms of one natural phenomenon — namely, the valence of emotional experience — and simply seeks to maximize the quantity of that phenomenon in expectation, which is mathematically straightforward and ‘natural’ as goals go. For scientifically minded philosophers who like naturalizing concepts, utilitarianism is one of the closest known approaches to a concept of morality grounded in objectivity, and hence a demonstration that “objective morality” is a coherent concept (which most ethicists believe and would very much like to demonstrate).
Second, utilitarianism is precise and universal, at least in principle. To the extent that emotional valence can be objectively defined by neuroscientists, psychologists, cognitive scientists, etc., there exists an unambiguous objective truth about the values of utilitarianism’s utility function, and hence about what it means to do the right or wrong thing according to utilitarianism. This means that utilitarianism provides advice about every possible decision by which sentient beings stand to be affected, and it’s hence applicable across space, time, culture, and even species. Utilitarians will in practice develop context-specific heuristics, or rules of thumb, about what sorts of actions and habits tend to lead to favorable or unfavorable consequences — i.e., about rights, duties, virtues, and vices — but because they’re merely generalizations, it’s okay if they sometimes conflict with one another, and the underlying maximization objective is always available as a fallback in case of conflict or a change of context.
Third, utilitarianism empirically has a history of associating with and justifying political causes that most liberals and leftists view as worth pursuing, making it appealing for people of those political persuasions. For instance, John Stuart Mill, a 19th-century philosopher who remains arguably the most influential utilitarian, was also an influential liberal political theorist, advocating for freedom of speech and thought as well as equality for women and the abolition of slavery (though he favored “benevolent despotism” as a policy of the British empire over outright decolonization, for which history looks on him less kindly). More recently, still-living utilitarian philosopher Peter Singer is famous for being a long-time anti-poverty and animal welfare activist. In general, recognizing that providing human beings with more and more material resources results in diminishing returns on happiness, utilitarianism tends to favor egalitarian distributions of fixed-size resource pools, aligning it with the broad pro-equality tendencies of the political left.
Finally, as discussed by philosopher Heather M. Roff in her 2020 paper “Expected Utilitarianism,” the most prominent contemporary approaches to crafting artificially intelligent decision-making systems naturally lend themselves to consequentialist ethics. For instance, reinforcement learning involves explicitly coding a utility function into an AI, which then learns by experience in an environment how to behave in order to maximize that utility function. If consequentialist theories like utilitarianism are more suited than other normative ethical theories to guiding AI behavior, then, they are of particular importance in the modern world, and so is the issue of if and how we can develop correct and justified forms of them. Indeed, if these consequentialist approaches to AI continue to be fruitful, they may one day become the basis of artificial general intelligence (AGI) — AI that replicates the general-purpose problem-solving ability of humans. If indeed feasible, AGI stands to be one of the most powerful and dangerous technologies ever, due to its ability to match human intelligence and even exceed it — either through deliberate design by humans, or through an AGI “continuing our research” and learning to improve itself — without necessarily acting on human-like values or having any regard for human well-being. The problem of designing “friendly AI” whose programmed decision-making behavior leads it to help humans instead of harming us is difficult even on a conceptual level, as well as of paramount importance should AGI become a reality, and that importance bleeds into the realm of consequentialist theories and their utility functions.
The task of accommodating diversity within society — of culture, lifestyle, physical and cognitive abilities and needs, and so on — is non-trivial, and imposes certain costs from a utilitarian perspective (leaving aside for the moment whether diversity is a net utilitarian cost). For instance, people who are able to hear can communicate through spoken language, while people who lack hearing largely cannot, and must use sign language or text instead. If hearing people and deaf people coexist in a society, how should the society be structured to accommodate them both? Should everyone, even hearing people, eschew spoken language in favor of sign language, which everyone can learn to use? But spoken language has some advantages over sign language, such as the ability to communicate with someone even when you’re not looking at each other, so this creates an inconvenience for hearing people that they wouldn’t experience otherwise. Perhaps, then, deaf people should use sign language and hearing people should use spoken language (as is generally the case in real life) — but now it’s harder for deaf and hearing people to communicate, and a lot of spaces are inaccessible to deaf people if they’re in the minority. Or perhaps everyone should learn sign language, but hearing people should be allowed to learn spoken language too — except that hearing people once again experience an additional inconvenience, needing to learn two languages instead of one if they want the advantages of using spoken language with other hearing people. If society consisted of only deaf people or only hearing people, we wouldn’t have to worry about such costly trade-offs and the subtle negative effects the inevitable inconveniences have on total net happiness.
Similar trade-offs can be described for accommodation of cultural diversity, neurodiversity, other forms of physical diversity, etc. If a utilitarian were to judge the resulting inconveniences to be worth working to avoid, then, they might slide into approval of fundamentalist-esque cultural perfectionism and/or eugenics as means of curtailing diversity. Of course, if they were a good utilitarian, they would also take into account the great harm that such practices have demonstrably inflicted throughout history so far — but that might just lead them to seek the development of e.g. “less harmful” forms of eugenics, underpinned by social engineering, gene editing technologies, and a careful utilitarian calculus rather than forced sterilization and irrational bigotry — but with the same end goal of making people more similar.
However, it is the position of almost all liberals and leftists — including myself — that the value of diversity is well worth the inconveniences it creates. Indeed, the utilitarian John Stuart Mill argues in his 1859 book On Liberty that diversity of opinion benefits society by generating debate that in the long run helps society improve and maintain its collective understanding of the world, in a “marketplace of ideas” fashion (though he himself does not use that term). Likewise, Mill argues that diversity of lifestyle benefits society by allowing people to collectively experiment and uncover broadly applicable knowledge of how best to live. Mill also argues that people naturally differ in what makes them happy, and thus the maximization of happiness for all requires allowing people to pursue a diversity of ends — though our hypothetical eugenicist utilitarian already has a “solution” for this “problem” of natural diversity.
Mill’s points about the epistemological benefits of diversity for society carry significant weight from a utilitarian standpoint — as does the observation that novelty is often a key component of a fulfilling human life, and novelty is harder to come by in a world where everyone is alike. As such, in circumstances like the ones humanity has faced so far, perhaps the utilitarian thing to do is to find a natural balance between the advantages of diversity and the advantages of homogeneity – and this sounds reasonable, at least when put that broadly. I imagine most people would agree that there is such a thing as too much diversity within a society; for an extreme example, imagine trying to build a society in which humans and polar bears live as next-door neighbors. The two species are simply too different to belong together.
However — and this may be the core of the problem — the possibility of superhuman AGI and other advanced information technology threatens to undermine the benefits of diversity. What happens when there exists an automated system that can get at the truth at least as well as we can without needing us to debate? What happens when there exists an automated system capable of studying us and figuring out how we should live better than we can, or of procedurally generating pleasurable novelty for us better than we can for each other? Wouldn’t it be most efficient from a utilitarian perspective to make us all alike, so it’s easier for this superintelligent agent to generalize about us and what makes us happy, and the energy and other resources that would otherwise go toward understanding and accommodating our diversity can instead go toward sustaining more of us, and hence generating more total happiness? At the limit, if this superintelligence was a true utilitarian, might it see fit to directly control and fully homogenize both the creation of sentient beings and their every experience? Devise an optimally happy life for an optimally happy sentient being, and use all available resources in the universe to make copies of that one sentient being, all living that same life by way of direct sensory stimulation as in Robert Nozick’s “experience machines?” This is an extreme situation — but utilitarianism seeks to maximize its utility function and nothing else, and no method that increases expected net happiness is too extreme to be worth bothering with. The practical feasibility of utilitarian AGI should not be all that stands between us and a world anything like the one described, in which society as we know it and all of the rich complexity found in human life is optimized away for the sake of industrial mass production of happiness.
The modification I propose to utilitarianism to remedy this problem has its grounding in a certain metaphysics of mind, which I will describe here. This metaphysics, known as functionalism, has enjoyed considerable popularity in philosophy and cognitive science in recent decades. Functionalism, as laid out in texts like Jerry Fodor’s 1981 paper “The Mind-Body Problem,” developed as an attempt to understand the nature of minds and mental contents that is compatible with scientific physicalism and yet accounts for the similarities between minds despite the differences in their physical constitution. For instance, the nervous system of an octopus is quite unlike that of a human; an octopus’ central brain is torus-shaped, for instance, and the majority of its neurons are found in its tentacles rather than its brain. Despite this, however, an octopus exhibits (for instance) aversive reactions to visual and tactile stimuli that threaten physical harm to it; one is tempted to say it feels pain and fear, and indeed such concepts are useful for describing its behavior. But what are the phenomena of pain and fear if both humans and octopi manifest them, and yet their characteristic behaviors are almost certainly not caused by the same specific physical mechanisms – the same neurotransmitters, arrangements of neural connections, or what-have-you – in both creatures? Functionalism would answer that pain and fear are their characteristic behaviors; that is, they are defined by the way they causally interact with sensory input, motor output, and other mental phenomena, on an abstract level of description that does not make reference to the details of biology or neurochemistry. The same is true of all other mental phenomena, from particular experiences of sights and sounds to, yes, “happiness” and “suffering.” All that varies is the details of the causal-relation description, including the description’s level of specificity.
Functionalism has a few major implications, one of the most famous being that any mental phenomenon can be genuinely manifested by any physical substrate capable of implementing its characteristic causal relations — even a computer! — and hence AI that replicates any and all human cognitive abilities is possible in principle. For our purposes, however, the most important implication of functionalism is that happiness and suffering, and every possible specific experience of either, are fundamentally descriptions, or in other words, information. Experiences with emotional valence are information about what emotions are being felt and what other mental contents (perceptions, beliefs, etc.) they are being felt about. And if they are information, they can be quantified in all of the ways that information can.
There exist multiple ways of quantifying information that are appropriate in different contexts. I will not attempt here to specify the exact quantification scheme that is most appropriate for use in my proposed modification to utilitarianism, but I will discuss one scheme, Kolmogorov complexity, that I believe is at least in the neighborhood of what my theory is looking for, as a demonstration that it likely can be fully formalized even if I cannot do so here. In brief, Kolmogorov complexity, a concept from algorithmic information theory, is the length of the shortest computer program in a predetermined programming language that generates a given string of information as output — one might say, the length of the shortest possible description of a given description. Kolmogorov complexity can be applied to a set of multiple inputs instead of a single input, and in this case the following property holds: the joint complexity of the set is no greater than the sum of the complexities of its members, but it may still be less than that sum. If this is the case, it is because the members contain some mutual redundancy; they are similar to each other in a way that a single description can take advantage of in order to describe them all without needing to describe each one fully independently. (For instance, the text of Shakespeare’s Hamlet contains a certain amount of information, and [the text of Shakespeare’s Hamlet, but where every mention of “Hamlet” is replaced with “Bob”] contains about the same amount. However, a complete description of both Hamlet and Bob-Hamlet may contain only a little more information than one of them individually; for instance, I can just describe Bob-Hamlet and then tell you that Hamlet is the same as Bob-Hamlet with every mention of “Bob” replaced with “Hamlet”.) Kolmogorov complexity is closely related to the more widely applied concept of entropy in information theory; for instance, for dynamical systems with states that evolve over time, the Kolmogorov complexity of the trajectories that describe this evolution is almost always equal to the entropy of the dynamical system (Galatolo et al., 2010).
If we grant functionalism about mental contents, then we can measure the amount of information in any experience of happiness or suffering using a metric akin to Kolmogorov complexity — and furthermore, we can measure the joint information content in multiple experiences of happiness or suffering. Indeed, if we believe that such experiences fundamentally consist of information, then in some sense the most natural way to “add them up” is to calculate their joint information content, rather than to independently calculate and then sum up their intensities as standard utilitarianism does. Using joint information content completely in place of the intensity sum seems to contain an implicit claim that the intensity of an emotional experience is equal to, or should be defined as, the amount of information it carries. This is parsimonious in that it defines one hitherto-vaguely-defined concept in terms of another, more formalized concept, and it could be taken to provide justification for the idea expressed by Mill and some other utilitarians that “higher,” more sophisticated pleasures are worth more than base pleasures. Maybe what it means for a pleasure to be more sophisticated is for it to carry more information; to have higher complexity, perhaps as in Kolmogorov complexity. If we accept this identity claim, then perhaps what we should be doing to maximize happiness is to maximize the joint information content of every happy experience throughout the universe’s history. If we don’t trust the identity claim, however, perhaps we should add the intensities of all happy experiences together, divide that by the sum of their individual information contents to get a sort of intensity-based scaling factor (that’s conveniently equal to 1 if the identity claim is true), and maximize the product of that scaling factor with the experiences’ joint information content. Details such as these are worth working out for the sake of the overall plausibility of this proposed normative ethical theory, which I will call information utilitarianism, but for our purposes what matters is that the theory’s utility function scales with joint information content rather than being a simple sum, and so redundancy in happy experiences is penalized as compared to standard utilitarianism.
The most important implication of information utilitarianism, the reason I have proposed it, and one of the most compelling reasons to accept it is the fact that it does not highly approve of the homogenization of sentient beings and their experiences that may be recommended by standard utilitarianism. After all, a happy experience added to the universe’s history contributes less to the joint information content of all happy experiences the more similar it is to existing experiences, and genuinely identical happy experiences like those in the previously described “industrial happiness production” nightmare scenario contribute almost nothing. Hence, diversity of happy experiences is a key consideration for anyone seeking to maximize the information-utilitarian utility function. In a way, information utilitarianism places an inherent value on diversity that standard utilitarianism does not, while simultaneously preserving the precision and completeness of standard utilitarianism, as well as most of its parsimony (and arguably all of it if we accept the intensity-information identity claim). This value placed on diversity may even increase information utilitarianism’s appeal as compared to standard utilitarianism for the liberals and leftists that are already disproportionately attracted to the latter.
How does information utilitarianism recommend we act in practice in the short term, in our current environment with our current technologies? We should strive to increase happiness and reduce suffering, of course, but we should also foster cultural diversity by preserving endangered cultures and allowing existing cultures to interact and cross-pollinate, leading to diversification via “recombination” of cultural elements — as long as there isn’t a huge power imbalance in the interaction that leads to more homogenization than diversification. Also, since utilitarianism (in both standard and information forms) is supposed to factor in the experiences of all sentient beings, information utilitarianism may recommend going out of our way to preserve our planet’s ecosystems and with them biodiversity, particularly of our fellow sentient animal species. The single utility function of information utilitarianism is in principle able to naturally balance these at-least-occasionally competing directives, though in practice it’s impossible to compute exactly and still difficult to compute approximately (probably even more than its standard-utilitarian counterpart, admittedly, since joint information content is much more complicated than summation). As such, similarly to standard utilitarianism, heuristics such as rights and duties would need to be developed on an information-utilitarian basis, using observation combined with estimates of both the intensity of subjects’ experiences and their similarity to each other.
What about in the long term, with hypothetical future technologies, where standard utilitarianism’s nightmare scenario appears? What would an information-utilitarian superintelligent AI do? Such an entity would still be a maximizer of a utility function with no upper bound, and as such it would be motivated to make efficient use of all resources it could access — but “efficient use” for it would probably mean nurturing and even creating a vast diversity of sentient beings, placing them in environments conducive to their happiness, and configuring the whole system as much as possible to be naturally ever-changing yet requiring little intervention from the superintelligence to remain stable and collectively happy, so the superintelligence doesn’t need to spend many resources on micromanagement. I imagine that such an entity would develop its behavioral heuristics in such a way that any existing species that’s capable of being meaningfully happy in the right environment is given such an environment, rather than being eradicated or altered beyond recognition for being sub optimally happy in isolation. After all, if you’ve got a perfectly good species already there, why go through all the trouble of destroying it and designing a new one to replace it? As such, it’s likely that humanity, our diverse societies intact, and most if not all of the other sentient species on Earth would have a place in this hypothetical far future, as would any sentient species that the hypothetical superintelligence runs across in its quest to make use of resources beyond Earth.
Information utilitarianism also has potential downsides, to be sure. Besides the fact that it requires the acceptance of functionalism while standard utilitarianism does not, the formalizations of it provided in this paper apply the same penalization of redundancy to the badness of suffering as to the goodness of happiness, which if looked at in the right way can be intuitively unpalatable — if I’m torturing both you and another person, should it make you feel any better to learn that the other person is being tortured in a very similar way to you? There are likely also other thought experiments specific to information utilitarianism that reveal unintuitiveness in its conclusions, much as there are with standard utilitarianism — and, of course, information utilitarianism is vulnerable to some of the same critiques as standard utilitarianism, even if the problem presented in this paper is exclusive to the latter. However, in-depth exploration of information utilitarianism’s vulnerabilities will have to be left to future papers.
Fodor, J. A. (1981). The mind-body problem. Scientific American, 244(1), 114–123. https://doi.org/10.1038/scientificamerican0181-114
Galatolo, S., Hoyrup, M., & Rojas, C (2010). Effective symbolic dynamics, random points, statistical behavior, complexity and entropy. Information and Computation, 208(1), 23–41. arXiv:0801.0209v2
Mill, J. S. (1859). On Liberty. Urbana, Illinois: Project Gutenberg. Retrieved May 10, 2021, from https://www.gutenberg.org/files/34901/34901-h/34901-h.htm
Roff, H. M. (2020). Expected Utilitarianism. arXiv:2008.07321v1
* Alex Heyman is a former C4E Undergraduate Fellow (2020-21) and a recent graduate from U of T (Class of 2021) with a BSc in Computer Science and Cognitive Science, plus a Philosophy minor. They have been interested in philosophy since being introduced to it by their parents in middle school, with chief interests including ethics, metaphysics, epistemology, and philosophy of mind. They take a consequentialist approach to ethics, and hope to help integrate ethical and safety concerns into the field of AI research in their future career. Outside of academia, they are an amateur writer of both fiction and non-fiction, as well as a hobbyist designer and programmer of retro-style video games.
 On an abstract mathematical level, it’s easiest to think of universe-states as stretching across time: i.e., they’re not snapshots of the universe, but entire possible histories of the universe. Hence, calculating an action’s expected outcome utility involves factoring in one’s uncertainty about the universe’s entire future history, and even one’s uncertainty about the past. In practice, utility-maximizing decision procedures break time up into steps or chunks, stop looking backward once the past is estimated to no longer be relevant – if it’s relevant at all – and stop looking forward when the future is estimated to be too uncertain to be worth predicting.
 There exist people who have only a small amount of hearing and fall into the gray area between these categories, but for purposes of clarity we’ll treat the distinction as binary here.
 As for how to maximize net happiness in this view: it’s not immediately obvious to me what the most natural thing is to do here, but perhaps we can simply maximize the joint information content of all happiness minus the joint information content of all suffering, calculated independently.