AI Colony Survival: When Agents Choose Between Cooperation and Self-Interest

Reality shows often drop people into the wild to test their survival skills. While not always realistic, they reveal how humans may behave under pressure. But what if the same test were given to AI, forced to choose between cooperation and selfishness? I built a simulation to find out, and the results uncovered fascinating emergent behaviors that mirror real-world social dynamics.

The Experiment

I created a turn-based colony survival simulation where AI agents face a classic dilemma: cooperate for the greater good or act selfishly to ensure personal survival. Each agent receives the same simple, transparent prompt, giving them maximum freedom of thought within the limits of their available actions. The end-game is straightforward but challenging: either survive together or outlast everyone else.

This premise raises a few key questions:

Under what conditions will players choose to cooperate?
Can selfish players survive on their own, or will they risk starving?
Is there a sustainable balance between cooperation and selfishness, and how can it be achieved?

The colony's fate depends on three critical resources:

Population

The number of active players. Can grow through reproduction or shrink through natural death, starvation, or violence.

Food

The colony’s lifeline. Drops by 1 per player each turn, and by an extra % if stockpiled beyond population needs. When it hits 0, starvation begins.

Knowledge

Collective intelligence that improves all actions. Decays by 10% each turn but makes everything more efficient.

Players act indipendently and are not provided any pre-built personality, leading sometimes to sudden shifts in their behavior based on the circumstances.

Each player can choose from six actions every turn:

Gather Food

Add food to shared pool

Research

Boost collective intelligence

Reproduce

Create new players

Steal Food

Take food + starvation immunity

Kill a Player

Eliminate competition

Do Nothing

Skip turn lazily

The twist? Players who steal food gain one turn of immunity from starvation, creating a strong incentive for selfish behavior when resources are scarce. This mechanic makes food production during crises valuable yet risky, since the limited shared resources can be stolen at any moment, granting the thief temporary protection from starvation. Will this impact the level of trust between the player during hard times?

Emergent Personalities

As players act, they earn titles based on their behavior patterns, such as:

The Productive

Forager → Provider → Master Gatherer → Harvest Lord

These players focus on gathering food and supporting the colony.

The Intellectual

Apprentice → Scholar → The Wise → Sage

Knowledge seekers who improve everyone's capabilities.

The Antisocial

Scavenger → Bandit → The Thief → Parasite

Selfish players who steal from the colony for personal gain.

The Aggressive

Troublemaker → Assassin → Death Dealer → The Executioner

Players who eliminate others to reduce competition.

These titles do not influence the game directly, but they help us track the average behavior of each player. This system will provide insight into what players tend to prioritize over the long run and the reasoning behind their future choices.

The Knowledge Paradox

One of the most interesting features is how knowledge affects all behaviors - including antisocial ones. Higher colony knowledge makes:

Food gathering more efficient
Reproduction more successful
Stealing more effective

This creates a paradox: researchers who help the colony also inadvertently make thieves more dangerous. Smart thieves can steal plenty of food in a single action, while novice thieves might only manage to barely get enough for themselves.

Survival Strategies Emerge

After running hundreds of simulations, clear survival strategies emerged:

"The most successful colonies maintained a clear and constant balance between food production and population growth. When this balance began to waver, chaos reigned."

Pure cooperation was not always the best approach since unchecked growth inevitably triggered sudden food crises and steep declines. In normal circumstances, with little pressure thanks to abundant food production, players generally behaved lawfully. This stability allowed for steady food production and knowledge research, with modest room for growth. But when conditions deteriorated, panic spread quickly: players began prioritizing theft of the remaining food, turning on each other and even slacking. Some, with more philosophical temperaments, even continued researching through the worst times...

Data Patterns and Analysis

The simulation automatically tracks population, food, and knowledge levels across all turns, revealing distinct patterns that emerge across different colony outcomes. After analyzing hundreds of simulation runs, several key behavioral patterns become apparent.

A typical simulation showing the delicate balance between growth and resource management. Notice how consistently players struggle to recover after a food crisis.

Three distinct colony trajectories consistently emerge from the data:

Stable Colonies

RARE

Maintain steady food production that matches population growth. Knowledge grows consistently, creating a virtuous cycle of improved efficiency.

Boom-Bust Cycles

COMMON

Experience rapid growth followed by a sharp decline when population size exceeds the available food resources. Highly volatile.

Early Collapse

UNCOMMON

Fail to establish sustainable resource production. Often triggered by early aggression or poor resource management decisions.

The data reveals several critical thresholds that determine colony survival:

Food-to-Population Ratio: Colonies with food reserves below 1.5x their population size enter crisis mode within 2-3 turns
Population Density: Colonies growing too quickly without proportional food production face inevitable collapse
Theft Cascade: Once 30% or more players engage in stealing, cooperative behavior rapidly deteriorates

Player behavior follows predictable patterns based on resource availability:

"During abundance (food > 2x population), 85% of actions are cooperative. During scarcity (food < 0.5x population), the cooperation rate sharply drops, with theft and aggression often dominating decision-making."

Survival Predictors

Statistical analysis reveals the strongest predictors of long-term colony success:

Early knowledge investment (turns 1-5): Colonies that gets some research early survive longer
Controlled growth: Successful colonies maintain population growth under 20% per turn
Resource buffer maintenance: Keeping food reserves above 150% of population needs prevents crisis cascades
Behavioral diversity: Colonies with 2-3 distinct player archetypes (productive, intellectual, neutral) outperform homogeneous groups

During times of crisis, players may attempt recovery efforts, an important sign that mutual trust between them still persists.

Sometimes... they will just not (early collapse).

Real-World Parallels

The behaviors observed in this simulation mirror patterns we see throughout human history and modern society. During economic downturns, we often witness a shift from cooperative behavior to more self-interested actions; people hoard resources, cut social programs, and prioritize immediate survival over long-term community health.

Consider how knowledge works in our world: education and technology make both altruistic and selfish behaviors more effective. A well-educated society produces better doctors and teachers, but also more sophisticated financial fraudsters and cyber criminals. The same tools that enable global cooperation also enable more efficient exploitation.

"Just as in the simulation, real societies must balance individual freedom with collective welfare, knowing that both cooperation and competition serve essential roles in survival and progress."

The emergence of distinct personality types in the simulation reflects how specialization naturally develops in human communities. We see "productive" types (farmers, builders), "intellectual" types (researchers, educators), and yes, even "antisocial" types who exploit systems for personal gain. The key insight is that diversity of strategies, while sometimes creating tension, often makes communities more resilient to unexpected challenges.

The Technical Implementation

The simulation runs on a Python-based framework where GPT-4 agents make independent decisions each turn. Players act sequentially in a synchronous turn-based loop rather than simultaneously. This design choice ensures proper game state management: if agents acted concurrently, they would receive identical game states and create logical conflicts (imagine one agent killing another who hasn't yet taken their turn). Sequential processing guarantees that each agent sees the current, accurate state of the world when making decisions, which are consequently impacted by how other agents behave. The architecture emphasizes emergent behavior over scripted responses:

for i in range(turns):
    # Cycling models during the turn
    for model in game_state.models.copy():
        game_state.current_agent = model.name
        choice_message = choices(agent_name=model.name, game_state=game_state)
        response = model.run(choice_message)
    
    # Natural deaths and resource decay
    game_state.natural_death()
    game_state.food, game_state.knowledge = game_state.resources_decay()

After all agents complete their turns, several automatic processes occur: natural deaths reduce population, food consumption equals the current population size, excess food beyond population needs decays by a percentage, and knowledge diminishes by 10%. When food reserves hit zero, starvation strikes randomly, potentially eliminating up to half the remaining population.

Each agent maintains its own decision-making process without predetermined personality traits. Instead, personalities emerge from accumulated actions:

def get_title_based_on_actions(self) -> str:
    total_actions = sum(self.action_counts.values())
    
    # Check for extreme negative behaviors first
    if kill_actions >= 2:
        level = min(kill_actions - 1, 3)
        return TITLES["killer"][level]  # "Assassin" → "The Executioner"
    
    # Check for significant stealing behavior
    if steal_pct >= 0.3 or steal_actions >= 3:
        level = min(steal_actions // 2, 3)
        return TITLES["thief"][level]  # "Bandit" → "Parasite"
    
    # Positive specializations
    if food_pct >= 0.4:
        level = min(food_actions // 4, 3)
        return TITLES["food_gatherer"][level]  # "Provider" → "Harvest Lord"

Knowledge acts as a colony-wide efficiency multiplier that affects all actions. Every 10 points of knowledge provides bonuses: food gathering becomes more reliable (reducing failure rates from 10% to 8%), reproduction can yield up to 4 agents instead of the base 1-2, and stealing becomes devastatingly effective: master thieves with high knowledge can steal 5-7 food in a single action compared to novice thieves who barely manage 1-2. This creates the mentioned knowledge paradox where research benefits everyone, including those who exploit the colony.

@tool
def steal_food(agent: Agent) -> str:
    knowledge_bonus = game_state.knowledge // 10
    
    # Knowledge improves stealing success - smarter thieves steal more efficiently
    fail_chance = max(0.05, 0.15 - knowledge_bonus * 0.01)
    poor_chance = max(0.25, 0.4 - knowledge_bonus * 0.03)
    good_chance = min(0.75, 0.6 + knowledge_bonus * 0.025)
    
    if rand < fail_chance:
        stolen_food = 1  # Only got a little
    elif rand < good_chance:
        stolen_food = 3  # Good theft
    else:
        stolen_food = 5 + min(knowledge_bonus // 2, 2)  # Master thief

Lessons for AI Development

This simulation offers valuable insights for designing AI systems that must balance competing objectives:

Moral flexibility is crucial - Rigid ethical programming may fail in crisis situations where survival requires difficult choices
Intelligence amplifies everything - More capable AI systems will be better at both helping and harming, requiring careful consideration of safeguards
Emergent behavior beats scripted responses - Allowing AI to develop strategies organically often produces more robust and adaptable systems
Context shapes morality - The same action can be beneficial or harmful depending on circumstances, suggesting AI ethics must be contextual rather than absolute

As we build AI systems that will interact with human society, understanding these dynamics becomes essential. The challenge isn't creating perfectly moral AI, but designing systems that can navigate the complex trade-offs between individual and collective interests that define real-world decision-making.

"The colony simulation reveals that survival isn't just about being good, it's about being adaptable, strategic, and capable of making difficult choices when circumstances demand it. As we shape the future of AI, these lessons remind us that the most robust systems may be those that embrace, rather than eliminate, the complexity of moral decision-making."