Skeptic's Play: August 2013

Thursday, August 29, 2013

Extortionate strategies in Prisoner's Dilemma

In an earlier post, I programmed a simple simulation of the evolution of prisoner's dilemma strategies. I promised to read a bit of the established literature on the subject, and follow up. In particular, I will write reports on the following two papers:

Press, W. & Dyson, F. J. Iterated Prisoners’ Dilemma contains strategies that dominate any evolutionary opponent. Proc. Natl Acad. Sci. USA 109, 10409–10413 (2012).

Adami, C. & Hintze, A. Evolutionary instability of zero-determinant strategies demonstrates that winning is not everything. Nat. Commun. 4, 2193 (2013).

The first paper demonstrates that in the Iterated Prisoner's Dilemma game, there exist extortionate strategies. If you use such a strategy, your opponent can slightly improve their own score, but only by greatly benefiting you. While this seems like a pretty powerful strategy, the second paper demonstrates that it is not evolutionarily stable. This post will focus on the first paper.

The strategic landscape

In the previous post, I already explained the Iterated Prisoner's Dilemma. I proposed a player who is described by three numbers: x, y, and z. The three numbers describe the probability that the player will cooperate, given what their opponent did in the last round. This player is known as a "memory-one" player, because they only remember the outcome of one previous round. But I oversimplified it a bit. In this paper, a "memory-one" player doesn't just remember what their opponent did in the previous round, they also remember what they themselves did in the previous round. Thus you really need to describe the player with four numbers, each describing the probability of cooperating, given each of the four outcomes in the previous round.¹

You could also have a "memory-two" player who remembers the outcomes of the previous two rounds. Or a player of arbitrarily large memory. You'd think that more memory makes for a better player. However, the paper proves that there is a sense in which this is not true.

Suppose that player X is a memory-one player, and chooses some particular strategy. And suppose player Y is a memory-two player, who chooses the best strategy to respond to X's strategy. But it turns out that Y's best memory-two strategy fares no better against X than does the best memory-one strategy. Note that this is a precise statement, and it's not quite the same as saying that more memory is useless.² But it provides some justification for focusing solely on memory-one strategies.

In summary, each player is described by four probabilities. Player X is described by p1, p2, p3, and p4 (for the outcomes cc, cd, dc, and dd respectively, where c is cooperate, and d is defect). Player Y is described by q1, q2, q3, q4. Since these are all probabilities, they must be between zero and 1.

Zero-determinant strategies

From this point on, there will be a lot of math. Skip to "extortionate ZD strategies" to get past the math.

When I calculated scores in my simulation, I used what's called the Markov matrix. It looks like this:

Fig 2A of paper

The Markov matrix tells you how to transform the probabilities of one iteration to the probabilities of the next iteration. Each iteration is associated with four probabilities describing the likelihood of each of the four outcomes. The four probabilities can be written as four components of a vector. By multiplying the vector by the Markov matrix, you get the vector corresponding to the next iteration. By multiplying the Markov matrix many times, you can get the probabilities after many iterations.

But if you multiply the Markov matrix many many times, the outcome should approach some limiting state, called the stationary state.³ I didn't calculate the stationary state in my simulation, because I wasn't really sure how. But the authors of the paper are a little bit more knowledgeable, and can calculate the stationary state. In my simulation, I calculated the average outcome of the first 10 iterations, but here the authors effectively calculate the average outcome of an infinite number of iterations.

Once you calculate the stationary state, you just need to calculate the scores. To do this, you do a dot product between the probability vector and the score vector. The score vector is (3,0,5,1) for the first player and (3,5,0,1) for the second player. It turns out that this dot product is equivalent to the determinant of a special matrix:

Figure 2B of paper. v is the stationary vector, while f is the score vector.

Why is this matrix so important? Because player X has complete control over the second column, and player Y has complete control over the third column. This allows for "zero-determinant" strategies (henceforth ZD strategies), where the player forces the determinant of the above matrix to be zero. All you need to do is choose probabilities such that your column is exactly equal to the last column. If a matrix has two identical columns, its determinant is zero.

You might wonder, why would you ever want to set the determinant to zero? Doesn't that just set someone's score to zero? In fact, it does more than that. You could, for instance, set the sum of your score and the opponent's score to zero. Instead of using (3,0,5,1) or (3,5,0,1) for the score vector, all you have to do is use the sum of these two vectors, (6,5,5,2). In general, you can enforce equations of the following form:

Equation 7 from the paper. s_X is X's score, and s_Y is Y's score, while the rest of the variables are parameters chosen by the player using the ZD strategy.

This equation should make you skeptical. Can X choose a strategy such that s_X - 10 = 0, thus causing their average score to be 10? Clearly this is impossible, since the maximum score is only 5. If you try such a ZD strategy, you must set p1, p2, p3, and p4 outside their allowed range between zero and one. This is the catch of the ZD strategy: not every ZD strategy is possible.

Extortionate ZD strategies

So far, I've shown that a player using a ZD strategy can enforce a linear relationship between their own score, and their opponents score. However, not all linear relationships are possible. What linear relationships are possible?

One thing you can do is unilaterally choose your opponent's score. In a game where the outcome scores are 5/3/1/0, you may set your opponent's score anywhere between 1 and 3. However, you can't unilaterally choose your own score using ZD. After all, it would be a contradiction if both you and your opponent had complete control over your score.

Another thing you can do is enforce a positive correlation between your score and your opponent's score. Thus, the more your opponent helps you, the better it is for your opponent. One such ZD strategy is known as the "tit for tat" strategy, where you simply copy the action of your opponent in the last iteration. In the tit for tat strategy, your opponent's average score will exactly equal your own average score.

But there are more extortionate ZD strategies, where you benefit far more than your opponent does. For example, player X could enforce the rule (s_X-1) = 6*(s_Y-1), meaning that if Y tries to improve their own score, then X's score improves six times more than Y's does.⁴ In general, you can be as extortionate as you want, although at some point your opponent loses incentive to bother trying for such meager rewards. Or if your opponent is smart, they may give up on those rewards to spite you.

Consequences for evolution of altruism

It might seem like the existence of extortionate strategies is bad news for the evolution of altruism. In fact, it's more complicated than that. You can use extortionate strategies to hurt your opponent, but hurting your opponent doesn't necessarily help yourself. The mere existence of an extortionate strategy doesn't lead to selfishness. However, ZD-extortionate strategies have an additional property that I think was not properly emphasized in the paper. The more extortionate you are, the greater your own rewards when your opponent cooperates.

For example, the least extortionate strategy is the tit for tat strategy. If you persuade your opponent to fully cooperate, your average score will be 3. But if you instead use the extortionate strategy above, with (s_X-1) = 6*(s_Y-1), and if your opponent fully cooperates, then your average score will be 4. The problem with ZD strategies is not that they're extortionate, but that they benefit more extortionate players.

So imagine that you are playing against an evolutionary opponent. You can choose a ZD-extortionate strategy, and watch your opponent evolve towards a more cooperative strategy. When they cooperate fully, you can use a more and more extortionate strategy to reap greater rewards (while also hurting your opponent).

Of course, in this picture, we're imagining that one player is intelligently choosing a ZD strategy, and the other player is blindly evolving. The paper also briefly discusses the case where both players are using ZD strategies, or if one player has a "theory of mind". I am a bit unsure about the applicability to evolution. Even the most intelligent agents are not going to choose a strategy based on its eventual evolutionary outcome, many generations down the road. If agents are not intelligent, I'm not sure if ZD would ever evolve. Does a ZD strategy really give you an advantage if you're mostly playing against siblings and cousins who are using similar strategies?

I suspect that the next paper, which shows that ZD strategies are not evolutionarily stable, will bring up one or more of these issues. I will read the paper and see.

-----------------------------------

1. Actually, a memory-one player requires five numbers. The first four numbers specify what to do in the case of the four possible outcomes of the previous game. The fifth number specifies what to do in the very first game. However, in the long run, the first game doesn't really matter (except in the scenario mentioned in footnote 3). And this paper uses some special calculations that don't use the outcome of the first iteration.

2. The key thing is that if the low-memory player chooses a strategy first, and the high-memory player responds, then there is no advantage to having high memory. However, if the high-memory player chooses a strategy first, and the low-memory player responds, there is a disadvantage to having low memory.

3. There's also the possibility that the game-state will circle around the stationary state without ever reaching it, and I think it's possible for there to be many stationary states. The authors don't really address these possibilities, but they're smart so I'm sure they know what they're doing. I think we can safely ignore these possibilities because they only occur for pathological cases.

4. To get this outcome, try (p1,p2,p3,p4)=(3/5,0,2/5,0). Tit for tat is (1,0,1,0).

This is part of a miniseries on the evolution of Prisoner's Dilemma strategies.
1. Evolution of prisoner's dilemma strategies
2. Extortionate strategies in Prisoner's dilemma
3. Prisoner's Dilemma and evolutionary stability
4. A modified prisoner's dilemma simulation

Monday, August 26, 2013

A relaxed community

This post was cross-posted on The Asexual Agenda.

I've seen many arguments over demisexuality over the years, and I've seen many people say that demisexuals aren't an oppressed group. Defenders of demisexuality generally use two strategies. The first is to talk about a few problems experienced by demisexuals. The second is to argue that even if demisexuals are not oppressed, it is still a useful label. The second strategy was seen in SlightlyMetaphysical's post earlier, but can be seen in much older articles as well.

------------------------------------------

At my last queer conference, I attended a caucus for queers in STEM (science, technology, engineering, mathematics). One person I talked to there just couldn't think of a single issue he personally dealt with as a queer in STEM. That's okay! There are some intersecting identities that just don't have that many issues, and there are some individuals who for whatever reason avoid the issues experienced by others in the same minority group. But a queer STEM community can still be useful, because it gathers people with similar interests. I think communities are great, why do we even need an excuse to create one?

------------------------------------------

I'm trying to draw these threads together, and apply them to ace identity.

I know that it is distasteful to compare the marginalization experienced by different groups of people, because this often results in dismissing smaller problems, as if only large problems were worth solving. But we all basically know that aces tend to experience less marginalization than, say, transgender people. This clearly doesn't imply that we should ignore ace problems, but what does it imply?

I think it means we can relax a bit. We don't need to focus so much of our discussion on the problems we face, or how to solve those problems. We can also talk about our favorite cakes, exchange our favorite webcomics, geek out over our favorite sexuality models. We can discuss the ups and downs of social networks, organize outings, find partners.

We can spend some time helping other people, especially other sexual minorities. I've heard many a complaint from queer folk that "allies" are too demanding and unhelpful. It's an unfortunate consequence of allies not having much personal investment in queer issues. But if you're ace and identify as queer, the personal investment is there. You spent all that time researching and educating yourself about the ins and outs of asexuality, you can do the same for other sexual minorities. I consider it damn near a moral obligation.

Another advantage is that we can be ourselves! We often talk about the "gold star asexual", who has all their personal characteristics arranged in such away that no one can deny their asexuality. We all feel pressure to conform to that gold star image, lest you cause people to doubt asexuality, and you hurt your fellow aces. But if we recognize that the stakes are low, that takes some of the load off. I'd like more people to recognize and affirm ace identity, but I feel confident that we will eventually achieve widespread public awareness. So while we're at it, we might as well be ourselves, and not pressure ourselves to conform to a perfect image.

Of course, I don't mean to dismiss or minimize the problems faced by many aces. Having your identity denied and erased is not a trivial thing. It leads to some very real consequences. But let's recognize that one of the negative consequences is that sometimes it makes us tense, as a community. It is okay to relax as well. There are situations where we as a community don't face many problems, and there are individuals who for whatever reason actually have it pretty good. When you have problems, recognize them and fight them! When you don't have problems, enjoy it for what it is.

Regardless of how marginalized we are, this community is still useful, and still meaningful. And the great thing is, when people do face problems, we can instantly convert to a support group as necessary.

Saturday, August 24, 2013

Evolution of prisoner's dilemma strategies

While I was in France, I couldn't spend all my time going to the beach and talking about physics. I mean, sometimes the beach had too many jellyfish. And we had no internet for a time.

Our arch-nemesis, the jellyfish

So I might have spent an hour or two programming an evolutionary simulation of Prisoner's Dilemma strategies. The inspiration was that my boyfriend described to me an awful article on BBC news which completely distorted a paper (update: corrected link) on evolutionarily stable strategies for the iterated prisoner's dilemma.

This is all based on multiple hearsay, since I have read neither the news article nor the paper. So my evolutionary simulation may be completely off the mark. It's okay, this is just for fun. Later I will look at the paper and compare what they did to what I did.

Introduction

I assume readers are familiar with the prisoner's dilemma. The prisoner's dilemma contains the basic problem of altruism. If we all cooperate, we all get better results. But as far as the individual is concerned, it's better not to cooperate (ie to defect). It gets more complicated when two players play multiple games of prisoner's dilemma in a row. In this case, even a purely selfish individual may wish to cooperate, or else in later iterations of the game, their opponent might punish them for defecting.

The iterated prisoner's dilemma can be analyzed to death with economics and philosophy. But here I am interested in analyzing it from an evolutionary perspective. We imagine that individuals, rather than intelligently choosing whether to cooperate or defect, blindly follow a strategy determined by their genes. I parametrize each person's genes with three numbers:

x: The probability of cooperating if in the previous iteration, the opponent cooperated.
y: The probability of cooperating if in the previous iteration, the opponent defected.
z: The probability of cooperating in the first iteration

I believe everything is better with visual representation, so here is a visual representation of the genespace:

Each square represents an individual. Each individual is described by three numbers x, y, and z, which are visually represented by the position and color of the square. The major strategies of the iterated prisoner's dilemma are associated with different quadrants of the space. In the upper right quadrant is the cooperative strategy. In the lower left quadrant is the defecting strategy. In the lower right, is the "tit for tat" strategy, wherein you cooperate if your opponent cooperate, and defect if your opponent defects. The upper left quadrant... I don't think there's a name for it.

And of course, there are hybrid strategies. For example, between cooperation and "Tit for Tat", there's "Tit for Tat with forgiveness". In the evolutionary algorithm, the only way to get from one strategy to another is by accumulating mutations from generation to generation, so the hybrid strategies are important.

The simulation

I tried to perform the evolutionary simulation in the simplest way possible. Start with some number of individuals (here I use 20 individuals). Have each pair of individuals play an iterated prisoner's dilemma (here I used 10 iterations).¹ Afterwards, I total up all the scores, and remove the individual with the worst score while duplicating the one with the best score. Then every individual mutates by randomly making small changes in x, y, and z.

Upon testing, I found that this simulation favors defection far too much. The trouble is that it's really a zero sum game. There is no absolute standard for what sort of scores are good or bad, it's just a comparison between the scores of different individuals. Even if defecting doesn't benefit you directly, it's still beneficial to hurt your opponents so that you come out ahead.

I'm not really sure how to solve this problem, and I'm very interested to see how researchers solve it. I tried several things, and settled for a simple modification. Rather than playing against every other player, each individual will play against two random individuals.² It's still beneficial to hurt your opponents, but not that beneficial since you are only hurting a few of them. It's also possible that an individual will play against itself, and by defecting hurt itself.

I believe that researchers typically use the following payoffs: mutual cooperation is 3 points, mutual defection is 1 point, and if only one player defects then they get 5 points while the cooperating player gets 0. However, I wanted to find some interesting behavior, so I adjusted the parameters to better balance the cooperation and defection strategies. I eventually settled on 4/3/1/0 instead of 5/3/1/0.

Results

Interestingly, rather than there being a single preferred strategy, multiple strategies can be evolutionarily stable. Actually, this may not be so surprising, since I basically adjusted the parameters of the simulation until I saw interesting behavior. But it's still fun to watch.

Often the population will settle into the defection strategy and not move for a thousand generations. But sometimes there is a shift to a cooperative tit-for-tat hybrid strategy. This hybrid strategy is not very stable, and always seems to eventually fall back to the defection strategy. Here I show an example of this happening:

Why does this happen? I'm not quite sure. But I think that when everyone is defecting, there's no real disadvantage to taking a tit-for-tat strategy. So genetic drift will sometimes cause the population to shift towards tit-for-tat. Once there is a critical mass of tit-for-tat, it becomes advantageous to cooperate, since you will get back what you paid for. Soon everyone is cooperating or using tit-for-tat. But then, perhaps by genetic drift, we lose too many tit-for-tat individuals, and it becomes advantageous to defect again. And so begins the cycle again.

Here I also show the average scores of the population over time:

If the population is defecting, then the average score will be 1. If it is cooperating, the average score will be 3. Here you see a punctuated equilibrium behavior.

Of course, you cannot draw any real conclusions from this simulation, since I did a lot of fine-tuning of the parameters, and because I'm not really interacting with the established literature. The conclusion is that this is an interesting topic, and now I feel motivated to read some papers. When I do, I'll report back.

--------------------------------
1. Actually I don't have individuals really play an iterated prisoner's dilemma against each other. Instead, I calculate the average result of an iterated prisoner's dilemma between those two players. There is no chance involved in this calculation.
2. This means that each individual plays four games on average. Two games where the opponent is chosen randomly, and two games where that individual was chosen as the opponent.

This is part of a miniseries on the evolution of Prisoner's Dilemma strategies.
1. Evolution of prisoner's dilemma strategies
2. Extortionate strategies in Prisoner's dilemma
3. Prisoner's Dilemma and evolutionary stability
4. A modified prisoner's dilemma simulation

Thursday, August 22, 2013

Chelsea Manning

From The Guardian: Chelsea Manning (formerly Bradley) publicly stated that she is a woman, and intends to undergo hormonal replacement therapy. This comes right after her sentence to 35 years in prison.

So I guess this settles the question of why Chelsea Manning was not identifying as transgender. I was never sure if it was because of the court case, because of personal reasons, or because she had decided that she was not mtf after all (instead being one of the many other things in the trans umbrella).

I'm interested to see how the Free Bradley Manning people will react. They march every year in SF pride, because I guess they considered her a hero who was also a gay man. I'm vaguely optimistic that their next action will be to call for a presidential pardon of Chelsea Manning. It could be a good show of trans inclusion in gay activism.

I don't think I really agree with the Free Bradley Manning people. They regard her as a whistleblower, but what did the public actually learn from the leaks? It's not like learning that the NSA is spying on us. Nonetheless, I regard Chelsea with compassion, whether her convictions were just or not.

Tuesday, August 20, 2013

On accusations against skeptical leaders

While I was in France, I lost internet entirely for a week, and then this stuff happened. Short version: there was a cascade of accusations against several skeptical leaders for sexual harrassment, assault, and rape. People named include Ben Radford, Lawrence Krauss, and Michael Shermer.

I don't have much to say about this, just an expression of shock.

I have a lot of respect for Michael Shermer, despite a few political differences. I first got excited about skepticism through reading his columns. I have an autographed copy of his book, Why People Believe Weird Things. I don't want to think so poorly of him. But yeah... I can't let my respect for Shermer bias my conclusions of his sexual conduct.

Sometimes I think the reason that sexual harrassment and assault inspire such high emotions is not so much because of the sexual aspect, but because it occurs between friends. If Shermer were instead accused of stealing millions by wire fraud (as happened with Brian Dunning), I would lose respect for him, but I could still keep that separate from his skeptical work. But this is sexual assault and rape of people in the skeptical movement. I don't know anything about the victims, but in a way those women are my friends, just as much as Shermer is a friend. Why must friends be so awful to one another?

Monday, August 19, 2013

Patterned icosahedron, and symmetric origami colorings

Meenakshi Mukerji's "Patterned Icosahedron", from Ornamental Origami. I was in desperate need of a d20, so I put a number of dots on each face. Yes, I use my statmech textbook as a folding surface.

One of the things I will talk about repeatedly in this origami series is coloring. After all, coloring is one thing I can easily change without changing the form or function of the design. So even if I exactly follow the recipe out of a book, the colors still offer some creative choice.

This icosahedron is made of thirty units (also called modules). They are called "edge units" because each unit corresponds to an edge of the icosahedron. I only used three colors: red, green, and brown. The colors are not placed haphazardly. I tried to place them in a symmetrical pattern.

But what does it mean for a coloring to be symmetrical?

I don't think there is any standard definition, but one math blogger offered the following definition:

The group of rotations of the polyhedron should act transitively on the set of colors, and the stabilizer of each color should act transitively on the edges of that color.

Here is my translation. A symmetrical coloring has two conditions:
1. For every pair of colors, it is possible to rotate the first color exactly onto the second. That is, the set of locations of the second color before rotation are exactly equal to the locations of the first color after rotation.
2. It is possible to rotate any unit onto any other unit of the same color, without changing the complete set of locations occupied by that color.

It's not possible to fulfill both of these conditions for an icosahedron with three colors. However, I fulfilled the first condition. Here is a sketch of my coloring:

With apologies to the colorblind

This is also called a "proper" coloring, because no color appears more than once on each face.

Figuring out symmetric colorings is very fun, although that's not always how I choose colors. I find that in practice, symmetric colorings are pretty subtle, and aren't really obvious in a 2D photo.

Sunday, August 4, 2013

Possible blogging break

For the next two weeks, I will be in France. It's not a vacation; I will be at a conference. I may or may not post anything while I am there.

Saturday, August 3, 2013

Vatican damage control

Some time ago, Pope Benedict said condoms were permitted under certain circumstances. Of course, Catholic sources were quick to do damage control, saying that, in context, the pope's comments were not quite so enlightened as praise-givers claimed.

Later, Pope Francis said that even atheists are redeemed. Yes, even atheists. But just as everyone was praising his papal condescension, the vatican does damage control. It turns out that redemption doesn't mean you go to heaven or anything like that, that would be "salvation". Only Catholics get salvation.

More recently, Pope Francis said of gay priests, "Who am I to judge them?" Other Catholic leaders do damage control, reassuring us all that the Church still disapproves of same-sex sex.

Every time it happens, it's practically a story out of The Onion. No really, this is an article from The Onion: Vatican Quickly Performs Damage Control on Pope's Tolerant Remarks.

What can we conclude? The media really wants to say true and positive things about the modern Catholic Church, but is hilariously unable. People just don't understand that the Catholic Church's beliefs are already laid out in the Catechism in their full regressive glory, and thus not really newsworthy. The Catholic's "love the sinner, hate the sin" stance on homosexuality makes so little sense that the media gets confused every time it hears it.

Thursday, August 1, 2013

Character vs thematic representation

I was thinking about fictional representation, partly because there was an FtBCon panel on atheists in pop culture, and partly because there's been some recent bloggy discussion about ace characters. Actually there's some similarity between ace and atheist representation: many characters seem like they could be ace or atheist by default, simply because they're not shown practicing religion or expressing interest in sex. But here I focus on atheist representation, just as an example.

Different people seem to want different things in their fictional representation. Some people just want characters who are mentioned or implied to be atheist, and to not be horrible stereotypes. The story doesn't have to focus on their atheism. But sometimes, some of us would like stories that deal directly with atheism, rather than simply having atheist characters.

Perhaps we should be separating out these two desires entirely. They both have to do with fiction and representation, but they're asking for different things, and have different motivations.

On the one hand, you want people to understand that atheists are just ordinary people. They can be perfectly intelligent and sociable, or not, as the case may be. They have flaws, but ordinary ones (depending on the style of fiction) that don't necessarily line up with stereotypes. If you feel alone as an atheist, it's also nice to see a character you can relate to, or which legitimizes your experiences.

On the other hand, you may be interested in atheist issues, like skepticism or the dangers of religions. It would be nice to explore these topics through fiction. And you know, the fiction doesn't even have to have any atheist characters in it! You can simply depict a religious institution, or some other institution that acts as a metaphor. Or you can depict superstitions.

Interestingly, the balance of the two desires varies from identity to identity. For example, when people ask for women in fiction, they usually aren't asking for fiction that deals with women's issues so much as fiction that includes women as important characters. Similarly when people ask for POC representation, they just want characters, not necessarily entire stories dealing with race. When people look for LGBT fiction, they're often looking for stories that deal directly with LGBT issues, thus the countless stories about coming out. But my coblogger Queenie expressed interest in seeing more stories about "space wizard boyfriends", so there's also demand for characters who are incidentally LGBT.

When it comes to atheism, people would like to see both incidental atheist characters, and also stories with atheist themes. You could say that this reflects the tension between atheism as a personal identity and atheism as a quasi-political cause.

What would I personally prefer? I like it when there are atheist characters out there in pop culture, but I don't have much of a personal interest in them. I already know atheists are ordinary people! But I am interested in seeing more atheist thematic content. Or at least, that's what I'd like to see in principle. In practice, I tend not to like fiction that expresses an atheist slant, because it's too ham-handed. These days I like subtle, serious, "snooty" literature. Instead of literature that advances certain ideas (that I probably disagree with anyways), I'd rather see literature advance ideas that are subsequently torn down. I haven't really seen much atheist literature that does that.