John Salmon's PhD Blog: March 2009

Monday, March 16, 2009

Reward versus Punishment

In this latest paper, which I didn't read word for word, to save time, I found a mathematical construct to display these games, namely a simplex S4 structure, which represent the points of a saddle-like surface to quantify how a certain game evolves. I didn't read all the math but at least I will have a way to perhaps present data if I continue down this path.

The concluding paragraph was most helpful however in that it mentioned that punishment seems to be more effective than rewarding to motivate cooperation. I suppose I will need more support to take that stance but perhaps on the outset, we as humans are more willing to avoid punishment than receive rewards. I've vocally discussed this idea with many others but it falls into place with my own theory of levels/reasons of obedience.

Lastly, reputation was essential for either reward or punishment to be effective. I agree. Without some sort of knowledge about our biases, feelings, history, record, there is no possible way to really deal with each other in any sort of rational way. I suppose a lot of quotes would be pertinent right now about how one's reputation really is one of the most valuable things one can have.

Reference:
Sigmund, K.; Hauert, C. & Nowak, M. "Reward and Punishment" PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, NATL ACAD SCIENCES, 2001, 98, 10757-10762

Thursday, March 12, 2009

Fairness versus Reason in UG

I found an article that mentions that reputation does matter. It seems the if more people know what sorts of offers you accept they will provide offers to you at that level. So, if you are known to take low offers in the Ultimatum Game, they will offer that value to you. If you reject low offers, you, of course lose out for that deal, but you bought a reputation of only accepting higher offers.

So, the question is how many times does someone reject an offer, waiting for that big offer, before it is too late?

Reference:
Science 8 September 2000:
Vol. 289. no. 5485, pp. 1773 - 1775
Fairness Versus Reason in the Ultimatum Game
Martin A. Nowak, Karen M. Page, Karl Sigmund

Continuous Strategy Space

One of the limitation of IPD is that there are only really two strategies that the actors can take. Obviously, in real life there are a lot more and this paper I read today begins to discuss this. They had a large network of actors and played the Ultimatum Game (which I will talk about some day) many times with neighbors in their network. Furthermore, they were able to adjust their strategies so as to converge and maximize their payoff, and thus learn how to deal with other individuals. The continuous nature of the strategy space is found in the fact that they can offer or accept offers at not just discrete levels, such as in IPD (defect or cooperate) but over a range of values in between the minimum and maximum offers allowed.

The authors also investigate how to model fair decision which humans seem to do quite quickly. He also denigrated the impact of reputation a little but I think that that still comes into play. Humans are much more emotional than computer agents and so I'll have to find another paper discussing that issue in more depth.

Reference:
"Learning to Reach Agreement in a Continuous Ultimatum Game"
de Jong, Steven et al.
JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH Volume: 33 Issue: Pages: 551 Published: 2008

Wednesday, March 11, 2009

Steven Tangen Defense

I think I'll post some thoughts about the defenses I attend as well. I won't necessarily dissect them very much and I'll try to keep it positive and focus on what I would like to do at my defense or during the entire process.

spend more time on data analysis and less on the background work for the defense, for the proposal vice versa.
explain the axes of the figures
test the layout on the COVE wall before the day off
have someone take notes about the reaction and/or questions of students and committee members.
don't be afraid to say that something is outside the scope of the problem
be assertive when answering questions, it is your work afterall!
prepare the audience for the next slide or figure
have back up slides to answer questions about the details.
be prepared to explain the small steps if people ask about them
don't be insulting intelligences with to simple of explanations

Well, that's enough for now. At lot of those are general, but they came to mind right now. I'm sure there will be more and I'll try to get into the habit of listing them as soon as I hear them.

Tuesday, March 10, 2009

Part two of PD

Another paper by Axelrod. In this one he summarizes the data and observations from his tournament for the Prisoner's Dilemma. TIT FOR TAT wins handedly. In essence, it is a very forgiving strategy and will reward quickly when the other party cooperates but will still punish the other it they first punish. I got me thinking about God's mercy and willingness to forgive. I suppose He is even more forgiving than this strategy, in that he will, like TFT, forgive as soon as possible, and as soon as we repent and turn to him (cooperate) but even above that, He will cooperate even when we defect at times! However, his "punishment" if one can call that is probably a withholding of blessings and not really a penalty or sorts. It is likely deferred anyway, so as to give as much time as possible for us to correct our ways. But, perhaps, sin is its own punishment in many regards and in some circumstances. Although, likely sin also keeps us from blessings of having the spirit and thus that form of punishment is also likely evident.

Anyway, the interesting thing is that the author actually models this unbalanced forgiveness strategy, similar to what God may be like. It only defects (punishes) after two defections from the other actor. Likely, God will take even more "defections" before he retaliates, but He also won't stand idly by and let us continue in wickedness forever. Clearly, there have been punishments meted out to certain groups or individuals who are wicked.

The reference:
Effective Choice in the Prisoner's Dilemma
Robert Axelrod
The Journal of Conflict Resolution, Vol. 24, No. 1 (Mar., 1980), pp. 3-25

Monday, March 9, 2009

Prisoner's Dilemma

Robert Axelrod has quite a sound argument for how cooperation strategies can move into a population which was previously all greed driven. I suppose, it is still greed driven afterwards but on average everyone does better.

There were some gospel equivalents to this that came to mind. Clearly, some people think that the Law of Consecration is a kin to socialism, in that everyone contributes and everyone withdrawals. I suppose it depends on ones perspective, but I see vast differences. The LC seeks to aggrandize (glorify) every single individuals as much as possible, through cooperation. We want to be better, know more and have more power, and we can do that more quickly by cooperating and sharing, learning from each other serving each other, and trading skills which allow us to gain intelligence more quickly since we will be able to use, for example, the ideas of each other to free up time to learn. (I guess a concrete example is: had one person invented the dishwasher and given it to everyone to use, we could all have more time to learn calculus...) Thus, each individual wants to contribute out of love, yes, to give more to others but also as to self-aggrandize themselves through expected reciprocity of blessings. Socialism seems to focus on finding the lowest level of equality. Thus, if one person has 91 units of X, and there are also nine people with each 1 unit of X, socialism seems, at least to me, to make each of the ten people have 10 units, after some organized system gets involved. Wouldn't it be better if the person with 91 units teaches, shows, helps, the other nine people how to get 91 unites themselves. That seems to be another level of equality which is significantly higher and no one person has been reduced.

Anyway, Axelod discusses this Theory of Cooperation in other ways as well, including: Robustness, Viability and Stability. Cooperation seems to win out over time, it can withstand attacks and is more stable. Everything, in my mind, that a group of individuals would want. So why is it that we aren't on board with it? Biology seems to accept it. It must be that the one assumption which makes it effective is invalid, that there is a finite number of encounters. Thus, it would be in my best interest to kill someone and take there 50 dollars from their wallet if I never see them again. But if I do see them again would it be wise to gain more by allowing them to live more, thus work more, thus make more, and then kill them... But at that point, the analysis is the same, and one should thus let them live indefinitely! Okay, but what about my payoff? I need to kill him to get any money right? I don't think so. Surely there is something that he wants, (perhaps even more than once) that you can trade with him at multiple encounters. (I suppose he wants to keep his life each time, so even that "commodity" would work...)

Anyway, I have some issues with Axelrod's analysis and assumptions. At times the payoff matrix is not constant and so strategies break down. Also, the numbers have to fit certain criteria to be valid and at times there may not be a difference in value between strategy A and B. So, if that is the case, a random decision or at least one based on other criteria is more realistic. Still, the foundation problem and arguments really gets one thinking about why humans are so "cooperation-averse." Do we really feel that we don't have anything to offer some one else? Or do we really feel that no one else has something to offer us?

I suppose this give new meaning to the principle that we cannot be saved without our dead and they cannot be saved without us. It's all or nothing. (Because mathematically it is too in a way!)

Here's the reference... (but no link)

The evolution of cooperation
R Axelrod and WD Hamilton
Science 27 March 1981:
Vol. 211. no. 4489, pp. 1390 - 1396
DOI: 10.1126/science.7466396

John Salmon's PhD Blog