### Why I capitalize distribution names

There are two ways to think about what a parameterized statistical distribution is.

$\def\Re{{\mathbb R}}$ As a single point: Here, the Normal distribution is a mapping of the form $f:(x, \mu, \sigma) \to \Re^+$. More specifically, it is $f(x, \mu, \sigma) = \frac{1}{\sigma \sqrt{2\pi} } \exp(-\frac{(x-\mu)^2}{2\sigma^2})$. Within the infinite space of functions, this is a single point. We often fix certain parameters, and get a function of fewer dimensions, like $f(x, \mu=0, \sigma=1) = \frac{1}{\sqrt{2\pi} } \exp(-\frac{x^2}{2})$.

As a family: under this perspective, when we fix, say, $\mu=2$, $\sigma=1$, we get a Normal Distribution. When we fix $\mu=3$, $\sigma=1$, we get a different Normal Distribution. Here, there is a meta-function of the form $N:(\mu, \sigma) \to (f:x\to\Re^+)$, which defines a family of functions, and produces a series of Normal distribution functions depending on the values to which $\mu$ and $\sigma$ have been fixed.

Both of these approaches are coherent, and if you go with either, I respect you fully. Almost any Wikipedia page about a distribution will jump back and forth between these two interpretations, so at the end it's impossible to say whether a Normal Distribution has the form $f(x, \mu, \sigma)$ or $f(x)$. But any given Wikipage is edited by several people, so finding anacoluthons on Wikipedia is something of a fish-in-a-barrel exercise. But my web analytics software tells me that a large percentage of the readers of this blog are individual human beings; if that's you, I recommend picking one interpretation or the other and sticking with it.

I prefer the single-point characterization over the family. At the least, the meta-function is confusing, and implies the two-step estimation process of fixing the parameters, then grabbing a data point. This is a certain type of workflow that may or may not be what we want.

Of course, this gets into the Bayesian versus Frequentist debate. The stereotypical Frequentist believes that there is a true value of $(\mu, \sigma)$, and our job is to find it. This more closely aligns with a search for a single Normal distribution in the family of Normals. The stereotypical Bayesian doesn't know what to believe, and thinks that reality may even be an amalgam of many different values of $(\mu, \sigma)$. Either perspective works under either the single-point or family interpretation—as they say, mathematics is invariant under changes in notation—but the Frequentist approach more closely aligns with the two-step estimation process of the family interpretation, and the Bayesian approach is much easier to express under the single-point interpretation. My earlier post about Bayesian updating, with frequent integrals of $f(x, \mu, \sigma)$ over parameters certainly would have been more awkward via the family interpretation.

Grammatically, this has a clear implication. If the Normal Distribution is the name for that single expression up there, then its name should be capitalized as a proper noun, like London or Jacob Bernoulli, which are also unique entities. If a normal distribution is one of a family of functions, then it is a class of entities, like cities or people, and should be lower case.

By the way, I used to write “Normal Distribution,'' but none of the style books would be
OK with that. The *C* in London City is capitalized; same with the *D* in Normal
Distribution.

There's a bonus of consistency, because so many statistical models are capitalized anyway:

- Gaussian
- Poisson
- OLS
- F distribution
- Normal distribution [because
*the normal distribution*can easily confuse the reader (and I prefer it over*Gaussian*because I'll always choose descriptive over appellative).]

### The formalization of the conversation in a social network

Last time, I ran with the definition of an academic field as a social
network built around an accepted set of methods. The intent there was to counter all
the dichotomies that are iffy or even detrimental (typically of a form pioneered by
Richard Pryor: "Our people do it like *this*, but their people do it like *this*".)

This time, I'm going to discuss peer-reviewed journals from this perspective, to clarify all the things journals aren't. The short version: if journals are the formalized discussion of a social network built around a certain set of methods, then we can expect that the choice of what gets published will be based partly on relatively objective quality evaluation and partly on social issues. It's important to acknowledge both.

Originally, journals were literally the formalized discussions within a social network. Peer review was (and still is) a group of peers in a social network deciding whether a piece of formalized discussion is going to be useful and appropriate to the group.

An idea that exists only in my head is worthless—somebody somewhere has to hear it, understand it, and think about using it. Because a journal is a hub for the social network built around a known set of tools, I have a reasonable idea of which journal to pick given the methods I used, and what tools readers will be familiar with; readers who prefer certain methods know where to look to learn new things about those methods. So journals curate and set social norms, both of which are important to the process of communicating research.

###### Factual validity

Something that is incorrect will be useless or worse; work that is sloppily done is unlikely to be useful. So an evaluation of utility to the social network requires evaluating basic validity.

Among non-academics, I get the sense that this is what the peer review process is perceived to be about: that a paper that is peer reviewed is valid; one that isn't is up for debate.

If you think about this for a few seconds, this is *prima facie* absurd. The reviewers
are one or two volunteers who will only put a few hours into this. Peer reviewers do not
visit the lab of the paper author and check all the
phosphate
was cleaned out of the test tubes. They rarely double-code the statistics work to make
sure that there are no bugs in the code. If there is a theorem with a four-page proof
in the appendix, you've got low odds that any reviewer read it. I have on at least one
occasion directly stated in a review that I did not have time to check the proof in the appendix
and this has never seemed to affect the editors' decisions either way.

The most you can expect from a few hours of peer review is a (nontrivial and important) verification that the author hasn't missed anything that a person having ordinary skill in the art would catch. Deeper validity comes from a much deeper inquiry that is more likely to happen outside the formalized discussion of a journal.

###### Prestige

If a journal is the formalized discussion of a social network built around a certain set of methods, we see why journal publications are the gold standard in tenure reviews and other such very important affairs. Academics don't get hired for their ability to discover Beautiful Truths, they get hired for their ability to convince grant making bodies to give grants, to convince grad students and potential new hires to attend this department, and so on. These things require doing good work that has social sway. Each journal publication is a statement that there is a well-defined group of peers who think of your work positively, and publications in more far-reaching journals indicate a more far-reaching network of peers.

###### Choice of inquiry

Sorry if that sounds cynical, but even in mathematics, whose infinte expanse exists outside of human society, the choice of which concepts are most salient and which discoveries are truly important is chosen by people based on what other people also find to be salient.

Maybe you're familiar with the Beauty Contest, which was a story Keynes made up to explain how money works: the newspaper publishes photos of a set of gals, and readers mail in their vote, not for the one who is most beautiful, but for the one who they expect will win the content. Who you like doesn't matter—it's about who you think others will like. No wait, that isn't it either: what's important is who you think other people will think other people will like. Infinite regress ensues.

When you're chatting with a circle of friends, you don't pick topics that are objectively interesting—that's meaningless. You pick topics of conversation that you expect will be of interest to your friends. Now let's say that you know that after the meeting, your friends will go to RateMyFriends.com and vote on how interesting you would be to other potential friends. Then you will need to pick topics that your friends think will be of interest to other potential friends. You're well on your way to the Beauty Contest (depending on the rating strategy used by raters on RateMyFriends).

The Beauty Contest easily leads to bland least-common-denominator output. You're going to pick the most typically attractive looking gal out of the newspaper, and are going to avoid conversation topics that most would find quirky or odd.

What if day-glo `80s leggings are trendy this year? You might pick the gal in florescent lime green not because her attire is objectively attractive (a view which I really can't endorse), but because the setup of the Beauty Contest pushes you to select contestants who follow the current trends. It's not hard to find examples, especially in the social sciences, where a subject takes on its own life, as this quarter's edition publishes papers that respond to last quarter's papers, that are primarily a response to the quarter before.

###### Diversity

Even the part where we get a fresh pair of eyes to notice the things the author missed or easy-to-spot blunders is limited, because we're still asking peers. If you ask an anthropologist to read an Econ paper, the anthropologist will tear apart the fundamental assumptions; if you ask an economist to read an Anthro paper, she'll tear apart the fundamental assumptions.

But because journals are the formalized discussions of already-formed social networks, we can't expect a lot of cross-paradigm discussion in the journals or in-depth critiques of the social network's fundamental assumptions.

In the software development industry (which often refers to itself as `the tech industry'), you'll find more than enough long essays about the myth of meritocracy. To summarize: even in an industry that is clearly knowledge-heavy and where there are reasonably objective measures of ability, homophily is still a common and relevant factor. Given that fact of life, promoting the network as a meritocracy does a disservice, implying that whoever won out must have done so because they are the best here in this, the best of all possible worlds. If a person didn't get hired, or their code didn't get used, then it must be because the person or the code didn't have as much merit as the winner. The possibility that the person who wasn't picked does better work but wasn't as good a cultural fit as the person who got picked is downplayed.

Academics, in my subjective opinion, are much more likely to be on guard against creeping demographic uniformity. But an academic field is a social network built around an accepted set of tools, and this definition directly constrains the breadth of methodological diversity. Journals will necessarily reflect this.

The fiction of journals as absolute meritocracy still exists, especially among non-academics who have never submitted to a journal and read an actual peer review, and it has the same implications, that if a work doesn't sparkle to the right peers in the right social network, it must be wrong. And it's especially untrue in the present day, when more good work is being done than there is space in traditional paper journals to print it all.

###### Conclusion segment

I do think that there is much meritocracy behind a journal. A journal editor is the social hub of a network, so you could perhaps socialize your way into such a job, but you're going to kill the journal if you can't hold technical conversations with any author about any aspect of the field. As a journal reviewer, I have seen a good number of papers that can be established as fatally flawed even after a quick skim. But I would certainly like to see a world where the part about improving the quality of inquiry and the part about gaining approval by a predefined set of peers is more separated than it is now.

Social networks aren't going away, so journals supporting them won't go away. But there are many efforts being made to offer alternatives. It's a long list, but the standouts to me are the Arxiv and the SSRN (Social Science Research Network). These are sometimes described as preprint networks, implying that they are just a step along the way to actual peer-reviewed publication, but if the approval of a social network is not essential for your work, then maybe it's not necessary to take that step. Especially in the social sciences, where review times can sometimes be measured in years, these preprint networks are increasingly cited as the primary source. Even the Royal Society, who started this whole journal thing when it was a homophilic society in the 1600s, has an open journal that “...will allow the Society to publish all the high-quality work it receives without the usual restrictions on scope, length or [peer expectations of] impact.''

PS: Did you know I contribute to another blog on social science and public policy? In this entry and its follow-up I discuss other aspects of the journal system. I wrote it during last year's government shutdown, when I had a lot of free time.

### The difference between Statistics and Data Science

An academic field is a social network built around an accepted set of methods.

Economics has grown into the study of human decision making in all sorts of aspects. At this point, nobody finds it weird that some of the most heavily-cited papers in the Economics journals are about the decision to commit crimes or even suicide. These papers use methods accepted by economists, writing down utility functions and using certain mathematical techniques to extract information from these utility functions. Anthropologists also study suicide and crime, but using entirely different methods. So do sociologists, using another set of tools. To which journal you submit your paper on crime depends on matching the methods you use to the methods readers will be familiar with, not on the subject.

A notational digression: I hate the term `data science'. First, there's a general rule (that has exceptions) that anybody who describes what they're doing as “science'' is not a scientist—self-labelling like that is just trying too hard. And are we implying that other scientists don't use data? Is it the data or the science that statisticians are lacking? Names are just labels, but I'll hide this one under an acronym for the rest of this. I try to do the same with the United States DHS.

I push that the distinction is about the set of salient tools because I think
it's important to reject other means of cleaving apart the Statistics and DS networks. Some just
don't work well and some are as odious as any other *our people do it like this,
but the other people do it like this* kind of generalizations. These are claims about how statisticians
are too interested in theory and too willing to assume a spherical cow, or that
DSers are too obsessed with hype and aren't careful with hypothesis testing.
Hadley
explains that “...there is little work [in Statistics] on developing
good questions, thinking about the shape of data, communicating results or
building data products'' which is a broad statement about the ecosystem
that a lot of statisticians would dispute, and a bit odd given that he is best
known for building tools to help statisticians build data products. It's not hard
to find people who say that DS is more applied than Stats, which is an environment
issue that is hard to quantify and prone to observation bias. From the
comment
thread of this level-headed post: “I think the key differentiator between a
Data Scientist and a Statistician is in terms of accountability and commitment.''

Whatever.

We can instead focus on characterizing the two sets of tools. What is common knowledge among readers of a Stats journal and what is common knowledge among readers of a DS journal?

It's a subjective call, but I think it's uncontroversial to say that the abstract methods chosen by the DSers rely more heavily on modern computing technique than commonly-accepted stats methods, which tend to top out in computational sophistication around Markov Chain Monte Carlo.

One author went to the extreme of basically defining DS as the practical problems of data shunting and building Hadoop clusters. I dispute that any DSer would really accept such a definition, and even the same author effectively retracted his comment a week later after somebody gave him an actual DS textbook.

If you want to talk about tools in the sense of using R versus using Apache Hive, the conversation won't be very interesting to me but will at least be a consistent comparison on the same level. If we want to talk about generalized linear models versus support vector machines, that's also consistent and closer to what the journals really care about.

The basic asymmetry that the price of admission for using DS techniques is greater computational sophistication will indeed have an effect on the people involved. If we threw a random bunch of people at these fields, those who are more comfortable with computing will sort themselves into DS and those less comfortable into Stats. We wind up with two overlapping bell curves of computing ability, such that it is not challenging to find a statistician-DSer pair where the statistician is a better programmer, but in expectation a randomly drawn DSer writes better code than a randomly drawn statistician. So there's one direct corollary of the two accepted sets of methods.

Three Presidents of the ASA wrote on the Stats vs DS thing, and eventually faced the same technical asymmetry:

Ideally, statistics and statisticians should be the leaders of the Big Data and data science movement. Realistically, we must take a different view. While our discipline is certainly central to any data analysis context, the scope of Big Data and data science goes far beyond our traditional activities.

This technical asymmetry is a real problem for the working statistician, and statisticians are increasingly fretting about losing funding—and for good reason. Methods we learned in Econ 101 tell us that an unconstrained set leads to an unambiguously (weakly) better outcome than a constrained set.

If you're a statistician who is feeling threatened, the policy implications are obvious:
learn Python. Heck, learn C—it's not that hard, especially if
you're using
my C textbook, whose second edition was just released (or *Modeling with Data*, which this blog is ostensibly based on). If you have the
grey matter to understand how the F statistic relates to SSE and SSR, a reasonable level
of computing technique is well within your reach. It won't directly score you publications (DSsers can
be as snobby about how writing code is a “mere clerical function'' as the statisticians
and US Federal Circuit can be), but you'll have available a less constrained set of abstract tools.

If you are in the DS social network, an unconstrained set of tools is still an unambiguous improvement over a constrained set, so it's worth studying what the other social network takes as given. Some techniques from the 1900s are best left in the history books, but now and then you find ones that are exactly what you need—you won't know until you look.

By focusing on a field as a social network built around commonly accepted tools, we see that Stats and DS have more in common than differences, and can (please) throw out all of the bigotry that comes with searching for differences among the people or whatever environment is prevalent this week. What the social networks will look like and what the labels are a decade from now is not something that we can write a policy for (though, srsly, we can do better than “data science''). But as individuals we can strive to be maximally inclusive by becoming conversant in the techniques that the other social networks are excited by.

Next time, I'll have more commentary derived from the above definition of academic fields, then it'll be back to the usual pedantry about modeling technique.

### Bayes v Kolmogorov

$\def\Re{{\mathbb R}} \def\datas{{\mathbb D}} \def\params{{\mathbb P}} \def\models{{\mathbb M}} \def\mod#1{M_{#1}}$

We have a likelihood function that takes two inputs, which we will name the data and the parameter, and which gives the nonnegative likelihood of that combination, $L: d, p \to \Re^+$. [I wrote a lot of apropos things about this function in an early blog post, by the way.]

The two inputs are symmetric in the sense that we could slice the function either way. Fixing $p=\rho$ defines a one-parameter function $L_\rho: d\to \Re^+$; fixing $d=\delta$ defines a one-parameter function $L_\delta: p \to \Re^+$.

But the inputs are not symmetric in a key way, which I will call the unitary axiom (it doesn't seem to have a standard name). It's one of Kolmogorov's axioms for constructing probability measures. The axiom states that, given a fixed parameter, some value of $d$ will be observed with probability one. That is, \begin{equation} \int_{\forall \delta} L_\rho(\delta) d\delta = 1, \forall \rho. \end{equation} In plain language, when we live in a world where there is one fixed underlying parameter, one data point or another will be observed with probability one.

This is a strong statement, because we read the total density as an indication of the likelihood of the parameter taking on the given value. I tell you that $p=3$, and we check the likelihood and see that the total density on that state of the world is one. Then you tell me that, no, $p=4$, and we refer to $L(d, 4)$, and see that it integrates to one as well.

Somebody else comes along and points out that this may work for discrete-valued $p$, but a one-dimensional slice isn't the right way to read a continuous density, insisting that we consider only ranges of parameters, such as $p\in[2.75,3.25]$ or $p \in [3.75,4.25]$. But if the integral over a single slice is always one, then the double integral is easy: $\int_{\rho\in[2.75,3.25]}\int_{\forall \delta} L(\delta, \rho) d\delta d\rho$ $=\int_{\rho\in[2.75,3.25]} 1 d\rho$ $=.5$, and the same holds for $p \in [3.75,4.25]$. We're in the same bind, unable to use the likelihood function to put more density on one set of parameters compared to any other of the same size.

This rule is asymmetric, by the way, because if we had all the parameters in the universe, whatever that might mean, and a fixed data set $\delta$, then $\int_{\forall \rho} L_\delta(\rho) d\rho$ could be anything.

Of course, we don't have all the data in the universe. Instead, we gather a finite quantity of data, and find the more likely parameter given that subset of the data. For example, we might observe the data set $\Delta=\{2, 3, 4\}$ and use that to say something about a parameter $\mu$. I don't want to get into specific functional forms, but for the sake of discussion, say that $L(\Delta, 2)=.1$; $L(\Delta, 3)=.15$; $L(\Delta, 4)=.1$. We conclude that three is the most likely value of $\mu$.

What if we lived in an alternate universe where the unitary axiom didn't hold? Given a likelihood function $L(d, p)$ that conforms to the unitary axiom, let $$L'(d, p)\equiv L(d, p)\cdot f(p),$$ where $f(p)$ is nonnegative and finite but otherwise anything. Then the total density on $\rho$ given all the data in the universe is $\int_{\forall \delta} L_{\rho}(\delta)f(\rho) d\delta = f(\rho)$.

For the sake of discussion, let $f(2)=.1$, $f(3)=.2$, $f(4)=.4$. Now, when we observe $\Delta=\{2, 3, 4\}$, $L'(\Delta, 2)=.01$, $L'(\Delta, 3)=.03$, $L'(\Delta, 4)=.04$, and we conclude that $\mu=4$ is the most likely value of $p$.

Bayesian updating is typically characterized as a composition of two functions,
customarily named the *prior* and the *likelihood*. In the notation here, these are $f(p)$
and $L(d, p)$. Without updating, all values of $p$ are equally likely in the world
described by $L$, until data is gathered. The prior breaks the unitary axiom, and
specifies that, even without gathering data, some values of $p$ are more likely than
others. When we do gather data, our prior belief that some values of $p$ are more
likely than others advises our beliefs.

Our belief about the relative preferability of one value of $p$ over another could be summarized into a proper distribution, but once again, there is no unitary axiom requiring that a distribution over the full parameter space integrate to one. For example, the bridge from the Bayesian-updated story to the just-a-likelihood story is the function $f(\rho)=1, \forall \rho$. This is an improper distribution, but it does express that each value of $p$ has the same relative weight.

In orthodox practice, everything we write down about the data follows the unitary axiom. For a given observation, $L'(\delta, p)$ is a function of one variable, sidestepping any issues about integrating over the space of $d$. We may require that this univariate function integrate to one, or just stop after stating that $L'(\delta, p) \propto f(p)L(\delta, p)$, because we usually only care about ratios of the form $L'(\delta, \rho_1)/L'(\delta, \rho_2)$, in which case rescaling is a waste of time.

In a world where all parameters are observable and fixed, the unitary axiom makes so much sense it's hard to imagine not having it. But in a meta-world where the parameter has different values in different worlds, the unitary axiom implies that all worlds have an equal slice of the likelihood's density. We usually don't believe this implication, and Bayesian updating is our way of side-stepping it.

### Microsimulation games, table top games

I wrote a game. It's called *Bamboo Harvest*, and you can see the rules at this link. You can play it with a standard deck of cards and
some counters, though it's much closer to the sort of strategic games I discuss below
than poker or bridge. I've played it with others and watched others play it enough
to say it's playable and pretty engaging. Ms NGA of Baltimore, MD gets really emotional
when she plays, which I take as a very good sign.

Why am I writing about a game on a web page about statistical analysis and microsimulation? I will leave to others the topic of Probability theory in table top games, but there is also a lot that we who write economic models and microsimulations of populations can learn from game authors. After all, the designers of both board games and agent-based models (ABMs) have the same problem: design a set of rules such that the players in the system experience an interesting outcome.

Over the last few decades, the emergent trend among board games have been so-called
*Eurogames*, which are aimed at an adult audience, seek greater interaction among
players, and typically include an extensive set of rules regarding resource trading
and development. That is, the trend has been toward exactly the sort of considerations
that are typical to agent-based models.

A game that has resource exchange rules that are too complex, or is simple enough to be easily `solved' will not have much success in the market. In most games, the optimal move in any given situation could theoretically be solved for by a hyperrational player. But the fact that players find them to be challenging demonstrates that the designers have found the right level of rule complexity for a rational but not hyperrational adult. We seek a similar complexity sweet spot in a good ABM. Readers can't get lost in all the moving parts, but if the model is so simple that readers know what your model will do before it is run—if there's no surprise—then it isn't worth running.

Of course, we are unconcerned as to whether our *in silico* agents are having any
fun or not. Also, we get to kill our agents at will.

Simulation designers sometimes have a sky's-the-limit attitude, because processor time is cheap, but game designers are forced by human constraints to abide by the KISSWEA principle (keep it simple, stupid, without extraneous additions). It's interesting to see what game designers come up with to resolve issues of simultaneity, information provision and hiding, and other details of implementation, when the players have only counters and pencil and paper.

###### Market and supply chain

*Settlers of Catan* is as popular
as this genre of games get—I saw it at a low-end department store the other day on
the same shelf as *Monopoly* and *Jenga*. It is a trading game. Each round a few
random resources—not random players—are productive, which causes gluts and droughts
for certain resources, affecting market prices. The mechanics of the market for goods
are very simple. Each player has a turn, and they can offer trades to other players
(or all players) on their turn. This already creates interesting market dynamics, without the need
for a full open-outcry marketplace or bid-ask book, which would be much more difficult
to implement at the table or in code. How an agent decides to trade can also be coded
into an artificial player, as demonstrated by the fact that there are versions of
*Settlers* you can play against the computer.

Some games, like *Puerto Rico*, *Race for the Galaxy*, *Bootleggers*, and
*Settlers* again, are supply chain games. To produce a victory point in Puerto Rico, you have
to get fields, then get little brown immigrants to work the fields (I am not making
this up), then get a factory to process the crops, then sell the final product or ship
it to the Old World. There may be multiple supply chains (corn, coffee, tobacco). The
game play is basically about deciding which supply chains to focus on and where in
the supply chain to put more resources this round. The game design is about selecting
a series of relative prices so that the cost (in time and previous supply-chain items)
makes nothing stand out as a clear win.

One could program simple artifical agents to play simple strategies, and if one is a runaway winner with a strategy (produce only corn!) then that is proof that a relative price needs to be adjusted and the simulation redone. That is, the search over the space of relative prices maximizes an objective function regarding interestingness and balance. ABMers will be able to immediately relate, because I think we've all spent time trying to get a simple model to not run away with too many agents playing the same strategy.

I'm not talking much about war games, which seem to be out of fashion. The central
mechanism of a war game is an attack, wherein one player declares that a certain set of
resources will try to eliminate or displace a defending resource, and the defender then declares what resources
will be brought to defense. By this definition, *Illuminati* is very much a war game;
*Diplomacy* barely is. Design here is also heavily about relative prices, because so much
of the game is about which resources will be effective when allocated to which battles.

###### Timing

How does simultaneous action happen when true simultaneity is impossible? The game
designers have an easy answer to simultaneously picking cards: both sides pick a card at a
leisurely pace, put the card on the table, and when all the cards are on the table,
everybody reveals. There are much more complicated means of resolving simultaneous
action in an agent-based model, but are they necessary?
*Diplomacy* has a similar simultaneous-move arrangement: everybody picks a move,
and an arbitration step uses all information to resolve conflicting moves.

*Puerto Rico*, *San Juan*, and *Race for the Galaxy* have a clever thing
where players select the step in the production chain to execute this round, so the
interactive element is largely in picking production chain steps that benefit you but
not opponents. Setting aside the part where agents select steps, the pseudocode would
look like this:

```
for each rôle:
for each player:
player executes rôle
```

Typical program designs make it really easy to apply a rôle function to an array of players.
Josh Tokle implements a
hawk and
dove game via Clojure. His code has a `game-step` where all the birds play
a single hawk-and-dove game from Game Theory 101, followed by all executing the
`death-and-birth-step`, followed by all taking a `move-step`.

It's interesting when *Puerto Rico* and *Race for the Galaxy* have this form, because it's not how games usually
run. The usual procedure is that each player takes a full turn executing all phases:

```
for each player:
for each rôle:
player executes rôle
```

I'd be interested to see cases where the difference in loop order matters or doesn't.

###### Topology

One short definition of *topology* is that it is the study of what is adjacent to what.

The Eurogamers seem to refer to the games with very simple topologies as
*abstracts*—think *Go* or *Chess*. Even on a grid, the center is more valuable in
Chess (a center square is adjacent to more squares than an edge square) and the corners
are more valuable in Go (being adjacent to fewer squares $\Rightarrow$ easier to secure).

Other games with a board assign differential value to areas via other means. War games
typically have maps drawn with bottlenecks, so that some land is more valuable than
others. *Small World* has a host of races, and each region is a valuable target
for some subset of races.

I'm a fan of tile games, where the map may grow over time (check out *Carcassonne*), or
what is adjacent to what changes over the course of the game (*Infinite City* or
*Illuminati*).

Other games have a network topology; see *Ticket to Ride*, where the objective is
to draw long edges on a fixed graph.

War games often extol complexity for the sake of complexity in every aspect of the
game, so I'm going to set those aside. But the current crop of Eurogames tend to
focus on one aspect (topology or resource management or attack dynamics) and leave
the other aspects to a barebones minimum of complicatedness. *Settlers* has an
interesting topology and bidding rules, and the rest of the game is basically just
mechanics. *Carcasonne* has the most complex (and endogenous) topology of anything I'm discussing
here, so the resource management is limited to counting how many identical counters
you have left. *Race for the Galaxy*, *Puerto Rico*, and *Dominion* have crazy
long lists of goods and relative prices, so there is no topology and very limited
player interaction rules—they are almost parallel solitaire. A lot of card games
have a complete topology, where every element can affect every other.

###### An example: Monopoly

Back up for a second to pure race games, like *Pachisi* (I believe *Sorry!* is a
rebrand of a Pachisi variant). Some have an interactive element, like blocking other
opponents. Others, aimed at pre-literate children, like *Chutes and Ladders* or
*Candyland*, are simply a random
walk. Ideally, they are played without parental
involvement, because adults find watching a pure random walk to be supremely dull.
Adults who want to ride a random walk they have no control over can invest in the
stock market.

Monopoly is a parallel supply chain game: you select assets to buy, which are bundled into sets, and choose which sets you want to build up with houses and hotels. On top of this is a Chutes and Ladders sort of topology, where you go around a board in a generally circular way at random speed, but Chance cards and a Go to Jail square may cause you to jump position.

The original patent has an explanation for some of these details—recall that Monopoly was originally a simulation of capital accumulation in the early 20th century:

Mother earth: Each time a player goes around the board he is supposed to have performed so much labor upon mother earth, for which after passing the beginning-point he receives his wages, one hundred dollars[...].

Poorhouse: If at any time a player has no money with which to meet expenses and has no property upon which he can borrow, he must go to the poorhouse and remain there until he makes such throws as will enable him to finish the round.

You have first refusal on unowned properties that your token lands on (then they go up for auction, according to the official rules that a lot of people ignore), and you owe rent when your token lands on owned properties, and Mother earth periodically pays you \$200. All of these cash-related events are tied to the board movement, which is not the easiest or most coherent way to cause these events to occur. E.g., how would the game be different if you had a 40-sided die and randomly landed on squares all around the board? Would the game be more focused if every player had a turn consisting of [income, bid on available land, pay rent to sleep somewhere] phases?

The confounding of supply chain game with randomization via arbitrary movement is what makes it succesful, because the Chutes and Ladders part can appeal to children (the box says it's for 8 year-old and up), while the asset-building aspects are a reasonable subgame for adults (although it is unbalanced: a competent early leader can pull unsurpassably ahead). But it is the death of Monopoly as a game for adults, because there are too many arbitrary moving parts about going around an arbitrary track.

I can't picture a modern game designer putting together this sort of combination of elements. I sometimes wonder if the same sort of question could be asked of many spatial ABMs (including ones I've written): is the grid a key feature of the game, or just a mechanism to induce random interactions with a nice visualization?

###### Conclusion

Microsimulation designers and for-fun game designers face very similar problems,
and if you're writing microsimulations, it is often reasonable to ask *how would a
board game designer solve this problem?*. I discussed several choices for turn order,
trading, topology, and other facets, and in each case different choices can have a real
effect on outcomes. In these games that are engaging enough to sell well, the game
designers could only select a nontrivial choice for one or two facets, which become
the core of the game, and other facets are left to the simplest possible mechanism,
to save cognitive effort by players.

Also, now that you've read all that, I can tell you that Bamboo Harvest focuses on a shifting-tiles topology, with a relatively simple supply chain. We decided against marketplace/trading rules.

### Intercoder agreement: the R script and its tests

Here, I will present an R script to calculate an information-based measure of intercoder agreement.

The short version: we have two people putting the same items into bins, and want to know how often they are in agreement about the bins. It should be complexity-adjusted, because with only two bins, binning at random achieves 50% agreement, while with 100 bins binning at random produces 1% agreement. We can use mutual information as a sensible measure of the complexity-adjusted agreement rate. A few more steps of logic, and we have this paper in the Journal of Official Statistics describing $P_i$, a measure of intercoder agreement via information in agreement. I also blogged this paper in a previous episode.

There are two features of the paper that are especially notable for our purposes here. The first is that I said that the code is available upon request. Somebody called me out on that, so I sent him the code below. Second, the paper has several examples, each with two raters and a list of their choices, and a carefully verified calculation of $P_i$. That means that the tests are already written.

The code below has two functions. We could turn it into a package, but it's not even worth
it: just `source("p_i.R")` and you've got the two defined functions. The `p_i` function does the
actual calculation, and `test_p_i` runs tests on it. As in the paper, some of the tests
are extreme cases like full agreement or disagreement, and others are average tests that
I verified several times over the course of writing the paper.

Could it be better? Sure: I don't do a very good job of testing the code for really
pathological cases, like null inputs or something else that isn't a matrix. But the tests
give me a lot of confidence that the `p_i` function does the correct thing given
well-formed inputs. It's not mathematically impossible for a somehow incorrect function to
give correct answers for all six tests, but with each additional test the odds diminish.

Here is the code. Feel free to paste it into your projects, or fork it from Github and improve upon it—I'll accept pull requests with improvements.

```
p_i <- function(dataset, col1=1, col2=2){
entropy <- function(inlist){
-sum(sapply(inlist, function(x){log2(x)*x}), na.rm=TRUE)
}
information_in_agreement <- function(diag, margin1, margin2){
sum <- 0
for (i in 1:length(diag))
if (diag[i] != 0)
sum <- sum + diag[i]*log2(diag[i]/(margin1[i]*margin2[i]))
return (sum)
}
dataset <- as.data.frame(dataset) #in case user provided a matrix.
crosstab <- table(as.data.frame(cbind(dataset[,col1],dataset[,col2])))
d1tab <- table(dataset[,col1])
d2tab <- table(dataset[,col2])
d1tab <- d1tab/sum(d1tab)
d2tab <- d2tab/sum(d2tab)
crosstab <- crosstab/sum(crosstab)
entropy_1 <- entropy(d1tab)
entropy_2 <- entropy(d2tab)
ia <- information_in_agreement(diag(crosstab), d1tab, d2tab)
return (2*ia/(entropy_1+entropy_2))
}
test_p_i <- function(){
fullagreement <- matrix(
c(1,1,1,1,2,2,2,2,3,3,
1,1,1,1,2,2,2,2,3,3),
ncol=2, byrow=FALSE
)
stopifnot(p_i(fullagreement)==1)
noagreement <- matrix(
c(1,2,1,2,1,2,3,1,3,2,
2,1,3,1,2,3,2,2,1,3),
ncol=2, byrow=FALSE
)
stopifnot(p_i(noagreement)==0)
constant <- matrix(
c(1,1,1,1,1,1,
1,1,2,2,2,3),
ncol=2, byrow=FALSE
)
stopifnot(p_i(constant)==0)
neg_corr <- matrix(
c(1,1,1,1,1,2,2,2,2,2,
1,2,2,2,2,1,1,1,1,2),
ncol=2, byrow=FALSE
)
stopifnot(abs(p_i(neg_corr)- -.2643856) < 1e-6)
rare_agreement <- matrix(
c(1,1,1,2,1,2,2,2,3,3,
1,1,1,1,2,2,2,2,3,3),
ncol=2, byrow=FALSE
)
stopifnot(abs(p_i(rare_agreement)- .6626594) < 1e-6)
common_agreement <- matrix(
c(1,1,1,1,2,2,2,3,2,3,
1,1,1,1,2,2,2,2,3,3),
ncol=2, byrow=FALSE
)
stopifnot(abs(p_i(common_agreement)- 0.6130587) < 1e-6)
}
```