So the first thing I want to show you is this thing called a Galton board.
Now, this was worked out by Francis Galton, who's actually quite an interesting person.
If you hear about Francis Galton, you'll often hear that he was the first person who tried to measure human intelligence, but that's actually not true.
What Galton tried to do was measure human eminence.
And he thought of eminence as, I suppose, something like high position in the cultural dominance hierarchy.
And this was back in Victorian England in the late 1800s, and so eminence would have been intellectual achievement, financial achievement, cultural achievement, achievement broadly speaking, but really considered within the confines of a dominance hierarchy.
So at that point in time, the English regarded themselves as the prime nation and Ethnicity on the planet, and then the aristocratic Englishman regarded themselves as the prime human beings in that prime country and ethnicity.
It was really an idea that would be more associated with dominance hierarchy, uppermost dominance hierarchy position than anything we would conceptualize as intelligence today.
And Galton was a polymath, he was a genius in many, many fields, and he got very interested in the idea of measuring human differences and he tried to determine whether the differences that he could measure, and some of those things were like reaction time and height and I can't remember all the other things that he attempted to measure,
but he wanted to see if any of those things could be used to predict eminence.
It turned out that they couldn't, partly because eminence is not the kind of category that you can make into a scientific category.
And partly because his measurement instruments in many ways weren't sensitive enough.
He didn't make enough measurements and the statistics weren't sufficiently sophisticated.
But the idea that you could measure elementary attributes and that you could use them to predict something important was still an important idea.
And it's one of those examples in science, sort of like phrenology.
I mean, everyone makes fun of phrenology now.
You know, the idea that you could read someone's character by Mapping out the protrusions and dips and so forth on their skull, it sounds ridiculous to modern ears.
But there was, again, there was some idea behind it, and that was the idea that cortical functions could be localized and that cognitive functions could be differentiated into separate functions.
We still do that.
We think of emotions and motivations and personality traits and Intelligences, even though intelligences is not a very good idea, the idea that you can differentiate human psychological function and that that would be related in some way to the underlying neurological and physical architecture, that's not a stupid idea.
So an idea can be very intelligent at one level of analysis and not so intelligent at the other.
So I would say the problem with the phrenologist was actually a problem of operationalization.
They had the theory right, in some sense, but they didn't have the measures right.
Now, one of the things Galton came up with was the normal—he came up with this thing called a Galton board, and a Galton board demonstrates how a normal distribution is produced.
Now, basically what you—the normal distribution is an axiomatic—it's an axiom of modern statistics, and every system has to have its axioms, and The axiom of normal distribution is basically the idea that around any measurement there's going to be variations in measurement, and those measurements are basically going to be random.
And then another axiom is that extreme outlying measurements are going to be rare, and small outlying measurements are going to be more common.
So you can imagine if I measured any one of you a hundred times, let's say with a tape measure, I'm going to get a set of variables that are not always exactly the same.
And so there'll be an average, and then there'll be some deviation around that average, and if I looked at a hundred measurements that I took of one of you, then I would get a normal distribution around that distribution.
And that's sort of the fuzziness of the mean.
Now that might be partly because my measurement instrument isn't very accurate.
It might be partly because at some point when you're being measured for the 50th time you're slouching a little bit or maybe other times you're standing up a bit straighter or maybe I measure you in the morning sometimes and in the evening at other times and you're actually taller in the morning than you are in the evening because your spine has a chance to decompress at night so Maybe that adds half an inch to your height in the morning.
And so there's going to be shifting and movement around the central tendency, around the average.
Now, it turns out if I measure all of you, the same thing's going to happen.
What we're going to get is an average height We're good to go.
Those of you who are just somewhat smaller than the average.
Now, generally, the idea of a normal distribution is predicated on the idea that the variation around the average is actually random.
And that's an important thing, because the variation around an average is not always random.
Now, psychologists will tell you, and so will most social scientists who use classical statistics, that the normal distribution is the norm, or maybe they'll say more than that, is that it's the standard case that whatever set of variables you measure is going to come out in a normal distribution,
and that means that you can apply all of the stats that you're going to learn To decompose the world and reconstruct it to your data sets because the data sets will fit the assumptions of the statistics.
The problem with that is that it's often wrong.
Now, I asked you guys to buy the black swan.
And we haven't talked about the black swan much yet, and I'm not going to test you on it, but I would recommend that you read it, especially if you're interested in psychology, because you've got to watch your axioms, and the idea that the normal distribution might not always be correct, and certainly might not be a correct description of your data, is a fundamentally important idea, because if it's wrong, if your data isn't normally distributed, then the phenomena that you're looking at isn't what you think it is, and the statistics that you're going to use aren't going to work.
Now this came as quite a puzzle to me at one point because it turned out that when we produced the creative achievement questionnaire, you know, and you'd think I would have known better by this time because it wasn't that long ago, it was like fifteen years ago.
The data never came back normally distributed.
It came back distributed in a, what they call a Pareto distribution, which is everyone stacks up on the left hand side and The curve drifts off to the right and that's one of those curves where almost everyone has zero or one and a small minority of people have very high scores.
Now when you get a Pareto distribution there's something going on that's not random.
And that doesn't mean you can ignore it.
Now one of the things that psychologists will do sometimes, if they get a distribution that's not random, is they'll do a mathematical transformation, like a logarithmic transformation, to pull in the outliers and to try to make the data fit the normal distribution again, assuming that there's some kind of measurement error or that the scale's got a logarithmic function.
It isn't always obvious that that's a useful and appropriate thing to do, because It's the case in many situations where the extremeness of the distribution is actually an accurate representation of the way that that phenomena behaves in nature.
Now, I want to show you how a normal distribution works.
Have any of you ever seen a demonstration of why a normal distribution is normal and why it takes the shape it takes?
Has anybody ever seen that?
No, it's weird, eh?
Because you'd think that given its unbelievable importance, especially in psychology, where everything we do is measured and has an average and a standard deviation, that there'd be some investigation into why we make that presupposition.
There's a great book, one book I would recommend for those of you who might be interested in psychology as a career, there's a book called A History of Statistical Thinking.
Now you'd think, Jesus, if there's not anything more boring than statistics, it's got to be the history of statistics.
But it turns out that that's really wrong.
I mean, the history of statistics is actually the history of the social sciences, and not only of the social sciences, because statistics became—statistics was actually initially invented by cosmologists who were measuring planetary position, and then The statistical processes that have been used to underpin modern science have been tossed back and forth,
weirdly enough, between social scientists and physicists over about the last 200 years, because the physicists ended up having to describe the world from a statistical perspective, right?
Because if you go down into the realm of the Atomic and subatomic particles, what you find is that things behave statistically down there.
They don't behave deterministically.
And it's the same at the level of the human scale.
You know, we behave statistically, not deterministically.
So, you know, sometimes you hear this old idea that psychology has physics envy or something like that.
It turned out that for much of the history of the development of statistical ideas, physics had psychology envy, which I think is quite funny.
So, anyways, if you look at this book, History of Statistical Thinking, you can see How the idea that populations could have behavior and that populations could be measured and that the idea that states, like political states, could be measured and that human behavior could be measured.
I didn't realize until I read this book how revolutionary the idea of statistics actually was because, of course, it's part of the idea of measurement.
I had this very interesting client at one point.
He was an old guy, and he'd been a psychologist and a financier and a statistician, and he was in love with mathematics.
He was one of these guys for whom mathematical equations had this immediate glimmer of beauty, and he made this very interesting little gold sculpture that I have up at my office that's a representation of what people claim to be the most perfect mathematical equation ever constructed.
So he made this little gadget.
It's like a religious icon almost, and that's really what he thought of it.
And I'm afraid, foolishly enough, that I can't tell you the name of the equation.
Some of you might know it.
It relates I, E, and pi.
Is there anybody in here who—yes, what is that?
Yes, and what is that equation?
Okay, do you know why it's such a remarkable equation?
In one equation.
Right, so he thought that in some sense this equation summed up the magnificence of the mathematical universe, and so, it was so funny, because he also made little pins that people could wear of this, of this equation that were also made out of gold, and there was another person in our department at that point, and I was wearing this pin around, and she said, oh, well, she told me what the equation was.
She pointed at that, and she got all excited because it was this perfect equation that related all these fundamental constants to one and zero.
So, Anyways, he taught me a lot of this.
Well, he was a client of mine.
He was very much obsessed with mathematical ideas.
He couldn't get them out of his head.
It was a true obsession and a useful one.
But he taught me a lot of things about statistics that I just never knew at all.
One of them was the Pareto distribution which, it just staggered me that I didn't know it.
I already learned this stuff about five or six years ago and it just made me feel like a complete moron because I'd, you know, gone through an immense amount of psychological training and I was measuring things like prefrontal ability and intelligence and personality And creativity, and then in the creativity measurement I stumbled into these Pareto distributions.
I thought they were a bloody mistake.
I didn't know what the hell they were.
You know, and I've also found economists who didn't know about the Pareto distribution, which is a pretty bloody bizarre thing.
Like, it really is a strange thing.
So, anyways.
One of the things he showed me is how the normal distribution comes about.
I had a student who did a PhD thesis on Galton, so I knew something about Galton at this point.
He was doing part of his thesis on the history of the measurement of intelligence, and Galton was a key figure in the establishment of that sort of measurement.
But this is one of the things Galton invented, so let's take a look at it here.
So let me show you a picture of a galton board so you get a better sense of exactly what it's doing.
So there's a good one right there.
Let's see if we can make that a little bigger.
Yeah, so basically all that happens with a galton board is that you have a You have a bunch of pegs on a board.
The board is horizontal.
And you drop balls down it, marbles or whatever it is, and the marbles come in one place, and then they bounce down the pins, and they distribute themselves in a normal distribution.
Now, the reason they do that is because there are far more ways of getting down the middle than there are of getting to either side.
Now you see, because to get only to the right side, say, the ball has to go right, right, right, right, right, right, right, right, right, and then fall in the little cup.
And then on the left it has to do exactly the same thing.
There's only one way it can do that.
But to get into the middle it can go left, right, left, right, left, right, right, left, etc.
There's a far greater number of pathways for the ball to go from the middle down to the middle than there is for the ball to go to the sides.
And so all that means fundamentally is that the probability that a given ball is going to land in the middle, or near the middle, is much higher than the probability that a ball is going to land in the extremes.
And so it's just a description of how random processes lay themselves out around a mean.
Okay, so that's a normal distribution.
Now, if you have normal distribution one, and you have normal distribution two, you can overlap them, right?
And if they overlap perfectly, the mean and the standard deviation are the same, then there's nothing different about them.
Because one thing you want to ask is, well, how do you know if two means are different?
And the answer is you don't know unless you know what the means are measuring, and you don't know what the means are measuring unless you know the variation of the measurement.
And so you have to know the mean and the standard deviation.
So the standard deviation is like the width of the mean in some sense.
And all statistics do, generally speaking, is take one normal distribution and another and slap them on top of each other, and then measure how far apart they are, how much they overlap, And correct that for the sample size, and then you can tell if the two distributions are different, significantly different.
So what you would do is you'd run a whole bunch of normal distribution processes by chance, And you're going to get a distribution of distributions in some sense, and then you can tell what the probability is that the normal distribution that you drew for group A is the same as the normal distribution that you drew for group B. And that's all there is, basically, to standard group comparison statistics.
And that's how we decide when we have a significant effect, if there's a significant difference between two groups, or better still, if you randomly assign So if you look at people to two groups and then you do an experimental manipulation and then you measure the outcome, the means, then you can test to see whether your experimental manipulation produced an effect that would be greater than chance.
And then you infer that, well, there's a low probability, one in twenty, that you would You would produce that effect if you were just running random simulations, and therefore you can say with some certainty that there's an actual causal effect.
That's with an experimental model.
Now, you know, why do you think—all right, let's say—let me give you a conundrum.
So I used to study people who were sons of male alcoholics.
Okay, now there was a reason for that.
We didn't have women.
Why might that be?
We didn't use women.
Why?
Why?
Why would they be confounding, apart from the fact that they're women?
That's supposed to be a joke, you know?
Jesus.
It's a better joke than that.
Okay.
Well, it could be that there's some sort of, like, emotional coping mechanism that differs between the way that men deal with...
Okay, so that's not a bad answer, but it's not the answer that was appropriate for our research, and I'll tell you why, because you couldn't infer it from the question.
We were interested in what factors made alcoholism hereditary, because it does seem to have a strong hereditary component, and what we basically concluded after doing a tremendous amount of research was that People who are prone to alcoholism—at least one type of person who's prone to one type of alcoholism—got a very,
very powerful stimulant effect from the alcohol during the time that their blood alcohol level was ascending in the ten or fifteen minutes after they took a drink, especially if they took a large drink fast, or multiple large drinks fast.
You can probably tell, by the way, if you're one of these people, if you want to go do this in the bar the next time you go, Go on an empty stomach, take your pulse, write it down, drink three or four shots fast, wait ten minutes, take your pulse again.
If it's gone up ten or fifteen beats a minute, look out.
Because that means alcohol is working as a psychomotor stimulant for you.
And we found that for many of these people that was an opiate effect.
What seemed to happen was that when they drank alcohol fast, They produce probably beta-endorphin, although we were never sure.
It can be blocked with naltrexone, which is an opiate antagonist.
Anyways, the other characteristic of that pattern of alcohol consumption is that The real kick only occurs when you're on the ascending limb of the blood alcohol curve.
So, you know, first of all your blood alcohol goes up and then it goes down.
And generally when it goes down it's not pleasant.
That's when you start to feel hungover.
And hangover is actually alcohol withdrawal, by the way.
So it's like heroin withdrawal except it's alcohol withdrawal.
And it's generally not pleasant for people so they usually sleep through it.
Or it puts them to sleep.
But if you're one of these people who get a real kick on the ascending limb of the blood alcohol curve, then you can just keep pounding back the alcohol and it'll keep hitting you and keep you in that position where you're, you know, on the ascending part of the blood alcohol curve.
And you can probably tell if you're one of these people if you can't stop once you get started.
You know, so if it's like you have four drinks, quick, and it's like, man, You're gone until the alcohol runs out or until it's four in the morning or until you've spent all your money or you've been at the last bar in town or that you're sitting on your friend's bed after everybody's gone home from the party and you're still drinking, you might suspect that you're one of those people.
And if you are one of those people, well then you should watch the hell out because Alcohol is a vicious drug, and it gets people in its grasp hard, and it's hard for them to escape once they do.
People also drink to quell anxiety.
So now the problem—because we were looking at the genetics of alcoholism, it wasn't easy to study offspring of alcoholic mothers, and the reason for that is they might have consumed alcohol during pregnancy.
In which case, and that's a bad idea, especially, there's certain key times in pregnancy where even a few drinks are not good.
And that turns out, if I remember correctly, that turns out to be the times when the fetus is producing the bulk of its hippocampal tissue.
Anyway, so if it turned out that daughters or sons of female alcoholics were markedly different from the general population in some manner, we wouldn't be able to tell if that was a consequence of alcohol consumption during pregnancy or if it was a genetic reflection.
So what we wanted to do was study sons of male alcoholics, and so their mothers actually couldn't be alcoholic.
And we wanted their fathers—the best subjects had an alcoholic father and an alcoholic grandfather and at least one or more alcoholic first or second degree male relatives.
And they couldn't be alcoholic and they had to be young, so because obviously if they were forty and they still weren't alcoholic then they probably weren't going to be alcoholic.
So we wanted to catch them, you know, between say, well it had to be eighteen, which was in Quebec, That was even a little late, probably, but, you know, that's the best we could do from an ethical perspective, so we used to bring these guys into the lab and get them quite drunk.
The National Institute of Alcoholism and Alcohol Abuse pretty much put a stop to that research because we used to bring them in and, you know, we'd get their blood alcohol level up to.12 or.10.
It had to be pretty high.
It actually looked like the real physiological effects seemed to kick in when When people hit, legal intoxication.
So you don't really get the opiate effect until you pop yourself up about.08, which was the legal limit for driving at that point.
So we used to get, some of these guys were pretty big, they'd come in, they were maybe 230 pound guys, and to get them up to You know, 0.1 or 0.2, you had to give them quite a whack of alcohol.
And then what we usually do is we let them sober up till 0.06, about that, and then we'd send them home in a cab.
Well, when the NIAAA got all ethical on the whole situation, they wouldn't let us send them home until they hit 0.02.
It was like, well, if you're 240 pounds and we've just nailed you with enough alcohol so that your blood alcohol level hit.12, you're going to be sitting in our bloody lab, bored to death, feeling horrible for like six hours or seven hours, and you'd be pretty damn irritated about it.
It's like, it wasn't obvious how we were supposed to keep the people there.
It's like, well, can I leave?
No.
I'm not paying you if you leave.
It's like, that's going to produce real positive outcomes with Like drunk people in the lab.
That's gonna work really well.
Of course, then they'd never come back either because it was such a bloody awful experience.
So I stopped doing that research partly because it became impossible.
Anyways, we did find out a lot.
We found out that there was this one particular pattern of alcohol abuse that seemed to be hereditary.
So, and we couldn't study it in women.
Anyways, so now There was a problem with this line of research, and the problem was, well, alcoholism comes along with other problems.
So this was correlational research, right?
We'd pick a group of people, and we'd match them—the sons of male alcoholics—we'd match them with people who weren't sons of male alcoholics.
So they were still sons, they were still the same age, but their fathers weren't alcoholic.
Now, here's the problem.
What should we match them on?
Age.
Gender.
Well you can't match them on number of drinks, obviously, because you want the people who are alcoholic in one group not to be—you don't want alcoholics in the second group, so do you match them on number of drinks?
Well if you don't, then you don't know if the effect that you're measuring is a consequence of The genetic difference or on the number of drinks per week or drinks per occasion that the people had, right?
That would be a confounding variable.
Do you match them for antisocial personality?
Or antisocial personality symptoms?
Because lots of people who are alcoholic, they tilt towards the antisocial side of the spectrum.
So do you control for that?
Well, you don't know because you don't know if antisocial personality is part of alcoholism, like it's part of the same underlying genetic problem, Or if it's a secondary manifestation, or if it's a consequence of drinking.
You don't know any of that.
Do you match them on depressive symptoms?
Do you match them on schooling?
Do you match them on education?
Do you match them on personality?
Do you match them on other forms of psychopathology?
Do you match them on what they drink?
Do you match them on how many drinks per occasion they drink, when they drink, etc?
Well, the answer to that is You don't know.
And so what that means is there's actually an infinite number of potential covariates because you don't know what differences there are between the two groups are the differences that are relevant to the question at hand.
Now, that's actually one of the big problems with psychiatric research.
In fact, it might be a problem with psychiatric research that's so serious that it cannot be solved.
You know, so if you take kids who are attention deficit disorder, say, which is a horrible diagnostic criteria, and you match them with kids who aren't, and then you look to see what makes the ADHD kids different, what do you match them on?
Well, you don't know.
So usually what happens is the people who do psychiatric research finesse this a bit.
They match them on the important variables—age, physical health maybe, education—but the problem is you actually have no idea what the important variables are, and there are an unlimited number of them.
And so that's actually why random assignation to groups is so important.
Now, if I take all of you And I say, well, let's look at the effects of alcohol.
So what I would do is I'd say, you'd come towards me and I'd do a golden board sorting.
You go to the left, you go to the right, you go to the left, you go to the right.
This is non-biased separation of the two populations.
And then maybe I'd give one group Two ounces of alcohol in water, or in Coke, say, and in the other group I'd put a few drops of alcohol on the top of a glass of Coke so it smelled like alcohol and tasted like alcohol when you had your first drink, and then I'd put you through a whole battery of neuropsychological and personality tests, which I did, by the way, to a bunch of people when—I think it was the first publication I had back in about 85 or something like that.
Random assignation.
Gets rid of the necessity for the infinite number of covariance, right?
Because you're all different in a whole bunch of important ways, but we could assume, as long as I threw half of you in this group randomly and half of you in that group randomly, that all your various idiosyncrasies, no matter how many there are, would cancel out.
And that's the massive advantage to random assassination.
Now, one of the things you're going to notice when you go through psychology, especially if you do psychopathological work, is that the studies are almost always correlational.
And they'll control for relevant covariates, that's what it'll say in the paper.
But the problem is, you don't bloody well know what the relevant covariates are.
If you did, you'd already understand the condition, you wouldn't have to do the damn research.
So you can make a pretty strong case that all psychiatric research that is studying psychopathological groups, compared to a control, it's all not interpretable.
And it turns out in that kind of research that who the control should be is the killer.
Because you don't know.
Picking the psychopathological population is easy.
You just pick them according to whatever diagnostic checklist you happen to use.
But when you figure out who to control them against, it's like, siblings?
I mean, what do you do?
How do you match a population?
The answer is you can't.
Anyways, the reason that we didn't include women was because of the uncontrollable potential confounding effects.
And the only way that you can do that is to assign randomly.
And that's an important thing to remember.
It's why experimental designs are way more powerful than correlational designs.
And the problem is, frequently in psychology, what you see are correlational designs.
Now, I think if you're careful and you dig around and, you know, you're obsessively careful, you can extract information out of correlational studies, but it's no simple thing to do.
You see this problem come up, too, when you hear about studies on diet, you know?
It's like, well, you should eat a low-fat diet or you should eat a high-fat diet.
Usually what happens is they track people across time who hypothetically have been having one diet or the other, but the problem is you don't know what the hell else makes those people different.
And the problem with that is there is an infinite number of potential things that make them different.
And that's a big problem.
The probability that you've picked the one thing that makes people who have a high-fat diet different from other people, and that it happens to be that they have a high-fat diet, and that's the only difference.
It's like, yeah, no.
Not at all.
Definitely not.
Okay, so anyways.
Random assignation gets rid of the problem of the infinite number of covariates, and that's worth knowing.
And then the other thing that we've just figured out is that, you know, if you make a measurement, you're going to get a distribution around the measurement, and if you do that in two groups, you can check out the overlap between the two groups, and you can determine what the probability is that that overlap is there as a consequence of chance.
Now you might say, why not set your probability level To like 1 in 10,000, so you could be absolutely damn sure that the two groups don't overlap.
So why wouldn't you do that?
Why wouldn't you set p equals.0001 instead of p equals.05?
So that would be 1 in the thousandth instead of 1 in 20.
Yes?
That's exactly right.
So you're basically damned if you do and damned if you don't, which is a very important thing to remember about life.
Because generally when you make a choice of any sort, there's error and risk on both sides of the choice, right?
And it's really, really useful to remember that because people always act like their current situation is risk-free, which is never, ever the case.
People were trying to figure out, well, how do we balance the risk of finding something that doesn't exist against the risk of not finding something that exists?
And the reason—it's one in twenty.
And why is that?
It's because someone made that guess forty years ago or fifty years ago and it's just stuck.
There's no reason.
And so the other thing to notice is that you want to be careful about the p equals.05 phenomena, because one of the things you'll see is that people treat any experimental result they get like it's significant if the probability is.05 or less, and it's not if it's.06 or greater.
And that's not smart.
Because the cutoff is arbitrary.
You need to know three things to interpret an experimental result.
It's like you can't calculate the area of a triangle without knowing three things.
I think it's three things anyways.
You need to know the number of people in the study.
You need to know the size of the effect, so that would be the relationship between two or more variables that you're interested in looking at, and then you need to know the probability that you would find that effect size among a population of that size by chance.
And you cannot interpret one of those Numbers without the other two.
It's not possible.
And now, psychologists, statisticians among psychologists who have a clue, have been jumping up and down for fifty years trying to get psychologists to report all three of those every time they report anything.
Effect size, which seems logical, right?
The effect size is the difference between the two means divided by the standard deviation of the pooled group.
Well, you need both of those because it's the standard deviation that tells you how damn big the numbers are.
Because what does 70 mean?
Well, it doesn't mean anything.
Seventy.
It's like if I just came out here and said that, I could say 70 and 40.
Are those different?
Well, what the hell does that mean?
It doesn't mean anything.
You need to know what the units are, and the distribution gives you the units.
And so once you know what the units are, you can say, well, here's how big the difference is between these two groups, and here's the probability that that would be acquired by chance, and so that's how… Confident you could be that it's an actual difference.
And so you want to always report effect sizes.
And most of you are going to be taught to look at the damn probability.
And that's stupid.
It's like, when you want to know how much difference there is in height between two people, you want to know how much difference there is in height.
You also might want to know to what degree that's there because of chance.
But it's the effect size that's the critical variable.
Now, you can't understand the effect size without understanding how many subjects were in the experiment, And also the probability that you would acquire that by chance.
So you want to keep that in mind.
And when you're reading scientific papers, you want to be looking at the effect size.
How large an effect size is this?
Now all the effect sizes are… you can transform one into another.
So you have correlation coefficients which go up to one.
You have the square of the correlation coefficient, which is the amount of variance that's accounted for, and that also goes up to one.
And you have your basic correlation coefficients, or sorry, your standard deviation effect sizes, which is mean one minus mean two over the standard deviation of the entire group.
And so, large effect sizes in psychology—I've talked to you about this before—are correlation coefficients of about 0.5 or above, or r squareds of 25% of the variance, which is 0.5 squared, or mean 1 minus mean 2 of, say, half a standard deviation or greater.
And you can derive the conclusion that those are relatively large effect sizes by looking at the distribution of the effect sizes across the published literature.
And most of you will be told estimates for effect sizes that are actually way too large, because those were guesses too.
How big's a large effect size?
Well, some statistician guessed fifty years ago.
So, it's just like the.05 rule.
It's arbitrary.
That doesn't mean it's stupid, but it's arbitrary, and you don't want to get stuck on it like it's some sort of fact.
Okay, so.
Now let's look at Pareto distributions That'll do
Okay, the first thing you're going to see about a Pareto distribution is it's definitely not normal.
Now, one of the things that I mentioned to you before, but I want to hammer this home because it's unbelievably important for determining how to understand the way the world lays itself out and how to interpret the way that people distribute themselves in terms of their success across time.
Now, this is a fundamental law, and I'll show you how it works.
Okay, so the law is, most of you get nothing, and a few of you get everything.
Okay, so that's the law.
Now, you might say, well, that's because of—you might attribute that to various things.
So, for example, If you're a left-winger you attribute it to the inequities of the social structure and if you're a right-winger you basically attribute to the fact that, well, most people are, you know, not that good at anything so it's no wonder they end up with nothing.
So they're not very smart and they're not very hard-working so they don't get very much of whatever it is that they're after.
Okay, so we'll take that apart a little bit but before we do that I want to show you an animation.
Now, You've got to watch this one carefully.
Oh yes, that fears.
Okay, yeah, this thing moves very, very quickly so I'll go back to the beginning and...
All right, now.
So here's the deal.
Each of you gets ten dollars.
Okay, so that's a non-random starting place, right?
You're all starting in the same place.
That's definitely not random.
Okay, now here's how you play this game.
You flip a coin.
You turn to your partner and you flip a coin, and if it both comes up heads, then you win and they lose.
And if it comes up tails and heads, then you lose and they win.
And if the winner gets a dollar from the loser.
Okay, and so you can imagine that this is a simulation.
So basically, it's like your apes trading bananas.
If you give away a banana, then you're done with the banana.
So one of you will walk away with eleven, and one of you will walk away with nine.
And then you turn to someone else, you just wander around the room and randomly trade.
Okay, so what happens?
Well, the first thing that happens is this.
Okay, so we started the graph with everyone at ten.
Now, you've done a bunch of trades.
We don't know how many trades you've done, but a fair number.
And so what happens—what you see happening is some winners are starting to emerge, right, on the right-hand side.
Those are the people who've won every single trade.
Maybe they've traded ten times, so now they have twenty.
And then there's the people on the other side who've lost nine out of the last trades.
Now, what happens when you lose nine?
What's the big problem that you have?
You only have one dollar left, so what happens if you lose another trade?
That's right, you hit zero.
And zero is not a number like any other number.
Zero is like the black hole of numbers.
You fall into the zero hole, and that's it.
You're not in the game anymore.
And poverty is like that.
It's like that.
It's a kind of a trap, and it's very, very difficult to get out of.
So, and it seems to be in part because once you fall off the charts enough, a bunch of things start conspiring against you.
So, for example, you don't have enough money to buy a large amount of decent food cheaply.
So you have to buy expensive junk food in the short term.
So that might be one possibility.
Or let's say you end up on the street.
Well, you can't even get a job then because you don't have an address.
And you can't get social security because you don't have an address.
Like, things start to conspire against you very badly.
Or maybe you're unemployed and you've been unemployed for a year and a half because it's been a prolonged downturn in the economy.
Well, if you're 17, who the hell cares?
But if you're 50, That might be that for you, right?
You've hit zero, you've been out of the market for eighteen months, nobody's going to hire you.
And so you've hit zero.
And the problem is, we don't know what the hell to do when people hit zero.
It's difficult to pry them out of zero and throw them back in the game.
Now, you could say, well, what if you just rearm them with money?
And that would actually work if the game was truly random.
But it isn't obvious that the game is truly random, and that's where things get weird.
Okay, so, anyways, by this point in the game, there's some winners piling up and there's some losers piling up.
What's the difference between the people who have nineteen and the people who have one?
Well, we know the one people, they're going to be wiped out.
They got a fifty percent chance of hitting zero.
What about the 19 people?
They've got a much better chance of not hitting zeros because they'd have to pay a lot more.
So they're in this place where they're basically sitting pretty.
They can fail nine times and that just puts them back to where they were to begin with.
So okay, so we keep playing
Well,
so you see what's happening is that with repeated random play, the normal distribution turns into a Pareto distribution and most of the population stacks up at zero.
Hmm.
Okay.
This old guy I told you about He had a theory of social structure that was predicated on this.
There's a bit of a Marxist twist to it.
He thought that what happened is that—so if you ran a game long enough, most people will stack up at zero and a few people will have everything.
Okay, so let's say now you're down at zero.
What's your best strategy?
Well, you might say, well, we should reset the game.
Because if you're at zero, and it's basically a random game, if you wipe out that game and you put another one in place, there's some probability that when the next game starts playing, you're not going to end up at zero.
And that was his theory of revolutions, fundamentally.
Once the game had played itself out until resources were maximally distributed, it didn't cost the people at the bottom anything to be revolutionaries because they had already hit zero.
And so one of his hypotheses was that one of the things that political and economic systems have to do—they have to figure out how to do—is to Make sure that the people who end up on the zero side of the distribution don't have nothing to lose.
Because if you have nothing to lose, God only knows what you're going to do next.
Now, it's a big problem because merely shoveling money down there is not likely to change the outcome very much.
Yes?
Yes?
I was about to say, that's a little bit like the gambling policy, though.
Like, if you—because once you reset the game, your probability of ending up at zero is exactly the same for the first time.
Yeah, but the probability that any given person will end up at zero is the same.
But the probability that you'll end up the same isn't.
Because at the beginning of the game you have just as much probability of moving up to the top as you do of moving down to the bottom.
Once you're at zero, though, you don't get to play anymore.
So at zero your probability of moving up is zero, whereas at the beginning your probability of moving up is fifty percent.
So when the gambler's fallacy is merely that if you keep dumping good money after bad you'll win it back, you know, because let's say you've lost ten times in a row, you think, well, I've lost ten times in a row, it's fifty-fifty, I must have a virtually a hundred percent chance of winning the next time.
It's like, well, no you don't because probability doesn't have any memory.
That's the gambler's fallacy.
All right.
Now, one of the things we might ask ourself is—this is something that I discussed with him in length, because he really thought of this as a random process.
And, you know, because he thought—here, it's very, very interesting and very complicated.
It's like, we know, for example, that the IQ distribution in this room is approximately normal.
And so we could assume that it's a consequence of random factors.
Okay, so what do we mean, random factors?
There have been random genetic things going on in your lineage since life began.
And here you are, you know, you're smarter than him, perhaps.
Why?
Well, we're going to eliminate the environmental effects for now.
Just forget about them.
We're going to assume that everybody's been raised in an environment where they had enough to eat and where they have enough resources, informational resources, so that their intellect can capitalize.
So none of you people have been starved for information or food.
So I would say in many ways the important variation in the environment has already been ironed out for most of you.
Not completely.
Okay, so it's random occurrences in the past, in the evolutionary history, that's put you wherever the hell you are in the IQ distribution.
So you're the beneficiary of random forces.
So he thought of the problem as being random all the way down to the bottom.
Now, the problem I have with that is, wait a second, we have evidence that some things predict where you're going to end up in the distribution.
And so what are those things?
Well, IQ, conscientiousness, emotional stability, Openness, if it's a creative product, and then some other smattering of personality features depending on the particular domain.
So, for example, if you want to be a salesperson, some extraversion is useful.
If you want to be a manager, it tends to be better if you're disagreeable to some degree rather than if you're agreeable.
If you want to be a nurse or a caretaker, then agreeableness is useful.
But the big performance predictors across time are intelligence and conscientiousness.
Now, if the damn game is random, Which the statistics seem to indicate, or at least that you can model it using processes that model random processes, why in the world do IQ and conscientiousness—we'll just stick with those two for now—why in the world do they predict success?
So, does anybody have any ideas about that?
Let's start with IQ. I mean, for example, if you apply your IQ to the stock market, If I distribute to you all a bunch of money and I measured your IQ and I said, okay, put together a portfolio, you get to pick thirty stocks, you can't sell them for a year, you get to sell them at the end of the year, what's the probability that the high IQ people would pick a better set of stocks than the low IQ people?
The answer to that, as far as I can tell, is zero.
It's not—it's not gonna—because you can't pick stocks, as it turns out.
All the information is already eaten up.
And the stock market's an interesting analogy of the environment, right?
You know what I mean?
Because the stock market is basically an index of the continually transforming human, economic, and social environment.
It's basically random.
You can't predict the damn thing even if you're smart.
So why is it that smart people win across time?
Then again, why is it that hardworking people win across time?
If we imagine each pig on the Galton board as a choice point in life, then I think the people with a higher IQ are more likely to choose the choice point that will leave them...
No, but that's only a restatement of what I just said.
It's not a causal account.
You're basically saying that the high IQ people make better choices.
Yeah, but that's—if it's a random environment, how can they?
I'm thinking, instead of money, we can view it as opportunities and genetics which make somebody adaptable in multiple environments.
And each coin flip as each subsequent generation of that lineage.
So that there is an increase, like the more good adaptable genetics, emotional stability and situational wealth you have, the more likely your next generation has, you know, like they have to lose a lot more to get to the zero.
Okay, so what about a given individual, though?
Forget about it playing across time.
And that's also a weird thing, eh?
Because some of the single-celled organisms that were your ancestors three billion years ago are still single-celled organisms.
Whereas here you are.
You know, same environment, roughly speaking.
So there's this tremendous branching of possibility across time.
And you might say, well, IQ is adaptive, which is a terrible word, adaptive.
It's like, yeah, okay, so how do you account for all the single-celled organisms?
They're just not that bright, but there they are.
There's more of them by weight than there are of us people by weight.
So I puzzled this out a bunch of different ways, and you can think about this and see what you think.
I mean, the first thing that seems to me to be the case is that you don't have to play one game.
You know, like, imagine that you're sitting around with your friends, and there are 50 board games going on, and you can keep a hand in each of them.
Well then, at some point in one of the board games, you're going to start to amass some success, right?
Just by chance.
If you're playing 50 board games, in 25 of them, after playing for some time, you're going to be at the top of the—you know, you're going to be moving towards the top 50% rather than the bottom.
We might say, okay, play 50 games.
Play twenty turns, throw away the ones that you're doing the worst in.
The five percent that you're doing the worst in.
And then just keep doing that until you end up with the one game that you win.
And maybe you can do that better if you have high IQ because you're faster.
So maybe all that happens is that if you're smart is that you can play many games faster.
And as a consequence of that, you can choose the ones that you seem to be winning and stick with those.
And then maybe what happens with conscientiousness is that you actually stick to them.
So that means that you actually don't have to predict the future in order to master it.
It means that you have to simulate multiple futures, keep your options open, and sort of play dynamically as the environment unfolds.
It might be more complicated than that because it's also possible that in some environments, at least for local periods of time, the game is actually not random.
You know, so it's funny.
Like, do you think—is Monopoly random?
What do you think?
Everybody here has played Monopoly, I presume, right?
Is there anybody who hasn't?
Okay, so what do you think?
Do the smart people win Monopoly more often?
I think it's the people that play ahead.
So you would say it's the conscientious people that win more often.
Yeah, well it's pretty clear that you can lose stupidly in monopoly, right?
Although I'm not sure if you can win intelligently.
You know, you can avoid obviously bad choices, like one choice would be don't buy any property and just hold on to your money.
That seems to be a losing strategy, right?
So that'll wipe you out.
So maybe another thing that smart people can do is that they don't do things that are self-evidently That make it certain that you'll lose.
It might be something like that.
So, alright, so one of the things we might ask ourselves too is, given that IQ and conscientiousness do predict success, how much success do they predict across time?
And the answer to that is, they predict a substantial amount.
If you combine IQ and personality, I'll show you the equations.
These were formulated in the 1990s.
So we're going to look at this here.
Let me walk you through it a bit.
So this is a spreadsheet I put together quite a long time ago.
Get this a little bigger so you can see it.
So here's the elements of the equation.
So you need to know—first element in the equation.
How many people—so we're going to say, well, how powerfully can you predict the performance outcome of people across time?
And the value we're going to use is dollar value.
And dollars are basically—they're sort of like a—obviously they're the universal currency because you can trade money in for virtually anything.
So we use dollars as our standard of value.
The first question might be, well, how many people do you want to predict for?
So let's say we're going to take ten, so we put that in this part of the equation here.
Ten people.
Then you might say, well, over what period of time do you want to calculate their success?
And one decent answer is five years, because people seem to stay in the same job position or career position for about five years, although that's shortening.
Okay, so that's another thing that you guys should keep in mind.
You're probably not going to be in the same position in your life for more than about five years at a time.
And so you have to plan and plot for dynamic transformations in your career, and you have to do that in a way that works to your advantage, which partly means you have to keep options open and you have to be able to say no.
Okay, the next thing you might ask yourself is, well, how high a predictor?
What's the R for your predictor?
What's the correlation coefficient?
And we put together a neuropsychological battery.
I published the results of this with someone named Daniel Higgins.
Quite a long time ago.
We used neuropsychological battery looking at dorsal lateral prefrontal ability, which we thought of as something potentially separate from fluid intelligence, which probably just turned out to be quite a good measure of fluid intelligence.
And we also used conscientiousness, and we found that when we validated that in a—we validated it in a factory And we looked at the performance of the administrators and managers in the factory because they had relatively complex jobs.
And we had access to seven years of their performance records, which someone else had gathered independently of us.
And what we found was that for those people who had performance records that were more than four years old, we could predict their performance at about.59.
About.
About 0.6.
Tremendously powerful prediction.
Now, one of the things we found was that the degree to which our predictions were accurate increased with the increase in the number of years of performance data we had.
So, for example, if I want to predict how you do next year and I get a performance, you enter a new job.
And I get a performance index after your first year.
I'm not going to be able to predict that very well from your intelligence and your conscientiousness because the measure of your performance actually won't be very good.
Because it turns out that you can't even figure out how well someone is performing in a job until about three years.
So it turns out that if you take a complex job on You get better and better at it over three years, if you're going to get better and better at it.
And so we can't tell how well you're doing until about that period of time.
So that's another thing that's useful to know, by the way, when you guys go off to find your next complex role in life.
You can't really expect to be good at what you're going to do until about three years.
After about four years, additional experience doesn't seem to matter.
So that's also worth knowing.
So you're going to feel like a bloody moron for the first bit of your new job, and that's because you are.
But if you keep at it, you'll accrue experience and expertise quite rapidly over a three-year period.
And that's probably the right amount of time over which to evaluate your performance.
Because you need to know that, right?
Should you be freaking out if after six months you're not doing a good job?
Well, you've got to see how you're doing compared to your peers.
Hopefully you're not doing a worse job than them, but if it's a complex job, it's going to take you a long time to master it.
So, okay, so anyways, our predictor was.59, which we're pretty damn thrilled about.
And we're comparing that, we're going to compare that to a comparison predictor of zero, for now.
Zero is random.
So if I was going to predict your trajectory through life, let's say I want to predict your industrial productivity, I'll take ten of you, I'm going to predict your industrial productivity over the next five years.
That'll be the goal.
Or we can do this, we'll do this a slightly different way.
Let's see.
So that's the selection ratio.
The last one is the selection ratio.
So let's say I wanted to select you guys for a position at a, well let's say you're going to be managers in a In a relatively complex corporate organization.
There's about a hundred and ten of you or something like that in here now, maybe.
Let's say a hundred.
Say I'm going to pick ten of you.
That means my selection ratio is right there.
It's one in ten.
You can transform that into a standard score of 1.76.
Now, it's important to know how many people you get to choose from because if you're going to hire one person and you only have one applicant, there's not a lot of sense doing any selection.
Because you only have one applicant, so I don't care how much you know about that person, it's not going to be helpful if you have to hire them.
But in this situation, we're going to assume that I can screen all 100 of you and only recommend ten.
Okay?
So, we're going to assume you have to put in a variable for job complexity because the relationship between intelligence, conscientiousness, and predictive power increases as the job becomes more complicated, which is exactly what you'd expect, right?
So if you can do a job by rote, all that IQ and conscientiousness will predict is how fast you can learn it.
But if it's a dynamic job where you have to make decisions on a non-stop basis, then IQ and conscientiousness are going to predict your performance over the long run.
So we're going to assume that you're in a complex job.
We're going to assume that you have an annual salary of $75,000.
So, then the question is, armed with that knowledge, How much money would the company that's using my products obtain by using the selection compared to if they just picked ten of you randomly?
And the answer is right there.
In five years, they would have made four million dollars more than they would have if they would have just picked randomly, and that would be a productivity increase of 104%.
So, that's graphed here.
So, with random selection, 50% of the people you hire are going to be above average and 50% are going to be below.
And using a good psychometric battery, you can get that to about 80-20.
And the consequence of that is four million dollars in increased productivity.
So now we're going to turn that into one person.
So over five years, if you use selection processes properly, you make $400,000 more by hiring one better person.
Right, it's $40,000—what is that?
It's $100,000 a year.
It's actually more than you're paying them in salary.
And now the reason for that—there's a bunch of reasons.
The first reason is that there are tremendous amounts, there's tremendous differences in individual productivity.
Now one of the things that's weird about these formulas is this formula is predicated on the idea that productivity is actually normally distributed.
But actually productivity is Pareto distributed.
So what that means is that the basic consequence of using this sort of prediction is probably higher than this formula indicates.
Now There's a variety of reasons to know this.
One is obviously the practical idea.
The practical idea is, of what use is it knowing something about the people that you're going to hire, or work with?
And the answer to that is it depends on what you know.
But if the things that you know are valid, and they include personality and intelligence, it's not only valuable to know it, it's virtually vital.
Because the success of your enterprise will depend on The people that you hire and associate with.
The second thing is, I think, that knowing this should change the way that you look at the world.
If you understand that the outcomes in life are distributed on a Pareto distribution, and that there are Inbuilt temperamental factors that play an important role in determining that, one of the problems that you have before you as modern people is, what the hell do you do about that from a conceptual, social, political, and economic perspective?
Because nobody who's currently considering how societies are structured pays any attention to this sort of thing.
They don't assume that there are differences between people with regards to their life outcome chances, not of this sort of magnitude.
And so none of the social policies that we have in place reflect any of this.
And so I would say, like our psychometrics are 21st century and our political and economic theories are basically 17th century.
And it's not a good thing.
And you guys are going to suffer for this or benefit from it because as society becomes more technological and as it transforms more and more rapidly, the degree to which the Pareto distribution is going to kick in is going to increase.
Now you already see that.
That's why The separation between rich and poor in industrialized countries is becoming increasingly severe.
Now, that's modified by some degree by the fact that the middle class worldwide is growing like mad, and so that's a really good thing.
But still, the long-term play is the Pareto distribution, and it's partly because An intelligent person is one thing, but an intelligent person with infinite computational power, that's a completely different thing.
And that's where we're headed.
So, I don't know exactly what to make of all this sort of thing politically, but I know that to the degree that we ignore it, we're going to have very unhappy and unstable societies.