Thank you for sharing this dialogue. It's a wonderful illustration of how constructive disagreement remains possible and valuable, especially when individuals are open to understanding both sides of an argument. Regarding implicit bias, what particularly troubles me—beyond the academic debate about what insights certain tests offer or what implicit bias truly signifies—is the unwavering confidence with which corporations wield tools like the IAT. It seems as though they believe administering such tests is akin to undergoing multiple sessions on a psychiatrist's couch. They then use these results to recommend (or mandate) specific diversity trainings for employees, purportedly to recognize and address their guilt, racism, misogyny, and so forth.
I found this article very interesting but when it really comes down to it, I couldn’t help thinking that a vast amount of time and resources has gone into trying to prove that disparities between groups is due to racism.
When explicit racism had significantly declined over the decades, the reflex was to pivot to trying to prove that racism still exists somewhere in our subconscious which must be the cause of the disparities.
Maybe the time has come to give up on this theory and spend more time and resources looking elsewhere? Things have only gotten worse with regard to race relations and the disparity problem is no closer to being solved in the least.
If the scheduling hypothesis is false, and it really is hunger that is causing the difference, then this would say to me that there is no point in trying the measure Implicit Bias -- the signal is so noisy you can never be sure you have found anything.
I'm unclear why the issue doesn't boil down -- ultimately -- to a question of predictive validity. The theory of implicit bias is that an individual may claim not to be biased toward members of a particular group, but may behave differently toward (or at least in some way respond differently to) members of that group than members of a different group in ways that are predicted by scores on the IAT. For example, if people stand further from an African American male than a white male on an elevator, while still claiming, and believing, that they hold no negative attitudes about African Americans males, and if that behavior is predicted by IAT scores, that would seem to validate the concept. No? At the same time, in the absence of clear predictive power, why should anyone care about the how people score on the IAT at all?
To further take your example, is said African American man dressed like Barak Obama in a sharp suit, or like Fiddy Cent (circa 2005) just stepped off a rap video set? Are you reacting to the clothing choices or the person? If it was a white guy dressed like a rap video extra, would you also give him extra space? Are people reacting to other signals like class mediated by clothing choices or skin colour and how the heck do you parse out the difference?
Even predictive validity is trickier than it seems. Let's say Fred conducts an IAT study and shows that it predicts racial discrimination to the tune of a correlation of r=.20, which is in the ballpark of what the meta-analyses find. There are at least three ways this can occur:
1. People with high positive IAT scores ("more implicit bias") discriminated against Black people; people with scores near 0 were egalitarian; and people with negative IAT scores discriminated against White people.
2. People with scores of near zero up to high positive scores discriminated against Black people and the negative scores were egalitarian.
3. People with scores near 0 discriminated against White people and high positive scores (presumably high implicit bias) were egalitarian.
All 3 will produce predictive validity. Precious little of the research using the IAT has even attempted to map this out. The two papers that did found results mostly consistent with point 3.
Hi Lee, I’ll admit I’m not the best statistician, but it seems like you’re pointing out that a slope of IAT scores on behavior could simply have different intercepts. To take your examples: case 1 would have a Y-intercept around 0, case 2 would have a Y-intercept quite high, and case 3 would have a Y-intercept quite low.
Could this whole question not be resolved simply by looking at the regression tables produced in IAT research to identify whether the Y-intercept is sufficiently low, high, or moderate to produce any of your cases? Perhaps that’s already been done and I’m not aware of it
(As a side note, I wish Substack let us include images in comments because this would be way easier to explain if I could draw the lines to demonstrate what I mean)
Heh. yes, absolutely true as written. The core problem is that I have never seen an empirical IAT article that actually reports the regression equation. So one cannot actually point to different intercepts in the empirical lit, because the empirical lit (almost?) never reports the intercepts! If you find one that did, would you ping it here? I mean, there are so many, I have to stop short of saying "Its never been" done because I can't say I know every one of the thousands of articles on it. But, at best, it is almost never done.
OTOH, one can use the regression equation, IF it were to be reported, to do more than just say "Look, different intercepts." One can use the regression equation to identify the precise IAT score correspoinding to egalitarianism in that particular study. This is an excerpt from our review chapter invited for Nelson's Handbook of Stereotypes, Prejudice, and Discrimination:
"If the IAT has no stable score reflecting egalitarianism, one cannot be in the business of declaring what proportion of people show racial preferences and whether any particular IAT score constitutes a “weak” or a “strong” implicit racial preference (as was once common, see also the reviews by the advocates cited above). And if egalitarianism usually corresponds to scores well above zero, but conclusions of “preference” are based on proportion of scores above 0 or even some low cutoff (such as D = .1), these will be overestimates, perhaps extreme ones. Claims about how many people show “implicit preferences” for one group or another – which typically refer to how many claims are above some arbitrary cutoff – need not be taken seriously and are almost certainly overestimates.
"A related major failure of this line of research is that it is not normative for researchers to report the benchmarked (against their other variables) point of egalitarianism for their IAT scores. This is not hard to do. The simple bivariate regression prediction equation is:
Y = C + b(X).
Y is the outcome, C is the regression constant, X is the predictor, and b is the coefficient relating X to Y.
"When Y is some outcome (let’s say, discrimination), and X is the IAT, the equation becomes:
Discrimination = C + b(IAT).
"Anyone can now solve for the IAT score that corresponds to zero discrimination:
0 = C + b(IAT), so
0 - C = b(IAT), so
- C = b(IAT), so
-C/b = IAT
"-C/b is the IAT score that corresponds to egalitarian behavior (i.e., zero discrimination). Anyone with an elementary understanding of regression can do this. At best, it is not common to do so, though it should be. At worst, and as far as we know, no one conducting original research using the IAT has ever reported this. If they did, the field could get much clearer information regarding what IAT scores correspond to egalitarian judgments and behavior. Thus, one recommendation for future research emerging from this review is that researchers start routinely reporting this."
P.S. While here... The review chapter is titled:
"Limitations, Contestations, Failures, and Falsification of Dramatic Claims in Intergroup Relations"
The chapter:
1. Reviews Merton's Norms of Science, with an emphasis on Organized Skepticism
2. ID's common known threats to the validity of the published lit (replication crisis, allegiance bias, persuasive communication devices, propaganda scholarship, social pressure/censorship)
3. Develops an a scientifically principled approach to judging when common claims in a literature should be taken as credible
4. Applies those principles to common claims in work on implicit bias, microaggressions, gender-based job discrimination and racial job discrimination and concludes that common claims in only the latter are well-justified (although even then, they do not mean what many people seem to think they mean -- but that, Dr. Significance, is a blog for another day).
Interesting! Glad to know I’m on the right track. I don’t read much IAT research so I wouldn’t know if this is standard practice. In my own field of marketing, it’s pretty common (though not universal) to present a regression table including the intercept, so I thought this was way more resolvable than it actually is.
I hope someone does some sort of meta analysis / systemic review in the future on this, because it’s an important interpretive question
Right. And this would be a controversy INCREDIBLY FUCKING EASY TO RESOLVE if only people would report basic stats. It is so easy to resolve that it is hard not to see it as motivated blindness. Which it may be, but I think it was Heinlein who said something like, "Never attribute to malice that which is equally explained by incompetence."
Seems to me that one issue is whether reaction time is a valid measure of implicit bias ( and if so not a very reliable measure), or was it unconscious bias. But before that can be established, how valid is that, (or those?) concept?
This conversation has so much more potential; it feels like some finding/understanding is missing still. Any more recent discussion(?)
Thank you for sharing this dialogue. It's a wonderful illustration of how constructive disagreement remains possible and valuable, especially when individuals are open to understanding both sides of an argument. Regarding implicit bias, what particularly troubles me—beyond the academic debate about what insights certain tests offer or what implicit bias truly signifies—is the unwavering confidence with which corporations wield tools like the IAT. It seems as though they believe administering such tests is akin to undergoing multiple sessions on a psychiatrist's couch. They then use these results to recommend (or mandate) specific diversity trainings for employees, purportedly to recognize and address their guilt, racism, misogyny, and so forth.
Progressives are very clever: they want to talk about implicit bias because then there is less time to discuss IQ.
I found this article very interesting but when it really comes down to it, I couldn’t help thinking that a vast amount of time and resources has gone into trying to prove that disparities between groups is due to racism.
When explicit racism had significantly declined over the decades, the reflex was to pivot to trying to prove that racism still exists somewhere in our subconscious which must be the cause of the disparities.
Maybe the time has come to give up on this theory and spend more time and resources looking elsewhere? Things have only gotten worse with regard to race relations and the disparity problem is no closer to being solved in the least.
Have you heard about the Hungry Judge Effect?
https://en.wikipedia.org/wiki/Hungry_judge_effect
If the scheduling hypothesis is false, and it really is hunger that is causing the difference, then this would say to me that there is no point in trying the measure Implicit Bias -- the signal is so noisy you can never be sure you have found anything.
What a refreshing example of how two professionals with differing viewpoints should interact with each other.
I would think of ham and cheese as a bias.
I'm unclear why the issue doesn't boil down -- ultimately -- to a question of predictive validity. The theory of implicit bias is that an individual may claim not to be biased toward members of a particular group, but may behave differently toward (or at least in some way respond differently to) members of that group than members of a different group in ways that are predicted by scores on the IAT. For example, if people stand further from an African American male than a white male on an elevator, while still claiming, and believing, that they hold no negative attitudes about African Americans males, and if that behavior is predicted by IAT scores, that would seem to validate the concept. No? At the same time, in the absence of clear predictive power, why should anyone care about the how people score on the IAT at all?
To further take your example, is said African American man dressed like Barak Obama in a sharp suit, or like Fiddy Cent (circa 2005) just stepped off a rap video set? Are you reacting to the clothing choices or the person? If it was a white guy dressed like a rap video extra, would you also give him extra space? Are people reacting to other signals like class mediated by clothing choices or skin colour and how the heck do you parse out the difference?
Even predictive validity is trickier than it seems. Let's say Fred conducts an IAT study and shows that it predicts racial discrimination to the tune of a correlation of r=.20, which is in the ballpark of what the meta-analyses find. There are at least three ways this can occur:
1. People with high positive IAT scores ("more implicit bias") discriminated against Black people; people with scores near 0 were egalitarian; and people with negative IAT scores discriminated against White people.
2. People with scores of near zero up to high positive scores discriminated against Black people and the negative scores were egalitarian.
3. People with scores near 0 discriminated against White people and high positive scores (presumably high implicit bias) were egalitarian.
All 3 will produce predictive validity. Precious little of the research using the IAT has even attempted to map this out. The two papers that did found results mostly consistent with point 3.
Got it :)
Hi Lee, I’ll admit I’m not the best statistician, but it seems like you’re pointing out that a slope of IAT scores on behavior could simply have different intercepts. To take your examples: case 1 would have a Y-intercept around 0, case 2 would have a Y-intercept quite high, and case 3 would have a Y-intercept quite low.
Could this whole question not be resolved simply by looking at the regression tables produced in IAT research to identify whether the Y-intercept is sufficiently low, high, or moderate to produce any of your cases? Perhaps that’s already been done and I’m not aware of it
(As a side note, I wish Substack let us include images in comments because this would be way easier to explain if I could draw the lines to demonstrate what I mean)
Heh. yes, absolutely true as written. The core problem is that I have never seen an empirical IAT article that actually reports the regression equation. So one cannot actually point to different intercepts in the empirical lit, because the empirical lit (almost?) never reports the intercepts! If you find one that did, would you ping it here? I mean, there are so many, I have to stop short of saying "Its never been" done because I can't say I know every one of the thousands of articles on it. But, at best, it is almost never done.
OTOH, one can use the regression equation, IF it were to be reported, to do more than just say "Look, different intercepts." One can use the regression equation to identify the precise IAT score correspoinding to egalitarianism in that particular study. This is an excerpt from our review chapter invited for Nelson's Handbook of Stereotypes, Prejudice, and Discrimination:
"If the IAT has no stable score reflecting egalitarianism, one cannot be in the business of declaring what proportion of people show racial preferences and whether any particular IAT score constitutes a “weak” or a “strong” implicit racial preference (as was once common, see also the reviews by the advocates cited above). And if egalitarianism usually corresponds to scores well above zero, but conclusions of “preference” are based on proportion of scores above 0 or even some low cutoff (such as D = .1), these will be overestimates, perhaps extreme ones. Claims about how many people show “implicit preferences” for one group or another – which typically refer to how many claims are above some arbitrary cutoff – need not be taken seriously and are almost certainly overestimates.
"A related major failure of this line of research is that it is not normative for researchers to report the benchmarked (against their other variables) point of egalitarianism for their IAT scores. This is not hard to do. The simple bivariate regression prediction equation is:
Y = C + b(X).
Y is the outcome, C is the regression constant, X is the predictor, and b is the coefficient relating X to Y.
"When Y is some outcome (let’s say, discrimination), and X is the IAT, the equation becomes:
Discrimination = C + b(IAT).
"Anyone can now solve for the IAT score that corresponds to zero discrimination:
0 = C + b(IAT), so
0 - C = b(IAT), so
- C = b(IAT), so
-C/b = IAT
"-C/b is the IAT score that corresponds to egalitarian behavior (i.e., zero discrimination). Anyone with an elementary understanding of regression can do this. At best, it is not common to do so, though it should be. At worst, and as far as we know, no one conducting original research using the IAT has ever reported this. If they did, the field could get much clearer information regarding what IAT scores correspond to egalitarian judgments and behavior. Thus, one recommendation for future research emerging from this review is that researchers start routinely reporting this."
P.S. While here... The review chapter is titled:
"Limitations, Contestations, Failures, and Falsification of Dramatic Claims in Intergroup Relations"
The chapter:
1. Reviews Merton's Norms of Science, with an emphasis on Organized Skepticism
2. ID's common known threats to the validity of the published lit (replication crisis, allegiance bias, persuasive communication devices, propaganda scholarship, social pressure/censorship)
3. Develops an a scientifically principled approach to judging when common claims in a literature should be taken as credible
4. Applies those principles to common claims in work on implicit bias, microaggressions, gender-based job discrimination and racial job discrimination and concludes that common claims in only the latter are well-justified (although even then, they do not mean what many people seem to think they mean -- but that, Dr. Significance, is a blog for another day).
Interesting! Glad to know I’m on the right track. I don’t read much IAT research so I wouldn’t know if this is standard practice. In my own field of marketing, it’s pretty common (though not universal) to present a regression table including the intercept, so I thought this was way more resolvable than it actually is.
I hope someone does some sort of meta analysis / systemic review in the future on this, because it’s an important interpretive question
Right. And this would be a controversy INCREDIBLY FUCKING EASY TO RESOLVE if only people would report basic stats. It is so easy to resolve that it is hard not to see it as motivated blindness. Which it may be, but I think it was Heinlein who said something like, "Never attribute to malice that which is equally explained by incompetence."
This is helpful description of the problem with interpreting the results of the test Dr. Jussim.
Seems to me that one issue is whether reaction time is a valid measure of implicit bias ( and if so not a very reliable measure), or was it unconscious bias. But before that can be established, how valid is that, (or those?) concept?