Racial Disparities in School Discipline

Volume 3 Number 3
June 2001

RACIAL DISPARITIES IN SCHOOL DISCIPLINE

There are among us persons of so refined and delicate a nature that they cannot bear the guilt even of crimes they have not committed. Their shame is so great that they turn their considerable talents to serve the demagogues of bias. In this essay we analyze their efforts to document racial discrimination in school discipline, and humbly offer advice on how to improve their methods.

The Bias Hunters
In September 1999, six black high school students were expelled for brawling violently with other students at a Decatur, Ill. football game. Grabbing the spotlight, the Reverend Jesse Jackson made a house call. His presence in Decatur propelled a local event to the nation's front pages. Sleepy sociologists awoke to the call. One team of investigators from the Applied Research Center of Oakland interrupted a study to rush forward with preliminary results. They found that black students are suspended and expelled at disproportionate rates. "The figures are astounding," declared Rev. Jackson as he, the Oakland group and the press joined in mutual exploitation of the moment.

A year later, in the December 2000 issue of La Griffe du Lion, we asked:

Suppose a half-white, half-black school system suspends 1 percent of its students for disruptive behavior. What is the most probable racial makeup of the suspended group?

   The question began the section, "Games with Fuzzy Variables," but December's game is June's analytical scalpel. With it we will peel back layers of misinformation to reveal race and school discipline in a greater social context.

   The year 1975 witnessed the first widely-disseminated report on racial disparities in school discipline. Issued by the Children's Defense Fund, the report found that black students at all levels were suspended at 2 to 3 times the rate of whites. The bias hunters had arrived. Hundreds of studies and millions of dollars later they would learn little more.

   Powerful statistical techniques - analysis of variance, χ², regression, correlation analysis and more - have been brought to bear on school discipline and race. Each new study adds to the annals of redundant findings. The essential result is now well established: Referrals, suspensions and expulsions are distributed asymmetrically with respect to race and ethnicity. So what new can La Griffe bring to an area so thoroughly picked over? Surprisingly, quite a bit.

   The groundwork for our analysis was laid in December in the essay, "Aggressiveness, Criminality and Sex Drive by Race, Gender and Ethnicity." There we developed the method of thresholds, a procedure that takes rates at which different groups cross behavioral thresholds (like committing assault), and from them constructs yardsticks for "fuzzy" variables that are otherwise hard to define (like aggressiveness). The method lends itself well to the analysis of race and school discipline.

   In December, using assault as a threshold for aggressiveness, we found a black-white aggressiveness gap of 0.37 standard deviations (SD), blacks being the more aggressive. Data came from two sources. From INTERPOL we obtained rates of "serious assault" in 64 African and European countries. From the Justice Department's National Crime Victimization Survey (NCVS) we obtained US assault data. Each source independently produced the same 0.37 SD aggressiveness gap. Even allowing for some degree of fortuity, the black-white aggressiveness gap was found to be invariant to cultural shifts across continents, compelling evidence that aggressiveness distributions differ among races and are intrinsic to them.

   The threshold for "serious assault" falls way out on the aggressiveness scale, 2.86 SD from the black mean. The NCVS survey identified lower levels of assault from simple on up. Its threshold was much lower, 1.64 SD from the black mean. That aside, both data sets yielded the same black-white gap. From the standpoint of the method of thresholds, expulsion and suspension from school are simply new thresholds to be placed on the aggressiveness axis.

Discipline and Group Differences
   Bias hunters who look for racial disparities usually find them. Their papers, though sullied by subjectivity, are valuable for the data they include. Such is the case of a much-referenced paper that appeared a few years ago. The paper, "Office referrals and suspension: Disciplinary intervention in middle schools," Skiba, R. J., Peterson, R. L., and Williams, T., Education and Treatment of Children, 20, 295-315, (1997), examined disciplinary records in a large Midwestern urban school district for the 1994-95 school year. The large-scale study included all 11,001 middle-school students in the district, 98 percent of whom were either black (56%) or white (42%). By analysis of variance Skiba et al. convincingly demonstrated that African American students were suspended and expelled at rates disproportionate to their population in the schools. More important, the paper included raw data uncompromised by expectation or prejudice.

   Skiba et al. found suspension rates of 17.1 percent for whites and 27.0 percent for blacks. We applied the method of thresholds (see Appendix) to these rates. It revealed a black-white aggressiveness gap of 0.34 SD, a value differing insignificantly from the assault data value of 0.37 SD. On the aggressiveness axis, suspension fell at 0.61 SD from the black mean, much lower than assault (1.64 SD), and lower still than "serious assault" (2.86 SD).

   Figure 1 shows distributions of aggressiveness for blacks and whites. Drawn to scale, the curves are separated by a mean difference, Δ, of 0.37 SD. The thresholds for suspension (Skiba data), assault and "serious assault" are marked on the aggressiveness axis.

Figure 1. Aggressiveness distributions. The curves for blacks and whites are separated by a mean difference, Δ = 0.37 SD. Thresholds for suspension, assault and "serious assault" are designated λ₁, λ₂ and λ₃, respectively.

In general, schools (or school systems) have different tolerances for disagreeable behavior. Irrespective of where the boundaries of acceptable conduct are set, in the absence of bias, discipline rates should be consistent with the 0.37 SD aggressiveness gap. In Griffian terms, standards of tolerance correspond to points on the aggressiveness axis, beyond which punishment is triggered. By constraining the proportions of suspended (or expelled) students to be consistent with the 0.37 SD aggressiveness gap, we can predict how discipline rates should break down by race in the absence of racial discrimination.

In brief, if N_B and N_W are the number of black and white students enrolled in a particular school system, and N_S is the number students suspended (or expelled), then we may write:

where P is the normalized aggressiveness distribution for blacks, Δ is the black-white mean difference (0.37 SD) and λ is the threshold of aggressiveness beyond which suspension (or expulsion) kicks in. (The quantity λ is school or system dependent.)

Equation 1 describes how discipline would break down by race if students received punishment in rank order of aggressiveness. The first term on the left side of (1) represents the number of blacks who would be disciplined, the second term the number of whites. Solution of (1) yields the threshold, λ, from which the individual terms may be evaluated.

In the Skiba study, N_B = 6161, N_W = 4620, and N_S = 2461 suspended or 42 expelled. Solution of (1) yielded the results shown in Table 1.

	Black (n = 6161)		White (n = 4620)
	actual	predicted	actual	predicted	threshold (λ)
suspended	1696	1692	765	769	0.60 SD
expelled	35	34	7	8	2.54
Table 1. Predictions based on group differences in aggressiveness.

   Close agreement between prediction and observation indicates that punishment was dispensed without prejudice. For diehard activists who require more proof, we develop additional criteria below.

A Test for Bias
   When school discipline is administered in an evenhanded manner without regard to race, blacks will be punished out of proportion to their numbers. That is the expected norm. Disproportionate discipline therefore cannot serve as a criterion for racial bias. On the other hand, we cannot dismiss the possibility that bias (racial or otherwise) can creep into the discipline process. In fact, given the number of schools in the US and what we know of human nature, it would be a statistical miracle if it did not. In this regard, a test for bias would be useful.

   An existing test, known as the "ten percent rule," has been in use for years. It is employed by bias hunters to judge whether a racial group has been the victim of discrimination. The rule states that if a group's representation among the ranks of the punished exceeds its representation in the school population by more than 10 percent, the group has been unfairly punished.

   In the Skiba study African Americans made up 69 percent of those suspended and 56 percent of those enrolled. Because 69 exceeds 56 by 23 percent (comfortably more than 10 percent), the ten percent rule found blacks to be casualties of discrimination.

   There are two reasons why a rule like the ten percent rule does not work. 1) It is arbitrary, and 2) it is based on an ill-founded notion that all peoples are alike. We propose a test for bias that is neither arbitrary nor based on false assumptions. Like the 10 percent rule it looks at how the numbers of disciplined students break down by race, but the similarity ends there. The test can sniff out bias wherever it hides. And if it finds it, you can bet your last dollar it is there.

   We have seen how to predict the most probable numbers of students, black and white, who in the absence of bias will be suspended or expelled. The actual numbers will vary because of statistical fluctuations.

   Imagine a large (infinite) ensemble of school districts, each a replica of the Skiba district. Each has the same numbers of black and white students and applies the same criteria for meting out discipline. Within the ensemble students may be exchanged between districts, but always a black for a black or a white for a white.

   Because intra-ensemble exchange is allowed, the districts of the ensemble will have different numbers of suspended and expelled students. If we sample the districts in the ensemble, the central limit theorem assures us that we will find the numbers of suspended and expelled students distributed normally about their ensemble averages. We postulate that the ensemble averages are those predicted by (1). Thus, if n* is the predicted value of (say) suspended blacks, its standard error, σ, is given by

where again N_S is the total number of suspended students.

Absent bias, there is only a 5 percent chance that the observed number of suspended or expelled students from a group will fall outside 2 standard errors of n*. Thus, a test for bias may be constructed as follows:

When the number of suspended or expelled students from a given (racial) group differs from that predicted by Equation 1 by two or more standard errors (computed from Equation 2), there is reason to suspect extrinsic factors including (racial) bias as a cause.

In the Skiba example, n* = 1692, and the total number of suspended students, N_S = 2461, giving a standard error of 23. The difference between the predicted and actual numbers of suspended blacks is 0.17 σ. For expulsion, the difference is 0.48 σ. Plainly, bias was not a factor.

Closing the Behavior Gap
What human attribute is most responsible for the advancement of civilization? What capability has it to build an internal combustion engine, what capacity to produce Chateau Margaux 1982? If forced to choose but one, it is the ability to solve problems. Unacceptable group behavior is just another problem to be solved.

   Recently, a remarkable paper appeared in the Journal of the American Medical Association that bears upon this issue. The article, "Long-term Effects of an Early Childhood Intervention on Educational Achievement and Juvenile Arrest," A. J. Reynolds, J. A. Temple, D. L. Robertson and E. A, Mann, JAMA, 285, 2339-2346 (2001), is a 15-year follow-up of urban, mostly black, low-income children who matriculated through an early intervention program in the Chicago public schools. The authors describe impressive long-term gains made by the children. We put their data through the Griffian engine and thus obtained an unexpected and extraordinary result.

   The intervention strategy described in the paper is that of the Chicago Child-Parent Center Program (CCPCP). According to Reynolds et al., it incorporates "comprehensive education, family, and health services and includes half-day preschool at ages 3 to 4 years, half- or full-day kindergarten, and school-age services in linked elementary schools at ages 6 to 9 years."

   Outcomes for the 989 children who went through the program were compared with those of a 550-child matched cohort. The comparison group participated in alternative programs such as kindergarten with or without additional services, but without early preschool. Both group outcomes are summarized in Table 2.

	CCPCP group outcomes	comparison group outcomes
HS completion	49.7%	38.5%
Juvenile arrests	16.9	25.1
Violent arrests	9.0	15.3
School dropout	46.7	55.0
Grade retention	21.9	32.3
Special Ed.	13.5	20.7
Table 2. Outcomes of Early Childhood Intervention.

Impressive achievements aside, there is more to these data than is immediately apparent. All six outcomes listed are for binary achievements. That is, a student either completes high school or doesn't, gets left back or doesn't. The two groups have different rates of success in each category. If we regard each item as a threshold for a fuzzy variable, the method of thresholds will reveal the fuzzy variable gap between the two groups in each category. Table 3 displays the calculated gaps. The result was unanticipated.

HS completion	0.28 SD
Juvenile arrests	0.29
Violent arrests	0.32
School dropout	0.21
Grade retention	0.32
Special Ed.	0.29
Table 3. Gaps between CCPCP children and comparison group.

The gaps created by the early intervention program are almost constant. We are probably looking not at six fuzzy variables, but one! -- a unique human attribute that was modified by early intervention, and whose change persisted through young adulthood. A shift in the distribution of this variable by about 0.3 SD accounts quantitatively for all the results of Table 2. We might call the variable, "socialization," and its shift to higher values, "getting civilized." There is yet hope.

###

APPENDIX: The Method of Thresholds in Brief
Let F_B and F_W be the fractions of black and white students, respectively, who are suspended by a given school. Let λ be the point on the aggressiveness axis beyond which suspension kicks in. That is, λ is the threshold for suspension. Its value will vary from school to school because tolerance for disagreeable behavior varies.

We may write,

where P is the normalized distribution of aggressiveness for blacks, and Δ is the black-white mean difference in aggressiveness.

In Skiba et al., the fractions F_B and F_W are 0.270 and 0.171 respectively. Using these values (A.1) and (A.2) may be solved simultaneously for λ and Δ. Taking P to be Gaussian, the result is Δ = 0.34 and λ = 0.61.

###

HOME complete contents of La Griffe du Lion
Problems for the Heterodox
Write to La Griffe du Lion