Guidelines for using Racial and Ethnic Groups in Data Analyses
Updated: July 2003
Purpose
Background and Context
Definition of Minimum Categories
Guidelines
Implementation Schedule
Data Collection
Data Tabulation
Data Presentation
Additional Information
Special Issues
Estimates of "More than One Race" in Washington State
Bridging Ratios for Trend Analysis
Bridging Methods
Sample Use of Statistical Methods
Recommended Method
Additional Information
Converting Multiple to Single Race: Washington State Department of Health Discussion Paper
Guidelines
For Using Racial and Ethnic Groups in Data Analyses (Word Document)
>>>
Next >>>
The Assessment Operations Group in the Washington
State Department of Health is coordinating the
development of guidelines related to data
development and use in order to promote good
professional practice among staff involved in
assessment activities within the Washington State
Department of Health and in Local Health
Jurisdictions in Washington. While the guidelines
are intended for an audience of differing levels of
training related to data development and use, they
assume a basic knowledge of epidemiology and
biostatistics. They are not intended to recreate
basic texts and other sources of information related
to the topics covered by the guidelines, but rather
they focus on common issues encountered in public
health practice and where applicable, on issues
unique to Washington state.
Epidemiology is the study of the distribution and
determinants of disease frequency in the human
population. Epidemiologists often examine this
distribution by race or ethnic group. The concepts
of race and ethnic group and the meaning assigned to
these concepts have changed considerably over time.
Current research on the human genome shows that
genetic variation within a particular
"race" group is much larger than the
variation between groups. (See, for example,
Rosenberg NA, et al. Science, 2002, 298: 2381-2384;
Garte S. Public Health Reports, 2002117 421-425;
Kaplan MS and Bennett T. JAMA, 2003, 289;2709-2716.)
Thus, while sickle-cell anemia is often thought of
as an "African-specific" genetic
condition, people from certain areas of Greece,
Italy and the Arabian Peninsula, generally
classified as "white," are also affected.
Similarly, malignant melanoma is most common in
"whites," but affects people of all
"races" and skin types.
Most scientists do not believe race is a valid
biological construct. (See discussion in 'The Meaning of
Race in Science - Considerations for Cancer Research,'
Report of the President's Cancer Panel, National
Institutes of Health, April 7, 1997.) Researchers, such
as Camara Jones, propose that "race is only a rough
proxy for socioeconomic status, culture, and genes, but
it precisely captures the social classification of
people in a race-conscious society such as the United
States." … That is, the variable 'race' is not a
biological construct that reflects innate differences,
but a social construct that precisely captures the
impacts of racism." (Jones CP, AJPH, 2000,
90:1212-1215).
Thus, in most public health assessment, race and
ethnic group should be viewed as capturing the effects
of complex social, cultural, economic and political
factors on human health, and these factors must be
addressed in interpreting health data. For example, good
birth outcomes among Mexican-American women are thought
to be related to socio-cultural practices supportive of
healthy lifestyle choices during pregnancy.
Discrimination and racism may affect the quality of
medical care, leading to poorer health outcomes among
African Americans and other race and ethnic groups.
(See, for example, Smedley B et al. Unequal Treatment:
Confronting Racial and Ethnic Disparities in Health
Care, Institute of Medicine, 2003.) Additionally, race
may sometimes reflect measures of socioeconomic factors.
For example, sometimes people "belonging to" a
specific racial and ethnic groups may, as a group, have
fewer material resources than those in other groups.
Differences in health status caused by lack of access to
material goods may appear as differences in health
status among racial and ethnic groups, although the root
cause is not race or ethnicity. Epidemiologic analysis
can be used to assess these relationships and guide
interventions.
The distribution of health care in the community may
also be a reason for measuring health data by race and
ethnic group. Some health care providers and other
social and health service organizations serve people
primarily from one or several racial or ethnic groups.
These providers and organizations want to know the
health status of the groups they serve and one method of
delineating this is to analyze health data by race and
ethnic group. Concepts of race and ethnic group have
changed over time. During the early 1990's, the United
States Office of Management and Budget (OMB) embarked on
a nationwide review of the federal guidelines for
reporting race and ethnicity that had been in effect
since 1977 (Statistical Policy Directive No. 15). As a
result of that review, OMB issued a revised standard in
1997 that was used in the United States Census in 2000,
in birth and fetal death certificates in 2003, and in
death certificates in 2004. The three major changes in
the 1997 standard are:
- People can be identified by more than one
racial category
- Pacific Islanders will no longer be classified
with Asians
- The question on Hispanic/Latino ethnicity
will be asked before the race question.
The minimum categories established in the 1997
OMB Standard for Federal Data on Race and Ethnicity
are:
- Race
- American Indian or
Alaska Native (AIAN): A person having
origins in any of the original peoples of
North and South America (including Central
America), and who maintains tribal
affiliation or community attachment.
- Asian: A person having origins in
any of the original peoples of the Far
East, Southeast Asia, or the Indian
subcontinent, including, for example,
Cambodia, China, India, Japan, Korea,
Malaysia, Pakistan, the Philippine
Islands, Thailand and Vietnam.
- Black or African
American: A person having origins in
any of the black racial groups of Africa.
Terms such as "Haitian" or
"Negro" can be used in addition
to "Black or African American."
- Native Hawaiian or
Other Pacific Islander (NHOPI): A
person having origins in any of the
original peoples of Hawaii, Guam, Samoa,
or other Pacific Islands.
- White: A person having origins in
any of the original peoples of Europe, the
Middle East, or North Africa.
- Ethnic Group
- Hispanic or
Latino: A person of Cuban, Mexican,
Puerto Rican, South or Central
American or other Spanish culture or
origin, regardless of race. The term
"Spanish origin" can be used
in addition to "Hispanic or
Latino."
- Continue to use the OMB 1977 Standards for
Race and Ethnicity for all data pertaining to
years through 1999, regardless of date of
publication
- Adopt the OMB 1997 Standards for Race and
Ethnicity for collecting data by January 1,
2005.
- Adopt the OMB 1997 Standards for Race and
Ethnicity for tabulating any data collected
using the 1997 standard.
- January 1, 2000 - December 31, 2004 is
called the transition period. Some programs
will use the 1997 standard and some programs
will use the 1977 standard. Bridging estimates
may have to be used for data comparability.
- Respondents and informants may identify with
more than one race. Data on multiple race
should be collected as multiple responses to a
single question rather than from a separate
"multiracial" category. Recommended
question wording is "Mark one or
more...," "Select one or
more....," or "Race - enter one or
more."
- For ethnic group, data should be collected
on whether or not a person is of Hispanic or
Latino culture or origin, but the OMB
standards do not permit multiple responses
(e.g., both Hispanic and non-Hispanic
heritage). To help provide more complete race
data for Hispanic respondents, an instruction
to answer both the Hispanic or Latino question
and the race question may be useful,
especially for mail surveys or
self-administered questionnaires.
- Data should be collected separately for race
and ethnicity, with ethnicity collected first.
- If a combined race/ethnicity format must be
used, a person should be allowed to select
more than one racial/ethnic category.
- The specific terminology for racial and
ethnic categories as described above should be
used whenever the categories are
listed.
- Data collection is not limited to the
categories described above. In fact, the
collection of subgroup detail is
encouraged.
- Subgroup detail could be collected
through write-in entries or follow-up
questions asked by the interviewer.
- If more detailed categories are used,
then the additional categories must be
organized so that they can be aggregated
into the minimum categories described
above.
- Additional categories should be mutually
exclusive
- Additional categories should be
consistent with available denominator data
if one intends to calculate rates.
- Additional categories need to be
meaningful to the populations about whom
data are being collected. If possible,
involve the communities in developing
additional categories to help assure this.
- Mode of administration should be considered
when designing questions and instructions.
- For face-to-face surveys, a flashcard
with the categories may be useful.
- For telephone surveys, fewer response
categories could be used, with follow-up
questions to provide more detail. The way
response options are read is important. To
avoid confusion, pause between categories,
e.g., White (pause) Black or African
American (pause) etc so that the
respondents don't think they have to
choose between Black and African American.
- For self-administered forms with a check
box format, definitions for the minimum
race categories may be needed.
- Use self-reporting rather than observer
identification whenever possible. If
self-reporting is not possible (e.g., for a
deceased person), attempt to obtain proxy
responses from family or friends before using
observer identification.
- Use translated data collection forms to
ensure inclusion of people from diverse
backgrounds whenever possible.
DATA COLLECTION EXAMPLE FOR SELF-ADMINISTERED
QUESTIONNAIRE
NOTE: Please answer Both questions 5 and 6.
5. Are you Spanish/Hispanic/Latino? Mark [X] the
"No" box if not Spanish/Hispanic/Latino.
- No, not Spanish/Hispanic/Latino
- Yes, Mexican, Mexican Am, Chicano
- Yes, Puerto Rican
- Yes, Cuban
- Yes, other Spanish/Hispanic/Latino Print group
6. What is your race? Mark [X] one or more races to indicate what this
person considers himself/herself to be.
- White
- Black, African Am., or Negro
- American Indian or Alaska Native Print name of enrolled or principal
tribe
- Asian Indian
- Chinese
- Filipino
- Japanese
- Korean
- Vietnamese
- Native Hawaiian
- Guamanian or Chamorro
- Samoan
- Other Asian Print race
- Other Pacific Islander Print race
- Some other race Print race
- Tabulate data so as to accurately represent
each person's choice of racial and ethnic
identity.
- Include as much detail on race and ethnicity
as possible without compromising data quality
or confidentiality. More detail can usually be
published for population totals than for
attributes (e.g., income, education, or health
outcomes).
- Aggregate data by racial/ethnic group only
when such an aggregation provides meaningful
categories (e.g., "nonwhite" or
"other" is not usually a meaningful
category).
- In addition to the minimum racial and ethnic
categories (and
subgroup detail where possible), data tables may include three other
categories:
- Other Race: Use
for responses that do not match any of the standard racial categories.
- Race Not Reported: Use when
information on race was not provided. If
data are available, this category can be
subdivided according to the reason that
information on race was not obtained:
refusal, don't know, and not ascertained.
- Not Tabulated Above: Use to
aggregate responses for any racial
categories (e.g., a single race or
responses to more than one category) that
do not contain enough people to be
published separately because of data
quality or confidentiality concerns.
The following tables show some options for presenting
data. On all tables the five race categories (plus 'Other') are listed
alphabetically.
- Follow this example when sample sizes or population counts do not
permit greater detail.
Table 1. Minimum Presentation of Data on Race
|
Race |
Number |
Percent |
Total |
|
|
AIAN1 |
|
|
Asian |
|
|
Black |
|
|
NHOPI1 |
|
|
Other |
|
|
White |
|
|
More than one race |
|
|
Race Not Reported |
|
|
1AIAN=American Indian or Alaska Native;
NHOPI=Native Hawaiian or Other Pacific Islander
If sample or population sizes for any of the
racial groups are too small to meet data quality or confidentiality
standards, then combine these racial groups into a category labeled 'Not
Tabulated Above' and present data only for the larger racial
categories.
Follow this example to report data containing the minimum racial
groupings plus additional detail (e.g., under "Asian"
and under "More than one race").
Table 2. Detailed Presentation of Data on Race
Race |
Number |
Percent |
Total |
|
|
AIAN1 |
|
|
Asian |
|
|
Asian Indian |
|
|
Chinese |
|
|
Filipino |
|
|
Japanese |
|
|
Korean |
|
|
Vietnamese |
|
|
Black |
|
|
NHOPI1 |
|
|
Other |
|
|
White |
|
|
More than one race |
|
|
AIAN/White |
|
|
Asian/White |
|
|
Black/White |
|
|
|
Other/White |
|
|
Race Not Reported |
|
|
1AIAN=American Indian or Alaska Native;
NHOPI=Native Hawaiian or Other Pacific Islander
- Report as much detail under the category
"More than one race" as needed to
represent the responses provided by
individuals (without violating confidentiality
or data analysis standards) and such that the
minimum set of racial categories can be
recreated. The detail reported under 'More
than one race' in this table represents the
most frequently reported combinations of
racial categories based on National Health
Interview Survey data. For other populations,
the most frequent combinations may be
different.
- When data systems collect more detail on
subgroups of the main categories, some persons
may indicate that they belong to more than one
subgroup. For example, in the Asian category,
respondents might indicate both Chinese and
Japanese heritage. These respondents should be
included in the single race total for Asians,
not in the "More than one race"
category. If sample size permits, they can be
tabulated separately as an Asian subgroup
"More than one Asian race."
- · Follow this example to reflect the
complexity of reporting race and to present
inclusive categories. (Note: persons may be
counted in more than one category in the table
below.)
Table 3. Detailed Presentation of Data on Race
and the All Inclusive Distributions
Race |
Number |
Percent |
Total |
|
|
AIAN1 |
|
|
Asian |
|
|
Asian Indian |
|
|
Chinese |
|
|
Filipino |
|
|
Japanese |
|
|
Korean |
|
|
Vietnamese |
|
|
Black |
|
|
NHOPI1 |
|
|
Other |
|
|
White |
|
|
More than one race |
|
|
AIAN/White |
|
|
Asian/White |
|
|
Black/White |
|
|
Race Not Reported |
|
|
AIAN all inclusive |
|
|
AIAN and other race(s) |
|
|
Asian all inclusive |
|
|
Asian and other race(s) |
|
|
(etc for other
categories) |
|
|
1AIAN=American Indian or Alaska Native;
NHOPI=Native Hawaiian or Other Pacific Islander
- Use the 'all inclusive' headings to
represent persons who report a particular race
either alone or in combination with other
race(s). The distributions of individuals
under the all-inclusive category may provide
information on groups that are not of
sufficient size in the sample or population to
be included in basic tabulations.
- Present data on Hispanic or Latino ethnicity
using the format shown in Table 4, similar to
the race tabulations, but no provision is made
for reporting more than one subgroup or both
Hispanic and non-Hispanic ethnicity. The
subgroups used will be a function of the
sample size and the population composition
where the data are collected.
Table 4. Hispanic or Latino Ethnicity with
Detail
Ethnicity |
Number |
Percent |
Total |
|
|
Hispanic/Latino |
|
|
Cuban |
|
|
Mexican |
|
|
Puerto Rican |
|
|
Not Hispanic/Latino |
|
|
Ethnicity Not Reported |
|
|
- Report race by ethnicity whenever
possible without violating data quality or
confidentiality standards. Thus, Tables 1-3
could be further subdivided by tabulating data
in each category for "Hispanic or
Latino," "Not Hispanic or
Latino," and "Ethnicity Not
Reported." Table 5 shows Table 2
subdivided in this way. Follow this example to
report race by ethnicity.
Table 5. Detailed Presentation of Data on Race
and Hispanic or Latino Ethnicity
Race |
Number |
Percent |
Total |
|
|
Hispanic or Latino |
|
|
AIAN1 |
|
|
Asian |
|
|
Black |
|
|
NHOPI1 |
|
|
Other |
|
|
White |
|
|
More than one
race |
|
|
Race Not
Reported |
|
|
Not Hispanic or Latino |
|
|
AIAN |
|
|
Asian |
|
|
Asian Indian |
|
|
Chinese |
|
|
Filipino |
|
|
Japanese |
|
|
Korean |
|
|
Vietnamese |
|
|
Black |
|
|
NHOPI |
|
|
Other |
|
|
White |
|
|
More than one
race |
|
|
AIAN/White |
|
|
Asian/White |
|
|
Black/White |
|
|
Race Not Reported |
|
|
Ethnicity Not Reported |
|
|
White |
|
|
Race Not
Reported |
|
|
1AIAN=American Indian or Alaska Native;
NHOPI=Native Hawaiian or Other Pacific Islander
Note: Not all categories are included due
to small cell sizes. Values for these cells
are included in the total category and
appropriate subcategories; therefore,
subcategories may not add to total.
If possible, collect data under both the
1977 and the 1997 OMB standards for a
sufficient period of time to create an
estimate of the characteristics of the
population under consideration under both
methods for reporting race and ethnicity.
Use these data to calculate a bridging
ratio for use in trend analysis.
- If it is not possible to calculate a
bridging ratio that is directly applicable to
a specific data collection system, then use a
bridging ratio available from another source
that most closely approximates the population
under consideration (e.g., state or county
survey).
- If it is impractical to use a bridging ratio
or this approach is not suitable, then clearly
indicate that there is a break in the data
series between data collected before and after
the implementation of the 1997 OMB standard on
race and ethnicity. Depending on the nature of
the analysis, use appropriate means for
conveying the change in the data series,
including separate tables or graphs, clearly
demarcated breaks in trend lines, footnotes,
technical notes, and explanations within the
text.
We recommend following the suggested guidelines
described in "Use of Race and Ethnicity in
Biomedical Publication" (Kaplan MS and Bennett T.
JAMA, 2003, 289:2709-2716) when presenting data by race
and ethnicity.
- Due to the complexity of the issues, analysts
should consider why they are analyzing health data
by race and ethnic group and should articulate their
reasons in their reports and presentations.
- Analysts should specify how race and ethnicity
were collected (e.g., self-report using check boxes,
assigned by someone else using an open-ended
question.) (See Guidelines:
Data Collection for specific recommendations on
collecting data on race and ethnic group.)
- Race and ethnicity should not be used as a proxy
for genetic variation in the absence of firmly
grounded genetic evidence. Similarly, in discussing
differences among racial and ethnic groups, analyst
should avoid discussing differences as due to
inherent underlying traits without clear evidence of
such.
- In interpreting differences among race and ethnic
groups, analysts should consider all conceptually
relevant factors, such as socioeconomic factors,
racism, and discrimination. Since lack of adjustment
for socioeconomic status may be an important source
of bias, analysts should try to adjust for these
factors.
|