Evaluation & Research

Glossary of Testing & Measurement Terminology

 

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z

ABCs: The ABCs of Public Education is North Carolina’s comprehensive plan to improve public schools. The plan is based on three goals: 1) Strong Accountability; 2) Mastery of Basic Skills; and 3) Localized Control. The model uses End-of-Grade tests in grades 3-8 and End-of-Course tests in grades 6-12. Components for drop out, enrollment in the college/university courses of study and competency are included in 9-12. The model includes standards for the performance and growth of student achievement.

ABCs Expected Growth: Met when a school’s average of academic change scores for students is 0 or greater. Academic change measures how close individual students come to their predicted performance.

ABCs Growth Model: In this model, students are expected to do at least as well this year as they have in the past, compared to other NC students who took the same statewide test in the year standards were set (usually the first year the test was given). Schools meet “expected growth” if students, on the average, show a year of growth in a year’s time. Schools meet “high growth” if 60% of students in a school meet her/his individual growth targets.

ABCs High Growth: Met when expected growth is met and 60% of students meet their growth expectation.

ABCs Performance Composite: The percentage of students' scores at or above grade level. The composite reflects the percentage of scores at grade level across all tested subjects. All students tested are included in analyses.

ABCs Standards -- Awards & Recognitions: ABCs Awards recognize school excellence based on both student growth and performance. Several categories exist, the highest of which is Honor School of Excellence and School of Excellence. Both recognize schools that show Expected growth, and had a performance composite of 90% or higher of their scores at or above grade level. "Honor" is added for schools also meeting AYP federal standards. School recognitions are based on student performance and staff bonuses are based on growth.

Ability: A characteristic indicative of an individual’s competence in a particular field. The word "ability" is frequently used interchangeably with aptitude, although many psychologists use "ability" to include what others term "aptitude" and "achievement." (See Aptitude)

Accountability: A means of judging policies and programs by measuring their outcomes or results against agreed upon standards. An accountability system provides the framework for measuring outcomes - not merely processes or workloads.

Back to Index

 

Academic Aptitude (See Scholastic Aptitude)

Academic Change (AC): The metric used in the state's ABCs growth model to determine whether a student has made "a year's worth of growth in a year's worth of time". The calculation is done student by student, and then aggregated up to the school level to determine whether a school makes growth under the ABCs program.

Accommodations: Modifications in the way assessments are designed or administered so that students with disabilities (SWD), limited English proficient (LEP) students or other test takers who are unable to take the original test under standard test conditions can be included in the assessment. Assessment accommodations or adaptations might include Braille forms for blind students or tests in native languages for students whose primary language is other than English.

Achievement Data: Any data that reflect student's academic attainments. Formative data are collected throughout the year and are used primarily to drive instruction; summative data are collected at the end of a cycle and are generally used to assess overall outcomes and examine areas for improvement.

Achievement Levels for EOG/EOC: State test scores are divided into four levels to designate proficiency. These are descriptions of the test taker’s competency in a particular area of knowledge. Student performance on North Carolina’s End-of-Grade and End-of-Course tests is reported by achievement level. There are four achievement levels reflecting student mastery of knowledge and skills in this subject or course area:

Achievement Test: An objective examination that measures educationally relevant skills or knowledge about such subjects as reading, spelling, or mathematics.

ACT®: A college entrance exam that assesses high school students' general educational development and their ability to complete college-level work.

Adequate Yearly Progress (AYP): Required under the federal No Child Left Behind law, AYP provides another way to measure school performance. To meet AYP, a school must meet target goals for each group of students that numbers 40 or more. Target goals are set annually by the state for reading and mathematics at grades 3-8 and 10, and for attendance rates or graduation rates as well. AYP is an all-or-nothing model. If a school misses one target, it does not make AYP. The long-term goal of AYP is to have every school at 100 percent student proficiency by 2013-14.

Admissions Test: A test of a student's ability to participate in special programs or advanced learning situations. For example, an honors-level class or a magnet school may require the attainment of high scores on an assessment for admission.

Back to Index

Advanced Placement (AP): A program that enables high school students to complete college-level courses while in high school. Students can take an AP exam for college placement and/or credit.

Age Norms: The distribution of test scores by age of test takers. For example, a norms table may be provided for 9 year olds. This age-norms table would present such information as the percentage of 9 year olds who score below each raw score on the test. (See Norms)

Alternate Assessments: Ways to assess students academically other than through standard test administration. Alternate assessments are used for some students with disabilities and some English language learners. NCExtend I and NCExtend II are examples from state EOG and EOC testing.

Alignment: The process of linking content and performance standards to assessment, instruction, and learning in classrooms. One typical alignment strategy is the step-by-step development of (a) content standards, (b) performance standards, (c) assessments, and (d) instruction for classroom learning. Ideally, each step is informed by the previous step or steps, and the sequential process is represented as follows: Content Standards - Performance Standards - Assessments - Instruction for Learning. In practice, the steps of the alignment process will overlap. The crucial question is whether classroom teaching and learning activities support the standards and assessments. System alignment also includes the link between other school, district, and state resources. Alignment supports the goals of the standards, i.e., whether professional development priorities and instructional materials are linked to what is necessary to achieve the standards.

Analytic Scoring: Evaluating student work across multiple dimensions of performance rather than from an overall impression (holistic scoring). In analytic scoring, individual scores for each dimension are scored and reported. For example, analytic scoring of a history essay might include scores of the following dimensions: use of prior knowledge, application of principles, use of original source material to support a point of view, and composition. An overall impression of quality may be included in analytic scoring. (See Holistic Scoring)

Anchor(s): A sample of student work that exemplifies a specific level of performance. Raters use anchors to score student work, usually comparing the student performance to the anchor. For example, if student work was being scored on a scale of 1-5, there would typically be anchors (previously scored student work), exemplifying each point on the scale.

Anecdotal Data: Data obtained from a description of a specific incident in an individual’s behavior (an anecdotal record). The report should be an objective account of behavior considered significant for the understanding of the individual.

Annual Measurable Achievement Objectives (AMAO): AMAO for limited English proficient (LEP) students; mandated by Title III of NCLB; measured by three goals: Progress (measured by the IPT), Proficiency (measured by the IPT), and district AYP for the LEP subgroup (measured by EOGs and EOCs).

Aptitude: A combination of characteristics, whether native or acquired, that is indicative of an individual’s ability to learn or to develop proficiency in some particular area if appropriate education or training is provided.

Aptitude Test: A test consisting of items selected and standardized so that the test predicts a person's future performance on tasks not obviously similar to those in the test. Aptitude tests may or may not differ in content from achievement tests, but they do differ in purpose. An aptitude test might consist of items that predict future learning or performance; achievement tests consist of items that sample the adequacy of past learning. Aptitude tests include those of general academic (scholastic) ability; those of special abilities, such as verbal, numerical, mechanical, or musical; tests assessing "readiness" for learning; and tests that measure both ability and previous learning, and are used to predict future performance—usually in a specific field, such as foreign language, shorthand, or nursing.

Assessment: Any systematic method of obtaining information from tests and other sources, used to draw inferences about characteristics of people, objects, or programs; the process of gathering, describing, or quantifying information about performance; an exercise—such as a written test, portfolio, or experiment—that seeks to measure a student's skills or knowledge in a subject area.

Assessment System: The combination of multiple assessments into a comprehensive reporting format that produces comprehensive, credible, dependable information upon which important decisions can be made about students, schools, districts, or states. An assessment system may consist of a norm-referenced or criterion-referenced assessment, an alternative assessment system and classroom assessments.

Asymptote: An item statistic that describes the proportion of examinees that endorsed a question correctly but did poorly on the overall test. Asymptote for a theoretical four-choice item is 0.25 but can vary somewhat by test. (For mathematics it is generally 0.15 and for social studies it is generally 0.22).

Authentic Assessment: An assessment that measures a student's performance on tasks and situations that occur in real life. This type of assessment is closely aligned with, and models, what students do in the classroom.

Average: A statistic that indicates the central tendency or most typical score of a group of scores. Most often average refers to the sum of a set of scores divided by the number of scores in the set.

AYP: (See Adequate Yearly Progress)

AYP Subgroups: In each school, every group of students that numbers 40 or more has a target goal. Target goals are set annually by the state for reading and mathematics at grades 3- 8 and 10, and for attendance rates or graduation rates as well. (See Adequate Yearly Progress)

 

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z

Baseline: Level of behavior associated with a subject before an experiment or intervention begins.

Battery: A test battery is a set of several tests designed to be administered as a unit. Individual subject-area tests measure different areas of content and may be scored separately; scores from the subtests may also be combined into a single score.

Benchmark: A detailed description of a specific level of student performance expected of students at particular ages, grades, or development levels. Benchmarks are often represented by samples of student work. A set of benchmarks can be used as "checkpoints" to monitor progress toward meeting performance goals within and across grade levels.

Benchmark Assessments: Given to students periodically throughout the year or course to determine how much learning has taken place up to a particular point in time and to track progress toward meeting curriculum goals and objectives.

Bias: A situation that occurs in testing when items systematically measure differently for different ethnic, gender, or age groups. Test developers reduce bias by analyzing item data separately for each group, then identifying and discarding items that appear to be biased.

Biserial Correlation: The relationship between an item score (right or wrong) and the total test score.

Blue Diamond (BD): A management system for housing formative assessments, scoring, and generating reports.

 

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z

Ceiling: The upper limit of performance that can be measured effectively by a test. Individuals are said to have reached the ceiling of a test when they perform at the top of the range in which the test can make reliable discriminations. If an individual or group scores at the ceiling of a test, the next higher level of the test should be administered, if available.

Checklist: An assessment that is based on the examiner observing an individual or group and indicating whether or not the assessed behavior is demonstrated.

Classroom Assessment. An assessment developed, administered, and scored by a teacher or set of teachers with the purpose of evaluating individual or classroom student performance on a topic. Classroom assessments may be aligned into an assessment system that includes alternative assessments and either a norm referenced or criterion-referenced assessment. Ideally, the results of a classroom assessment are used to inform and influence instruction that helps students reach high standards.

Closed-Ended Questions: Questions which have a clear and apparent focus and a clearly called-for answer.

Composite Score: A single score used to express the combination, by averaging or summation, of the scores on several different tests.

Competency: A group of characteristics, native or acquired, which indicate an individual's ability to acquire skills in a given area.

Competency-based assessment (criterion-referenced assessment): Measures an individual's performance against a predetermined standard of acceptable performance. Progress is based on actual performance rather than on how well learners perform in comparison to others; usually given under classroom conditions.

Confidence Interval (CI): A margin of error around an estimate.

Construct Validity (See Validity).

Content Validity (See Validity).

Constructed-Response Item: An assessment unit with directions, a question, or a problem that elicits a written, pictorial, or graphic response from a student. Sometimes called an "open-ended" item.

Content standards: Broadly stated expectations of what students should know and be able to do in particular subjects and (grade) levels. Content standards define for teachers, schools, students, and the community not only the expected student skills and knowledge, but what programs should teach.

Conversion Tables: Tables used to convert a student's test scores from scale score units to grade equivalents, percentile ranks, stanines, or other types of scores.

Correlation: The degree of relationship between two sets of scores. A correlation of 0.00 denotes a complete absence of relationship. A correlation of plus or minus 1.00 indicates a perfect (positive or negative) relationship. Correlation coefficients are used in estimating test reliability and validity.

Criteria: Guidelines, rules, characteristics, or dimensions that are used to judge the quality of student performance. Criteria indicate what we value in student responses, products or performances. They may be holistic, analytic, general, or specific. Scoring rubrics are based on criteria and define what the criteria mean and how they are used.

Criterion-Related Validity: See Validity

Cumulative Percent: See Percentile Rank

Curriculum: All of the instruction, services, and activities provided for students through formal schooling including but not limited to: content, teaching methods and practices, instructional materials and guides, the physical learning environment, assessment and evaluation, time organization, leadership, and controls. Curriculum includes planned, overt topics of instruction as well as unseen elements such as norms and values taught by the school and through classroom interactions between the teacher and learner, hidden social messages imbedded in the curriculum materials themselves, and the material that is not included in the overt or planned curriculum.

Curriculum Framework: A curriculum framework is a document outlining content strands and learning standards for a given subject area. The specific knowledge and skills taught in the classroom are based on student needs and objectives as identified by the teacher and students. By providing examples of learning activities and successful instructional strategies, the frameworks link statewide learning standards found within the framework to educational practices developed at the classroom level.

Curriculum Map: A matrix showing the coverage of each program learning outcome in each course.

Cut score: A specified point on a score scale, such that scores at or above that point are interpreted or acted upon differently from scores below that point. (See also Performance Standards.)

 

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z

Data: A record of an observation or an event such as a test score, a grade in a mathematics class, or response time.

Data Analysis: Identifying patterns/trends among multiple pieces of data.

Data Capture: The electronic survey process used to collect students' end-of-school-year status on K-5 assessments.

Data Displays: Graphs, charts, or tables used to display data trends.

Demographic Data: Descriptive information about school population.

Developmental Scale Scores: The number of questions a student answers correctly is called a raw score. For the grade 3 pretest, the raw score is converted to a developmental scale score. The developmental scale score may be used to measure growth in a specific subject from year to year. The developmental scale score on the grade 3 pretest, given the first three weeks of school, and the developmental scale score on the end-of-grade test, given during the last three weeks of school, allows parents and teachers to measure a student’s growth in a particular subject.

Diagnostic Test: A test used to "diagnose" or analyze; that is, to locate an individual’s specific areas of weakness or strength, to determine the nature of his or her weaknesses or deficiencies, and, if possible, to suggest their cause.

Difference Score Reliability: Reliability of the distribution of differences between two sets of scores. These scores could be on two different subtests or on a pre- and posttest, where the difference score is typically called a gain score. The meaning of the term "reliability" is the same for a set of difference scores as for a distribution of regular test scores, see Reliability. However, since difference scores are derived from two somewhat unreliable scores, difference scores are often quite unreliable. This must be kept in mind when interpreting difference scores.

Difficulty Index: The percent of students who answer an item correctly, designated as p. (At times defined as the percent who respond incorrectly, designated as q.)

Dimensionality: The extent to which a test item measures more than one ability.

Disaggregated Data: Breaking down overall results into smaller groupings. For example, test results are sorted by students who are economically disadvantaged, by racial and ethnic groups, by disability, or English proficiency.

Discipline Data: Any data that reflects student behavior.

Discrimination Parameter: The property that indicates how accurately an item distinguishes between examinees of high ability and those of low ability on the trait being measured. An item that can be answered equally well by examinees of low and high ability does not discriminate well and does not give any information about relative levels of performance.

Discrimination Index: The extent to which an item differentiates between high-scoring and low-scoring examinees. Discrimination indices generally can range from -1.00 to +1.00. Other things being equal, the higher the discrimination index, the better the item is considered to be. Items with negative discrimination indices are generally items in need of rewriting.

Distracters: An incorrect answer choice in a multiple-choice test question.

District Improvement: The No Child Left Behind (NCLB) legislation determines a school district to be in District Improvement when the district does not meet the AYP goals for two consecutive years.

 

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z

Economically Disadvantaged: Students eligible for free or reduced-price lunch (FRL).

Educational Research Services (ERS): A searchable site of which WCPSS is a member where you can query, download, and print research documents and summaries. Also provides at low cost many printed reports and literature reviews related to educational topics.

Effectiveness Index: A measure of how students performed on EOG and EOC tests compared to similar students in WCPSS. This index serves primarily as a tool for identifying areas for improvement, with value for accountability and identifying effective teachers and schools within WCPSS.

Elementary and Secondary Education Act (ESEA): The principal federal law affecting K-12 education. When the ESEA of 1965 was reauthorized and amended in 2002, it was renamed the No Child Left Behind (NCLB) Act.

Embedded Test Model: Using an operational test to field test new items or sections. The new items or sections are “embedded” into the new test and appear to examinees as being indistinguishable from the operational test.

End of Course (EOC): The EOC tests assess the competencies defined by the NC Standard Course of Study (SCOS) for: Algebra I, Algebra II, English I, Biology, Chemistry, Geometry, Physical Science, Physics, Civics and Economics, and US History. Tests are taken during the last 5-10 days of school (calendar dependent) or the equivalent for alternative schedules, see ABCs standards.

End of Grade Testing (EOG): EOG tests in reading and mathematics are taken by students in grades 3-8, with science taken in grades 5 and 8. EOGs are given during the last three weeks of the school year, see ABCs standards.

Equivalent Forms: Statistically insignificant differences between forms.

Equivalent Forms: Any of two or more forms of a test that are closely parallel with respect to content and the number and difficulty of the items included. Equivalent forms should also yield very similar average scores and measures of variability for a given group. Also called parallel or alternate forms.

Error of Measurement: The amount by which the score actually received (an observed score) differs from a hypothetical true score. (See Standard Error of Measurement)

Evaluation: When used for most educational settings, evaluation means to measure, compare, and judge the quality of student work, schools, or a specific educational program.

Evaluation and Research (E&R): The WCPSS Evaluation and Research Department.

 

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z

Face Validity: See Validity

Field Test: A collection of items to approximate how a test form will work. Statistics produced will be used in interpreting item behavior/performance and allow for the calibration of item parameters used in equating tests.

Floor: The lowest limit of performance that can be measured effectively by a test. Individuals are said to have reached the floor of a test when they perform at the bottom of the range in which the test can make reliable discriminations.

Focus Group: A qualitative data-collection method that relies on facilitated discussions with participants who are asked a series of carefully constructed open-ended questions about their attitudes, beliefs, and experiences.

Foil Counts: Number of examinees that endorse each foil (e.g., number who answer “A”, number who answer “B”, etc.).

Foils: The possible answer choices presented in a multiple-choice question.

Formative Assessments: A process used by teachers and students during instruction that provides feedback to adjust ongoing teaching and learning to improve intended instructional outcomes. The purpose of formative assessments is to assist teachers in identifying where necessary adjustments to instruction are needed to help students achieve the intended instructional outcomes.

Free or Reduced-Price Lunch (FRL): Students are eligible to receive free or reduced-price lunch, based upon parent or guardian financial status through a federal governmental program.

Frequency: The number of times a given score (or a set of scores in an interval grouping) occurs in a distribution or set of scores.

Frequency Distribution: A tabulation of scores from low to high or high to low showing the number of individuals who obtain each score or fall within each score interval.

Frequency Reports: EOG and EOC tests report which list each scale score and the number and percentage of students receiving it.

Full Academic Year (FAY): A policy stating that students who count in some ABCs and AYP calculations must accrue at least 140 days in membership at their current school.

 

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z

Gain Score: Difference between a posttest score and a pretest score.

Goal Summary Report: A summary of EOG and EOC performance on each test by NCSCOS curriculum goal or category. Lists percent of items on a test by goal or category and a comparison of school to state percentage correct. School summaries are more reliable and helpful than teacher summaries.

Grade Norms: The average test score obtained by students classified at a given grade placement. (See Age Norms and Norms.)

Grading: The process of evaluating students, ranking them, and distributing each student's value across a scale. Typically, grading is done at the course level.

Growth Target: The EOG or EOC scale score a student must achieve in order to demonstrate "a year's worth of growth in a year's worth of time" in the state's ABCs growth model.

Guessing Parameter: The probability that a student with very low ability on the trait being measured will answer a test item correctly. There is always some chance of guessing the answer to a multiple-choice item, and this probability can vary among items. The guessing parameter enables a model to measure and account for these factors.

 

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z

High Growth Ratio: The ABCs High Growth Ratio is computed by dividing the number of students who met their individual growth target by the number of students who did not. A value of 1.5 or higher indicates High Growth was met for that test.

High-stakes test: A test used to provide results that have important, direct consequences for examinees, programs, or institutions involved in the testing.

Holistic Scoring: A scoring procedure yielding a single score based on overall student performance rather than on an accumulation of points. Holistic scoring uses rubrics to evaluate student performance. (See Analytic Scoring)

Hypothesis: A specific statement of prediction.

 

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z

IDEA (Individuals with Disabilities Education Act): A federal law designed to ensure that all children with disabilities have available to them a free and appropriate public education that emphasizes special education and related services designed to meet their unique needs and prepare them for further education, employment and independent living.

IEP (Individualized Education Plan): A written statement for a student with a disability that is developed, at least annually, by a team of professionals knowledgeable about the student and the parent. The IEP is required by federal law for all exceptional children and must include specific information about how the student will be served and what goals he or she should be meeting.

IPT: The IPT is the state-identified English language proficiency test. Federal and state policies require that all students identified as limited English proficient (LEP) be annually administered the state-identified English language proficiency test in grades K–12. (See LEP)

Indicators: Measures used to track performance over time. These can often be classified as input indicators (provide information about the capacity of the system and its programs); process indicators (track participation in programs to see whether different educational approaches produce different results); output indicators (short-term measures of results); or outcome indicators (long-term measures of outcomes and impacts).

Insufficient Data: This term is found on AYP results from state reports, particularly if there were not enough students to identify that student group as a subgroup for accountability purposes.

Inter-rater reliability: The consistency with which two or more judges rate the work or performance of test takers.

Item: An individual question or exercise in a test or evaluative instrument.

Item Analysis: The process of examining students’ responses to test items. The difficulty and discrimination indices are frequently used in this process. (See Difficulty Index and Discrimination Index.)

Item Difficulty: (See Difficulty Index)

Item Discrimination: (See Discrimination Index)

Item Response Theory (IRT): A method for scaling individual items for difficulty in such a way that an item has a known probability of being correctly completed by a respondent of a given ability level. The North Carolina Department of Public Instruction (NCDPI) uses the 3-parameter model, which provides slope, threshold, and asymptote.

Item Tryout: A collection of a limited number of items of a new type, a new format, or a new curriculum. Only a few forms are assembled to determine the performance of new items and not all objectives are tested.

 

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z

K-5 Assessments: A collection of assessments used by teachers to assess students' skills in literacy (reading and writing) and mathematics throughout the year. Results of these assessments are recorded on profile cards or electronically. Each spring, students' status is collected centrally and summarized for WCPSS and each school. (See Data Capture)

Kindergarten Initial Assessment (KIA): Assessments given to kindergarten students during the first weeks of school to determine strengths and needs in literacy, mathematics, physical, and personal/social skills. The report indicates how frequently each question was answered so that a teacher can see specifically where a class is struggling.

 

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z

Learning Outcomes: Learning outcomes describe the learning mastered in behavioral terms at specific levels. In other words, what the learner will be able to do.

Learning Standards: Learning standards define in a general sense the skills and abilities to be mastered by students in each strand at clearly articulated levels of proficiency.

Level of Significance: The Type I error rate or the probability that a null hypothesis will be rejected when it is actually true. (See Significance Level)

Likert Scale: A method of scaling in which items are assigned interval-level scale values and the responses are gathered using an interval level response format.

Limited English Proficiency (LEP): The identification given to students who score below Superior in at least one domain on the state-mandated English language proficiency test, which is currently the IPT. (See IPT)

Location Parameter: A statistic from item response theory that pinpoints the ability level at which an item discriminates, or measures best.

Longitudinal: A study that takes place over time.

 

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z

Mainframe: Data warehouse for demographic student information and reports which include academically gifted (AG), limited English proficient (LEP), free or reduced-price lunch (FRL), and students with disabilities (SWD) groups.

Mantel-Haenszel: A statistical procedure that examines the differential item functioning (DIF) or the relationship between a score on an item and the different groups answering the item (e.g., gender, race). This procedure is used to identify individual items for further bias review.

Measurement: Process of quantifying any human attribute pertinent to education without necessarily making judgments or interpretations as to its meaning.

Mean (): The arithmetic average of a set of scores. It is found by adding all the scores in the distribution and dividing by the total number of scores.

Median (Md): The middle score in a distribution or set of ranked scores; the point (score) that divides a group into two equal parts; the 50th percentile. Half the scores are below the median, and half are above it.

Meta-analysis: A procedure that allows for the examination of trends and patterns that may exist across many different studies.

Mode: The score or value that occurs most frequently in a distribution or set of scores.

Multiple Measures: Assessments that measure student performance in a variety of ways. Multiple measures may include standardized tests, teacher observations, classroom performance assessments, and portfolios.

Multiple-Choice Item: A question, problem, or statement (called a "stem") which appears on a test, followed by two or more answer choices, called alternatives or response choices. The incorrect choices, called distracters, usually reflect common errors. The examinee's task is to choose from, among the alternatives provided, the best answer to the question posed in the stem. These are also called "selected-response items." (See Selected-Response Item)

Multiple Regression: A statistical technique where several variables are used to predict another.

Educational Multirisk: A student within at least 2 of these groups: free or reduced-price lunch (FRL), limited English proficient (LEP), and students with disabilities (SWD).

 

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z


N: The symbol commonly used to represent the total number of cases in a group.

n: The symbol that represents the sample size of the number of cases in a statistical sample.

NAEP (National Assessment of Educational Progress): The NAEP, also known as the “Nation’s Report Card”, assesses the educational achievement of elementary and secondary students in various subject areas. It provides data for comparing the performance of students in North Carolina to that of their peers in the nation.

NCEXTEND1: The North Carolina EXTEND1 is an alternate assessment designed to measure the performance of students with significant cognitive disabilities using alternate achievement standards.

NCEXTEND2: The North Carolina EXTEND2 is an alternate assessment designed to measure grade-level competencies of students with disabilities using modified achievement standards in a simplified multiple choice format.

NC School Report Card: A snapshot of status information about individual schools produced annually by North Carolina’s Department of Public Instruction (DPI).

NC WISE (North Carolina Window of Information on Student Education): This web-based system provides student and school information management capabilities and captures essential data about students throughout their careers in North Carolina public schools.

No Child Left Behind (NCLB): The NCLB Act of 2001 focuses on greater accountability at the school level for student achievement and staff quality, sanctions for not performing, and parental involvement. NCLB requires all students to be proficient in reading and mathematics by 2013-2014 and teachers to be "highly qualified" by 2006. NCLB is sometimes referred to as “Nickelby”, related to the acronym.

Normal Curve Equivalents (NCEs): Normalized standard scores with a mean of 50
and a standard deviation of 21.06. (See Standard Score.) The standard deviation of 21.06 was chosen so that NCEs of 1 and 99 are equivalent to percentiles of 1 and 99. There are approximately 11 NCEs to each stanine. (See Stanine.)

Normal Distribution: A distribution of scores or other measures that, in graphic form, have a distinctive bell-shaped appearance. In a normal distribution, the measures are distributed symmetrically around the mean. Cases are concentrated near the mean and decrease in frequency, according to a precise mathematical equation, the farther one departs from the mean. The assumption that many mental and psychological characteristics are distributed normally underlies the utility of this mathematical distribution.

The figure below is a normal distribution. It shows the percentage of cases between different scores as expressed in standard deviation units. For example, about 34% of the scores fall between the mean and one standard deviation above (or below) the mean.

Norming: The process of constructing norms. Typically norming studies are conducted to construct conversation tables so that an individual’s raw score can be compared to other individuals in a relevant reference group, the norm group.

Norm-referenced assessment: An assessment where student performance or performances are compared to that or those of a larger group. Usually the larger group or "norm group" is a sample representing a wide and diverse cross-section of students. Students, schools, districts, and even states are compared or rank-ordered in relation to the norm group. The purpose of a norm-referenced assessment is usually to sort students and not to measure achievement towards some criterion of performance.

Norms: The distribution of test scores of some specified or reference group called the norm group. For example, this may be a national sample of all fourth graders, a national sample of all fourth-grade male students, or perhaps all fourth graders in some local district. Usually norms are determined by testing a representative group and then calculating the group’s test performance.

Norms vs. Standards: Norms are not standards. Norms are indicators of what students of similar characteristics did when confronted with the same test items as those taken by students in the norms group. Standards, on the other hand, are arbitrary judgments of what students should be able to do, given a set of test items.

Null Hypothesis: A statement of equality between a set of variables.

 

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z

Objective: A Desired educational outcome such as "constructing meaning" or "adding whole numbers." Usually several different objectives are measured in one subtest.

Open-Ended Questions: Questions that provide a broad opportunity for the participant to respond.

Operational Test: Test is administered statewide with uniform procedures and full reporting of scores, and stakes for examinees and schools.

Other Academic Indicator (OAI): This indicator is used to help determine Adequate Yearly Progress (AYP) Status for public schools and includes attendance (grades K-8) and graduation rates for grades 9-12. (See Adequate Yearly Progress).

Out-of-Level Testing: The activity of administering a test level that is different from the one designated for a student of a particular age or in a particular grade. For example, a fourth grader might be given a test level designated for use in Grade 2. Out-of-level testing is used so that students can be tested on the content appropriate to their current level of functioning; that is, above or below their grade placement or age.

Outliers: Those scores in a distribution that are noticeably much more extreme than the majority of scores. Exactly which score might an outlier is usually an arbitrary decision made by the researcher.

 

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z

Parallel Forms: Covers the same curricular material as other forms, see Equivalent Forms.

p-Value: The proportion of people in a group who answer a test item correctly; usually referred to as the difficulty index. (See Difficulty Index.)

Percent Proficient: The percentage of students who scored at or above grade level for each test.

Percent Tested: Percent of eligible students tested for each test.

Percentile: A point on a distribution below which a certain percentage of the scores fall. For example, if 70% of the scores fall below a score of 56, then the score of 56 is at the 70th percentile.

Percentile Rank: The percentage of scores falling below a certain point on a score distribution. (Percentile and percentile rank are sometimes used interchangeably.)

Performance assessment (alternative assessment, authentic assessment, and participatory assessment): Performance assessment is a form of testing that requires students to perform a task rather than select an answer from a ready-made list. Performance assessment is an activity that requires students to construct a response, create a product, or perform a demonstration. Usually there are multiple ways that an examinee can approach a performance assessment and more than one correct answer.

ABCs Performance Composite: The Performance Composite (PC) reflects the percentage of test scores in the school at or above grade level (state-defined achievement levels III and IV).

Performance Standards: Performance standards are defined in the following two ways:

  1. A statement or description of a set of operational tasks exemplifying a level of performance associated with a more general content standard; the statement may be used to guide judgments about the location of a cut score on a score scale; the term often implies a desired level of performance.
  2. Explicit definitions of what students must do to demonstrate proficiency at a specific level on the content standards.

Performance Task: A carefully planned activity that requires learners to address all the components of a performance standard in a way that is meaningful and authentic. Performance tasks can be used for both instructional and assessment purposes.

Pilot Test: A test that is administered as if it were “the real thing” but has limited associated reporting or stakes for examinees or schools.

Portfolio Assessment: A portfolio is a collection of work, usually drawn from students' classroom work. A portfolio becomes a portfolio assessment when (1) the assessment purpose is defined; (2) criteria or methods are made clear for determining what is put into the portfolio, by whom, and when; and (3) criteria for assessing either the collection or individual pieces of work are identified and used to make judgments about performance. Portfolios can be designed to assess student progress, effort, and/or achievement, and encourage students to reflect on their learning.

Predictive Validity: The ability of a score on one test to forecast a student's probable performance on another test of similar skills. Predictive validity is determined by mathematically relating scores on the two different tests.

Profile: A graphic presentation of several scores expressed in comparable units of measurement for an individual or a group. This method of presentation permits easy identification of relative strengths or weaknesses across different tests or subtests.

PSAT/NMSQT: The PSAT/NMSQT stands for Preliminary SAT/National Merit Scholarship Qualifying Test. It is a standardized test that provides firsthand practice for the SAT Reasoning Test™. It also gives students a chance to enter National Merit Scholarship Corporation (NMSC) scholarship programs. The PSAT/NMSQT measures critical reading skills, mathematics problem-solving skills and writing skills.

 

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z

Qualitative Measurement: Collecting information that is not numeric in nature. Qualitative data typically consist of text while quantitative data consist of numbers. Some sources of qualitative data may include written documents (e.g., student assignments), interviews (e.g., focus groups), case studies (e.g., portfolios) and open-ended survey questions and/or questionnaires.

Quality Tools: A set of strategies that assist groups to effectively use the “Plan, Do, Study, Act" cycle for continuous improvement.

Quantitative Measurement: Collecting information that is numeric in nature. Quantitative data are those in which the values of a variable differ in amount (in numeric terms) rather than in kind (in descriptive terms). These data can be analyzed using quantitative methods and possibly generalized to a larger population.

Quartile: One of three points that divides the scores in a distribution into four groups of equal size. The first quartile or 25th percentile, separates the lowest fourth of the group; the middle quartile, the 50th percentile or median, divides the second fourth of the cases from the third; and the third quartile, the 75th percentile, separates the top quarter.

Quasi-equated: Item statistics are available for items that have been through item tryouts (although they could change after revisions); and field test forms are developed using this information to maintain similar difficulty levels to the extent possible.

 

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z

Rater: A person who evaluates or judges student performance on an assessment against specific criteria.

Rater Training: The process of educating raters to evaluate student work and produce dependable scores. Typically, this process uses anchors to acquaint raters with criteria and scoring rubrics. Open discussions between raters and the trainer help to clarify scoring criteria and performance standards, and provide opportunities for raters to practice applying the rubric to student work. Rater training often includes an assessment of rater reliability that raters must pass in order to score actual student work.

Rating Scales: Subjective assessments made on predetermined criteria in the form of a scale. Rating scales include numerical scales or descriptive scales.

Raw Data: Data that are unorganized and unaggregated.

Raw Score: A person’s observed score on a test, (e.g., the number of items are answered correctly). While raw scores do have some usefulness, they should not be used to make comparisons between performances on different tests, unless other information about the characteristics of the tests is known. For example, if a student answered 24 items correctly on a reading test, and 40 items correctly on a mathematics test, we should not assume that he or she did better on the mathematics test than on the reading measure. Perhaps the reading test consisted of 35 items and the arithmetic test consisted of 80 items. Given this additional information we might conclude that the student did better on the reading test (24/35 as compared with 40/80). How well did the student do in relation to other students who took the test in reading? We cannot address this question until we know how well the class as a whole did on the reading test. Twenty-four items answered correctly is impressive, but if the average (mean) score attained by the class was 33, the student’s score of 24 takes on a different meaning.

Readiness Test: A measure of the extent to which an individual has achieved the degree of maturity, or has acquired certain skills or information, needed to undertake some new learning activity successfully. For example, a reading readiness test indicates whether a child has reached a developmental stage at which he or she may profitably begin formal reading instruction.

Regression to the Mean: Tendency of a posttest score (or a predicted score) to be closer to the mean of its distribution than the pretest score is to the mean of its distribution. Because of the effects of regression to the mean, students obtaining extremely high or extremely low scores on a pretest tend to obtain less extreme scores on a second assessment of the same test (or on some similar measure).

Reliability: The degree to which the results of an assessment are dependable and consistently measure particular student knowledge and/or skills. Reliability is an indication of the consistency of scores across raters, over time, or across different tasks or items that measure the same thing. Thus, reliability may be expressed as (a) the relationship between test items intended to measure the same skill or knowledge (item reliability), (b) the relationship between two administrations of the same test to the same student or students (test/retest reliability), or (c) the degree of agreement between two or more raters (rater reliability). An unreliable assessment cannot be valid.

Reliability Coefficients: Statistical estimates of the reliability of a test or measurement.

Reliability of Difference Scores: (See Difference Score Reliability)

Rubrics: Specific sets of criteria that clearly define for both student and teacher what a range of acceptable and unacceptable performance looks like. Criteria define descriptors of ability at each level of performance and assign values to each level.

 

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z

Safe Harbor (SH): Safe harbor is the first provisional status calculation applied (for AYP) if a student group meets the 95 percent participation rate but does not meet the proficiency target. That student group can meet its proficiency target with a safe harbor provision if: a) the student group has reduced the percent of students not proficient by at least 10 percent from the previous year for that subject area; and b) the group shows progress on the Other Academic Indicator, see AYP.

Sample: A subset of a population.

SAT: The SAT tests students' knowledge of subjects that are necessary for college success: reading, writing, and mathematics. The SAT assesses the critical thinking skills students need for academic success in college—skills that students typically learn in high school.

Scales of Measurement: Different ways of categorizing measurement outcomes. There are four types:

  1. Nominal – the characteristics of an outcome that fits into one and only one class or category and are mutually exclusive (e.g., gender, ethnicity, political affiliation, etc.).
  2. Ordinal – the ordering of the things that are being measured (e.g., students can be ordered according to their class rank).
  3. Interval – A scale of measurement that is characterized by equal distances between points on some underlying continuum. An interval scale is marked off in units of equal size such that the distance between units is the same at each point along the scale.
  4. Ratio – A scale of measurement that is characterized by an absolute zero.

Scale score: A Scale Score is a score derived from an assessment which places the result of that assessment on a predetermined “number line” which allows it to be compared to other results from that same assessment (either comparing one test taker who takes the assessment multiple times, or comparing one taker’s result to that of another).

Scatterplot: A plot of paired data points which indicates the relationship between two variables.

School Assistance Module (SAM): A centralized database of student information that assists schools in management tasks. It is populated by NC WISE data and has multiple components within the module.

Scattergram: See Scatterplot

Scholastic Aptitude: The combination of native and acquired abilities that are needed for school learning; the likelihood of success in mastering academic work as estimated from measures of the necessary abilities.

School-Wide Information System (SWIS): A Web-based information system designed to help school personnel to use office referral data to design school-wide and individual student interventions.

Screening: A fast and efficient measurement to identify individuals from a large population who may deviate in a specified area, such as the incidence of maladjustment or readiness for academic work.

Selected-Response Item: A question or incomplete statement that is followed by answer choices, one of which is the correct or best answer. It is also referred to as a "multiple-choice" item. (See Multiple-Choice Item)

Significance Level: The likelihood that a statistical test will identify a relationship between variables when, in reality, that relationship does not exist. (See Level of Significance)

Slope: The ability of a test item to distinguish between examinees of high and low ability.

Back to Index

 

SMART Goals: Improvement goals that are strategic, measurable, attainable, results-oriented, and time-bound.

Standard Deviation (SD): A measure of the variability or dispersion of a distribution of scores. The more the scores cluster around the mean, the smaller the standard deviation. In a normal distribution of scores, about 68.3% of the scores are within the range of one S.D. below the mean to one S.D. above the mean. Computation of the S.D. is based upon the square of the deviation of each score from the mean. (See Normal Distribution)

Standard Error of Measurement (SEM): The amount an observed score is expected to fluctuate around the true score. For example, the obtained score will not differ by more than plus or minus one standard error from the true score about 68% of the time. About 95% of the time, the obtained score will differ by less than plus or minus two standard errors from the true score. It is a measure of the uncertainty inherent in measuring something with a single assessment. The SEM is frequently used to obtain an idea of the consistency of a person’s score or to set a band around a score. Suppose a person scores 110 on a test where the SEM is 6. We would thus say we are 68% confident the person’s true score was between (110–1 SEM) and (110+1 SEM), or between 104 and 116.

Standard Score: A general term referring to scores that have been "transformed" for reasons of convenience, comparability, ease of interpretation, etc. The basic type of standard score, known as a z-Score, is an expression of the deviation of a score from the mean score of the group in relation to the standard deviation of the scores of the group. Most other standard scores are linear transformations of z-Scores, with different means and standard deviations. (See z-Score)

Standardization: A consistent set of procedures for designing, administering, and scoring an assessment. The purpose of standardization is to ensure that all students are assessed under the same conditions so that their scores have the same meaning and are not influenced by differing conditions. Standardized procedures are very important when scores will be used to compare individuals or groups.

Standardization (or Norming) Sample: That part of the population that is used in the norming of a test, i.e., the reference population. The sample should represent the population in essential characteristics, some of which may be geographical location, age, or grade for K-12 students.

Standardized Testing: A test designed to be given under specified, standard conditions to obtain a sample of learner behavior that can be used to make inferences about the learner's ability. Standardized testing allows results to be compared statistically to a standard such as a norm or criteria. If the test is not administered according to the standard conditions, the results are generally considered invalid.

Standards: The broadest of a family of terms referring to statements of expectations for student learning, including content standards, performance standards, and benchmarks. (See also Norms vs. Standards)

Stanine: A unit of a standard score scale that divides the norm population into nine groups with the mean at stanine 5. The word stanine draws its name from the fact that it is a STAndard score on a scale of NINE units.

Comparison Table of Stanines and Percentiles

Stanines
Approximate
Percentiles
Percentage
of Examinees
9 Highest Level
96-99
4%
8 High Level
90-95
7%
7 Well above average
78-89
12%
6 Slightly above average
60-77
17%
5 Average
41-59
20%
4 Slightly below average
23-40
17%
3 Well below average
11-22
12%
2 Low Level
5-10
7%
1 Lowest Level
1-4
4%


Statistical Significance: See Significance Level

Statistics: Analytic tools concerned with the collection, analysis, interpretation, explanation and/or presentation of data in an effort to help describe and make inferences about a sample and the population from which it was drawn, or about the relationship between variables.

Stem: The actual test question in a multiple-choice format is referred to as the stem.

Student Achievement: See Academic Achievement

Student Learning Outcome: A statement of what students will be able to think, know, do, or feel because of a given educational experience.

Student Residual Score: A student residual is a measure of how a student performed on a test compared to other WCPSS students like themselves. The residual score is the difference in scale score points between a student’s actual score and the score predicted for that student by a statistical method called multiple regression.

Students With Disabilities (SWD). A broadly defined group of students with physical and/or mental impairments such as blindness or learning disabilities that might make it more difficult for them to learn through the same methods or at the same rates as other students. Students identified as SWD should always have a current IEP. This student group is also one of the adequate yearly progress (AYP) subgroups.

Subgroup: A subset of a population who all share some characteristic.

Summary Goal Report: See Goal Summary Report

Summative Assessments: Assessments that determine how much a student knows at a given point in time (e.g., end of a quarter, semester, year, etc.).

Survey Research: The process of acquiring information about one or more groups of people by asking them questions and tabulating their answers.

 

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z

Teacher Residual Average: A teacher residual average is a measure of how the teacher’s students performed on a test compared to other students like them in Wake County Public Schools. The teacher residual average is the average of all the students’ residuals for a particular test and roster of the teacher. (See Student Residual Score)

Test Battery: See Battery

Test Blueprint: The testing plan, which includes numbers of items from each objective to appear on a test, and the arrangement of objectives.

Test Item: See Item

Test Objective: See Objective

Test of Statistical Significance: The application of a statistical procedure to determine whether observed differences exceed the critical value, indicating that chance is not a likely explanation for the results.

Threshold: The point on the ability scale where the probability of a correct response is fifty percent. Threshold for an item of average difficulty is 0.00.

Title I: Federally funded programs that provide intervention services for students achieving below grade level. School funding is based on the number of low-income children, generally those eligible for the free or reduced-price lunch program. Many of the major requirements in the NCLB federal law are outlined in Title 1.

Title III: The section of the NCLB federal law that provides funding and addresses English language acquisition and standards and accountability requirements for limited English proficient students.

T-Score: A standard score with a mean of 50 and a standard deviation of 10. Thus a T-Score of 60 represents a score one standard deviation above the mean.

Triangulation: A process of combining evidence or data from multiple sources in order to provide support for a conclusion. The use of triangulation supports a central finding and overcomes the weakness associated with a single methodology.

True Score: The actual score that someone would obtain on a test if their ability was measured with 100% accuracy.

Type I Error: Same as the level of statistical significance – the level of risk you are willing to take that the null hypothesis is rejected when it is true.

Type II Error: The likelihood that a relationship between two variables exists if it fails to show up in a single study.

 

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z

Validity: The extent to which something measures what it purports to measure. Validity indicates the degree of accuracy of either predictions or inferences based upon a test score or other measure. The term validity has different connotations and, therefore, different kinds of validity evidence are appropriate for various measurement situations.

  1. Content validity: For achievement tests, content validity is the extent to which the content of the test represents a balanced and adequate sampling of the outcomes (domain) about which inferences are to be made.
  2. Criterion-related validity: The extent to which scores on the test are in agreement with other measurements taken in the present (concurrent) or future (predictive).
  3. Construct validity: The extent to which a test measures some relatively abstract psychological trait or construct; applicable in evaluating the validity of tests that have been constructed on the basis of an analysis of the trait and its manifestation.
  4. Face validity: An estimate of whether a test appears to measure a certain criterion; however, it does not guarantee that the test actually measures what it purports to measure. Face validity relates to whether a test appears to be a good measure or not. This judgment is made on the "face" of the test, thus it can also be judged by an amateur.

Variability or Variance: A measure of statistical dispersion of scores.

 

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z

Weighting: The process of assigning different emphasis to different scores in making some final decision.

WINSCAN Program: Proprietary computer program that contains the test answer keys and files necessary to scan and score multiple-choice tests. Student scores and local reports can be generated immediately using the program.

z-Score: A type of standard score with a mean of zero and a standard deviation of one. It is a statistical measure of how far a data point (in standard deviations) is from the average. (See Standard Score)

 

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z

 

References


CTB McGraw-Hill. (2008). Glossary of assessment terms. Retrieved on December 16, 2008, from http://www.ctb.com/index.jsp

ERIC Digest. (1989). A glossary of measurement terms. Retrieved on December 16, 2008, from
eric.ed.gov/ERICDocs/data/ericdocs2sql/content_storage_01/0000019b/80/1f/e0/b3.pdf

ERIC Digest. (1994). Questions to ask when evaluating tests. Retrieved on December 16, 2008, from eric.ed.gov/ERICDocs/data/ericdocs2sql/content_storage_01/0000019b/80/14/1b/61.pdf

Joint Committee on Standards for Educational and Psychological Testing. (1999). Standards for Educational and Psychological Testing. Washington, DC: AERA/APA/NCME. Retrieved on December 16, 2008, from: http://www.ipacweb.org/conf/01/kaiser.pdf

Leedy, P. & Ormrod, J. (2005). Practical research. Upper Saddle River, NJ: Pearson Prentice Hall.

NC Department of Public Instruction. (2008). The ABCs of public education: 2007-08 Growth and Performance of NC Public Schools. Retrieved on December 16, 2008, from http://www.ncpublicschools.org/docs/accountability/reporting/ABC_results/0811completeabcspacket.pdf

Salkind, N. (2005). Statistics for people who think they hate statistics. Thousand Oaks, CA: Sage.

Salkind, N. (2006). Exploring research. Upper Saddle River, NJ: Pearson Prentice Hall.

Trochim, W. (2001). The research methods knowledge base. Cincinnati, OH: Atomic Dog Publishing.

UCLA Graduate School of Education & Information. (n.d.). CRESST Glossary. Retrieved December 16, 2008, from http://www.cse.ucla.edu/products/glossary.html

Wake County Public Schools – OCIPD (2007). Data glossary. Retrieved December 16, 2008, from http://www2.wcpss.net/departments/ocipd/sip/downloads/glossary.pdf (This is an Intranet link available only from a computer within a WCPSS building.)

 

A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z


Top