A different set of essay and performance task (PT) questions is used for each administration of the California Bar Examination. The questions asked on one administration may, as a group, be more difficult than those asked on some other administration. Similarly, the graders who grade the answers to the questions on one administration may, as a group, be more lenient than the graders who grade the answers on another administration. This potential variation in question difficulty and reader leniency between administrations could unduly affect an applicant’s chances of passing.
Scaling is a statistical procedure that adjusts the scores assigned by the graders. This adjustment is used to ensure that an applicant’s likelihood of passing is not affected by any variation in the difficulty of the written section (includes both the essay and PT questions) across administrations. Almost all United States jurisdictions scale their written scores.
Scaling involves converting the sum of all score assigned by the Graders to the same units of measurement as that used for the Multistate Bar Examination (MBE). This is analogous to converting degrees of temperature measured in Fahrenheit to Celsius. On the bar examination, this conversion (or "scaling") essentially involves assigning the highest total written score that was earned by any applicant in a state the same value as the highest MBE score that was earned by any applicant in that state. The second highest essay score is assigned the same value as the second highest MBE score, and so on. This process is done separately for each administration of the examination. An applicant’s written score is not affected by that applicant’s own MBE score (or by the MBE or written scores of applicants in other states).
To improve the accuracy of the scaling process, a formula is used to make the conversion from the Grader-assigned written scores to scale scores. This formula is shown below where: A = the sum of the applicant’s Grader assigned scores across all the essay and PT questions in the test, B = mean of these scores across all applicants, C = the standard deviation (i.e., spread) of these scores for all applicants, D = the standard deviation of all applicants’ MBE scores, and E = the mean of their MBE scores. A simpler, but algebraically equivalent, version of this formula is used to report results.
Written Scale Score = [{(A – B) /C} x D] + E
The MBE has 200 multiple-choice questions. A different set of MBE questions is used for each administration of the examination. An applicant’s MBE “raw” score is the number of questions answered correctly. Because the questions asked on one administration may, as a group, be more difficult than those asked on another administration, a raw score earned on one occasion may not signify the same degree of proficiency as that same score earned on another administration. This problem is addressed by a process called "equating'; equated scores are often labeled as MBE "scale" scores in reports of the results.
Equating adjusts for differences in the difficulty of the MBE’s questions across administrations. It does this by calibrating the scores on each new version of the examination so that a given equated MBE score indicates the same degree of proficiency regardless of the examination on which it was earned. The equating process involves inserting into each new version of the MBE a set of questions that have been used before and whose difficulty is known. Equating adjusts the raw scores on the basis of whether the applicants taking the current version of the examination earn higher or lower scores on the repeated questions than did the applicants who answered these same questions on a previous administration. As a result of this adjustment, which is based on essentially all MBE takers nationally, an MBE equated score that was earned on one administration corresponds to the same level of proficiency as that same scale score on another administration. The National Conference of Bar Examiners (the organization that owns and produces the MBE) uses a method of equating known as Item Response Theory or IRT equating. Equating is used on virtually all large-scale multiple-choice tests (such as the LSAT, College Board Exams, etc.).
Equating cannot be used for bar examination written sections because it is not appropriate to repeat essay or PT questions across administrations of the examination since they are published following each administration. However, there is a strong underlying relationship between written and MBE scores. As a result of this relationship, an increase or decrease in average MBE scores between administrations signals a corresponding change in average applicant ability. Because of this relationship and the equating process, MBE “scale” scores provide the best way to monitor differences in average applicant ability across administrations of the examination. Scaling written scores to the MBE subsequently results in having a given written scale score indicate about the same level of proficiency regardless of the examination on which that written scale score was earned.