A poster on a prominent foreign language teachers’ list recently commented, in the context of a second poster’s question about research comparing methodologies:

…while it is difficult to do true scientific research where we isolate one variable, we can assess our students’ proficiency based on the ACTFL proficiency scale using instruments such as the OPIc, STAMP, or AAPPL. If we can gather data from as many programs as possible we can get a picture of the efficacy of foreign language education in the United States.  In programs where students consistently meet proficiency targets we can look at the methods used to produce those results.  We may even find that students are achieving roughly the same proficiency regardless of the method, that there are other variables at play.  I realize that those types of tests must be paid for but even a small sampling is better than none.

Yes, these sorts of measures would be useful. And we are acknowledging that it is impossible to design a head-to-head study comparing two methodologies, at least within the scope of a study that is do-able and fund-able in today’s world. It’s just impossible to control the variables that need to be controlled for those numbers to mean anything.

We do, however, have to look at what is being measured, even in a large-scale study.

First off, even if we ignore the problem of having different classroom teachers (which is diluted considerably if we imagine a nationwide study)  it is very difficult these days to find students who have gained their foreign language proficiency exclusively through CI-based instruction. Most students, of course, are still being taught “traditionally”, so there is no shortage of “traditional instruction only” students to fill that cell in the statistical table. The problem is that those who have had CI-based instruction are often in schools or programs that do not use CI exclusively. In some cases, CI is programmed sensibly into the beginning levels, where its rapid building of proficiency can do the most good, and more traditional methods are phased in later, when a higher degree of analysis is desired, perhaps in response to the demands of future college coursework in the language, or where the focus shifts to textual analysis.

In other schools, however, CI is done when it is possible, or even furtively, amid a department stiff with disapproval for someone who does not fit into the comfortable, controllable lockstep of chapter tests and worksheets. So clearly, saying a student has had 2 years of CI-based instruction and 2 years of “traditional” instruction is not meaningful in terms of determining best practices and maximum proficiency in minimum time, unless we can figure out what order of those two methods gives the best result, and for how long each one should optimally be applied.

The other impediment to this testing scenario is who is being counted. If we test only those who make it to Spanish IV, for example, then there is some pretty heavy self-selection going on. Given that most classes in the US are taught “traditionally” (again, for lack of a better word; what I mean is using methods that do not emphasize CI as the driver of acquisition), we would expect to see what is considered to be “normal” attrition from foreign language classes — the effect where, although a hundred students start out in Spanish I, there are only fifteen left in Spanish IV. For other languages, this effect can be even more marked.

If we are truly going to think about a large-scale study to find out which method works better, or which combination of methods is optimal (both of which are worthy questions, especially IMO the second), we need to base our numbers on the number of students who began to learn a second language. We are trying to find the way to get every child to acquire a second language. If we accept the bell curve and a high percentage of “washout”, we are automatically biasing our results away from whatever might have worked for those lost students, because they are simply no longer there to be counted by year 4 in most cases.

So, if there were ever to be a large-scale data collection effort, one that would overcome most of the objections in terms of differences in socioeconomics, individual teachers and so on, the numbers we would need to see for the simplest measure — CI vs. traditional teaching — need to be based on a larger denominator. The numbers that truly speak to reaching all students are these:

Success rate   equals

(Number of students reaching proficiency in year 4) /(Number of students STARTING OUT to acquire that language)

NOT (number of students reaching proficiency) / number of students tested in Year 4

This is just common sense. The point is not which year you choose to do the measurement; the point is that if we are not looking at what happens to all the students who start out to acquire a language, we are missing a huge part of the picture.