Tuesday, June 4, 2013

Problems with GPA and a Possible Solution

GPA should provide a total ordering of all the students being considered based holistically on their performance throughout their academic careers. That's how employers use it when deciding who to interview and that's how colleges use it to assign honors at graduation. In spite of this, however, there are some major problems with GPA that make it an extremely inadequate measure of relative student performance:
  1. Some courses are graded more leniently than others. This distorts student incentives, causing students to pad their schedules with easier courses despite the fact that taking more difficult courses provides more net benefit in the long run.
  2. Some departments grade differently. Arbitrary curve differences across concentrations often exist and cause students to unnecessarily shy away from one concentration over another in an attempt to achieve a higher GPA upon graduation and, thus, in their minds, better employment opportunities. This is undesirable because curve differences often do not reflect actual differences in difficulty, and thus shouldn't factor into a student's decision on what to major in. Comparative Literature, for example, is arguably just as difficult as Theoretical Physics, and yet lighter curves in the former can prevent students from majoring in the latter for no good reason.
  3. GPA discourages students from taking courses with intelligent students. Taking courses with smart students generally lowers one's chances of getting a good grade, since courses are usually graded on a curve. As such, a GPA system can cause students to shy away from upper-level courses because they are inadequately compensated for the increase in difficulty. This is undesirable because students can have much more intellectual conversations when they have more peers at their level.

Various hacks can be imposed upon the traditional GPA system to try and solve these problems. Princeton, for example, uses grade deflation (restricting the number of A's to 30% across all courses) to try and prevent problems (1) and (2) from occurring. But, obviously, by doing so, they exacerbate problem (3), while making the environment feel somewhat draconian, with students competing for A's that will be doled out based largely on noise in the easier courses.

Having thought about the problem of fixing GPA at some length, I reached the following conclusion: GPA does not factor in enough global information to make it a useful measure of relative student performance. That is, in order for a ranking system to be truly robust to all the problems listed above and more, it must be designed to heavily factor in deep inter-student relationships from the ground up.

Having reached this conclusion, I thought of the best global ranking algorithm I know, PageRank, which ranks websites, and thought about how it could potentially apply to the problem of ranking students. After all, all PageRank does is compute the limiting distribution for a strongly connected graph, so why not convert the problem of ranking students into a graph problem.  Below, I describe a scheme that I think would work very well as a ranking system for students. It requires the transcripts of all the students at a particular institution, a tough batch of information to gather, but we'll worry about practicality at the end.

How the algorithm works:

  • Construct a graph where each student corresponds to a vertex in the graph.
  • Go through every student's transcript and do the following:
    • If two students, A and B, took a course together and A scored higher than B, add a directed edge pointing from B to A. This effectively represents an endorsement of student A by student B that occurs as a result of A performing better than B.
    • If two students got the same grade, add two directed edges, one pointing from B to A and one pointing from A to B.
  • Use the graph to construct a (student x student) transition matrix, M, as follows:
    • For each pair of students, (i, j)
      • Let Mij = a/n + (1-a) (c/k) where a is some constant less than one (say .1), n is the total number of students, c is the number of edges leaving i and going into j, corresponding to courses in which student j outperformed student i, and k is the total number of edges leaving student i's node in the graph.
  • At this point, the matrix represents a fully-connected graph, so compute the limiting distribution, which is just the left eigenvector associated with eigenvalue one. The result is a vector of size equal to the number of students.
  • Each entry in the limiting distribution vector corresponds to a student's PageRank score. Since a higher score broadly implies that the student was endorsed more often by students with high PageRanks themselves, the entry for student i in the resulting vector can be used as a proxy for student i's rank.
  • Order the students by their PageRank scores and return this ordering.
This algorithm above (essentially PageRank) treats course outcomes as relative orderings and assembles all of these relative orderings into a single total ordering. I now claim that using the algorithm above to rank students would eliminate the three hazards mentioned at the beginning. Here's how:
  1. Some courses are graded more leniently than others. This wouldn't matter since we are treating course grades as relative orderings. That is, a course that grades leniently will provide less information to the algorithm (since many students will have the same grade) but will not otherwise unfairly affect the resulting ordering. In fact, no matter how leniently a course grades, as long as it grades fairly (a reasonable assumption), it will be no different for students than a course that grades more rigorously.
  2. Some departments grade differently. Again, this is not a problem as long as departments grade students fairly. That is, as long as a student who gets an A did legitimately better than a student who got a B or some other lower grade, departments that as a whole give more A's than others will not adversely affect the ranking of students. In particular, taking a course in a department that grades easier will not affect the student, since his performance will be globally compared to all of his schoolmates anyway.
  3. GPA discourages students from taking courses with intelligent students. This problem is eliminated because of what the limiting distribution represents. In the constructed graph, it is advantageous for one to have more inward edges in general, but it is more advantageous to have inward edges that come from students who themselves have many inward edges. This can be achieved by taking courses with many intelligent students, and so computing an ordering in this way significantly improves one's incentive to participate in higher-level courses.

More broadly, I think PageRank's general ability to factor in global information comes in very handy when considering how to rank students, a problem that I think is a fundamentally global if approached correctly. If you want to learn more about ranking this type of thing globally, check out the "Feedback Arc Set" problem, which PageRank solves heuristically (though not exactly).

Now, however, we run into problems of practicality. Unfortunately, I know of something called FERPA, which prevents crazy nerds like me from just mining all transcripts. But maybe it could be used by companies like Google that get tons of transcripts from students anyway. If Bitcoin has taught me anything, it's that providing the right incentives to people can instantly fix many practical problems (think about how quickly mining power grew with bitcoin compared to fold@home), so maybe actually paying students to provide transcripts, even if they're not applying, could be viable for a large enough firm. Perhaps this could be a particularly tantalizing opportunity for LinkedIn, since it already has all of the infrastructure in place to just slurp up people's transcripts. Like so many ideas in computer science, this one could die because of a lack of data and business model, but if one could figure out a way to get the transcripts and make money off of generating a ranking, it could be pretty cool to see what happens.

In general, I genuinely think that GPA is a poor ranking system for the 21st century, carried over from a time when computing anything more than an unweighted mean was "hard". Even if PageRank isn't the way to go about solving the problem of ranking students, I think something should be thought of to replace GPA, and I think the approach of treating course results as relative orderings and aggregating these rankings into one large ranking of all the students is a reasonable first stab.

2 comments:

  1. So I didn't mention this in the original post because I wanted to keep things simple but the scheme trivially generalizes to ranking students in a single department, say all math majors. The way you do it is restrict the graph to only contain the students in the math department and add connections between them only if those connections are due to math classes. What you get, in this case, is a ranking of just the students in the department where the only classes that affect the ranking are math classes.

    ReplyDelete
  2. This is a brilliant idea and one that I would love to see actually implemented. People don't realize how insidious problem #3 is above. From a pure growth perspective, it is far better to be the least smart than the most smart in a class. An idea like this would correct the flawed GPA incentives and possibly encourage people to maximize the opportunity available to them in college. The only drawback is that it would make it really easy to figure out who the truly smart people are in the world.

    ReplyDelete