A message I didn’t give my students this semester

I was looking at old course websites as I plan for the fall, and I ran across this post from the Aural Skills I website from last semester. I’m so glad that with the new grading system I didn’t have to post something like this this semester!

For the last three assignments, there have been a total of 27 zeroes! (one-third of the two sections) That is unacceptable, and it has knocked a few of you below the passing mark for the first time in the semester. It is too late to submit the Chapter 14 listening assignment or the compound-meter notation assignment for credit. However, since the listening assignment from Chapter 16 was due in our last class, you can still submit it tomorrow for a 20% late penalty. It will be a zero if you submit it on Thursday. Please turn that in if you haven’t already.

Keep in mind that homework and in-class evaluations are worth 55% of the final grade, which means that each homework assignment is worth about 2.5% of your final grade. The median for the class is 2 missing assignments—half of a letter grade—and half of you have more than that missing. In fact, everyone who is below the 70% mark has at least 4 zeroes—an entire letter grade, and in some cases the only reason you are failing. For those of you who have just recently started to miss assignments, please get the remaining homework in on time. Do not fail at the last second simply because you are too lazy to do the homework!

Posted in Uncategorized | Tagged , , | Leave a comment

First semester with criterion-referenced grading, general recap

All in all, I’m very happy with the way things turned out this semester incorporating the new grading system. There are a few tweaks I’d like to make next year, but I’m definitely not going back to the old system of holistic, average-based grading. In this post, I thought I’d offer a brief summary of what I did differently this semester (including a couple tweaks to the system that I made mid-semester), what I (and my students) thought went well, and what I (and my students) thought needed improvement with a few thoughts on how those specific things might be improved.

What it looked like

In the past, I have used a traditional grading system where each assignment is given a single holistic letter grade, those grades are grouped according to assignment type (homework, quiz, test, etc.) and averaged, and the final grade is determined by giving those averages appropriate weights and averaging them.

For each class this past semester, I created a list of criteria that the students needed to meet in order to pass the class. This list was exactly the list of course objectives. These criteria were essentially topical categories like keyboard-style voice-leading, functional-bass analysis, sight-singing – rhythm, or use of music notation software. Each assignment was graded according to each criterion covered in the assignment. For instance, a voice-leading assignment typically was given grades for five categories: voice-leading, functional bass, harmonic syntax, cadences, and notation software. A sight-singing exam would receive grades for rhythm, diatonic pitch, solfege syllables, and—if applicable—chromatic pitch and/or modulation.

At the beginning of the semester, I told students that they must achieve a passing grade (3 or 4 on a 4-point scale) for each criterion on the list of course objectives to pass the class. The final grade would be determined by the number of 3s and 4s they had.

(More details on what we did this semester, and on criterion-referenced grading in general—also called standards-based grading, particularly in the sciences—can be found by clicking on the “criterion-referenced grading” tag to the right of this post.)

The main change that I made mid-semester was to allow students to have non-passing grades in up to two categories (out of ten) in Music Theory II and Aural Skills II. I did not do this for Theory and Aural Skills IV, mainly because I didn’t need to. But I also made this change because next year’s classes will allow this year’s Theory and Aural Skills II students more changes to develop their skills in these areas. If they are passing almost all of them, they don’t need another semester to prepare for level III; they just need another pass at the material in a new context, or maybe some time brushing up over the summer. If that’s the case, they shouldn’t fail the class. By the end of the course sequence, however, I expect them all to get there, and so I didn’t allow any non-passing grades for any of the course objectives in level IV.

I was purposefully vague about any formulas for calculating the final grade in my syllabus, since I hadn’t done this before. I didn’t want to back myself into a corner. What I wanted was a system where students who fulfill the course objectives and are ready to go on to the next level would get a passing grade, students who do not fulfill the objectives and are not ready to go on would get a failing grade, and where mid-semester grades would give students information about where they are, where they need to be, and where to focus their efforts most intently. In other words, all but the final grade should be formative rather than summative assessments.

In the end, I calculated final grades for Theory and Aural Skills II (where I had the most students and needed the most objective system to ensure consistency) as follows. Students who passed 10 out of 10 categories were given an A or a B+. (CSU does not give minuses, and only gives pluses for B and C. Don’t ask why; I don’t know.) Students who passed 9 out of 10 categories were given a C+, B, or B+. Students who passed 8 out of 10 were given a C or C+. Anything else was given a D or an F because they did not meet the requirements for passing the class (only A–C is passing).

The specific grades within those ranges were obtained using a weighted median of their categorical grades. In essence, I made a list of their category grades, repeating each grade according to its weight. So, for example, I may have listed voice-leading 6 times, but notation software only 3 times, and functional bass 4 times. Then I took the median grade on that list. All passing students had a median of 3, 3.5, or 4. That median placed their grade within the ranges. (This weighted median prevented me from giving a student with 4s only in categories that were easy, or which accounted for only a small amount of work during the semester, the same grade as a student with the same number of 4s but in “harder” or more substantial categories.)

So a student that passed all 10 categories with a weighted median of 4 got an A. A student that passed 9 categories with a weighted median of 4 got a B+, as did a student who passed 10 but had a weighted median of 3. A student who just barely fulfilled the requirements with 8 passing categories and a weighted median of 3 got a C, the lowest passing grade.

What went well

First, a greater portion of students passed this semester, even though I held them to a higher standard of mastery. There were also far fewer mid-semester withdrawls. Now, the grading system was not the only potential contributing factor. My students and I knew each other much better, and they trusted me more this semester than my first. We also incorporated less lecture and more in-class work. There were several other changes as well. However, my impression (and that of my students) was that the grading scheme made a positive impact here. (See my earlier post with the results of my student survey, where they overwhelmingly state that the new system helped them achieve the course objectives and direct their efforts.)

Second, students who passed this semester do not have any glaring holes in their knowledge and skills. Some are still working through some things, but they are much better prepared for Musicianship III than they were for Theory and Aural Skills II.

Third, students understood the reasonings behind the assignments better (see survey for data). If I can sell them on the course objectives, and they can see the relationship of the assignments to those objectives, it’s easier (if not always easy) to sell them on the work. It also keeps me honest as I make assignments!

Fourth, struggling students were both challenged not to let problem areas alone, and encouraged by what they could do well. Several students made comments in the survey along these lines:

It helped to understand what I was doing well (I used to think I was doing nothing well in aural skills…which can be a bit defeating) and also helped me to understand where I needed to focus my attention.

Fifth, because students who finished well were not penalized for getting material late in the semester, some who were B/C students in the past were motivated to really push ahead and go for full mastery. In one case, a student who just barely passed Theory I and who started off struggling in Theory II finished each unit very strong and ended the semester with a well earned A. I think that’s a fairer way to assign final grades, and it motivates a significant number of students to work hard. Few students are ever so far from an A (or from passing) that it’s not worth their effort to try for it.

Lastly, this helped students start to take ownership of their education and their development as musicians. Interestingly, the most common critique of the system in the survey is that it let them be lazy. But that’s good! Better to learn to overcome that your first year in college than your first year on the job! And seeing how much better the grades were for the class as a whole and for most individuals, it seems that the bulk of those who found themselves prone to laziness this semester learned their lesson before it was too late.

What needs improvement

First, and most significantly, the system still needs more specificity. Knowing whether one’s mastery of voice-leading is passing or failing is more helpful information than a C+ average in the class, or getting Bs or Cs on voice-leading assignments that also include harmonic analysis. However, I can do better in giving them specific guidance through the grades. One mid-semester improvement along these lines was to give a specific list of criteria that would lead to passing grades for the categories that covered our final topical unit (pop/rock music). Next semester, I want to make those specific criteria the new course objectives. I won’t have to think “have they demonstrated mastery of law of the shortest way, doublings, chord realization, dissonance resolution, and register on two more more assignments in a row?” before assigning a voice-leading grade. I can simply treat each of those as their own criterion. A good precedent for this kind of system comes from Andy Rundquist‘s physics courses.

Second, there could be more transparency—or better yet, student ownership—about progress toward the final goal. A few students will read this and think it means “knowing how I’m progressing toward the final grade” or “knowing my current ‘average’” (requests from the student survey). This could be clearer, as I was purposefully vague about this early on. However, I really would like to see the students think more in terms of their own musicianship than their grades. This semester moved a lot of them at least a step or two in that direction, but not all the way. I think that a longer list of more specific objectives that they keep track of themselves may help more.

Still more helpful along these lines would be making the classes pass/fail—especially for the first course or two in the sequence. If there ceases to be differentiation between grades and a significant pull on the GPA (ever present on the minds of scholarship students who don’t get straight As and Bs), perhaps the shift from grade goals to musical goals can be made more profoundly.

In a moment of finals-week annoyance I quipped on Twitter, “Can we abolish final letter grades in favor of pass/fail classes and recommendation letters?” The more I think about it, the more I like that idea. As the only teacher most of these students see every day for the first two years of their studies, I end up writing many of them letters anyway. Why can’t that be the means of differentiating them, rather than letter grades? Anyway, that’s another topic. . . .

Lastly, some re-incorporation of deadlines, at least for first drafts of work, could help a few students stay on track, or at least get started. I like Andy Rundquist’s “two-week rule,” where students have two weeks after a criterion is “opened” to submit a first attempt. If a student fails to do that, they get 0 for the criterion. If they get it in, they have unlimited redoes to demonstrate mastery by the end of the course. I think some version of that may be helpful to incorporate next time.

All things considered, the new grading system went about as well as I could have hoped. It definitely helped the students accomplish what they needed to accomplish; I don’t have any reservations about sending those who passed onto the next level; and those who finished well got high marks. I’ll be making some changes for its implementation next year, but I’m definitely not going back.

Posted in Uncategorized | Tagged , , , , , , , | Leave a comment

Ligeti, the Serialist (part 1)

In preparation for my visit to the Paul Sacher Stiftung in Basel, Switzerland, to study materials in the György Ligeti Collection, I am revisiting some of my dissertation research. This is the first post in what will probably be a series devoted to the question of Ligeti and his relationship to the “serialist” composers of Darmstadt and Köln.

György Ligeti is often presented as an outsider to the serialist movement who infiltrated the ranks. However, my research has led me to a very different conclusion. Ligeti, for all of his originality, was a serialist composer. He worked alongside other serialists, engaged serialist questions, and worked within the framework of serialist ideology. He composed music in line with serialist aesthetics, and he wrote articles in serialist journals advancing serialist issues. Even his statements to be a non-serialist, a composer who hates all ideologies and shuns traditions, were typical serialist statements. In this post, based on part of the final chapter of my dissertation, I argue that several of Ligeti’s most popular “middle-period” pieces are not the work of a rogue composer that just happened to work in Darmstadt and Köln. Rather, they are indeed serialist works.

Ligeti’s writings in the first years after his emigration to the West, many of which can be found in Die Reihe or the Darmstädter Beiträge zur neuen Musik, deal to a significant extent with the problems facing serialist composers at the time. One of these problems is that of the nature of composition after the discovery of the “‘nature’ of the elements” of music (i.e., the parameters of pitch, duration, and loudness in relation to single tones) was made, by necessity, in the electronic studio (c.f., Eimert 1957/59, p. 9; Ligeti 1958/60, p. 62). Another is the inherent contradiction of the dictum to avoid repeating the past, which leads to an aversion both to directionally oriented functional tonality and periodic (and thus to a large measure, static) figures such as ostinati (c.f., Ligeti 1960/65, p. 18; Ligeti 1966; Adorno 1966/2008, pp. 204–05). Still another is the problem of octaves forming at the intersection of serial lines (Ligeti 1958/60; Ligeti 1960/65, pp. 7–8). And lastly, the general question of musical form post-Schoenberg, post-Webern, and post-early “integral” serialism (again, c.f., Ligeti 1966; Adorno 1966/2008). In fact, Ligeti even claims in “Metamorphoses of Musical Form” (1960/66) that Apparitions and Artikulation are attempts to work out solutions to some of these serialist problems (pp. 14–15).

Indeed, analyses of Apparitions, Artikulation, and other works do bear out the idea that some of Ligeti’s works and compositional devices post-1956 are part of an attempt to “solve” these problems that arise in the history of serialist and post-twelve-tone music. For instance, in “Metamorphoses,” Ligeti describes one of the problems of serializing durations, such that twelve different note durations each occur an equal number of times in a work: “The problem is, that the longer a duration-interval is, the more dominating its effect, for in the series and all structures proceeding out of it only one shortest duration is available to counter-balance the longest. . . . The fact that the longer durations dominate destroys the ‘non-hierarchicalism’ that serial organization is trying to establish” (p. 13). Ligeti follows this articulation of the problem by explaining several approaches taken by composers to correct the accidental hierarchy of duration, all of which, he writes, “inescapably result in the destruction of the original fixed pre-determination of the durations,” focusing on “higher-order control systems” (p. 14). Ligeti suggests, instead, that “Irregular distribution of the elements on a statistical basis could then take the place of fixed series” (ibid.). An exemplar of this is his Apparitions, in which “the product of each duration-value and the number of times that it occurred in the whole structure was constant. . . . [T]he shorter a particular duration-interval, the more frequently it appeared in the context, and so many short durations were used for every long one that the sum of the short ones equalled that of the long” (ibid.).

Ligeti then goes out of his way to make sure his readers interpret his technique as one that is employed in service of saving serialism, rather than subverting it. “It must be admitted, however, despite the fact that by this means we have succeeded in excluding another rudimentary trace of a hierarchical system, that the essential nature of the serial principle itself has here been called in question. But, as mentioned earlier, the serial principle itself has already been called into question” (ibid.).

The next two paragraphs of “Metamorphoses” similarly place Artikulation firmly within serialist tradition, if not in a salvific role. Just as Stockhausen’s Gruppen distinguishes between multiple “aggregate-conditions”—textures of contrasting timbre and various degrees of density—and those different conditions mix with one another and transform into each other in a way that articulates the form of the piece, so does Artikulation. “In my electronic piece Artikulation the aspect that occupied me most was the composition of the mutual effects exercised by these ‘aggregate-conditions’ on one another. . . . The serial ordering of such behaviour-characteristics served as a basis for the erection of the form” (p. 15).

A final example can be seen in Ligeti’s description of a compositional technique employed by Henri Pousseur. “[I]n Pousseur’s Quintet for clarinet, bass clarinet, piano, violin, and cello, the basic 12-note series—borrowed from Webern’s Saxophone quartet Op 22 in homage—is shorn of its function simply by filling out each interval chromatically. The pitch-series has been transformed into a series of densities” (1960/66, p. 7). Ligeti describes this as a stage on the way towards a complete abandonment of the “pre-formation of pitches” “in favor of serial depositions of a higher order” (ibid.). Compare this with Bernard’s conclusion from his (1994) article, “Voice Leading as a Spatial Function in the Music of Ligeti”:

[Ligeti’s] reaction to European serialism of the 1950s led him to eliminate, as far as was possible, pitch (especially pitch-class) and interval functions, substituting for them, respectively, a simple distinction between high and low and a scale of larger or smaller “bandwidths” (that is, intervals as vertical spans of absolute size) filled more or less densely (p. 249).

Bernard’s analysis leads him (rightly, I think) to conclude that “bandwidths”—or intervals between upper and lower boundary pitches—and density are important functional aspects of Ligeti’s music of the 1960s. However, while Bernard takes this as a “reaction” to serialism, it is clear from Ligeti’s description of some distinctly serial problems in 1960 and recent attempted solutions to those problems that the functional properties of bandwidth and density in Ligeti’s music of the 1960s are further attempts on Ligeti’s part to address these same problems, and there is a close connection between his bandwidth technique and Pousseur’s use of the series in his Quintet. We cannot speak of “a reaction to the modernist impulse found in [Ligeti’s] works of the 1950s and 1960s” (Drott 2003, p. 286) without significant qualification. In a very real sense, Apparitions, Artikulation, Lontano, and other works of the late 1950s and 1960s are serialist works, and Ligeti is a serialist composer.

Posted in Uncategorized | Tagged , , , , , | 2 Comments

Student feedback on criterion-referenced grading

At the end of this semester, I asked my students to fill out a few short surveys about some of the changes I introduced this semester, and some of the resources we made use of this year. One of those surveys was on the change from a holistic, average-based grading system to a criterion-referenced grading system. Just under half of my students who finished courses with me this semester responded (a few more than filled out CSU course evaluations!). Here are the questions and responses.

1. Did the new grading system help you achieve the course objectives better?

2. How did the new grading system help guide your study and direct your efforts towards achieving the course objectives throughout the semester?

3. How did the new grading system help guide your efforts *at the end of the semester*?

4. How did having multiple scores per assignment affect your ability to achieve the course objectives?

5. Did the new grading system make more apparent the reasoning behind the assignments given?

6. How did the absence of late penalties for assignments affect your coursework? (did you do less work or the same work? was it freeing? or did it make it easier for you to fall behind?)

i still did the same work

It was very freeing. I felt like I could do more work done than I could before because I had more time.

It definitely made it easier to fall behind but that was my own fault. Although it was easier to fall behind, it was very freeing. I liked that I could turn things in and catch up.

I did less work, so it was freeing, but it also made it easy to get lazy and fall behind.

It was definitely freeing, because there was one time that i just had so much music to be learning and work in other classes that i was able to hold off on the theory assignment and finish it later, but still be evaluated on my knowledge of the assignment

I did the same amount of work and tried to submit everything in a timely manner.

I thought it was great because it allowed you to take your time to make sure assignments were correct instead of rushing to get stuff done. It helped balance the amount of coursework as a music major. Thank you!

It did not affect my work. I may have been less worried about getting it in by 9:00 am, but I didn’t miss any assignments at all. This system made me feel like I needed to do all the work to grasp the concepts…I felt like it was freeing (see above and the part about aural skills creating an abundance of anxiety for me in previous semesters)

It was nice knowing that, after several hours of homework, I could go to bed without stressing about that one extra Theory or Aural Skills assignment.

7. How did the ability to resubmit assignments affect your ability to achieve the course objectives?

8. What would you suggest that I do in future semesters?

Obviously, the students liked the new system. In fact, some of them liked it so much that they don’t want me to improve it! (question 8) It took some of them some getting used to. However, no one wants to go back to the old system. And they believe (as do I) that the criterion-referenced system helped guide their efforts more than the old traditional system—and take more ownership of their education. Most of all, the ability to resubmit assignments was universally appreciated. And, as my previous post demonstrated, it seems to contribute more significantly to overall student improvement than simply doing more new assignments.

I do have some improvements that I want to make, which I’ll discuss in future posts, and at that point I’ll discuss some of the written comments provided by students in response to these questions. However, given student performance this semester and student response to the new system, I’m definitely not going back to the old system of holistic, average-based grading. And I strongly encourage any instructors who read this to give some form of criterion-referenced grading a try—even just for a single course.

Posted in Uncategorized | Tagged , , , , , , , | 3 Comments

Data on student engagement with criterion-referenced grading

One objection that often comes up in response to a criterion-referenced or standards-based grading system is that it does not sufficiently motivate students to engage coursework throughout the semester. If students only need to demonstrate mastery by the end of the semester, they can skip homework assignments with impunity (or so they think). And if they don’t do the work, they won’t engage the material sufficiently to succeed in mastering the material by the end of the semester. This point was raised in response to several of my early posts on criterion-referenced grading.

There are two claims here: first, that students are more likely to do assignments that count toward their final grade (as they do in traditional average-based grading); and second, that students are more likely to master the material and accomplish the course objectives if they do more of the assignments.

I now have some data that speaks to both of those claims.

(Though, I should note that I don’t have much data. So while my data may suggest something about the veracity and validity of these claims, I want to be clear up front that my data is not sufficient basis for any conclusions. If others out there have data of their own to contribute, perhaps a collaborative effort may lead to claims of substance.)

Grades and student incentive to work

I will leave philosophical matters aside here about whether grades should reflect mastery, behavior, or some combination thereof. (I tend toward mastery rather than behavior, as you might guess.) Though that issue is an important one not to overlook in discussions of grading methods, I simply want to focus on the two claims presented above in this post. The first question, then, is do students in an average-based grading system, where each assignment counts towards their final grades, do more coursework than students in a criterion-referenced grading system?

To explore this matter, I took data from my gradebook for the fall semester’s Music Theory I class (MUSI 131, holistic and average-based grading) and this semester’s Music Theory II class (MUSI 134, criterion-referenced with multiple categorical grades for most assignments). I have data for other courses I taught this year, but these classes are my largest, they have the most assignments, and they have the most similar kinds of assignments. That said, it was still difficult to get an apples-to-apples comparison. The distinction between homework and test became fuzzier in Theory II, for one thing. Once you stop using a grading system that counts homework and tests differently, but groups assessments by course objective, that distinction is bound to diminish. Incorporating inverted-classroom techniques does the same thing. Theory II still had homework, quizzes, and tests, but the distinction between in-class practice work not for a grade, in-class practice work for a grade, quizzes, tests, graded practice tests, homework, and take-home tests/projects becomes messy.

The data-related result is that the quizzes in Theory I have no direct correlate in Theory II. The tests (in-class and take-home) in Theory I are sometimes comparable to Theory II homework and sometimes comparable to Theory II tests. And I don’t have data on how many students did in-class assignments in Theory II that were looked at but not graded. (I could compare class activity notes to attendance records, but those activities have no correlate in Theory I, and I’m lazy.) So I decided to count all graded homework and tests for both classes, but not ungraded work or in-class quizzes (since the latter existed in only one class). I think this is a fair comparison, and where any unfairness exists, it will tip things in favor of the claim I hoped was false, so I’m okay with that. Take it for what you will.

Okay, now the data.

For each course, I took the number of assignments submitted (including redoes) and tests taken for each student, and divided that by the number of assignments and tests given to obtain a percentage of work performed. For Theory I (average-based grading, one shot per assignment), the mean amount of work submitted by students that finished the semester was 92%; the 95% confidence range was 86%–98%. For Theory II (criterion-referenced grading, redoes allowed), the mean amount of work submitted by students that finished the semester was 87%; the 95% confidence range was 78%–96%.

The 95% confidence ranges (represented by error bars on the chart below) are important. They demonstrate that though there is a difference in the mean value for each semester, regular variance from student to student and from semester to semester cannot be ruled out as the cause of that difference. In other words, there is no statistically significant effect of grading system on the amount of work done by students in these courses.

Now some will object that I counted redoes for the criterion-referenced grading system, stacking the deck in my favor. Well, I’ll tell you that there is a statistically significant difference between assignments submitted in Theory I and first attempts submitted in Theory II. The mean value there is 69%, with a 95% confidence range of 63%–75%.

However, I included redoes for two reasons. First, the claim to which I am responding is not that average-based grading motivates students to do a large number of assignments and that completing more distinct assignments leads to better understanding. The objection raised is that average-based grading motivates students to engage with the material more regularly and that regular engagement leads to better understanding. Redo assignments certainly contribute to student learning. In fact, that leads to my second reason: as I will argue shortly with data, I believe that redo assignments (following helpful instructor comments—and receiving multiple categorical grades for each assignment is part of that) contribute more to student learning than first-attempt assignments. So let’s move on to that data.

What kind of work helps students learn best?

Assignments where students are not penalized for failure but instead are encouraged to take comments from the instructor and apply them on another attempt at the same assignment before moving on, without any final grade penalty for that first attempt.

Here’s the data.

I analyzed a number of parameters for 17 students who took both Music Theory I and Music Theory II: Theory I final grades, Theory II final grades, the difference between those grades (each letter/+/– combination was given an integer rank, and the difference was calculated), the percentage of assignments & tests submitted in Theory I, same for Theory II, percentage of first-attempt assignments (number of first-attempt submissions / number of assignments given—no tests or projects were included) in Theory II, and percentage of redo submissions (number of redo submissions / number of assignments given—no tests or projects were included) in Theory II.

I used Spearman rank correlation to test which of the pieces of information about student engagement most predict student success (Theory II final grade) and student improvement (difference between Theory I and II). I used the SciPy.stats.stats.spearmanr() function in Python (I’m told that the p values may be unreliable for data sets of less than 500 elements, so take them as you wish).

What is the best predictor of student grades for Theory II? Theory I grades. The Spearman coefficient of rank correlation (rho) between Theory I grades and Theory II grades is 0.53, p < 0.03. So a moderate correlation that (as far as I can tell using this tool) is statistically significant. And that makes sense. Good students do well no matter what the grading scheme; poor students poorly; etc. Nothing else had anything near a statistically significant correlation with Theory II grades (and the rho values were low or low and negative anyway). So in this group of students, the amount of work done (overall, redoes, or first attempts) had no significant effect on their final grade.

But what is the best predictor of student improvement from Theory I to Theory II? The best predictor of student improvement was a low score in Theory I! Theory I grades and grade improvement correlated at rho = -0.61, p < 0.01. Of course, that could just be regression to the mean. Students with As in Theory I can only hold steady or go down, and vice versa.

There were no other significant correlations, but there was one other that was very close (and remember that the p values may not be reliable for such a small sample). Student improvement correlated moderately (rho = 0.46, p < 0.06) with number of redo submissions. Not overall work, not number of first-attempts, but redoes.

This makes sense, and I can think of specific students where this was huge—in one case a student submitting redoes on about a quarter of the assignments improved from a C in Theory I to an A in Theory II. Quality and specificity of instructor comments are important, though, as is student attention to those comments, for redo assignments to effect improvement. I'll save comments on that for another post.

Conclusion

Again, the data I collected here is minimal, and I don’t want to make any strong claims based on it. However, this data suggests that—at least for this group of students—having an assignment count toward the final average does not have a significant effect on student motivation to do the work. This data also suggests that percentage of assigned work submitted during a semester does not predict student outcomes. However, this data does suggest that percentage of assignments re-submitted by students (especially students who struggled previously) may have a significant effect on student improvement.

It should be noted that though my specific take on criterion-referenced grading this semester showed no significant difference in amount of work submitted, that may not always be the case for criterion-referenced grading more generally. I gave my students 23 assessments, for 10 categorical grades, and students were required to pass a minimum of 8 to get a C, 9 to get a B, and 10 to get an A (in addition to weighing the actual scores). Many assignments, tests, and projects counted toward multiple categories, but that didn’t leave many assignment students could skip and still cover all the categories. And a few early slackers found that out as we got closer to midterm. We also started many assignments in class, while I walked around the room catching errors and answering questions (office hours for everyone, essentially), which got the ball rolling and got most of them over any initial freeze that might happen were they to start it on their own late the night before the due date. Had we not done those things, results may have been different. So I’m not suggesting that criterion-referenced grading will never lead to students not turning in work until it’s too late. But the way I did it this semester did not.

I hope this post has been helpful for some of you thinking through things for next semester. And if you have data to add to mine and are interested in a larger quantitative exploration of the claims discussed here, please let me know. It would be an interesting and valuable project, I think.

Posted in Uncategorized | Tagged , , , , , , | Leave a comment

Open-access publishing and music scholarship

A couple very interesting threads have come up lately on the SMT-Talk mailing list (for the Society for Music Theory). One is on textbook publishing, and another on open-access publishing. (Actually, the latter was my fork of the former.) Some exciting things are happening, like the forthcoming creating of a new website hosting contributed heretofore unpublished pedagogical materials for music theory. But so far, a lot of the discussion on open access has centered around making material free to download on the web. But that isn’t the whole picture when it comes to open, free access. Here’s my recent email to the list. I think that some from other disciplines or subdisciplines may be interested to see what’s being discussed. And I hope that a few of you may have thoughts to contribute from outside the membership of the Society for Music Theory.

Here’s the message:

Dear Colleagues,
So many interesting threads right now, but alas it is finals week here. Probably for the best, since I could write far more than any of you would like to read on these matters! However, I have a few more thoughts on open access that I’d like to share.

First, thanks to Steve and Jennifer for the reminder about the JMTP site. It sounds like it has potential to do some great things, and I hope as well that there will be a substantial “community effort.” Also, thanks to Yonatan for the list of open access music journals. Very helpful, and I hope to see the list grow as more journals move their material to the open web.

Putting vetted material on the web for free download, though, is only one facet of open access. The open-source software (OSS) community often contrasts “free as in beer” with “free as in speech” or “free as in freedom” when discussing what it means for software to be free. Free to download and use as is within certain terms of use is a very good thing (free as in beer). But free to manipulate, re-create, enhance, break, rebuild, and contribute brings a whole other set of benefits (free as in freedom). For one, open-source software problematizes the developer/user divide and puts forward more of a community of developers, users, and developer-users.

While the developer/user (author/reader, perhaps) model has its place in academia, I think the model of a community of developer-users (or simply “researchers” or “scholars” or “pedagogues”) makes more sense of much of what we do as music scholars and teachers. Academic research and teaching often necessitate manipulation, re-creation, breaking, rebuilding, etc. But free-as-in-beer publishing doesn’t fully promote that. Developer-users must still think about the difference between infringement and fair use, derivative v. transformative use, etc., and the vetting process means that new information cannot get out quickly.

However, there are a couple things we can do in the sense of “free as in freedom” publishing that can foster more of a developer-user community, to the benefit of our scholarship and teaching:

1) A small thing we can do is to license our materials (in print and especially on the web), whenever possible, with a Creative Commons BY license. Here’s a great, short article on why that particular license makes good sense for humanities scholarship: http://nowviskie.org/2011/why-oh-why-cc-by/. (I hope the new JMTP site will use this license or something close to it in order to ensure the greatest possible potential for use of the materials hosted there.)

2) A more substantial thing we could do (which gets more toward what I believe Bob was suggesting) is to aggregate and “curate” as a society the material available on the open web relating to music theory (or along with our colleagues from other societies, music scholarship more generally considered). A great model for this comes from Digital Humanities Now and the Journal of Digital Humanities. Details can be found on their respective websites (Digital Humanities Now and Journal of Digital Humanities), but in a nutshell, DHN aggregates content on the open web relating to the field of DH; editors pick the most substantial original work and label them “Editors Choice”; followers of DHN are driven to those articles where they provide comments, criticism, and questions for the authors; authors make revisions based on those comments; and the best revised articles are chosen by the editors for placement in the JDH (which is available for free in HTML, PDF, and ePub).

The combination of these two things means that new scholarship can get out fast, and the quality can be quite high (due to the high number of potential reviewers and the small percentage of submissions that ultimately make it into the journal, at least in the case of JDH). The peer review process is also entirely transparent, and the articles that are finalists but not chosen for the journal are still widely publicized and receive substantial review.

I hope these two specific ideas that are taking root in other fields will take off in music scholarship, as well. I’m planning on using CC-BY with my work whenever possible. I’d also love to be involved in laying the groundwork for something like DHN/JDH for music, if there is enough community interest and some potential collaborators.

Happy finals week!

Kris Shaffer, Ph.D.
Assistant Professor of Music Theory
Charleston Southern University

Posted in Uncategorized | Tagged , , , , , , | Leave a comment

Mahler 5, Chicago Symphony, Barenboim

I played along with part of this in my office today. Getting in shape for this weekend’s Mahler 2 concert with the Charleston Symphony.

I’ve found a great way to practice a symphony like this is to download the part from IMSLP and load that up on one side of the computer screen, then open up a YouTube video of a great orchestra/conductor playing it on the other side of the screen. Lots of fun.

Of course, when I started playing along with the CSO at full volume, the practice rooms down my hallway were full. When I finished, they were completely empty! Must mean my fff was almost loud enough!

Posted in Uncategorized | Tagged , | Leave a comment

“A future full of badges”?

UC-Davis has created a new system of academic credentials known as “badges,” which Kevin Carey explains at The Chronicle of Higher Education. These badges, not unlike the Boy and Girl Scouts’ merit badges, provide detailed information about a person’s skills and knowledge in specific areas. This contrasts the traditional academic transcript in two chief ways. First, the skill information is more specific than a letter grade in a particular course (which could cover any number of topics; and as I’ve discussed previously in this blog, the course may have a grading system that masks serious gaps in student understanding when averaged with other topics in the same course on which they excel). Second, badges are open and reflect knowledge obtained anywhere, rather than knowledge gained at a single institution.

UC-Davis are not unique in using badges. Carey notes the work that the Mozilla Foundation and the MacArthur Foundation have put toward supporting the creation of such systems. And, of course, Khan Academy uses a badge system to reflect knowledge gained on the site.

The badge system has some clear points in its favor. For one, it is similar to a criterion-referenced or standards-based grading system in that it diminishes the likelihood of student weaknesses being “averaged out.” It also allows for individual-paced learning and the possibility of full-time employees knocking off some basic “coursework” on their own pace on nights and weekends (or on the job, for that matter) before jumping into full- or part-time university study to tackle what can’t be done readily on one’s own.

And there are some clear kinks to be worked out. For instance, will there be one central body determining the criteria for badge awarding? Will instructors need to be certified? Will each field have their own? Or will there be competing certification/accreditation boards? And as one commenter on Carey’s article asked, how long will these badges last? A degree from Yale has pretty long lasting significance (at least, that’s what I’m banking on!). Will these badges just disappear in a few years?

Rather than jump whole hog into that debate, I want to direct my thoughts toward a particular issue that Carey brought up:

Compared with the new open badge systems, the standard college transcript looks like a sad and archaic thing. Its considerable value is not based on the information it provides, which is paltry. What does a letter grade in a course often described only by the combination of a generic department label and an arbitrary number (e.g. Econ 302) really mean? Nobody knows, which is why accredited colleges often don’t trust that information for the purposes of credit transfer, even when it comes from other accredited colleges. [emphasis added]

This is a real problem. I’ve had several students this year come in with AP Music Theory credits or transfer credits and were placed into the level where they left off in their previous study. All of them were missing some important skill or piece of knowledge that my class covered the previous semester. Now, some of them rose to the challenge and have done brilliantly. But some have not. And it’s not entirely their fault. If CSU had better information about their background (or a better policy for acting on that information), we could place them better. It need not mean taking an extra prerequisite course; it may simply mean knowing ahead of time that they need to read up on x or practice doing y over the semester break, or audit the last few weeks of the previous course, before jumping in. But if we don’t know there’s a disconnect, they certainly won’t.

However, I’m not convinced that badges are the way to solve this problem. Instead, professors and universities can simply be more careful about testing incoming students before placing them, perhaps even granting them credit for courses where they have already mastered the material outside of an accredited course.

That’s the change we made at CSU to work around our AP and transfer problem. Starting next year, we will require a placement exam of anyone bringing in music theory/aural skills/musicianship credits. If they demonstrate knowledge and mastery consistent with the course objectives of Musicianship II, they are placed in Musicianship III (whether they took one semester or four prior to transfer). Looking at a letter grade, a course description (which probably isn’t up-to-date), and an institution’s reputation isn’t enough to tell us if a student will succeed at level x. So we’ll test them ourselves and put them where we think they best belong, regardless of how many credits they have. (Our registrar counts them as general credits until they pass our placement test or take our course in its place.)

Of course, you may say that evaluating students against a list of course objectives for each level of study is basically what UC-Davis’s badges do. And I’d agree. However, there is a movement to make badges universal, and that’s where I see the problem. As Carey writes:

The top-flight educators at UC-Davis may develop the first widely used badge system for sustainable agriculture, but they won’t, in the long run, control it. Over time, farmers, students, civic groups, companies, professional organizations, and individual scholars will all contribute to a continuing process of helping people organize critical information about their lives.

I love the idea of a degree or a transcript saying (in a fair amount of detail), the faculty at this institution agree that this graduate/student knows this material and possesses these skills at this level rather than a list of grades for a list of cryptically titled classes (which cover different material every year). But I cringe at the idea of a single list of badges (whether controlled by a central authority or an “open” “community”) and what it will do for employment practices and curriculum design, not to mention the increase in “grade grubbing” (“badge grubbing”?) it is sure to bring. Like a Windows software update, I’m afraid that an open, standardized badge system would introduce more problems than it fixes.

All in all, I think the answer to discrepancy between institutions and the opportunities for individuals to learn outside university is not to create a national or international standard metric of achievement. Rather, each individual institution—whose faculty know the curriculum and the current students, have a particular vision for the outcome of the students’ education, and who will be personally investing themselves in these new students’ lives and scholarly development for the next few years—should take care to assess incoming and outgoing students in the most meaningful ways, in the best service of their education. Invested faculty with clear goals, keen awareness of student progress, and a willingness to adjust methods (and institutional policies) in service of bringing those students to those goals will always beat a standardized system. I applaud UC-Davis for seeking clarity, transparency, and specificity in the way they assess students, and for their willingness to redesign the curriculum from the ground up around a department consensus of knowledge and skills necessary for graduates of their program. But I shudder at the thought of making a single, universal map of knowledge.

Posted in Uncategorized | Tagged , , , , , , | Leave a comment

Python scripts for a criterion-referenced grade book

When using a criterion-referenced or standards-based grading system, one big question is always how to manage grade records. The issues are that a single assignment may contribute to assessment for a number of different course objectives, a given course objective/grading criterion will reflect multiple assignments, and students may resubmit assignments at least once, sometimes several times. Thus, a traditional electronic grade book either will not fit the system, or it will be huge, slow, and clunky.

I decided that rather than force my model into an unwieldy spreadsheet, I would write some python scripts to do the job for me (download the scripts and a sample course grade list here). I wrote three scripts: one for grade entry (which I then modified into a version for each class that contained the specific grading categories for that class), one to generate a report for a given assignment (listing student name and score by category), and one to generate a report for a given student (listing assignment title and score by category). The grades go into a CSV file, one per course.

This has worked pretty well for my system. Every week or so, I go in and generate a report for each student in each class. Looking at that report, I can see their progress in each category over time, and I assign them a current score in that category based on their most recent work and progress. Those current scores go in the grade book on the course website, so they can log in and see their current progress in the dozen or so categories and know where they most need to focus their efforts.

If this sounds like it might work for your system, or if you know python well enough to modify it for your system, feel free to download the scripts and play around with them. Below is a screencast video of me demonstrating how they work. Nothing particularly fancy, but it may give you an idea how I use it and how it might work for you.

Posted in Uncategorized | Tagged , | 2 Comments

Video grading for transcription & arranging projects

I’m really enjoying using Camtasia to make screencasts as I walk through student transcription and arranging projects. I think that video is even more useful of a tool for assessment than it is for generating class resources.

For transcription projects, I have students submit a PDF or a native file from their notation application (MuseScore for most of them) on the course website. When I grade, if there are only a couple small errors I either just email them or record a video of me pointing at things with the mouse and telling them what needs to be revisited. If there more than a couple small errors, I may load up the audio track in iTunes and alternate talking and pointing with playing the audio for them explaining what’s wrong. (Camtasia has a plugin called Soundflower that allows recording of both a mic and system audio, including MIDI—though that takes a little setup—simultaneously.)

For arrangements, I really like going through the music notation file on video, since I can not only point to good/bad things, but I can change the notes and move voices around in the software while they watch, and then use MIDI playback to demonstrate the difference between what they produced and what I suggest.

I’d love to post a sample video of me grading, but the students’ names are on the files, and I usually address them personally at the beginning. Maybe I’ll make a fake one sometime for an example for those interested.

So far, the response has been positive. The students claim to understand my comments better, and when they submit redoes, the quality difference between first and second attempts seems to be bigger than when I provided only written comments.

I can imagine the big question, though, is how long this takes. I’ve found that the grading itself takes less time—or maybe the same amount of time, but I can communicate more or better during that time. And on campus, where we have fast upload (though slow download), putting these videos online doesn’t take any longer than it does to give the next assignment a quick look-over before making the next video. (I use Screencast.com, the site that TechSmith, Camtasia’s creators, have for hosting videos uploaded directly from Camtasia, and which supports private, authenticated folders perfect for maintaining student privacy) However, if I do this at home (fast download, slow upload), there is a good time gap in between. If I have written quizzes or dictation assignments to grade in between, that’s fine. But if all I’m doing is video grading, that’s annoying. I’ve tried to do my grading on campus, then, and if there’s work that needs doing while at home, it can be something for which location doesn’t matter. Next year, though, I’m thinking about using DropBox instead of Screencast.com for several reasons, and that will make it easier to do all the video making at once, and all the uploading at once (while watching Netflix, perhaps?). So overall, I think this is a net time-saver, and at worst a net time-maintainer. But the increase in quality of communication with the students and their ability to understand and apply my comments is a big plus. It’s definitely something I’ll continue to do, and hopefully it will allow me to include more model composition and more substantial arranging and transcription projects in the future.

Camtasia comes with a 30-day free trial, so if you’re at all intrigued by the idea, try it out.

Posted in Uncategorized | Tagged , , , , | 2 Comments