Points Per Jam: Roller Derby’s Default Difficulty

It should be difficult for a roller derby team to score points. So why does it often seem so easy for them to be scored instead?

Derby scoring has seemingly been getting easier and easier over the last several years, with point totals climbing higher and higher. This year’s rules updates appear to have addressed this trend, sure. But even taking games played in 2014, it is still not abnormal for many of them to end with a combined total score of 300, 400, 500, 600 (!), or even more than 700 (!!!) points.

Press me for 5 points. And again. And again…

Press me for 200 points.

Whether point totals of such magnitude were reached in a close game or a blowout, if it is possible for two teams to together score that often in a 60-minute game, any individual pass for a point, let alone the non-scoring initial passes, must be relatively easy to accomplish.

If it were really that difficult to get points, there wouldn’t be so many of them scored in the first place!

Very high-scoring games still happen quite a lot in the WFTDA and MRDA, especially during mismatches. Scoreboard-spinners can also show up in other derby variants, like in USARS, MADE, or the RDCL.

However, games where the scoreboard hits perilously high totals are less frequent in non-WFTDA forms of roller derby, and of a lesser magnitude when they do happen. This is in part because scoring points is appreciably harder to do, on the average, in these versions of the game.

As a result, these games can often be much more competitive.

Over the last year, examples of just how dramatic the differences in scoring rates between the different styles of roller derby have started to surface. A taste of this difference comes by way of Penn-Jersey Roller Derby, the long-standing co-ed flat and banked track league from Philadelphia.

The PJRD men’s team, the Hooligans, has recently become members of the MRDA. To achieve this, it began playing under WFTDA rules for the first time in its nearly decade-long history, having played by MADE rules and its forebears during that time.

Adopting WFTDA rules, however, has not changed the Penn-Jersey philosophy on roller derby. It has played MADE since the inception of that rule set, and continues to do so; has started playing RDCL on its banked track last year, entering its ladies team into the last two Battle on the Bank tournaments; and has expressed interested in trying USARS next season.

That open, bi-traxual, poly-setual nature has opened up interesting opportunities for their would-be opponents. Namely, the chance for the same teams, with the same rosters, to play MADE- and WFTDA-rules games in the same weekend.

The very first WFTDA-rules game the Hooligans ever played was earlier this year against Mass Maelstrom, at the time ranked #4 in the MRDA. That is a hell of an opponent to have for an initiation test, and it comes as no surprise that the differences between the two teams showed up on the scoreboard and penalty sheets.


Click to watch this game on YouTube.

This was a pedestrian game for the likes of the Massachusetts men, who had no trouble lighting up the scoreboard against a comparatively weak opponent.

However, something interesting happened when these same two teams took to the banked track the day before to play a regulation MADE-rules game. Scoring in big bunches suddenly became much more difficult for the Maelstrom—about six times more difficult—despite clearly being the superior team.

Click to watch this game on YouTube.

Click to watch this game on YouTube.

In both of these 60-minute games, one team was playing by a new rule set for the first time. You might suspect this would contribute to extremely high points totals or an severe blowout in both cases. Yet that does not appear to be the case here.

After the Hooligans had picked up a few more games of experience under WFTDA rules, they repeated the weekend rule set double a few months later. This time, against an MRDA team that was closer to their level, the Carolina Wreckingballs.


This was still a blowout by WFTDA standards, but one nowhere near the level of the Hooligans’ first go at it.

As it turns out, there was also a blowout when the Balls took their first shot at a different way to play the night before. However, it was the kind of lopsided result more expected in a sport like basketball or football, one where points are never easy to come by even when strong team is playing against a comparatively weak one.


The Wreckingballs won the MRDA game easily, since they were much more in tune with playing by WFTDA rules. The Hooligans won the MADE game for likely the same reason, just with MADE rules.

The winner of each contest is not of concern in this comparison, however. What is, is the overall difficulty of scoring points between the different versions of roller derby.

These games had the same teams, the same rosters, the same 60:00 of play. But not the same level of competition. Taking these four games at face value, one style of roller derby appears to be much more competitive than the other.

What could be causing this difference?

Aside from the banked track (never mind that MADE is dual-surface and plays nearly identically on flat track) the only plausible explanation for this massive change in scoring scale and the much more competitive games they wrought are the different rules in play.

If you generally compare MADE games to WFTDA games, the difference in scoring rates will fall within the 2x~4x range that these two example games did.

In fact, if you compare play among the three major alternative derby rule sets to that commonly found in the WFTDA and MRDA, the numbers show that WFTDA teams, including the elite ones, have had things very easy in comparison.

How this can be and why this matters is a critical insight that roller derby needs to understand as it thinks about what it needs to accomplish to sustain itself, in the short-term and for whatever next step is coming up next in the growth of the sport.

As will be demonstrated, how competitive roller derby can potentially be has less to do with how talented or how equally matched two teams are than you may think.

When you consider what it actually means for a contest to be “competitive,” then widen the focus to look at all games—among the elite, between the average, and with newbies; in the close games, the slight mismatches, and the big blowouts—there is a much more prevailing factor at work.

To discover what this factor is, a point first needs to be made about scoring rates in different roller derby rule sets.

Points of Reference

This analysis will compare score data of games played in different roller derby variations, and the game strategies behind the numbers which might help explain them. In doing so, we can find out how rules differences affect how competitive jams and games will generally be, independent of the skill level or skill gap of teams playing each other.

Four rule sets will be going under the microscope: WFTDA, USARS, MADE, and RDCL.

All data referenced here is from tournaments played in the 2013 roller derby rules season. Old data is being used because it is currently the only year where parallel data exists in the four rules environments. Besides ensuring an equal comparison, this will be useful to see how far along everyone has been in developing their respective rule sets as of the end of last year.

However, the ultimate purpose of the following data analysis the thought process that drives it more than the numbers that are produced by it. The conclusion reached will introduce a concept that transcends rule sets or annual rules revisions.

Once this concept and its corresponding data has been derived, we can use it as a universal point of reference that will help set up the second page of this analysis, where we start talking about the Xs and Os differences between the styles of roller derby that might help explain the numbers.

Data and diagrams from different rules variations will be juggled from now on, so note the coloring system in use across the website to avoid confusion. Any data table figure, track diagram, or rules text shaded in pink will always refer to some version of WFTDA rules, as noted. USARS is in blue. MADE is in green. RDCL is in orange. If it needs to be differentiated from the WFTDA, MRDA data will be called out in red.

The source tournaments from which this data comes from are as follows:

WFTDA – 2013 Division 1 Playoffs and Champs (80 games, 40 teams, women, flat)
RDCL – Battle on the Bank VI (12 games, 9 teams, women, banked)
USARS – 2013 Regionals and Nationals (38 games, 17 teams, men and women, flat)
MADE – 2013 Derby Ink Invitational (15 games, 11 teams, men and women, banked)

There are different amounts of game data for each set of rules, and each tournament was played under different conditions. This will not affect the analysis significantly, however. All of these events had a really wide variety of teams taking part, from the really great to the relatively bad. The proportion of huge mismatches to equal match-ups was also roughly the same.

All told, the overall range we are working with is similar across all game environments. We can see this in our first chart comparing the final score differentials in the four tournaments played under the four rule sets.


Close games can be very close in any rule set. Blowouts can be very large in any rule set. (Some more than others.) It makes no difference that these games were played under different rules or on different surfaces, played among a lot of teams or very few. No matter what kind of game someone went to see last year, there was better-or-worse than 50% chance that they would would seen a triple-digit blowout.

However, this analysis is not particularly interested in how close or far apart the final score between two teams may ultimately be. What we’re after is a measurement of easy or how hard it is to score points, regardless of the game result.

To help find this out, we can look at the total number of points scored, both teams combined, averaged across all games within the four 2013 tournament data sets.

The logic behind this is that if fewer points are available for teams to compete for in 60 minutes of play, the fight to earn a single point will necessarily be much harder. Lower supply leads to higher demand, and higher demand leads to harder competition to obtain the limited supply.

With this next chart, we can begin to see the differences in how much supply was available in 2013.


Unsurprisingly, the four highest-scoring games referenced here all ended in point-differential blowouts of the magnitude cited in the first chart. On the opposite end of the scale, the lowest-scoring games in the RDCL, USARS, and MADE ended up being in dispute until the final jam.

The lowest-scoring 2013 WFTDA game, a 128-point playoff blowout (Angel City 205, New Hampshire 77) was the odd one out. However, the second lowest WFTDA game total, 283 points, was indeed a last-jam thriller: Rocky Mountain 150, Windy City 133.

It kind of makes sense that a lower-scoring game environment will generally be more competitive than a higher-scoring one. If it is hard to score points, teams of similar skill would be unlikely to score/give up a lot of points very quickly. It would also make it at least somewhat more difficult for a good team to grossly blowout a lesser opponent.

Then again, the two WFTDA bouts just cited show you can have a low-scoring blowout and a low-scoring close game. All the same, you could have a high-scoring close game next to a high-scoring blowout.

Still, the general relationship between lower scoring games and more competitive ones is a concept that merits further investigation.

To see if it holds water, we need to factor out what teams that are playing against each other, thereby disregarding relative ability levels or final score differences. Without these variables, we can directly observe the constant the “default” level of competitiveness, as measured in an average game of roller derby between two average teams.

A thought experiment will let us discover this constant.

Hypothetically Ideal Roller Derby

Imagine an “ideal” game of roller derby where players and teams are of ideally equal abilities.

One team’s defensive prowess would be equaled by the other team’s offensive mastery, and vice versa. (That is to say, equal offense and equal defense should cancel each other out.) Equal strategies would be available to all, at all times. Penalties would be relatively low, or at least equal between the teams.

This is the ideal that many in derby today see as the ultimate. Eventually, everyone will improve to the point where there is very little between teams no matter who is playing. Mismatches and blowouts would be rare among top or near-ranked opponents, and it’s likely that a major chunk of games among them would be highly competitive.

What might one of these games look like?

Imagine what Gotham vs. Gotham would look like, and you're halfway there.

Imagine a game between Gotham and an exact duplicate of Gotham, and you’re halfway there.

If equal teams had an equal chance of scoring points (or stopping points; offense and defense at the same time and all that) it would undoubtedly be a very close game from start to finish. It’s not a stretch to imagine that every single point would be highly contested and hardly fought for, on every single jam.

Let’s make a quick definition here. For a point to be contested, teams must have an equal opportunity to get that point. Hence the word contest—if everyone who enters a contest doesn’t have the same chance to win, it’s not fair contest!

In this case, we are talking about a roller derby contest for points, one where both teams should have the same chance to score (win) a point as they do to defend one. Whatever one team does on offense, the other team must be given an equivalent chance to counter with some sort of defense. And vice versa.

In a literal equal match-up of teams, both would have offensive opportunities to find ways to get their jammers clear of the initial pass, no matter what their opponent tried defensively. Both should have a fairly equal chance of getting their scoring players out on the same scoring pass on every jam, therefore.

At the highest level of idealism between two equal teams, both teams would get their jammer out of the pack at the same time. Every jam would almost have to end with a 1-0 (or 0-0!) score due to the constantly close initial passes and neck-and-neck jammer races that would inevitably follow.

To simplify, let’s agree that two realistically equal teams would ideally have a very difficult time getting in more than one scoring pass at a time. This would necessarily limit jam scores to 1-0, 1-1, 2-0, 2-1, 3-0, 3-1, 4-0, 4-1, or other combinations of contested points scoring.

Assuming a low penalty count would result in the majority of jams happening in full packs, and that penalties are ideally balanced (e.g., jammer penalties and blocker penalties have equal effect), a greater number of full-pack, lower-scoring jams should account for the vast majority of scoring. Enough, that scoring spikes in lopsided penalty-heavy jams would be rare and not adversely scoring in the long run.

From this thought experiment, we can make a reasonable deduction.

When roller derby is played at an ideal level of competitiveness, the average rate of scoring on a per-jam basis must average out to a number around four points or less, the number of total points expected to be scored in a jam that is equally contested between both teams.

It doesn’t matter which team (or teams) score these points. If the number of total combined points scored divided by the number of jams played calculates out to a number less than 4.00 points per jam (PPJ), every average jam must have been competitive on a fundamental scale.

This is our ideal: As far as competitive balance goes, you can’t do much better when both teams have a real, direct shot to score points on every jam.

To put this points per jam “ideal” idea to the test, we can use real roller derby game data from potentially less-than-ideal gameplay.

A Note on Jam Scores and Jam Length

The PPJ measurement we are about to examine is different than the typical jammer scoring stats you may be used to. The PPJ we will be talking about here does not directly reflect what one player or one team scores in a jam or game, meaning it is different than the points per jam stat you might see in Rinxter data.

Instead, the PPJ we’ll be using here is literal: It is how many points are scored in a jam, both teams combined. That is, a jam ending with a 4-0 score and a jam ending with a 2-2 score are equivalent, because four players were lapped for points in each of those jams. Hence, 4.00 PPJ.

This is a universal concept that can go across different rulesets, even though each rule set has different jam lengths. (WFTDA – 2 minutes; MADE/USARS – 90 seconds; RDCL – 60 seconds) PPJ treats them all as “one jam,” because competitive jams usually don’t go the distance. Hit-it-and-quit-it jams will cut short a jam well before its natural conclusion, generally rendering jam length irrelevant.

If not called off early, however, two different situations may play out. One, you could get a lengthy jam if it is difficult for both teams to battle through on the initial and finally get around to score one hard-earned pass, like a 4-0. Two, you might see both jammers circulating and both teams scoring multiple passes, perhaps leading to something like a 14-10 jam score.

As far as points differential is concerned, these are both the same: A 4-point jam win. But again, we don’t care about jam score differentials. From the PPJ perspective, 4-0 (4.00 PPJ) and 14-10 (24.00 PPJ) are very different. Instead, it is the 14-10 and 24-0 (24.00 PPJ) jam scores that are equal.

This difference is important. Harboring the potential for lopsided, high-scoring jams (such as a 24-0) can overpower the smaller score differentials of many other competitive jams in such a way where the small jam wins (including the 14-10 jams) become less relevant to the final game result.

Higher average PPJ values therefore make any single point less important overall across the entire competitive spectrum of games played. If a large chunk of points can easily be scored in any given jam, there is less of a chance that smaller, harder victories will be meaningful in the long run. When there are fewer jams in a game, creating less opportunity to cobble together enough minor points to counter the effect of a few big scores, then only the big scores really matter.

The opposite is true for lower PPJs. If it is extremely likely that two teams are equally contesting points, no matter their skill level or relative skill difference, minor points are the only points generally available and every single one of them becomes important. Many more of the smaller, harder victories will be meaningful, because they all carry equal weight against the final outcome of a game.

This is what PPJ measures: The value of a single point within a roller derby game environment.

Something more valuable is something worth working harder for—supply and demand, remember. We can use PPJ as a proxy to measure the difficulty a team must overcome, on offense and defense, in order to earn that single point over their opponent’s attempt to do the same thing at the same time.

The Universal Constant: Points Per Jam

In examining the average per-jam scoring rates across all games played at the 2013 championship/final tournaments in each of the four rules environments, we can see how close to the hypothetically ideal scoring rate each of them are. From this we can figure out which yielded the most difficult and most competitive fight for roller derby points, when averaged out across all games played.

Remember, 4.00 PPJ is our ideal figure. It is the target number that indicates every point during a scoring pass is being contested by both teams, making every point more difficult for a team to score and that much more valuable to secure.

In looking at the real-world data, however, such an ideal may not be hypothetical after all.


There’s a simple way to picture what these numbers mean in the context of actual gameplay, using the 9.18 PPJ WFTDA average as an example. We know the effect penalties, power jams, passive offense, and an unproven ranking system had on WFTDA playoff games in 2013.

With PPJ, we can both quantify and describe just how significant the effect was.

Penalty-heavy, multi-pass jams overwhelmed WFTDA tournament games so much, it was as if was there was nothing that the best teams in the world could to do prevent getting a grand slam—and more!—scored on them in every single jam of every single game at Championships last year. Every single jam! And there were over 500 of them at Champs last year!

The other forms of roller derby have a much more interesting story to tell. Despite not having as many good teams or as deep as a field as the WFTDA, the numbers suggest, paradoxically, that all non-WFTDA forms of the game last year were significantly more competitive on a per-jam basis.

USARS and MADE games had average scoring rates at a level under 4.00 points per jam, indicating they are already working in an ideally competitive environment. When averaged out, every pass was contested across all tournament games. Big scoring chances were hard to come by, suggesting no positive point differentials came easily.

RDCL scoring last year was 6.02 PPJ, about two points higher than the target number. However, an average in this range must mean that several RDCL games clocked in under the ideal scoring rate, balancing out those that were well above it, showing that there were a significant number of ideally competitive jams played there as well.

Don’t forget that all of the tournaments this data is coming from had median final score differentials of 100 points, give or take. Each of the four events had big mismatches and a lot of bad blowouts.

However, this does not necessarily mean these mismatches and blowouts led to hopelessly uncompetitive games. On a fundamental level, points were contested similarly to the well-matched, close games.

PPJ explains why.

Look at the non-championship MADE-rules tournament from last year, the Derby Ink Invitational. It had the largest average adjusted score differential of 131 points, showing its games were primarily lopsided blowouts. (Much of this was due to a poorly-structured tournament and ill-seeded bracket.) However, its average scoring rate of 3.43 PPJ demonstrates that even if final scores were not competitive, jam scores were. All of the time.

Points were highly contested on every jam. During mismatches, the superior team won those contested points way more often. Even so, the inferior team still got a fair chance to also contest those same points. It didn’t actually score the points, obviously, explaining the blowouts. But at least it had the opportunity on offense and/or defense to limit the damage to one scoring pass against on a majority of jams.

It was always going to lose the war. But by playing in a low PPJ environment, an outmatched team knew it always had a chance to up a fair fight in each little battle.

PPJ averages work best when used with large samples. With enough games to measure, it becomes a strong indicator of how fair the competition for points is, independent of what teams may be playing.

For example, below is a table of PPJ averages from other breakdowns from the 2013 WFTDA Division 1 tournament season. It shows an interesting pattern.

Focus on sets of games that exclude extreme blowouts (like the 509-64 Gotham massacre of Ohio at 2013 Championships). Or only calculate the extreme closest games, those that finished with the lowest score differentials. Sample divisional games between just the top teams, those that played among each other for a ticket to Championships. Or just look at the two best teams in all of roller derby last year, Gotham and Texas, in their battle for the Hydra.

All of these games had good, well-matched opponents. You might assume that having mostly equal, more-ideally matched teams would lead to lower per-jam scoring rates, when averaged out, since their offensive and defensive abilities would be better at cancelling each other out on any given jam.

But actually…


It makes sense that PPJ would be higher and scoring easier during the most extreme mismatches, and that shows in the tournament games that were the most extreme blowouts. Yet the opposite did not apply in the WFTDA last year.

Compared to the others, the values were not significantly lower in the extreme closest games played during the 2013 playoff season. Nor during the most competitive games at 2013 Championships, where converging brackets would be expected to make matchups more equal and scoring more competitive.

In fact, if you take a points per jam average of the whole 80-game 2013 WFTDA D1 tournament season, you get a number that’s exactly the same as the number seen at Championships: Just under 9.2 PPJ.

This is the fascinating thing about the PPJ concept. What cross section of games you look at doesn’t really matter. With enough data, it will always average out to around the same value.

Take a group of good teams playing close games. A set of average teams in typically random bouts. Contrast men’s teams to women’s teams. Take data from in a seeded tournament or a multi-bout event.

It makes no difference. In the end, the PPJ numbers will always be similar.

We can see this is the case with games played during other events in 2013, for which per-jam data is available.


Some events here have lower scoring rates, MRDA Championships 2013 (7.05 PPJ) in particular. But those are balanced with the ones with higher scoring rates, like Midwest Brewhaha (10.17 PPJ).

A high concentration of elite teams playing one another (Golden Bowl III, 8.28 PPJ) does not make jam points any more or less contested than in games among a conglomeration of different teams of various skill levels (ECDX, 8.28 PPJ).

Packing seeded teams very close together (D2 tournament, 8.94 PPJ) did not produce a significantly different scoring rate than having a broader spectrum of ranked opponents facing each other (D1 tournament, 9.17 PPJ).

Comparing men’s play and women’s play, the scoring rates aren’t significantly different when everything averages out over several games and events. Spring Roll 2013 is a clear example of this, with just a two-tenths point average difference between the genders.

The previously cited PPJ numbers from the USARS (3.70 PPJ) and MADE (3.43) tournaments, which combined games from parallel competitions among men and women, produces a similar equality when the genders are separated. Less than 1.00 PPJ difference for both, in fact, although a scant number of men’s games means the separated figures may not be completely reliable.

To address the issue of small sample size and attempt to arrive at a definitive proof of the PPJ concept, let’s look at the the biggest set of data possible.

Thanks to Flat Track Stats and its meticulous attention to recording game statistics, jam-by-jam data is available for over 700 WFTDA and MRDA games played between February 2013 and January 2014.

We’re talking WFTDA sanctioned, MRDA sanctioned, and unsanctioned; close bouts, typical games, and blowouts; between elite teams, average teams, and bad teams; played in single games, multi-bout weekends, or seeded tournaments. All of it.

Get this: The calculated average scoring rate for all cited games in the FTS database is 8.92 points per jam.

For all intents and purposes, this is identical to the 9.17 PPJ scoring rate seen during the whole of the 2013 WFTDA playoffs and the 9.18 PPJ average at Champs 2013.

Despite the vastly different circumstances they happened under, WFTDA-style games played last year had a 9.00 PPJ scoring rate in common. On the track, this translates to an uncontested grand slam and a contested full scoring pass in every single jam of every single WFTDA and MRDA game across all of 2013.

No matter how good or equal teams were, this many points were expected to be scored every time the start whistle blows. Over 30,000 jams. Nine points scored in every single one, when averaged out.

Like, woah.

Mind: Blown.

Mind: Blown.

Who is playing a game or how close it is completely irrelevant to the absolute measurement of point-by-point competitiveness that is points per jam.

The reason for this spooky commonality between so many different games? It is because of the only other thing all these games have in common:

The rules the games were played by.

A direct correlation between the overall “default” level of competition (for points) and the rule set in use is the only possible conclusion that can be reached here. It couldn’t be a skill or strategy problem, because the scoring rate is the same between teams of low skill or high skill. It’s not a ranking or seeding problem, because moderate mismatches and relatively close games also produced roughly the same PPJ figures.

It has got to be WFTDA rules that last year caused so many WFTDA games to have many jams that were like mini-blowouts, even in games between well-matched teams—to say nothing of the teams that were not so well-matched.

– – – – – – – – –

With this revelation, we can start making some deductions about the other derby rule sets in this analysis.

First, we can posit that RDCL rules are the reason why an average RDCL game saw around 6 points per jam during Battle on the Bank last year across all its games. Boiling this down to one example average jam, this would be like a bad (“bad”) team always having enough opportunity on offense to free their jammer just after the lead jammer gets a grand slam. Sometimes, later than that. But often well before.

We can also say that USARS and MADE rules are the reason why every single scoring pass, on the average, was equally contested between all the teams in their respective 2013 tournaments. Even if you include into the average games where jam scores and final scores are terribly lopsided, you still wind up with an overall game environment where every team has a real chance to score points and a real chance to prevent them from being scored.


Many thanks to Flat Track Stats for providing raw WFTDA data spreadsheets, which were invaluable in preparing this analysis.

Further, we can theorize that in the RDCL, MADE, and USARS, a majority of common, non-tournament games played under their 2013 rule sets featured much more competitive derby on every jam. Their tournament PPJ numbers strongly suggest this.

The scoring rate sticks to the same number whether it happens inside or outside of tournament play, as Flat Track Stats WFTDA data has proven. There’s no reason to suspect it would not do the same for other forms of roller derby.

Which leads us back to the idea of “competitive” roller derby.

In a roller derby contest, teams are competing for points. To make this contest for points fair, you must guarantee that teams have an equal chance to get them every time the start whistle blows. (Otherwise, it’s not really a contest!)

Ranking systems or seeded tournament formats cannot always install this guarantee of fairness. How equal two similarly-ranked teams is still a big variable at this stage, so just putting them together and hoping a competitive game will result is an unreliable strategy, especially outside of tournament play.

The only foolproof way make sure teams and players can always compete for roller derby points and have an equal opportunity to play offense and defense at the same time, is to guarantee so through game rules.

The rule book in use is the only thing that is consistent across all contests. Just like how basketball’s shot clock rules guarantees both teams a chance to score—and forces them to play offense and defense, whether they want to or not—there are roller derby rules that do the same.

Of course, the difference with roller derby is that all five players on both teams are made to play offense and defense simultaneously. Certain rules and their derived gameplay elements are necessary to make sure roller derby is always being played—whether a team wants it to be played or not.

As we will next explore, there are multiple ways rules can achieve this.

~ ~ ~ Continue to Page 2 ~ ~ ~

Pages: 1 2 3

20 responses to this post.

  1. As a WFTDA rules skater, I really like some of the aspects of the other three rules sets such as the verbal warning for cutting and always having to be doing offence and defence at the same time. But watching the Your Mom v Oly game, it did feel a at times like I was just watching a load of men race each other around the track!
    I like fast derby but I think slow derby can be awesome to watch and take part in. I’m not sure how the two could be combined but I agree that, as a spectator, massive blow outs and the feeling of “Well, we all know how this jam is going to pan out” is not good and will potentially drive fans away.
    There are a lot of problems with WFTDA and it seems very fashionable to bash it at the moment. But I’m not sure everyone switching to USARS is the answer. I would be interested to have a go at playing under one of the other rule sets however.
    Good article but it did take me a LONG time to read! :D


    • You’re right about the guys just racing around the track. Last year, USARS teams killed a power jam by speedskating away once they got around the (bad) defense of the other team. This year it is a much more rare occurrence because USARS added the requirements that jammers and pivots start every jam, among other tweaks. This means a team that just endlessly skates in circles is effectively throwing away a chance to score points. Running away from your own jammer is a dumb, dumb strategy, and no successful team will ever do this except in end-game situations.

      Your desire to “combine” fast derby and slow derby is exactly what other rulesets do: Both teams have an equal chance to play the game at the speed they want to play it at. Want to speed things up? Get around the other team and earn the forward positioning necessary to do that. Want to slow things down? get around the other team and earn the forward positioning necessary to do that. WFTDA teams can always slow things down, but they can’t always speed things up (see: power jams). In truth, there is no such thing as “fast” derby or “slow” derby. Only fair derby.

      Anyways, this isn’t about roller derby “switching” to a different rule set. It’s about making sure everyone that plays roller derby, is actually playing roller derby. If you haven’t seen my seminar from RollerCon last year, I highly recommend you check it out, here. It’s long (75 minutes) but it should explain what I mean by that.


  2. Thanks so much for the excellent article!


  3. Am I missing something obvious here, or are the 2013 MADE tournament stats really claiming an average combined score of 199, with an average score gap of 131? That’s saying the winning team wins by almost 5:1 on average, even without scaling.


    • No, that’s correct. You can view the stats on my Google Drive, here.

      Having personally attended this tournament, I can tell you that the actual roller derby that happened on the track was fantastic. Everything else about the tournament, however, was not. A very, very poorly structured bracket (the organizers actually asked ME for suggestions on what could be done to change it—halfway through the tournament), a venue that was terrible, a promoter that did not make good on the promised prize monies, and a lot of angry players at the end of it.

      But stil, 3.43 PPJ. I’d see it again in a heartbeat.


  4. Posted by Captain Lou El Bammo on 2 September 2014 at 11:35 am

    Your maths are wrong for the analysis you wish to make. As is your definition of competitiveness.

    Using your own numbers, the WFTDA model provides for much more competitive games over the other rulesets.

    You should be using the PPJ numbers to bring all the rulesets into comparable tables so that you can see which rules actually create the biggest gaps/blowouts/discrepancies.

    Both MADE and USARS trails far behind the WFTDA on the competitiveness scale when you compare apples to apples. This is an analysis you could have done yet it does not fit your agenda. Sadly, for you, the facts and numbers don’t lie. The WFTDA numbers are the superior for the datasets that you provided.


    • Posted by Costa Ladeas on 23 October 2014 at 3:56 pm

      Yea cause “everybody” is playing WFTDA rules because WFTDA is the “cool kids” club. “Everybody” wants to hang out with the “cool kids” regardless whether the product is good or bad which should tell you a lot about people generally speaking.


    • Posted by theoriginaldonald on 23 October 2014 at 6:50 pm

      Call me November 3rd when the FIVE TIME! FIVE TIME! FIVE TIME Jennifer Wilson Memorial Trophy winner Gotham Girls Roller Derby crush all three of their opponents by AT LEAST 100 points each


  5. Posted by broei on 3 September 2014 at 1:00 pm

    In MADE, the Jammer or Pivot can take off, uncontestedly, on a scoring run during a No Pack situation. This makes it imperative that you also keep two opposing Pack Players behind you. I do not see this reflected in the corresponding diagrams (Minimum defense required…). I think this also could explain the lower PPJ rate in MADE compared to USARS.

    Relatedly, creating a No Pack under MADE rules is not only possible but also goes unpunished, as this recent bout shows (at the ends of clips 29, 39 & 45):

    Off course, no Jammer will score points during a No Pack, but if you’re still on your initial pass, that doesn’t matter.


  6. Posted by Mike on 23 September 2014 at 10:57 am

    The link between PPJ and competiveness is weak at best.

    I can make a game have any typical number of points. The teams are always looking to have more than half of them. If that means they need 2 points or 200 points, they are going to work just as hard to get over half of the points and win the game.

    Raising or lowering PPJ won’t do anything to make a game more or less competitive.

    In a competitive game, the teams split the available points fairly evenly. Whether that is a lot of points or very few doesn’t matter, how evenly those points are divided does.


    • I can make a game have any typical number of points. The teams are always looking to have more than half of them. If that means they need 2 points or 200 points, they are going to work just as hard to get over half of the points and win the game.

      Well, yes, but you’re talking about how hard teams work relative to each other, and how the ratio of points between them. I’m talking about how hard a team must collectively work to score points in the style of game being played before adjusting that difficulty level up or down based on the skill level of the opposition.

      Let’s think about what you said here, then take it a step further:

      In a competitive game, the teams split the available points fairly evenly. Whether that is a lot of points or very few doesn’t matter, how evenly those points are divided does.

      In a competitive jam, the teams will wind up with fairly even jam scores more often than not. Game scores can be close whether they are low or high, but the key is that jams can only typically be close if they are low-scoring, due to the team with lead calling off the jam early. Naturally, the WFTDA is the exception to this, with full 2-minute jams where both teams score a lot of points that cancel each other out equally.

      If the two teams were really equal, jam scores would more often reflect that. But you’re saying a (simplying) 24-0 jam followed by a 0-24 jam would result in a competitive game, since the scores cancel out 24-24.

      But then, wouldn’t a 14-10 followed by a 10-14 would be even more competitive? In this kind of jam, both teams would be dueling on offense and defense simultaneously. The scores would reflect that one team was slightly better in one jam, and the second team slightly better in the other.

      This distribution of scoring would be much more in line with the equality level of teams playing against each other, in the context of the game being played. However, if you have a style of game where 14-10 jams and 24-0 jams are possible (24.00 PPJ), you run the risk of one uncompetitive 24-0 jam wiping out 6 competitive jams’ worth of 14-10s.

      The reason why low(er) PPJs helps keep games more competitive overall is because the competition level of each individual jam more accurately resembles how competitive the teams actually are against each other. Two very equal, very competitive teams should be partaking in very equal, very competitive (and therefore, low-scoring) jams, in such a way that “big” scoring jams should be hard to get.

      Hard enough, so that if a big score does happen, it doesn’t render the competitive jams obsolete. I argue that this this “default” level of scoring difficulty is something around 4.00 PPJ, because two equal teams shouldn’t be scoring multiple passes on each other if they were really that equal.

      This way, two equal teams and two slightly mismatched teams would still be dropping single-pass scores on each other, it’s just that the bad team in a mismatch has more tools and more time to put a competitive effort in to try and score points, checking against blowouts and making sure a team that loses a few “big” jams can make up that deficit with a realistic number of “small” jams.

      Roller derby played in a high PPJ environment doesn’t show how good two teams are in a duel against each other, but rather which one scored better when it was their to pop shots at a shooting gallery. Big difference.


      • Posted by mike on 25 September 2014 at 11:04 am

        We’re going to end up disagreeing on some points as what we like in a sport is different. And I would say that is a statement that multiple rule sets existing is good as it allows us to both have games going on that we really like.

        So on to some disagreement/suggestions. :)

        There are some features of WFTDA play that make PPJ less informative.

        Take a look at Berlin v Rideau Valley at the Kitchener Tournament. I picked this one because it has a very high PPJ and a very low score differential. So this is a game we would disagree on the competitiveness of.

        483 combined points (243-240). 47 jams. PPJ = 10.3
        A high PPJ which you would consider to be a lot of noncompetitive jams.

        However, since WFTDA play often has jams where both teams score points, the PPJ is inflated.

        If I only consider jams where one team scores, what you indicate is what we should see in a competitive jam, the PPJ drops to 4.59.

        The high PPJ is coming from play where there are scoring passes for both teams in a single jam.

        If I replace PPJ with the average of the difference in points scored per jam, I get a score difference per jam of 5.30. Roughly half of the overall PPJ.

        The high PPJ isn’t reflective of lots of blowout jams and wild scoring swings. It’s indicative of a style of play where teams regularly make multiple scoring passes whether or not their opponent is also on a scoring pass. Having one jammer score 8 and another score 4 only shifts the score difference by 4 points, but it pushes the PPJ up significantly.

        The last jam of Rideau Valley v Berlin was a 38 point jam. 20 points for Berlin and 18 points for Rideau Valley. That jam raised the PPJ by 0.7 all by itself. But it was a jam where both teams competed for points and the score differential for the jam was only 2. I propose that you should view that jam as nearly perfect competition since both teams scored nearly equal points even though it was 7.9% of the total game score in a single jam.

        I would suggest that score difference per jam is a better stat for what you are trying to argue since that allows for play where both teams are successfully competing for points.

        (Numbers pulled from rinxter.net and FTS)

      • However, since WFTDA play often has jams where both teams score points, the PPJ is inflated.

        The high PPJ is coming from play where there are scoring passes for both teams in a single jam.

        I realize that WFTDA’s PPJ is higher in part due to the 2-minute high-scoring jam. I said as much in the analysis and in my previous comment. However, you can’t ignore that high PPJ also comes from play where there are scoring passes for only one team in a single jam.

        The high-scoring but competitive 20-18 jam from the Berlin/Rideau game you cited also had high-scoring and uncompetitive jams of 27-4 and 23-0. Both types of jams happen often across all WFTDA games played.

        I’d wager that the lopsided jams happen more often than the ones that even out, when you look at the bigger picture.

        If I only consider jams where one team scores, what you indicate is what we should see in a competitive jam,

        Not what I said a competitive jam is. By your logic, a 1-1 jam wouldn’t be competitive, because both teams scored. But a 23-0 jam would be, because only one team scored. That is clearly not correct.

        A competitive jam is simply one where both teams have an equal chance to score points, and both teams have an equal chance to defend points. Any equal jam score (0-0, 1-1, 4-4, 10-10, 20-20) would have to be extremely competitive, because the offense and defense of both teams are cancelling each other out on the scoreboard in that contest for points.

        However, part of the territory that comes with ruleset-wide low PPJ averages is that there is always—ALWAYS—a lead jammer that can call off the jam after scoring points. In that situation, they almost always do, as to prevent the opposition from scoring and cancelling out the fact that they got to the pack to score points first. So of course the other team isn’t going to score on many jams!

        You argue that a 20-18 jam is competitive. Of course it is. Yet your argument disregards the reality that the same gameplay environment produces 23-0 or similarly ridiculous uncompetitive jam scores. The major issue many people have with WFTDA derby is that the 1, 2, 3, and 4-point differences earned in the very competitive, very difficult jams, are rendered irrelevant by the 15, 20, and 25+-point differences that comes in very uncompetitive, very easy jams.

        Basically, in the vast majority of WFTDA games, the exciting hard-fought point differential wins are inconsequential to the final game result, because the boring, easy point differential wins are often the only ones that really matter.

        A low PPJ environment keeps the competitive score differences, but takes away the more easily-attainable uncompetitive score differences by making them proportionally more difficult. Your 20-18 jam in a WFTDA game would always be a 2-0 (or 3-1 or 4-2) jam elsewhere. If a team in a low PPJ game wanted to get a 23-0 score, however, that would be almost impossible simply because the difficulty level of doing so would be off the charts.

        In a big USARS mismatch, for example, a very good team would need to keep the pack moving forward as slowly as possible while simultaneously containing 2 or 3 opponents on defense. There were some pretty big mismatches at USARS Nationals this past weekend, but even so, the biggest jam, which only happened two or three times all weekend, was a 20-0. That seems about the upper limit for a USARS jam, because scoring that many points in a jam is very difficult, even for the best teams in the world playing against those way down the pecking order.

  7. Posted by Mike on 13 October 2014 at 9:15 am

    This is the area where we’re going to just end up disagreeing.

    I view competitiveness on the scale of the enitre game, not on the scale of an individual jam.

    To me, competitive means I don’t know who is going to win through most of the game. No matter what the point scoring structure is, the trailing team could reasonably be able to come back and challenge the lead.

    If the losing team was able to remain in position to threaten the lead for at least 75% of the game, I view it as a competitive game. The size of the score isn’t what interests me, and the single jam scores only play into it in that they determine how large of a gap can reasonably be overcome.

    So to me, a game where I knew who was going to win by halftime isn’t competitive at any PPJ.


  8. Posted by Tim on 16 October 2014 at 7:17 pm

    The real issue is that WFTDA roller derby is BORING. The points per jam and the competitiveness of any jam or bout are just symptoms of the real problem. The real problem is that the skaters are not skating, and they are definitely not skating with any speed or skill. They are stopped, they are standing, or they are even going backwards, they are ignoring 80% of opponents on the track, and THIS IS BORING.

    The current rules give the advantage to the team whose blockers are in the back and stopped and completely ignoring everyone on the other team except their jammer. This same problem is the cause of both why it is so easy to score in WFTDA roller derby and why it is so boring. The reason it is so easy to score is because the pack barely ever moves forward. It is now common to have entire 2 minute jams where the pack does not complete one lap of the track. The pack never completes more than a few laps of the track anymore. So since a jammer scores by lapping opponents, and those opponents are not moving, all the jammer is really doing is skating a few laps with opponents acting as glorified traffic cones in the way.

    When I attended my first roller derby bout back in 2008 and up through about 2011 for the roller derby league in my area, everyone skated fast. They did not know about bullshit passive offense, and so they were ignorantly playing fast, skillful, and EXCITING roller derby. Scoring was much more difficult back then too. This was because the opponents you needed to pass in order to score were always skating away from you. In order to make a scoring pass, a jammer had to circle the track several times to catch back up to them. The pack was often circling the track at about 10 seconds per lap, and the jammer was skating even faster. They were skating, blocking, dodging, and jumping at breakneck speeds which was EXCITING.

    So the reason it is so easy to score is because with bullshit passive offense, if a jammer is going to make 4 scoring passes, she only needs to skate 5 laps since the pack is not going anywhere. Before bullshit passive offense, a jammer that made 4 scoring passes was circling the track 25 times at incredible speeds, literally flying around the track, and the blockers were skating fast too.

    You can talk about the number of points scored, the closeness of jams or bouts, the competitiveness of any bout, the way the ranking are calculated and all the other symptoms of the real problem, which again, is that WFTDA roller derby is now incredibly BORING because 4 out of 5 of a team’s players are not skating at all, and the jammer is not skating all that fast. They are in fact encouraged not to skate by the perverse incentives in the current WFTDA rules.

    The way roller derby was played when I first saw it was fast and exciting. I imagine that the same is true for everyone that is a fan of the sport, or at least was a fan at one time. Why do WFTDA’s own promotional videos feature fast skating action rather then the prevalent bullshit passive offense? The fast roller derby is what I want to see come back. It baffles me that the skaters in WFTDA are apparently against rule changes to effect this. Maybe bullshit passive offense really is incredibly fun for the skaters, but I don’t believe that. How can it be fun for the skaters, when they aren’t even doing any skating? It does not look fun to me, and it is absolutely not fun to watch.

    Windy Man, after finding and reading a lot of your posts, I am pleased that someone at least is pushing on this subject with some intelligence and suggestions for improvements. The other rule sets appear to have some solutions to the problems in WFTDA, although I’ve never seen any of those rules used in person. But it can’t get much worse than it is right now, so I am left hoping that either WFTDA starts adopting these rules or that my local league switches to these other rules.


  9. Posted by mike on 20 October 2014 at 2:58 pm

    Passive offense in WFTDA is a passing phase that is already passing.

    If you want to see WFTDA games without it, watch the WFTDA Championships at the end of the month.

    The problem that all the rule sets are seeing and responding to is that roller derby is an incredibly difficult sport to play. Simultaneous offense and defense is not something most teams are able to effectively deploy.

    Some of the rule sets respond to the problem by disabling defense. The active scorer rule where the pivot starts in front of the blockers essentially removes defense as a viable strategy so that teams gain little from defense. You get a more offense oriented game because all the pivot has to do is loiter at the front of the pack and avoid engagement and they can completely negate the blocker’s efforts to contain the jammer. You get passive play in front instead of in back of the pack.

    The WFTDA rule set rewards defensive play. This has resulted in the short term in slower games because teams have focused on defense as there is much to gain by developing a good defense.

    What you see at the top of the sport is defenses that have become so strong that they cannot be effectively penetrated without offense. The teams that then move to the very top are the ones that can deploy effective offense while still maintaining a strong defense. When top end WFTDA teams meet, the stronger offense wins since they both have staggeringly powerful defenses that even the best jammer cannot reliably penetrate alone. This is leading to faster play at the top end since passive offense against a division 1 team and an increasing number of division 2 teams is a losing strategy.

    The challenge is how to find effective methods to move these offensive strategies further and further down the rankings so that more and more teams are able to make the transition from playing purely defensive derby to playing a mix of offense and defense. And that’s a challenge since making that rapid transition while maintaining a cohesive pack is very difficult.


    • Posted by Costa Ladeas on 23 October 2014 at 3:51 pm

      from your lips to God’s ears Mike.


    • The active scorer rule where the pivot starts in front of the blockers essentially removes defense as a viable strategy so that teams gain little from defense. You get a more offense oriented game because all the pivot has to do is loiter at the front of the pack and avoid engagement and they can completely negate the blocker’s efforts to contain the jammer. You get passive play in front instead of in back of the pack.

      Not true. If a pivot “loiters” at the front of the pack, she is ignoring her defensive responsibilities, to the detriment of her team. (I wrote about this in a previous article on strategy, which you can see here.) For example, the pivot is the last line of defense against the other team breaking out first for lead jammer. If the pivot continuously allows the opposing jammer to break out of the pack first, it will always be second into the pack, will not score very many points, and probably lose the game.

      Also, the job of the defense isn’t to “contain the jammer.” It’s to contain the opposing team, of which the jammer is only one member. Stopping the opposing jammer is important, in that it ensures your team will get your jammer out first for lead. But if your team wants a big jammer lead, it will need to play even better defense to contain the opposing pivot, too. If your pivot is “loitering” at the front and not actively defending the opposing pivot, you’re not going to get much out of your lead jammer advantage, are you?

      So yeah, there can be passive strategies by pivots at the front of the pack in MADE/USARS. But they are often losing strategies.

      The challenge is how to find effective methods to move these offensive strategies further and further down the rankings so that more and more teams are able to make the transition from playing purely defensive derby to playing a mix of offense and defense.

      Notice how you kept saying that at “the top of the sport,” or with the “top end WFTDA teams,” things appear fine. Great! But these teams are a small minority in the WFTDA, and an even smaller percentage of teams that play by WFTDA rules overall.

      There’s a misconception out there that boring/uncompetitive derby in the WFTDA will go away once the “bad” teams get better at playing defense, thereby forcing more of their opponents to have to play offense. As I explained in my analysis, that is a very unreliable strategy. The level of teams playing one another any given day is variable, and you can’t rely on a variable if the goal is to see offense and defense being played consistently across the entire spectrum of games played, especially when so many teams will never approach anything close to top-level debry.

      As I said, the only way you can guarantee consistent competition is to describe it in the only thing that is consistent everywhere: The rules. A rule set needs to work for everyone, not just those that are very good. A rule set needs to foster competition between teams regardless of how close or mismatched they may be, not just between teams that are ranked near each other—and can afford to travel to peer opponents regularly.

      And that’s a challenge since making that rapid transition while maintaining a cohesive pack is very difficult.

      No shit, it’s supposed to be difficult. This is roller derby, a game where a team is expected to play offense and defense at the same time, which is impossible. But that’s the point: It’s equally impossible for both teams. The trick is to make sure both teams have a fair opportunity to play offense and defense simultaneously and at all times, regardless of the level of their opponent or current situation on the track.


  10. Posted by Mike on 3 November 2014 at 1:01 pm

    If teams of wildly different skill play each other, the game should not be competitive. The teams won’t have equal ability to score since they don’t have equal ability.

    Competitive play requires teams of similar skill.

    You don’t need to build the incentives for teams of similar ability to play each other into the rules. The WFTDA builds it into their rankings calculator. A team is only going to hurt itself playing teams far above or far below its ranking. For sanctioned play, this encourages teams of similar skill to play each other across the entirety of the organization.


  11. […] I had some great feedback from Part 1, including an absolutely amazing article from WindyMan (https://windymanrd.wordpress.com/2014/09/01/points-per-jam-roller-derbys-default-difficulty/) that I suggest everyone check out, and it is time for me to continue to the second part in this […]


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: