2017_18 ISU Judging Anomalies | Page 20 | Golden Skate

2017_18 ISU Judging Anomalies

whatif

Medalist
Joined
Feb 20, 2014

LOL, I particularly like the irony in this statement:
“At Skate Canada, we have a history of very professional judging that’s very fair, and we’re proud of that. I feel that, as Canadians, when you win in [the] Olympics, it’s when you deserve it, and we feel like these Olympics medals, that we deserve [them].”

Oh Canada, never change.

I am happy that there were no Russians involved and we are still discussing judging conspiracies. Just shows how doomed this sport really is.
 

Metis

Shepherdess of the Teal Deer
Record Breaker
Joined
Feb 14, 2018
Now, that is a long post.
I do agree with your hypothesis - however i'm not sure that you can assume independence in the sense you have written.
Judges have meeting prior to an event, so they are biased towards something. I think that this kind of thinking by a judge will take time, and they don't have enough time to process it correctly.

Shepherdess of the teal deer, here to serve. 🦌 [emoji23]🦌

I don’t believe judges are independent (have no knowledge of each other’s preferences, scoring history, nationality, etc.). We know they aren’t, actually. But even if you start from the assumption that they are and remove peeking, I’d argue you still wind up with the end result being that the “optimal” scoring option isn’t honesty, but to try to uncap the high end of the distribution for a higher overall average. Which means the rational choice for each actor (judge) produces an irrational outcome (the skater with the smartest judging panel wins, everything else being equal in a thought experiment), assuming everyone is a rational actor, which doesn’t seem to be the case and/or academic exercises are tidy on paper/forums and harder in the real world. I would argue, though, that the entire system is failed when the “best” option is not to mark someone honestly, but to try to exploit the trimmed mean.

Again, this is something of an academic point, although I’d throw it in the “untested hypothesis” bucket for explaining the meteoric rise in PCS, especially when given the average during an event. Using a trimmed mean also creates a disincentive to mark anything a 10, except in the case of very few performances, and even then, one of those 10s will always be tossed. So there’s also an artificial compression of the range of values available to the judges, whereas 6.0s could be handed out like candy... which did at least allow some performances to be given majority or even unanimous top marks when they deserved them. Even if it took a miracle for a panel to give unanimous 6.0s.

Metis:
I guess the question is, what do international judges see their task?
And that’s the 64,000 dollar question, isn’t it? And, oddly, one reason why ordinals made life easier: when the US judge lowballed Soviet and East German skaters and they did the same in return, nobody was confused by what was happening.

Before addressing the rest of your (stellar) post, I just want to say: nothing I’ve posited about looking at judging via game theory requires intimate knowledge of the subject. You don’t need to understand the terms “rational actor” and “iterated game,” let alone be able to decipher the crazy work that goes on at the deeper end of the field and is way out of my league to handle (and I don’t even want to think about what someone with enough time, motivation, and math skills could do in terms of creating probabilistic models for judges). These are the points I’ve been trying to make with regard to PCS:
1. There’s a “metagame” to scoring, and the metagame does not reward “honest” (for any interpretation of the word) scoring.
2. The optimal strategy is to preserve the highest marks for the average. This isn’t game theory, just observation. BuzzFeed noted it in their article: “But very high scores can still influence the outcome by preventing the next-highest score for each part from being discarded.” (https://www.buzzfeed.com/johntemplon/by-voting-for-their-own-figure-skating-judges-may-have)
3. The current scoring system has been in place long enough to allow judges to understand the metagame. This does not mean every judge has to score skaters in the “optimal” way each time, but it does raise questions about the marks we consider to be “too high” — was it actual biased over-scoring or metagaming? They may look the same, but the solutions to each problem are different. (It also means we have enough data to do a systemic analysis over time and look for trends.)
4. Because the rational choice is to try to save as many high marks as possible for the average, the current system is fundamentally broken. For the judges themselves, we’re looking at what amounts to a version of the prisoner’s dilemma — assuming the desired outcome is “whomever skates best wins,” if each judge could trust that every other judge was scoring as honestly as humanly possible, there would be no need for a given judge to try to influence the average. Since that’s not the case, for many of the reasons you listed, the most rational move is to play the metagame, creating a scoring system in which the most rational choice each individual can make produces an irrational result in the aggregate and a worse possible result than if everyone just were as honest as possible. (Prisoner’s dilemma overview here.)

Regardless, the basic point is undeniably true — overscoring helps the average. Whether it’s intentional overscoring or inadvertent, it works. Are some judges likely doing this inadvertently due to corridor judging, having to make decisions quickly, TES inflating PCS, etc.? Sure. Are some doing it intentionally? I have no doubt the answer is yes. That was at the heart of the Sotnikova-Yuna controversy (especially given the composition of the judging panel), and it’s back again this Olympics.

This is, again, not about game theory per se; all I’ve tried to do is synthesize some of the theses I’ve already seen here and situate them within a framework that explains why it is “better” to overscore and be removed from the average than to hit the average (if you want a skater to win). That’s, unfortunately, incontrovertible math, but it also reveals the inherent brokenness of the current system, and that’s before human subjectivity comes in. There is an inherently qualitative aspect to judging figure skating that will always lead to complaints when transferred to numbers in PCS, which I think we’ve all accepted; personally, I‘ve just been waiting for more judges to wake up and play the metagame, as the deficiencies in the system are glaring.

1) The primary goal is to evaluate the skating according to skating criteria as accurately as possible (with the caveat that all humans are biased and individual judges' unconscious biases will affect their evaluations even if they do their very best to be impartial).
I think it’s impossible for judges to avoid comparative marks, and as such, we may as well drop the facade. There are a million studies on how showing a woman a photo of a model will make her less likely to eat an Oreo later, for example, or the concept of “anchor numbers”; the human brain is what it is, and that’s fine. But if we’re still talking about “saving marks for later” and the skating order playing a role, then the idea that the current system isn’t ordinals, at least as far as PCS is concerned, is worth reconsidering.

I don’t disagree with you on the ideal. I suspect we disagree on how to achieve it.

Are referees and the technical committee and assessment commission also players in the game, or are they the officials that regulate the judges' and maybe tech panels' moves? Do the players need to worry about these higher arbiters?
Technical callers make the calls they feel like calling (Medvedeva’s flutz in her free skate wasn’t called, but Osmond’s was). Without data on the technical callers, we can’t check for bias along pertinent lines of inquiry — nationality, rate at which certain skaters are scrutinized and subsequently penalized (and how that compares to their standings), etc. And then, as has already been noted, there’s the equipment. As you start climbing the ranks, more than what can be controlled for in a statistical analysis comes in (which is what I think would be interesting insofar as trying to minimize subjectivity on the “objective” elements — !, e, etc., deductions), though to answer your question: there’s so much in figure skating that has relatively little to do with on-ice performance that we (and the skaters) will never fully know. I’m just interested in the data we have.

Is winning defined by amassing the most medals for the home team? In that case, getting the best skaters as playing pieces would help significantly. Should the federations be considered the primarily players, applying strategy to develop the best skaters and send them to events as well as developing/choosing judges who are the best players of the scoring game?
Unfortunately, that’s up to various federations. And way outside the bounds of what we, as fans, can control.

Or do judges win the game by boosting their skaters the maximum number of places above what they actually deserve, in which cases there is more room for success the weaker the skaters actually are?
The judges — who were the “rational actors” I was discussing initially — have their own definition of winning, which likely changes on a per-skater basis. As I said from the start: the only reason to overscore and intentionally try to mess with the average is when a judge wants to preserve all high marks below their own for a given skater, which is likely to happen because they have a horse in the race (consciously or unconsciously) and they’re deliberately trying to move a given skater’s placement. A weaker skater is not going to benefit more unless they perform exceptionally well and pull a set of judges inclined to give them unusually high marks; how many skaters out of medal contention get component scores in the 9s? Or, to put it another way, would Nathan Chen have received the PCS marks he did for his free skate if his name weren’t Nathan Chen and he wasn’t expected to be skating in a different flight?

There’s so much in PCS I dislike, from the fact that it’s prone to inflation to the cases where it seems judges may as well have thrown darts at a board to arrive at the numbers, but my main issues with it are that it’s easily exploitable as well as fundamentally dishonest. It is absolutely a comparative measurement of the skaters, and there’s no getting around it, so why not introduce a system that allows for actual comparative judging with integrity? One option would be to add a ranking metric after each skater (on a per group basis), essentially asking the judges “better or worse than last performance?,” and force them to compare each skater as better/worse or +/++/-/— to each past performance. (Basically, skater 2 gets better/worse comparison to 1, skater 4 gets compared to 3, 2, and 1.) Alternatively, have the judges rank the skaters after each flight. There’s a lot of options — I’m drawing a blank on how to translate what amounts to ranked-choice voting into real time judging, but I, personally, would prefer a more direct comparative measure of who “won” a given flight than components, which are genuine concepts with arbitrary values attached.

Ultimately, though, who’s defining what “win” means and what difference does it make? I do this tl;dr off my phone because I’m too sick to use a computer (autoimmune), and the part of me that’s fascinated by the statistical and game-theoretical aspects of PCS marks... I’ll never have a career, so sometimes my brain applies too much thought elsewhere, I guess. And I got somewhat invested in figure skating again as even when my nerves are being stripped, Hanyu’s performances can make me forget about the pain. As for the scoring, it’s the same as it ever was. If the ISU really wanted to make a more transparent scoring system that was somewhat less arbitrary, they could design one or modify the current CoP. But they won’t. And almost all the proposed changes sound terrible. So... here we go again. [emoji23]
 

Metis

Shepherdess of the Teal Deer
Record Breaker
Joined
Feb 14, 2018
Moving from a different thread:

I agree. But they wanted him to make the free skate to have a chance at redemption.

I am of the opinion that if Nathan had performed a clean short program, they would have handed him the win, deserved or not. Just like they handed it to Zag (though that one's deserved). The figure skating Olympics are about "stories". The boundary-pushing technician Nathan Chen, with his five different types of quads, saying "move over" to an injured Hanyu, just like the boundary-pushing technician Alina Zagitova, with her 3Lz-3Lo and backloading, said "move over" to the injured Medvedeva, is a good story -- that if you fall behind the times, you will be supplanted by fresher, "more deserving" talent. This is the only reason I can use to justify their PCS rise this season (though Nathan deserved at least some of it). It was to create a final showdown.


Which is why the Olympics are, in some ways, a joke, because the scores are made to fit the story, not the skating.

For what it’s worth, Chen could have received the same marks as Rizzo in PCS for the short and still not only qualified for the free, but placed ahead of Rizzo. He could lose an additional four more points and still qualify ahead of Yee. That’s insane. I am still not certain how you can achieve greater than average (5.0) in PE (and possibly SS) when failing to execute a single jumping pass in the technical check. PCS may be subjective, but arguably there need to be more rules than just “programs with a serious error cannot receive 10s,” as if you take ISU at their word (everyone done laughing?), 5.0 = average. Average to what? Either no one at the Olympics is average, or some people have to be average or below it for their group or the overall talent pool. Or there need to be clearer definitions for 5/6/7/8 so we don’t see quite so much corridor judging.

Kind of a perfect case study in how PCS inflation by reputation makes it nigh-impossible for talented unknowns or unfavorites to catch up, as even though Chen ended up in 17th, the difference in points between 17th and 9th was less than 3 points after the short.
 

CanadianSkaterGuy

Record Breaker
Joined
Jan 25, 2013
I am happy that there were no Russians involved and we are still discussing judging conspiracies. Just shows how doomed this sport really is.

That's what Euros was for! ;)

Although there was also the SD where the Russian judge put B/S in 3rd, while the mean average placement was 6.44. http://skatingscores.com/2018/oly/dance/short/tss/

The Russian judge also put Osmond wayyy behind Zag/Med in both programs:

http://skatingscores.com/2018/oly/ladies/short/tss/
http://skatingscores.com/2018/oly/ladies/long/tss/
 

Metis

Shepherdess of the Teal Deer
Record Breaker
Joined
Feb 14, 2018
That's what Euros was for! ;)

Although there was also the SD where the Russian judge put B/S in 3rd, while the mean average placement was 6.44. http://skatingscores.com/2018/oly/dance/short/tss/

The Russian judge also put Osmond wayyy behind Zag/Med in both programs:

http://skatingscores.com/2018/oly/ladies/short/tss/
http://skatingscores.com/2018/oly/ladies/long/tss/

I love our American judge so, so much. Who put on “Blame Canada” during the short? Be less transparent, please....
 

Spirals for Miles

Anna Shcherbakova is my World Champion
Record Breaker
Joined
Aug 25, 2017
Judge #2 at Worlds in the ladies short. Overscoring of everyone, but especially Kostner. They gave her an 83! that's higher than anything Medvedeva or Zagitova have done!
 

TryMeLater

On the Ice
Joined
Mar 31, 2013
Judge #2 at Worlds in the ladies short. Overscoring of everyone, but especially Kostner. They gave her an 83! that's higher than anything Medvedeva or Zagitova have done!

The KAZ judge at the Olys gave Zagitova 83.67, the RUS judge 83.37, the JPN judge 84.07, and the SVK judge 83.67.
The KAZ judge at the Olys gave Medvedeva 83.9, the RUS judge 84.10, the HUN judge 83.20.

Now, I'm not going to say whether the scores are warranted or not, but that's Less than anything Zagitova or Medvedeva have done.
 

Spirals for Miles

Anna Shcherbakova is my World Champion
Record Breaker
Joined
Aug 25, 2017
Well yes, but they've never received that score. But I guess Kostner got 80 too not 83.
 

Procrastinator

On the Ice
Joined
Jan 12, 2014
russian judge scoring russians higher than osmond, canadian judge scoring Osmond higher than russians... nothing new :biggrin:
korean judge being the biggest fan of russians, giving 160+ to both, now that's interesting :laugh:

That's not what being called out - it's the Russian judge lowballing Osmond in addition to scoring RUS girls higher. The CAN judge is obviously overscoring Osmond but isn't throwing 8.75s at the Russian two two.

Two things I've noticed generally these past two or three seasons bugging me, which came to a head today:

1) Pairs twists with crash landings / botched catches aren't penalized, especially not T/M's messy quad twist. +2.0 went up immediately and didn't budge, when that should have been +0 or lower.

2) The judges in the men's have started ignoring the footwork into solo triple/quad requirement and GOE reduction (unless the rules have been changed and I missed it?). Nathan does almost nothing into his 4F. Yes, I understand more footwork into it may be impossible, but those are the rules and he should still be penalized. The judges can still take into consideration that there is *something* before the jump and that the jump is difficult and only take off -1 instead of more. Kolyada is even worse with his lutz. His 3Lz today was GIGANTIC and had a difficult exit - a sure +3 IF this wasn't the short program. I had him at +1.

--

As an aside, I encourage you all to read the rather long post above; it gets at a lot of the game theoretic / incentive to overscore thing I think I discussed earlier in this thread, but much more eloquently. The solution is to recruit more non-powerhouse judges and to drop more of the scores.
 

charlotte14

Medalist
Joined
Aug 16, 2017

bobbob

Medalist
Joined
Feb 7, 2014
Pairs free skate components today, listed in skate order.

A/W 55.83
Ziegler/Kiefer 58.38
Hocke/Blommaert 59.14
Duskova/Bidar 58.84
Ryom/Kim 60.34
Scimeca Knierim/Knierim 59.90
Yu/Zhang 65.67
Moore Towers/Marinaro 66.11
Astakhova/Rogonov 66.43
Marchi/Hotarek 68.07
Peng/Jin 67.63
Della Monica/Guarise 68.79
Zabiako/Enbert 69.43
James/Cipres 72.80

Savchenko/Massot 78.96
Tarasova/Morosov 73.94


Take the example of Peng/Jin, who by all accounts have been getting way lower PCS than Yu/Zhang. They come in here, skate a subpar program, with more mistakes than Yu/Zhang, and then turn up with 2 points higher PCS? Like what?

Now take the pairs ranked between 4th and 10th. Their long PCS, listed in order of skate order. You can see the PCS increasing almost linearly, with only a small bump down for P/J presumably due to hometown scoring for M/H. Now you say, later skating pairs are probably better, true, but look now at the short, with a different oder.

Now look at the short PCS, in SKATE ORDER of these 8 pairs:
A/R 32.09
Peng/Jin 33.22
Mt/M 32.69
M/H 34.40
Y/Z 33.51
DM/G 34.24
Z/E 35.35
J/C 36.09

The PCS are different--but even with the order mixed up, the PCS seems to go with the order, not the skater!
Other than a single outlier of Savchenko/Massot, everyone else's PCS was basically the PCS of the skater before them increased by a little bit. Skate order is literally the only thing that matter when determining PCS (If you are a home country skater that helps too, and may be the only way of getting yourself of the PCS assigned to you based on your start order.) The only outliers here were home country skaters M/H, and not by much!

Ridiculous!
 

Procrastinator

On the Ice
Joined
Jan 12, 2014
Nathan Chen's OBVIOUS UR 4F, no call: https://twitter.com/Alisha_Vaughn24/status/976856272745222145
Vincent Zhou's OBVIOUS UR 4lz3T, no call:https://twitter.com/Alisha_Vaughn24/status/976857237103742976
Boyang Jin's 4T, called https://twitter.com/Alisha_Vaughn24/status/976852311464202246

Same competition, they're all Asian, and all Chinese honestly. Biggest difference: The first 2 represent USFA/ America. And Boyang represents China.
Nationalities MATTER.

*angles* matter, and we don't know what camera judges used. The gifs, especially for Chen, are misleading. Also, UR does not mean slightly less than fully backward, it means 1/4 turn or more, with benefit of doubt giving to the skater.

Sigh, the Boyang fans really are driving me mad like no other. Yes, he's talented. No, he's not getting lowballed.
 

plushyfan

Record Breaker
Joined
Jun 27, 2012
Country
Hungary
Yes, it shows we can like the new system or not but the main purpose was not met. The judges can cheat if they want.
 

Barb

Record Breaker
Joined
Oct 13, 2009
Nathan Chen's OBVIOUS UR 4F, no call: https://twitter.com/Alisha_Vaughn24/status/976856272745222145
Vincent Zhou's OBVIOUS UR 4lz3T, no call:https://twitter.com/Alisha_Vaughn24/status/976857237103742976
Boyang Jin's 4T, called https://twitter.com/Alisha_Vaughn24/status/976852311464202246

Same competition, they're all Asian, and all Chinese honestly. Biggest difference: The first 2 represent USFA/ America. And Boyang represents China.
Nationalities MATTER.

not only Nationalities, the reputation and politicking of coaches helps too, Boyang should go with Brian or another big name.
 

charlotte14

Medalist
Joined
Aug 16, 2017
*angles* matter, and we don't know what camera judges used. The gifs, especially for Chen, are misleading. Also, UR does not mean slightly less than fully backward, it means 1/4 turn or more, with benefit of doubt giving to the skater.

Sigh, the Boyang fans really are driving me mad like no other. Yes, he's talented. No, he's not getting lowballed.
Some audience at the Milan WC saw that Vincent and Nathan's questionable landings in real time. And nope, no one says Boyang was underscored. They're saying he's robbed because the American skaters do not get called for the same mistakes.
 

kurtstefan

Rinkside
Joined
Dec 25, 2014
I don't think it's just the case of the "fake Australian" judge. The whole technical panel for ladies and men might have used a completely different set of ISU technical guidelines from the rest of the ISU as 95% of calls at this event were simply mind boggling.

I am talking to you Ms Yukiko Okabe, Mr Scott Davis, and Mr Fernand Fedronic. J'accuse......

This person Frederonic, whom I know very well, should be eliminated by ISU for more judgeing
 

Osmond4gold

Record Breaker
Joined
Jan 27, 2013
That's what Euros was for! ;)

Although there was also the SD where the Russian judge put B/S in 3rd, while the mean average placement was 6.44. http://skatingscores.com/2018/oly/dance/short/tss/

The Russian judge also put Osmond wayyy behind Zag/Med in both programs:

http://skatingscores.com/2018/oly/ladies/short/tss/
http://skatingscores.com/2018/oly/ladies/long/tss/

Am I rest my case re Osmond lowballed at the OG's. Looks like some judges are getting ...nervous. ;)
 

Spirals for Miles

Anna Shcherbakova is my World Champion
Record Breaker
Joined
Aug 25, 2017
I really don't think she was lowballed: she had more jumps in the first part, but more importantly, she made a mistake

If you want to further discuss this, let's find the Olympics thread to do it on though :)
 
Top