Scoring bias at the national level | Page 3 | Golden Skate

Scoring bias at the national level

Joined
Jun 21, 2003
Joe Inman (sp?) a US judge who also found his way to Golden Skate a number of years ago did Michelle Kwan no favors at Salt Lake.
Poor Joe. The New York Times, not understanding the "OBO" scoring system that was in place at the time, published a big story saying that if Inman and given Michelle Kwan a higher mark then Michelle would have won the gold. This was incorrect, but oh well.

I believe that Inman is a professional pianist and piano teacher, and was always known for showing especial appreciation for skaters who interpreted their music well.
 

snowed

Rinkside
Joined
Feb 7, 2023
I have some observations about Worlds 2023.
- using the https://skatingscores.com/2223/wc/ national bias charts (they are in the right top corner, second line), there were few judges around 10 points higher than that zero that is the average I guess, for men fs, the korean and french, for women fs the american judge, for dance the British judge was higher in both rhythm dance and free dance totaling towards 10 points also.

- looking at the marks of the american judge on women FS, he seem to have a wider range than the other judges, here is his overall points compared with what the panel averaged, in order of placement 1)Haein Lee +5, 2)Kaori similar, 3) Chaeyeon Kin +4, 4)Leona Hendrix +1 5) Isabeau +9 Mai 6)Mihara +4, many +/- up to 5 that I won't mention,10)Nina Pinzarone +10 14)Amber Gken+9, 1)Bradie +6 21)Janna -9 23)Alexandra Fegin -11 24)sofia sepchenko -11

-looking at 3) Chaeoyeon Kim that got 139.46, the american judge gave 143.37 that was 3rd place in his ovarall marking, the swiss judge gave 137.68 (so closer to 139.46) but that placed her in 1st place in his own markings. Which is more correct?

- if we look at GOEs the marks of the american judge for american ladies were mostly in line, the components were higher, but he was higher on the first placed ladies too, he was the only one that gave 9.50 on all 3 components to 1) haein, the rest of the judges were max at 9.25, some 8s also, to 2)kaori he gave 9.75 for SS, just another judge gave that, and there were 9.0 too, 6) Mai Mihara he gave 9.25 for SS, next high mark was 8.75

To me, looking at scores is not saying much (again, he is using a wider range going from 152.19 to 88.46, the swiss judge range was between 137.68 to 97.16) We should look more at placement, though judges cannot know how they will place. Anyway the american judge "placed" 5)Isabeau in 4rt, but wait the japonese judge placed 6)Mai in 5th, the Netherland judge placed 4)Leona in 3rd, the swiss judge placed 8)Kimmy in 7th, the estonian judge placed 12)Nina in 10th.

Amber Glenn came in 14, but the american judge put her in 10 (TESS was 13 so close, components 7th), but I dare to say that he was right in components, he gave 9 to ss, she is strong, 8.5 in composition and 8 in presentation. I dare to say that the other judges were unfamiliar with her and low balled her.

I guess my point is that I don't think looking at scores or (even placement, placement would be better though) shows much, as each judge has a different range. I think each anomaly (as element or component) should be looked by a panel of specialist (strong judges) and analyzed against a video of the competition, only like that a judge can be found suspicious or guilty of national bias or incompetence.
 

NaVi

Medalist
Joined
Oct 30, 2014
What I find weird about the ISU judging processes is that they're too uniform between competitions no matter how high profile they are.

If you have an event like Worlds then you're going to have an event with a large group of people already in attendance that could be potential judges. Instead of spelling out far in advance which countries will be in a judging pool, why not just have a judges draft at Worlds of potential judges? You could let them all go to the practices if they wanted to and do a quick drawing pretty close to the event happening. Then make it 15 judges to increase the amount of federations that would have to collude with one to have an effect.

I'm not sure a scoring corridor is very healthy and with my scheme here for worlds I feel like it could be gotten rid of or loosened.
 
Last edited:
Joined
Jun 21, 2003
My biggest takeaway from all the information posted on this thread is that the ISU is well aware of the challenges it faces in giving effective oversight to judging. The particuar example of the OP iregarding the U.S. judge who overscroed the three U.S. skaters at 2023 worlds was, I think, handled in a professional and even-handed way. The judge was not, in the absence of any non-statistical evidence, accused of being an evil and dishonest person. Still, his marks were identified as being out of line and he was officially cautioned about it.

To me, the ISU safeguards worked well in this instance.
 
Joined
Jun 21, 2003
I'm not sure a scoring corridor is very healthy and with my scheme here for worlds I feel like it could be gotten rid of or loosened.
An intriguing idea. And certainly there is something off-putting about the notion of a scoring corridor -- "You have to give the same marks as the other judges do or else!"

On the other hand, the coridor could hardly be "looser" than it already is. It is pracatically impossible for a judge to get significantly and repeatedly out of the corridor. For one thing, a conniving cheater can easily figure out how to inflate his own skarers' marks while devaluing his skaters' competitors, and still skate free of the corridor. Even the U.S. judge that is the subject of this thread was not outside the corridor, notwithstanding that by a different measure he scored the U.S. skaters 9 points higher than the other judges did.

The second consideration is this: How else can we evaluate the accuacy of a judge's marks under the rules except to compare them with the marks that other well-qualified judges give to the same performance? Even so, there is something distasteful about finding fault with a judge for no other reason than that he refuses to be bullied by the majority. The whole thing is a tricky business, if you ask me.
 
Last edited:

4everchan

Record Breaker
Joined
Mar 7, 2015
Country
Martinique
The issue is not that one judge, once in a while loves a performance and marks it higher than all the others. I believe it's very fine that different people have different appreciation for a program. I am talking here more about PCS because GOE should still follow bullet points and should be rather objective.

The issue is one when judge keeps doing that for skaters of their own federation, and that's what the ISU has noticed with that one judge. Keep it coming I say. As fans, we see it with the elaborate skating scores easily available to us. In the competition threads, there are a lot of people discussing the scores after the events. If fans can easily notice some biases, the ISU has to be responsible and punish these guys. It is difficult to prove an intent to cheat. However it's very simple to observe that some judge has an abnormal pattern and issue warnings. It is quite possible that the judge didn't realize how biased they were... yet, I tend to believe that judges know exactly what they are doing ;)
 

snowed

Rinkside
Joined
Feb 7, 2023
The issue is one when judge keeps doing that for skaters of their own federation, and that's what the ISU has noticed with that one judge. ;)
If you look on what I posted few posts above, every single judge was "placing" his own federation's skater, one place above the average, just as the american judge. And the american judge had a wider mark range overall, not only with the 3 american skaters.
 

4everchan

Record Breaker
Joined
Mar 7, 2015
Country
Martinique
If you look on what I posted few posts above, every single judge was "placing" his own federation's skater, one place above the average, just as the american judge. And the american judge had a wider mark range overall, not only with the 3 american skaters.
yeah.. i saw. i agree with your analysis. I am not talking about this case in particular but in general. In competition threads, at the end of each event, some fans share the skating scores and we discuss it. Strange things happen regularly. My point is that it is okay to have some judges like some programs more than others, but if it's always the flag instead of the performance, then we have a problem. I haven't looked carefully but it seems to me that dance scores are even more at risk when it comes to nationalistic bias. ;)
 

snowed

Rinkside
Joined
Feb 7, 2023
To me, national bias is real, even if it's only unconscious bias, and I think that some of it is on purpose. It starts with a conflict of interest, the federations are nominating the judges to serve as ISU judge (haven't recently a judge, maybe russian, lost his/her federation support?). But I think by excluding the higher and lower marks from the final mark is correcting the national bias problem in a significant proportion. Maybe ISU should exclude the 2 lowest marks and the 2 highest marks...
 
Joined
Jun 21, 2003
As fans, we see it with the elaborate skating scores easily available to us. In the competition threads, there are a lot of people discussing the scores after the events. If fans can easily notice some biases, the ISU has to be responsible and punish these guys.
I disagree that the ISU has the responsiblity to punish judges beacuse of comnplaints by fans.

The ISU has access to the same data as posters to Golden Skate do. They must then follow their own procedures regarding what to do about it.

On the recent Russian natiuonals competition threads there were charges that the judges were giving higher marks to skaters who have large and vocal fan bases, and lower scores to those who didn't. This is most certainly true, but, as they say, "correlation does not equal causality." ;) No one should be surprised that the skaters who sre perceived as "best" get the highest marks AND have the most fans.
 
Last edited:
Joined
Jun 21, 2003
Maybe ISU should exclude the 2 lowest marks and the 2 highest marks...
This has been tried in past versions of the IJS. One virtue of this plan is that then the recorded averages are out of 5 so all the PCSs would end in pretty decimals instead of weird equivalents of 28ths. This would eliminate rounding errors. ;)

However, from a statistical point of view a severely timmed mean starts to verge over into the Alice-in-Wonderland realm of "non-parametric statistics." The distribution of values of the trimmed mean does not follow a nice mathematical formula, and in fact the sample standard deviation for the trimmed mean is best approximated by using the entire sample size (9) instead of the number of data actually used (7 or 5). (At least this is true under the assumption of symmetry -- there are the same number of high scores as there are low.) Statistically, the best solution is to have a thousand judges and count them all.

(This is all very cool mathematics, though I am not sure that it speaks to any question of relevance to figure skating. ;) )
 
Last edited:

4everchan

Record Breaker
Joined
Mar 7, 2015
Country
Martinique
I disagree that the ISU has the responsiblity to punish judges beacuse of comnplaints by fans.
not because of complaints of fan :) did i say this? The ISU has the responsibility to punish judges because they are not good... What I said, is that even fans can easily spot the nationalistic bias... so if we can do so, I bet the ISU would be able to do so easily.
The ISU has access to the same data as posters to Golden Skate do. They must then follow their own procedures regarding what to do about it.

On the recent Russian natiuonals competition threads there were charges that the judges were giving higher marks to skaters who have large and vocal fan bases, and lower scores to those who didn't. This is most certainly true, but, as they say, "correlation does not equal causality." ;) No one should be surprised that the skaters who sre perceived as "best" get the best marks AND have the most fans.
 
Joined
Jun 21, 2003
not because of complaints of fan :) did i say this?
I was reacting to: " If fans can easily notice some biases, (then) the ISU has to be responsible and punish these guys."

It was the implied "then" that I think is wrong. IMHO the ISU has no responsibility to "punish" anyone because of what fans notice, or imagine that they notice, however loudly they rage.

Iif we (fans) can do so (spot bad and biased judging),I bet the ISU would be able to do so easily.
This conclusion I agree with. The ISU can and does spot the same anomalies as we do. How could they not -- are they worse at evaluating numbers than "we" are?

And now what? YMMD, but I think, all things considered, that the ISU is doing OK at using their authority to keep tabs on errant judges. Could they do better? Personally, I am not going to armchair-quarterback that one, but hey, have at it! It's our duty as skating fans.
 
Last edited:

4everchan

Record Breaker
Joined
Mar 7, 2015
Country
Martinique
I was reacting to: " If fans can easily notice some biases, (then) the ISU has to be responsible and punish these guys."

It was the implied "then" that I think is wrong. IMHO the ISU has no responsibility to "punish" anyone because of what fans notice, or imaging that they notice, however loudly they rage.


This conclusion I agree with. The ISU can and does spot the same anomalies as we do. How could they not -- are they worse at evaluating numbers than "we" are?

And now what? YMMD, but I think, all things considered, that the ISU is doing OK at using their authority to keep tabs on errant judges. Could they do better? Personally, I am not going to armchair-quarterback that one, but hey, have at it! It's our duty as skating fans.
you know, i barely speak English, so if you find flaws in my grammar, that's all good and thank you ! ... but the real flaws here are in nationalistic bias judging ;)
 

Miller

Final Flight
Joined
Dec 29, 2016
My biggest takeaway from all the information posted on this thread is that the ISU is well aware of the challenges it faces in giving effective oversight to judging. The particuar example of the OP iregarding the U.S. judge who overscroed the three U.S. skaters at 2023 worlds was, I think, handled in a professional and even-handed way. The judge was not, in the absence of any non-statistical evidence, accused of being an evil and dishonest person. Still, his marks were identified as being out of line and he was officially cautioned about it.

To me, the ISU safeguards worked well in this instance.
I've done a little bit of further analysis now that Christmas is over.

I looked at all the LP scores for Women for this years Grand Prix series and Final. The good thing about this is, as someone mentioned above, every single skater should have a judge from their country on the judging panel. Then I looked for any individual 'effective score' as if the judge had scored the competiton by themself that was more than 7 points different from the actual - this was to catch any 'anomalies' for lower scoring skaters - the difference for Isabeau Levito that this thread is about was 8.71.

The first thing I noticed at a glance compared with when I last looked at this round about 2018/19 was that the scores were so much closer to the actual than before - has the setting up of SkatingScores round about that time had its effect (I note that the ISU was using the effective score in its judgement)? I'm sure a difference of 8.71 back in the day wouldn't have triggered anything - the stuff that got the 2 Chinese judges banned at the 2018 Olympics as mentioned earlier in the thread was so obvious it was untrue, especially in the Men's - according to the Chinese judge Boyang Jin (4th) would have been the new gold medallist/combined score world record holder with a score 50/55 points higher than Shoma Uno and Javier Fernandez who actually came 2nd and 3rd.

The analysis - there were 693 cases of judges scoring the competitors across the 7 competitions - 12 skaters x 9 judges in 5, 11 x 9 in one, and 9 x 6 in the Final.

I got there to be 40 cases of scores more than 7 points difference in the effective score - 13 higher, 27 lower (don't get the wrong side of Dutch judges - all 3 of them were brutal with skaters they didn't like!).

Only 3 of them were when a judge was judging their own countries' skater. Rion Sumiyoshi (JPN) at GP France 143.26 vs 136.04 actual, Lorine Schild (FRA) at Espoo 122.28 vs 114.64, and Starr Andrews (USA) 105.34 vs 97.06 at Espoo. Bearing in mind the number of instances where a skater would have been judged by a judge from their own country this is really quite low. I think the problem for Doug Williams was that he had 2 skaters in the same segment of a World Champs, plus the 6 out of 6 higher than the actual across both the SP and LP. Based on this I'd agree that the ISU is definitely picking up anomalies much more than it used to be, plus another way of looking at the difference for Isabeau Levito is that the 8.71 difference is only that for GOEs and PCS - her base value of 58.46 would have been the same for all judges so Mr Williams' difference would have been based on a score of 84.87 for GOEs/PCS compared with 76.16 actual i.e. 11.43% higher - is the trigger point plus or minus 10%?

Overall I'd say the ISU's procedures are working well to pick up this case. However that of general slight nationalistic bias i.e. judges in general giving their skaters a little bit more than the final actual figure, well I don't see much evidence of this e.g. the 14 out of 16 (20 out of 22 including Mr Williams) cases where the effective score in the World Champs SP or LP was higher than the actual. This can only really be solved by not allowing judges to judge their own countries' skaters - any one or two minor cases for an individual judge can easily be explained away, it's only when you look at the competiton as a whole that you can see something is going on.
 
Last edited:
Joined
Jun 21, 2003
This can only really be solved by not allowing judges to judge their own countries' skaters ...
Sometimes 'tis better to bear the ills we have than fly to others that we know not of.

If the proposal is to have 9 sitting judges and each skater is marked by a slightly different sub-panel of 8. I think that this presents problems, too. For instance an over-lenient but unbiased judge could end up giving high marks to everyone except the skater from a particular country, skewing the results. It could, in fact,provide extra incentive for a judge deliberately to lowball other skaters.

One version of the early IJS had, as I recall, 14 sitting judges, and featured a random and secret draw that selected which 9 marks would count and which 5 judges (without being told) would just be sitting there like fools. (This scheme was eventually laughed off the stage.)
 
Last edited:

BlissfulSynergy

Record Breaker
Joined
Sep 1, 2020
Country
Olympics
Oh well. I mean as fans, we are always finding some judges' scores rather suspect, particularly on PCS! I guess this type of ruling means that judges must stay in line with the status quo views, which more likely leads to cookie cutter judging. There are no easy answers the way this sport is structured, mismanaged, and failed in regard to leadership. šŸ˜ž
 

4everchan

Record Breaker
Joined
Mar 7, 2015
Country
Martinique
Actually, cookie cutter judging should be the aim. This is more or less happening in other sports like diving. There is a bit of variation but judges, as expert, should have objective ways (and yes, even for PCS) to give marks within a somewhat similar range. A consensus among judges is really the ultimate goal. I don't really understand why people are advocating against judges all giving similar marks. It actually would be very credible if the judges were to agree on the athletic performances. The issues are there when judges indeed do not agree.
When judges disagree
1) it means the sport is not easily objectively judged either by its own nature or by the judging system
2) some judges may be cheating
3) judges have different preferences---which should not really happen in athletic performances.
4) some judges may have nationalistic bias

I am one of these fans who love looking at skating scores and how different judges marked skaters all over the place... but then, I had this thought : what if the skating scores lined up and provided a very educated and convincing evaluation of the performances ? Wouldn't the sport be better if that were to happen?

I believe so. Again, going back to other judges sports I do watch (diving, moguls) there is very little variation between judges. I don't see as many wuzrobbed... Athletes are fine with their scores (most often). Less scandals. Are there judges kicked out for nationalist bias ? Not that I have witnessed...
 

kolyadafan2002

Fan of Kolyada
Final Flight
Joined
Jun 6, 2019
I've done a little bit of further analysis now that Christmas is over.

I looked at all the LP scores for Women for this years Grand Prix series and Final. The good thing about this is, as someone mentioned above, every single skater should have a judge from their country on the judging panel. Then I looked for any individual 'effective score' as if the judge had scored the competiton by themself that was more than 7 points different from the actual - this was to catch any 'anomalies' for lower scoring skaters - the difference for Isabeau Levito that this thread is about was 8.71.

The first thing I noticed at a glance compared with when I last looked at this round about 2018/19 was that the scores were so much closer to the actual than before - has the setting up of SkatingScores round about that time had its effect (I note that the ISU was using the effective score in its judgement)? I'm sure a difference of 8.71 back in the day wouldn't have triggered anything - the stuff that got the 2 Chinese judges banned at the 2018 Olympics as mentioned earlier in the thread was so obvious it was untrue, especially in the Men's - according to the Chinese judge Boyang Jin (4th) would have been the new gold medallist/combined score world record holder with a score 50/55 points higher than Shoma Uno and Javier Fernandez who actually came 2nd and 3rd.

The analysis - there were 693 cases of judges scoring the competitors across the 7 competitions - 12 skaters x 9 judges in 5, 11 x 9 in one, and 9 x 6 in the Final.

I got there to be 40 cases of scores more than 7 points difference in the effective score - 13 higher, 27 lower (don't get the wrong side of Dutch judges - all 3 of them were brutal with skaters they didn't like!).

Only 3 of them were when a judge was judging their own countries' skater. Rion Sumiyoshi (JPN) at GP France 143.26 vs 136.04 actual, Lorine Schild (FRA) at Espoo 122.28 vs 114.64, and Starr Andrews (USA) 105.34 vs 97.06 at Espoo. Bearing in mind the number of instances where a skater would have been judged by a judge from their own country this is really quite low. I think the problem for Doug Williams was that he had 2 skaters in the same segment of a World Champs, plus the 6 out of 6 higher than the actual across both the SP and LP. Based on this I'd agree that the ISU is definitely picking up anomalies much more than it used to be, plus another way of looking at the difference for Isabeau Levito is that the 8.71 difference is only that for GOEs and PCS - her base value of 58.46 would have been the same for all judges so Mr Williams' difference would have been based on a score of 84.87 for GOEs/PCS compared with 76.16 actual i.e. 11.43% higher - is the trigger point plus or minus 10%?

Overall I'd say the ISU's procedures are working well to pick up this case. However that of general slight nationalistic bias i.e. judges in general giving their skaters a little bit more than the final actual figure, well I don't see much evidence of this e.g. the 14 out of 16 (20 out of 22 including Mr Williams) cases where the effective score in the World Champs SP or LP was higher than the actual. This can only really be solved by not allowing judges to judge their own countries' skaters - any one or two minor cases for an individual judge can easily be explained away, it's only when you look at the competiton as a whole that you can see something is going on.
You need the same panel for an event as every judge scores differently, so it's difficult to not have judges from that country. Maybe not have judges from top 5 countries to avoid placement issues? I don't know, still a reach. All I'm saying is ice dance judging is much worse than this example (look through the last 4 years of Christopher buchanan on skatingscores, or the italian judges wirh G/F, or the US judges with C/B )
 
Top