School Rankings

For anything Science Olympiad-related that might not fall under a specific event or competition.
legendaryalchemist
Exalted Member
Exalted Member
Posts: 29
Joined: February 7th, 2020, 2:07 pm
Division: C
State: WI
Pronouns: He/Him/His
Has thanked: 22 times
Been thanked: 11 times
Contact:

Re: School Rankings

Post by legendaryalchemist »

It seems to me that the largest tournaments (particularly MIT) are drowning out other competitions. Nearly every team that attended MIT had by far their best performance of the year there, according to the spreadsheet, which seems suspicious. Consequently, most teams that did not attend are underrated. The problem seems to be giving tournaments different "weights" and then making team's scores for the tournament some fraction of that weight. This makes it impossible for a stellar performance at a medium-sized meet to make a big difference. Maybe something that standardizes the "weight" so that it corresponds to a given superscore?

A simple way (I'm sure you can think of a better system, this is just an example lol) of doing this would be to assign a score of 100*W/SS for a given tournament, where W is the weight of the tournament and SS is the team's superscore. A team's superscore will be higher at larger, more heavily-weighted tournaments, so it would give medium-sized tournaments a chance to make a difference while still reserving the highest scores for the largest tournaments (if team X sends equal-caliber teams to one tournament that has 60 teams and another that has 120, with similar quality of teams at each tournament, the weight of the latter would be about twice that of the former, but team X's superscore would not quite be twice as high at the larger tourament, thus making the larger tournament worth more).

This would also give an intuition for what differences in the rankings actually mean - a team with twice another's rating would be expected to score half as many points at a given tournament, assuming full stack. If New Trier has a score of 71 and Naperville North has a score of 31 (strange considering UChicago results), it is unclear how they would be expected to fare against one another in competition, other than New Trier doing better. Of course, different ranking systems are ideal for different data sets. I think the current system is generally adapted well to produce reasonable results for the top 5 teams in the nation, but beyond there it loses some of its accuracy.
Last edited by legendaryalchemist on February 16th, 2021, 3:13 pm, edited 1 time in total.
These users thanked the author legendaryalchemist for the post:
pb5754 (February 16th, 2021, 2:55 pm)
Yale University, Class of 2026 | Marquette University High School, Class of 2022
Medal Count: 128 | Gold Count: 67
Userpage: https://scioly.org/wiki/index.php/User: ... yalchemist
User avatar
builderguy135
Exalted Member
Exalted Member
Posts: 736
Joined: September 8th, 2018, 12:24 pm
Division: C
State: NJ
Pronouns: He/Him/His
Has thanked: 191 times
Been thanked: 143 times
Contact:

Re: School Rankings

Post by builderguy135 »

legendaryalchemist wrote: February 16th, 2021, 2:23 pm It seems to me that the largest tournaments (particularly MIT) are drowning out other competitions. Nearly every team that attended MIT had by far their best performance of the year there, according to the spreadsheet, which seems suspicious. Consequently, most teams that did not attend are underrated. The problem seems to be giving tournaments different "weights" and then making team's scores for the tournament some fraction of that weight. This makes it impossible for a stellar performance at a medium-sized meet to make a big difference. Maybe something that standardizes the "weight" so that it corresponds to a given superscore?

A simple way (I'm sure you can think of a better system, this is just an example lol) of doing this would be to assign a score of 100*W/SS for a given tournament, where W is the weight of the tournament and SS is the team's superscore. A team's superscore will be higher at larger, more heavily-weighted tournaments, so it would give medium-sized tournaments a chance to make a difference while still reserving the highest scores for the largest tournaments (if team X sends equal-caliber teams to one tournament that has 60 teams and another that has 120, with similar quality of teams at each tournament, the weight of the latter would be about twice that of the former, but team X's superscore would not quite be twice as high at the larger tourament, thus making the larger tournament worth more).

This would also give an intuition for what differences in the rankings actually mean - a team with twice another's rating would be expected to score half as many points at a given tournament, assuming full stack. If New Trier has a score of 71 and Naperville North has a score of 31 (strange considering UChicago results), it is unclear how they would be expected to fare against one another in competition, other than New Trier doing better. Of course, different ranking systems are ideal for different data sets. I think the current system is generally adapted well to produce reasonable results for the top 5 teams in the nation, but beyond there it loses some of its accuracy.
While do you bring up a very important flaw in our ranking system, your proposed system far overvalues smaller tournaments, benefiting "spamming" tournaments instead of doing only a few well. The superscore metric is also effectively useless, as different competitions' winning superscore varies extremely significantly. We've experimented with a winning score bonus for teams such as Enloe at NC in-state invitationals, but ultimately there is no consistent way to balance out how hard a team sweeped and the difficulty of a competition: 100 points at MIT is far, far more impressive than even a 30 at other competitions.

While this may not be an optional solution, it, to us, is close to the best that we can get. The only reason that we are calculating scores as the top 4 competitions of one school is that we do not want teams who can go to fewer competitions to be undervalued. Some teams will be overvalued, some undervalued; there's no perfect balance, unfortunately.

Thanks for the feedback.

Edit: so I've taken a look at your school's rankings on our spreadsheet. Unfortunately, it is definitely one of the underrated teams by our algorithm, and it seems like many of your complaints are somewhat specific to your own team – however, consider the fact that it is not the small invites that a team sweeps that tells others how good a team is in relation to others, but rather how well they perform at larger invites in which the team doesn't sweep.
Last edited by builderguy135 on February 16th, 2021, 4:07 pm, edited 1 time in total.
These users thanked the author builderguy135 for the post (total 2):
Umaroth (February 16th, 2021, 5:50 pm) • sneepity (February 16th, 2021, 8:27 pm)
West Windsor-Plainsboro High School North '22
BirdSO Co-Director
My Userpage
legendaryalchemist
Exalted Member
Exalted Member
Posts: 29
Joined: February 7th, 2020, 2:07 pm
Division: C
State: WI
Pronouns: He/Him/His
Has thanked: 22 times
Been thanked: 11 times
Contact:

Re: School Rankings

Post by legendaryalchemist »

builderguy135 wrote: February 16th, 2021, 3:58 pm
legendaryalchemist wrote: February 16th, 2021, 2:23 pm It seems to me that the largest tournaments (particularly MIT) are drowning out other competitions. Nearly every team that attended MIT had by far their best performance of the year there, according to the spreadsheet, which seems suspicious. Consequently, most teams that did not attend are underrated. The problem seems to be giving tournaments different "weights" and then making team's scores for the tournament some fraction of that weight. This makes it impossible for a stellar performance at a medium-sized meet to make a big difference. Maybe something that standardizes the "weight" so that it corresponds to a given superscore?

A simple way (I'm sure you can think of a better system, this is just an example lol) of doing this would be to assign a score of 100*W/SS for a given tournament, where W is the weight of the tournament and SS is the team's superscore. A team's superscore will be higher at larger, more heavily-weighted tournaments, so it would give medium-sized tournaments a chance to make a difference while still reserving the highest scores for the largest tournaments (if team X sends equal-caliber teams to one tournament that has 60 teams and another that has 120, with similar quality of teams at each tournament, the weight of the latter would be about twice that of the former, but team X's superscore would not quite be twice as high at the larger tourament, thus making the larger tournament worth more).

This would also give an intuition for what differences in the rankings actually mean - a team with twice another's rating would be expected to score half as many points at a given tournament, assuming full stack. If New Trier has a score of 71 and Naperville North has a score of 31 (strange considering UChicago results), it is unclear how they would be expected to fare against one another in competition, other than New Trier doing better. Of course, different ranking systems are ideal for different data sets. I think the current system is generally adapted well to produce reasonable results for the top 5 teams in the nation, but beyond there it loses some of its accuracy.
While do you bring up a very important flaw in our ranking system, your proposed system far overvalues smaller tournaments, benefiting "spamming" tournaments instead of doing only a few well. The superscore metric is also effectively useless, as different competitions' winning superscore varies extremely significantly. We've experimented with a winning score bonus for teams such as Enloe at NC in-state invitationals, but ultimately there is no consistent way to balance out how hard a team sweeped and the difficulty of a competition: 100 points at MIT is far, far more impressive than even a 30 at other competitions.

While this may not be an optional solution, it, to us, is close to the best that we can get. The only reason that we are calculating scores as the top 4 competitions of one school is that we do not want teams who can go to fewer competitions to be undervalued. Some teams will be overvalued, some undervalued; there's no perfect balance, unfortunately.

Thanks for the feedback.

Edit: so I've taken a look at your school's rankings on our spreadsheet. Unfortunately, it is definitely one of the underrated teams by our algorithm, and it seems like many of your complaints are somewhat specific to your own team – however, consider the fact that it is not the small invites that a team sweeps that tells others how good a team is in relation to others, but rather how well they perform at larger invites in which the team doesn't sweep.
My proposed system was just an example - you could certainly improve upon it. My main point is that when the rankings become a reflection of only one tournament (or a select few as other mega-invites come in), it kind of defeats the purpose of having rankings. Of course, the largest tournaments are the best indicators, but MIT is weighted so heavily that taking 25th there is worth more than winning SOLVI (a very competitive tournament in its own right).

I would like to thank you guys for going through all the effort to put this together though. I can imagine how much time this takes to assemble all of these results. I am not trying to belittle that - I'm just giving suggestions that I believe would enhance the accuracy of these rankings for the majority of schools.
Last edited by legendaryalchemist on September 21st, 2021, 1:08 pm, edited 2 times in total.
Yale University, Class of 2026 | Marquette University High School, Class of 2022
Medal Count: 128 | Gold Count: 67
Userpage: https://scioly.org/wiki/index.php/User: ... yalchemist
User avatar
EwwPhysics
Exalted Member
Exalted Member
Posts: 158
Joined: February 22nd, 2020, 12:38 pm
Division: C
State: PA
Pronouns: She/Her/Hers
Has thanked: 144 times
Been thanked: 86 times

Re: School Rankings

Post by EwwPhysics »

I agree that the weighting seems to be a bit off. For example, I don't understand why our 41st and 46th at MIT were worth nearly twice as much as our 6th and 9th at SOAPS. Also our single unstacked team at Duke got us more points than our two unstacked teams at soaps.

Regardless, nothing is perfect and I like having an easy way to get a rough estimate of the strength of teams <3
Lower Merion Captain '24
Cell bio, code, disease, forensics
Cell bio, codebusters, disease, envirochem (and widi, chem lab) 
Protein Modeling - 1st @ nats
Disease Detectives - 4th @ nats
Designer Genes - 1st @ states
Also fossils, widi, circuit
jaymaron
Member
Member
Posts: 6
Joined: May 30th, 2022, 8:52 am
Division: Grad
State: WI
Has thanked: 1 time
Been thanked: 1 time
Contact:

Re: School Rankings

Post by jaymaron »

Strong teams:

Image

Strong states:

Image

The number of points a team scores in a competition is a function of rank.
A natural ranking function is:

Points = -log_2(Rank/Teams)

Where "Teams" is the number of teams. If teams = 64, it corresponds to the number of games won in a single-elimination tournament.

Rank Points
1 6
2 5
4 4
8 3
16 2
32 1
64 0

Expanded discussion: https://jaymaron.com/scioly.html#score

The Science Olympiad scoring function is goofy. Ideally, the score function should be a straight line in the plot.
Good functions include Formula-1, Indy racing, and World Cup Skiing.

Image
Last edited by jaymaron on June 15th, 2022, 8:27 am, edited 2 times in total.
jaymaron
Member
Member
Posts: 6
Joined: May 30th, 2022, 8:52 am
Division: Grad
State: WI
Has thanked: 1 time
Been thanked: 1 time
Contact:

Re: School Rankings

Post by jaymaron »

Image

A handful of states dominate the invitationals.

The strongest invitationals are the National Invitational, MIT, Mason, Golden Gate, and BirdSO. Data for BearSO and BadgerSO are not available.

Many highly-rated teams don't make nationals. Nationals placements can be decided by both state championships and by ratings from invitationals. Take one team from each state, and take N more teams, decided by ratings. States of death include California, New Jersey, Washington, Ohio, Illinois, Michigan, Massachusetts, and Wisconsin.

Ratings algorithm:

A team's rating is a sum of scores from invitationals. The score from an invitational is

Score = -log_2(Rank/Teams)

Where "Rank" is the rank of the team in the invitational, and "Teams" is the
number of teams at the invitational. We set Teams to 64 for all invitationals.

Rank=1 -> Score=6
Rank=2 -> Score=5
Rank=4 -> Score=4
Rank=64 -> Score=0

A proper ratings algorithm requires rating both teams and invitationals simultaneously.

The rating of an invitational is the sum of the ratings of the teams that participate.

We can extend the algorithm to include ratings of invitationals. Then a team's rating is

Team rating = Sum over invitationals [ Score + log_2(E/Emax) ]

Where Emax is the rating of the top invitational. Terms in the sum have a floor of zero. If a term is less than zero, it's set to zero.

We use a team's best 5 results from invitationals. Most heavyweight teams attend at least 5 invitationals.

The ratings can be calculated with convergence. Initialize all invitational
ratings to zero and use them to calculate team ratings. Then use the team
ratings to calculate a new set of event ratings. Repeat until convergence.

Expanded discussion of the algorithm: https://jaymaron.com/scioly.html#score

There are natural reasons for why the score function is Score=-log_2(Rank/Teams).
In the plot, this function is a straight line. It weights results logarithmically.

The algorithm is expanded on in https://jaymaron.com/scioly.html
Last edited by jaymaron on June 22nd, 2022, 10:04 am, edited 1 time in total.
These users thanked the author jaymaron for the post:
Blesson (January 23rd, 2023, 7:17 pm)
Post Reply

Return to “General Competition”

Who is online

Users browsing this forum: Amazon [Bot] and 5 guests