Page 8 of 9

Re: Musings on Test Length

Posted: January 8th, 2021, 11:41 pm
by SilverBreeze
Skink wrote: January 8th, 2021, 6:19 pm Interesting discussion. A few points:

1. I signed on to search for thoughts on the (troubling) fact that events seem entirely "open note" this season. What's the point in building a binder or cheat sheet when teams are going to have phones, tablets, or other computers available for looking up answers, a substantial fraction of which will be within reach (especially among the strongest competitors who know what they're looking for and able to find)? The average ES' tests are neither challenging nor lengthy such that I fear my teams are walking into something extremely highly scoring where the differences between scores may very well be determined by the quality of the open notes.
I agree with the underlined portion; however, to my knowledge, the only official competition with open note events is MIT, which is known for test quality and difficulty. (correct me if I'm wrong)
Skink wrote: January 8th, 2021, 6:19 pm 2. Whether questions are objective, subjective, or both, length must be there to separate scores adequately. The rule of thumb is the number of registered teams doubled is the minimum number of points there should be, but even this is limiting depending on the subject matter. The real problem here, as I see it, is the scoring system's demands for breaking all ties. See, at an invite of forty teams, there is no theoretical reason why Skink MS JV3 should score uniquely compared against some other team that came (very) inadequately prepared or not at all. Our goal is to separate the medalists and a few more places from there for the sake of the team score. There is no meaningful difference between a rank of 25 and 30, for example, but the scoring system pretends that there is. Discarding the assumption that low scores need to be separated, we shift focus to medalists, which are less of an issue.
I agree, and it's also important to give less-prepared teams a good test experience, even if they get similar scores. (you weren't implying the opposite; just wanted to clarify in case people begin attacking this as elitist)
Skink wrote: January 8th, 2021, 6:19 pm 3. I saw someone report a two hundred question test. There should almost never be a two hundred question test because this goes against the program's goals. The goal is for the ES to present a limited number of problems for teams to think critically about and solve; this requirement places a (kind of low) ceiling over how many questions can be completed in fifty minutes. Even the most brilliant and Scioly-famous participants squirm under pressure.
I'm not sure how much the "almost" is meant to encompass, but high-quality, difficult Orni tests regularly have 150-200 questions. When done well, this tests participants on interesting and useful information about birds while still engaging critical thinking. Perhaps it is the ID events themselves that need reworking to be less about only gathering, organizing, and retrieving information, but I think the underlined portion might be too broad a generalization.

Re: Musings on Test Length

Posted: January 9th, 2021, 6:53 am
by MorningCoffee
In my opinion, long tests are suitable depending on the event. As SilverBreeze said, amazing orni events tend to usually have around 150-200 questions, but for tests like Anatomy, that would be absolutely terrible, unless most of it was multiple choice. The problem with that is if the test write put many difficult questions, and some didn't know them, they could just guess and have a chance at getting it right, which defeats the purpose of Science Olympiad in general. The teams would not be applying their information and still could get points. I also feel that larger tests favor bigger teams who can have partners, because some smaller teams don't have enough people to give everyone a partner, making these long tests almost impossible to get at least half the questions done. This is fine for ID events, as most of it is flip through your binder and find the answer. So, super long tests tend to vary among different events.

Re: Musings on Test Length

Posted: January 9th, 2021, 9:48 am
by Unome
MorningCoffee wrote: January 9th, 2021, 6:53 am In my opinion, long tests are suitable depending on the event. As SilverBreeze said, amazing orni events tend to usually have around 150-200 questions, but for tests like Anatomy, that would be absolutely terrible, unless most of it was multiple choice. The problem with that is if the test write put many difficult questions, and some didn't know them, they could just guess and have a chance at getting it right, which defeats the purpose of Science Olympiad in general. The teams would not be applying their information and still could get points. I also feel that larger tests favor bigger teams who can have partners, because some smaller teams don't have enough people to give everyone a partner, making these long tests almost impossible to get at least half the questions done. This is fine for ID events, as most of it is flip through your binder and find the answer. So, super long tests tend to vary among different events.
Longer tests do tend to correlate with test quality to an extent (for various reasons not directly related to actual test length, such as that larger and higher-end tournaments tend to have better event supervisors). So on a longer test, guessing is not that likely to yield good results.

Also, even on a very high end lengthy test, a one-person team can still do very well. I'm confident that a good enough one-person team could finish top 20 in most events at MIT, based on my experiences in the past.

Re: Musings on Test Length

Posted: January 11th, 2021, 8:02 am
by knightmoves
Skink wrote: January 8th, 2021, 6:19 pm 1. I signed on to search for thoughts on the (troubling) fact that events seem entirely "open note" this season. What's the point in building a binder or cheat sheet when teams are going to have phones, tablets, or other computers available for looking up answers, a substantial fraction of which will be within reach (especially among the strongest competitors who know what they're looking for and able to find)? The average ES' tests are neither challenging nor lengthy such that I fear my teams are walking into something extremely highly scoring where the differences between scores may very well be determined by the quality of the open notes.
Most tests I've seen have not been open notes - they've followed the rules pretty closely (except that team members have been allowed a copy of the notes/binder each, rather than one between them.) But, of course, there's no way to enforce this when everyone is in their own home. So I think everyone's been assuming that some people are cheating, but I don't have a good guess as to how many people are cheating. Perhaps some people who have been ES in the big competitions can guess.
Skink wrote: January 8th, 2021, 6:19 pm 2. The real problem here, as I see it, is the scoring system's demands for breaking all ties. See, at an invite of forty teams, there is no theoretical reason why Skink MS JV3 should score uniquely compared against some other team that came (very) inadequately prepared or not at all. Our goal is to separate the medalists and a few more places from there for the sake of the team score.
In my opinion, there's no need to separate places after the medalists - having unique scores on each event doesn't make it less likely to have ties in the team totals. And the odds of two teams having the same pattern of scores (so that they can't be separated by number of first places etc.) are very small. And I actually think it's more honest to score teams as ties if they get the same score, rather than artificially breaking ties on largely meaningless distinctions.

I could even argue for not breaking ties for medalists, or (because in practice, competitions need to break ties because they don't have spare medals) breaking ties for the purposes of awarding medals, but still awarding points as if the places were tied.

But that isn't SO's policy.

Re: Musings on Test Length

Posted: January 11th, 2021, 11:25 am
by BennyTheJett
knightmoves wrote: January 11th, 2021, 8:02 am
Skink wrote: January 8th, 2021, 6:19 pm 1. I signed on to search for thoughts on the (troubling) fact that events seem entirely "open note" this season. What's the point in building a binder or cheat sheet when teams are going to have phones, tablets, or other computers available for looking up answers, a substantial fraction of which will be within reach (especially among the strongest competitors who know what they're looking for and able to find)? The average ES' tests are neither challenging nor lengthy such that I fear my teams are walking into something extremely highly scoring where the differences between scores may very well be determined by the quality of the open notes.
Most tests I've seen have not been open notes - they've followed the rules pretty closely (except that team members have been allowed a copy of the notes/binder each, rather than one between them.) But, of course, there's no way to enforce this when everyone is in their own home. So I think everyone's been assuming that some people are cheating, but I don't have a good guess as to how many people are cheating. Perhaps some people who have been ES in the big competitions can guess.
Skink wrote: January 8th, 2021, 6:19 pm 2. The real problem here, as I see it, is the scoring system's demands for breaking all ties. See, at an invite of forty teams, there is no theoretical reason why Skink MS JV3 should score uniquely compared against some other team that came (very) inadequately prepared or not at all. Our goal is to separate the medalists and a few more places from there for the sake of the team score.
In my opinion, there's no need to separate places after the medalists - having unique scores on each event doesn't make it less likely to have ties in the team totals. And the odds of two teams having the same pattern of scores (so that they can't be separated by number of first places etc.) are very small. And I actually think it's more honest to score teams as ties if they get the same score, rather than artificially breaking ties on largely meaningless distinctions.

I could even argue for not breaking ties for medalists, or (because in practice, competitions need to break ties because they don't have spare medals) breaking ties for the purposes of awarding medals, but still awarding points as if the places were tied.

But that isn't SO's policy.
I would like to say that as a competitor, I would either wholly disagree or agree. I think if you're going to break ties you should break all ties, but if you're not gonna break ties, don't break any. I personally find it more fun to write and grade longer and harder tests (although mostly so I can get meme answers), and I have an appreciation for the writers who put hours and hours into making their tests, as there's usually quite a difference between someone who made their test in 25 hours versus someone who made their test in 5 hours. I think the length of a "good" test is arbitrary depending on which even it is, the level of competition, and the quality of questions. I disagree with the "It's not 100 questions so it's a bad test" because this encourages poor question writing to bump up the number of questions, which in the end will cause more ties in my opinion. Apologies if someone said this, I haven't been following this thread closely enough.

~Benny

Re: Musings on Test Length

Posted: January 11th, 2021, 2:49 pm
by EastStroudsburg13
BennyTheJett wrote: January 11th, 2021, 11:25 am
knightmoves wrote: January 11th, 2021, 8:02 am
Skink wrote: January 8th, 2021, 6:19 pm 1. I signed on to search for thoughts on the (troubling) fact that events seem entirely "open note" this season. What's the point in building a binder or cheat sheet when teams are going to have phones, tablets, or other computers available for looking up answers, a substantial fraction of which will be within reach (especially among the strongest competitors who know what they're looking for and able to find)? The average ES' tests are neither challenging nor lengthy such that I fear my teams are walking into something extremely highly scoring where the differences between scores may very well be determined by the quality of the open notes.
Most tests I've seen have not been open notes - they've followed the rules pretty closely (except that team members have been allowed a copy of the notes/binder each, rather than one between them.) But, of course, there's no way to enforce this when everyone is in their own home. So I think everyone's been assuming that some people are cheating, but I don't have a good guess as to how many people are cheating. Perhaps some people who have been ES in the big competitions can guess.
Skink wrote: January 8th, 2021, 6:19 pm 2. The real problem here, as I see it, is the scoring system's demands for breaking all ties. See, at an invite of forty teams, there is no theoretical reason why Skink MS JV3 should score uniquely compared against some other team that came (very) inadequately prepared or not at all. Our goal is to separate the medalists and a few more places from there for the sake of the team score.
In my opinion, there's no need to separate places after the medalists - having unique scores on each event doesn't make it less likely to have ties in the team totals. And the odds of two teams having the same pattern of scores (so that they can't be separated by number of first places etc.) are very small. And I actually think it's more honest to score teams as ties if they get the same score, rather than artificially breaking ties on largely meaningless distinctions.

I could even argue for not breaking ties for medalists, or (because in practice, competitions need to break ties because they don't have spare medals) breaking ties for the purposes of awarding medals, but still awarding points as if the places were tied.

But that isn't SO's policy.
I would like to say that as a competitor, I would either wholly disagree or agree. I think if you're going to break ties you should break all ties, but if you're not gonna break ties, don't break any.
I would agree. You can't treat medalists as if they're inherently different, because at the end of the day the difference between 3rd and 4th is the same as the difference between 15th and 16th when it comes to total team scores. If you break a tie between 3rd and 4th, but you don't do the same for 15th and 16th, you're giving the 4th place team an extra point that they wouldn't ordinarily get if they hadn't medaled. It has to be consistent.

Re: Musings on Test Length

Posted: January 11th, 2021, 3:23 pm
by knightmoves
EastStroudsburg13 wrote: January 11th, 2021, 2:49 pm I would agree. You can't treat medalists as if they're inherently different, because at the end of the day the difference between 3rd and 4th is the same as the difference between 15th and 16th when it comes to total team scores. If you break a tie between 3rd and 4th, but you don't do the same for 15th and 16th, you're giving the 4th place team an extra point that they wouldn't ordinarily get if they hadn't medaled. It has to be consistent.
I wouldn't break the 15/16 tie, and I'd award each team 15.5 points. I'm on the fence as to whether to break the 3/4 tie for the purposes of team scores, or just for medals. I'd be quite happy to award 3.5 points and the 3rd place medal to one team, and 3.5 points and the 4th place medal to the other team.

But there's no a priori reason why the difference between 3 and 4 has to be the same as the difference between 15 and 16 in the team score. There are certainly sports where the score differentials are larger for the first few places. It's a choice - it didn't come to us on stone tablets.

Re: Musings on Test Length

Posted: January 11th, 2021, 6:00 pm
by EastStroudsburg13
knightmoves wrote: January 11th, 2021, 3:23 pm
EastStroudsburg13 wrote: January 11th, 2021, 2:49 pm I would agree. You can't treat medalists as if they're inherently different, because at the end of the day the difference between 3rd and 4th is the same as the difference between 15th and 16th when it comes to total team scores. If you break a tie between 3rd and 4th, but you don't do the same for 15th and 16th, you're giving the 4th place team an extra point that they wouldn't ordinarily get if they hadn't medaled. It has to be consistent.
I wouldn't break the 15/16 tie, and I'd award each team 15.5 points. I'm on the fence as to whether to break the 3/4 tie for the purposes of team scores, or just for medals. I'd be quite happy to award 3.5 points and the 3rd place medal to one team, and 3.5 points and the 4th place medal to the other team.

But there's no a priori reason why the difference between 3 and 4 has to be the same as the difference between 15 and 16 in the team score. There are certainly sports where the score differentials are larger for the first few places. It's a choice - it didn't come to us on stone tablets.
I guess it's technically a choice to break the 3/4 tie and not break the 15/16 one. I just think it's a really bad one.

EDIT: Also, the comparison to sports has an inherently flawed premise. There are already score differentials in Science Olympiad: the difference in total team scores. The only comparison you can really make would be to a sport that uses multiple events, like swimming or track and field. I don't know how those sports handle ties but my guess is that they don't treat ties differently based on where the tie occurs in the placement (mostly because in those sports I believe points are only awarded to the top x places or so).

Re: Musings on Test Length

Posted: January 11th, 2021, 7:55 pm
by knightmoves
EastStroudsburg13 wrote: January 11th, 2021, 6:00 pm I guess it's technically a choice to break the 3/4 tie and not break the 15/16 one. I just think it's a really bad one.
Frankly I'd prefer not to break it, because I think most tie-breakers end up being pretty unsatisfactory, and the best answer you really have from the event is "these two teams did as well as each other". But unless events are going to have spare medals on hand (so you can award 2nd place to 2 teams) you have to break the tie for medal purposes, and I'd be worried that showing teams finishing in a tie but getting different medals is going to lead to more bad feeling than just knowing where you placed. But maybe not - maybe everyone would be fine with getting a 2nd place medal and 2.5 points for tying with third, but winning on a tiebreak.

I'm probably picking nits - although we've all seen cases where teams have tied in the final score (and so a tiebreak or not would affect the final rankings), I think the noise you introduce from breaking ties is probably smaller than all the other random noise sources that contribute to the score. So it probably doesn't really matter.

Re: Musings on Test Length

Posted: January 12th, 2021, 10:07 am
by EastStroudsburg13
knightmoves wrote: January 11th, 2021, 7:55 pm
EastStroudsburg13 wrote: January 11th, 2021, 6:00 pm I guess it's technically a choice to break the 3/4 tie and not break the 15/16 one. I just think it's a really bad one.
But unless events are going to have spare medals on hand (so you can award 2nd place to 2 teams) you have to break the tie
If you simplify down to this, you get my position. In my opinion, you can either not break ties and have spare medals just in case, or you break all ties. I really don't like the idea of treating medalists any differently than all of the other teams competing in the event.