Musings on Test Length

YouTube · Post by **EastStroudsburg13** » January 5th, 2021, 7:22 am

To be perfectly frank, I'm not sure how I feel about intentionally making a test too long, at least for the regionals/state/national levels. There is some merit to doing so for invitationals, as it gives teams more material to study from, but when teams don't get their tests back, you've basically turned the event into a speed round. Answer as many questions as you can within 50 minutes. I don't think that really encourages critical thinking or application of skills. There's a difference between making a test long enough to challenge the best teams, and making a test that would be impossible for even the best teams to complete. Tests should be written to reward event preparedness more than test-taking strategy. It's a fine line, but an important one.

There's also a dichotomy I'm seeing getting thrown around regarding "easy" versus "hard" questions. What I think is important to realize that questions don't neatly fit into these categories. There's "I did some studying" easy and then there's "I'm completely unprepared for this event" easy. There's medium questions that will be hard for the unprepared, there's hard questions that someone with a decent handle on the event might be challenged by, and then there's very hard questions that even medalists might have trouble with. Especially at the regional and invitational levels, there's a wide range of preparedness levels that teams will have, and just separating questions out into "easy" and "hard" is a bit of an oversimplification, imo.

I'm not a particularly prolific test writer; I think I've written maybe 1 or 2 tests over the past 3 seasons, so I'm not going to pretend it's easy. But I do think that there are certain things that writers should always have in mind when writing a test, and that includes keeping in mind that you're not just writing for the top teams; there's a lot of middle/lower teams that should have a good experience too.

Post by **SilverBreeze** » January 5th, 2021, 7:54 am

sneepity wrote: ↑January 5th, 2021, 5:36 am However, if partners were to split the studying itself, I guess you can say both partners aren't "fully" learned in their event and are just very invested in their topic (so they can effectively split the test sections). Is this a common practice? If both partners don't study all aspects of an event, I don't really consider them proficient at the event ( of course apart from interests).

This was very common practice at my middle school - I can't say for other schools(anyone else want to weigh in on this, feel free!). The most competitive teams tend not to do it because partner matchups tend to change between competitions, and tryouts are single-person.

I would not consider them proficient at the event either, but my idea of proficient would medal at most competitions. Maybe they didn't get an understanding of the entire event, but they still put in the time and studied. (having a grasp of all the events' easy to medium concepts and having a deeper understanding of one topic aren't the same thing, but I don't think there's cause to favor one over the other) I think there's a lot of room for debate here, but my main goal with it is giving teams that did split topics a good testing experience(I'm obviously biased from competing in middle school haha).
What are your thoughts?

sneepity wrote: ↑January 5th, 2021, 5:36 am And I think you're right when you said that question difficulty should increase as you go- it's less likely that students can answer those questions, so they can afford to not do them. Kind of like bonus points, right?

But I'm still debated on what the most "accurate" way there is to properly test a student's knowledge (or their full potential), not just smart test taking strategies like skipping questions, partners dividing it into sections (although this is a fairly common type of strategy given the time constraints and the number of questions). What's your opinion on this?

I don't know, honestly. I try to do things like:
on the multiple choice key, I make an effort to not use C too often (a common "tip" is to guess C, not A)
clarify what I want, so students who don't know my writing style don't have to guess my meaning (some tests are extremely vague regarding what answer they want - looking at you, CollegeBoard)
avoid question reuse, so students with my previous tests in their test banks have less of an advantage

About dividing into sections - I really don't know. My stance as an ES is that I'm testing what the pair know collectively, not what each of them know, but I agree getting an understanding of the whole event is better. I think what I learned from ES-ing that I didn't know as a competitor is that a test writer's goal is to separate each team, not just pick out the best one. What are your thoughts?

EDIT, didn't see East's post:

EastStroudsburg13 wrote: ↑January 5th, 2021, 7:22 am To be perfectly frank, I'm not sure how I feel about intentionally making a test too long, at least for the regionals/state/national levels. There is some merit to doing so for invitationals, as it gives teams more material to study from, but when teams don't get their tests back, you've basically turned the event into a speed round. Answer as many questions as you can within 50 minutes. I don't think that really encourages critical thinking or application of skills. There's a difference between making a test long enough to challenge the best teams, and making a test that would be impossible for even the best teams to complete. Tests should be written to reward event preparedness more than test-taking strategy. It's a fine line, but an important one.

From my perspective, the underlined portion depends more on question quality than the length of the test. A shorter test based on recall still cannot foster understanding, while a longer test with a lot of critical thinking still does. What are your thoughts?

EastStroudsburg13 wrote: ↑January 5th, 2021, 7:22 am I'm not a particularly prolific test writer; I think I've written maybe 1 or 2 tests over the past 3 seasons, so I'm not going to pretend it's easy. But I do think that there are certain things that writers should always have in mind when writing a test, and that includes keeping in mind that you're not just writing for the top teams; there's a lot of middle/lower teams that should have a good experience too.

Do longer tests create a worse experience for middle/lower teams? (that sounds snarky but it's not)
I've always thought a longer test and a shorter test made no difference - you're doing the same number of questions either way. Is there something I'm missing? I do feel like I should make more of an effort to clarify that percentage doesn't matter (I know some students come in expecting scores comparable to a school test).
I've also noticed that many test writers are using long tests to mitigate cheating.

knightmoves · Post by **knightmoves** » January 5th, 2021, 8:44 am

sneepity wrote: ↑January 5th, 2021, 5:36 am Making sure kids know (I think these are what you mentioned, please correct me if I'm wrong) that you can skip questions, and that they can split up the test evenly is a huge relief in a test taker's point of view. If an ES urged me to do this, and there were enough questions in the packet- my partner and I would be glad we were given some pointers. Your perspective as an (awesome if I didn't mention!) ES makes a lot of sense, since you want to be able to give students the chance to perform to their fullest!

From a philosophical point of view, I don't like splitting a test and dividing it between event partners, because what you're demonstrating in these circumstances is mostly parallel play rather than actual teamwork. But from a practical point of view, it's basically impossible to prevent that in a test. I've seen ES try to force event partners to work together on questions by doing things like giving out a bound / stapled booklet, and telling teams they may not split the pages, so it's harder for partners to work on different parts of the test, but that doesn't seem like a very satisfactory solution.

You could achieve this better on scilympiad (or a similar platform) by only presenting one question at once to the team, so that the partners must work together on the solution (or the one that knows this area solves the problem while the other waits) but it's hard to do the same thing in a paper test.

Stations achieve this to a point on an ID event - you only present a subset of the questions at once, so there's a reduced opportunity for the partners to divide and conquer.

Post by **SilverBreeze** » January 5th, 2021, 8:51 am

knightmoves wrote: ↑January 5th, 2021, 8:44 am From a philosophical point of view, I don't like splitting a test and dividing it between event partners, because what you're demonstrating in these circumstances is mostly parallel play rather than actual teamwork. But from a practical point of view, it's basically impossible to prevent that in a test. I've seen ES try to force event partners to work together on questions by doing things like giving out a bound / stapled booklet, and telling teams they may not split the pages, so it's harder for partners to work on different parts of the test, but that doesn't seem like a very satisfactory solution.

You could achieve this better on scilympiad (or a similar platform) by only presenting one question at once to the team, so that the partners must work together on the solution (or the one that knows this area solves the problem while the other waits) but it's hard to do the same thing in a paper test.

Stations achieve this to a point on an ID event - you only present a subset of the questions at once, so there's a reduced opportunity for the partners to divide and conquer.

Hmm my thought was splitting the test is teamwork - after all, for a school project, if you each work on a separate part, you wouldn't say you two didn't work together, right? You have to both agree on who does which part.

I really don't like the idea of forcing students to work on the same question. For one, it's really hard to work on the same problem at the same time, even aside from the fact that Scilympiad doesn't support it. You can't write the second half of an explanation while your partner writes the first; you'd have to be mind-readers. Having one person wait is even worse - I came here to take a 50 minute test, not sit there for 25 minutes either doing nothing or saying a few words.

When I work on tests, we always do separate parts of the test. The teamwork aspect is checking answers with each other. "Hey, is thermocline temperature or salinity again?" Stations actually frustrate me - we're about to run out of time, and all I can do is watch my partner write stuff down. I can't help even verbally, because it takes far longer to write the answer down than to say it.

knightmoves · Post by **knightmoves** » January 5th, 2021, 9:00 am

SilverBreeze wrote: ↑January 5th, 2021, 7:54 am
EastStroudsburg13 wrote: ↑January 5th, 2021, 7:22 am To be perfectly frank, I'm not sure how I feel about intentionally making a test too long, at least for the regionals/state/national levels. There is some merit to doing so for invitationals, as it gives teams more material to study from, but when teams don't get their tests back, you've basically turned the event into a speed round. Answer as many questions as you can within 50 minutes. I don't think that really encourages critical thinking or application of skills. There's a difference between making a test long enough to challenge the best teams, and making a test that would be impossible for even the best teams to complete. Tests should be written to reward event preparedness more than test-taking strategy. It's a fine line, but an important one.
From my perspective, the underlined portion depends more on question quality than the length of the test. A shorter test based on recall still cannot foster understanding, while a longer test with a lot of critical thinking still does. What are your thoughts?

I think very long multiple choice tests are usually a speed round. Imagine a single person taking a 300-question multiple choice test in 50 minutes: that's a sustained rate of one question every 10s. That doesn't really give you time to think about the question, but it does give a speed merchant time to pick out some key words and make a likely guess - and with such a long test, "a likely guess" is probably a better strategy than thinking about each question in enough detail to be confident that you have it right.

The challenge is that it's harder to write challenging in-depth critical thinking / process in to multiple choice questions, and the scilympiad format (or non-specialist graders in a paper test) strongly favors multiple choice. (A computer, or any human with an answer sheet, can correctly grade a multiple choice test, but even fill-in-the-blank answers need a grader who understands whether the student's answer (which differs from that on the answer sheet) is correct or not. Long answers, where there's a calculation with partial credit available, or a paragraph of text, are basically impossible to grade for anyone who isn't a subject expert.

sneepity · Post by **sneepity** » January 5th, 2021, 9:03 am

Silver- I'm gonna try to understand what you said-
Let's say we have 100 teams taking a test. All of them are varying in proficiency in the event (just a random test event)
50 teams take a short test (60 questions)
50 teams take a long test (90 questions)
If the test was very good at testing their knowledge, (NOT their skill at taking a test, although this would be almost impossible in reality)
There's gonna be a limit to how many questions both groups get to anyway. There's gonna be a bell curve in how many questions both groups got up to.
I think Silver is trying to say that regardless of the test length, there's gonna be a cap (not a distinct one but you can notice it) to the number of questions a team can complete in a given amount of time (in our case, 50 minutes).
Of course, some teams are gonna get further, but as they get further, you need to challenge them by adding harder questions (harder questions which they would be less likely to answer correctly). This way, the teams that didn't get up to the latter questions have a fair chance as well.
On the other hand, having a short test would put a cap on the knowledge and the credit students can get. You can only include so many hard questions as the hardness and number goes up!
However, there are benefits to a short test as well,,
you can make sure everyone can have ample time to think out each question,
and you can make sure everyone tries their best to answer every question and gets credit.
This is just what my opinion is, I'm curios to know what you think!

"I would not consider them proficient at the event either, but my idea of proficient would medal at most competitions. Maybe they didn't get an understanding of the entire event, but they still put in the time and studied. (having a grasp of all the events' easy to medium concepts and having a deeper understanding of one topic aren't the same thing, but I don't think there's cause to favor one over the other) I think there's a lot of room for debate here, but my main goal with it is giving teams that did split topics a good testing experience(I'm obviously biased from competing in middle school haha).
What are your thoughts?" (quoting silver)

I think that everyone should at least have a basic understanding of the whole event and every topic and subtopic. If you were to split up the studying, it seems to me more like self study, or just understanding a topic a lot so you can do well on the test. But people can do this because of many reasons, one being extreme interest in one aspect, or they (like I said) want to do well in the event. I'm debated on this too, I'm trying to decide whether it's putting teams that went into decent depth in a disadvantage or not. I do know that strategy is a part of the competition, but it's also important to keep in mind that overall knowledge and ability should be tested primarily. Maybe it's all just a part of the competition? Not sure.

"About dividing into sections - I really don't know. My stance as an ES is that I'm testing what the pair know collectively, not what each of them know, but I agree getting an understanding of the whole event is better. I think what I learned from ES-ing that I didn't know as a competitor is that a test writer's goal is to separate each team, not just pick out the best one. What are your thoughts?" (quoting silver)

This is very interesting- now that I think about it, I find it very hard to actually test what the pair knows collectively. We can't stop them from dividing the test or etc. But I don't get what you mean by the goal of a test writer is to separate each team- sorry if I'm getting this wrong, but does it mean to evaluate each team independently?

"Answer as many questions as you can within 50 minutes. I don't think that really encourages critical thinking or application of skills. There's a difference between making a test long enough to challenge the best teams, and making a test that would be impossible for even the best teams to complete. Tests should be written to reward event preparedness more than test-taking strategy. It's a fine line, but an important one." (quoting East)

You're right, it shouldn't become a speed test. But like I think that making a long test that gets collectively harder is probably the best way to avoid this. As they get nearer to the end of the test, they'll be forced to spend more time on the problems, think about them more, etc. But critical thinking can be included too! Making a short or long test wouldn't really "force" you to stop writing questions that need a longer time to think about. After all, there's a *limit* to how far someone can get. And it is true, making a test too hard can challenge everyone, but the score still remains. I agree with you, though, event preparedness is crucial and it's something that should be worked on to prevent teams from relying on test taking strategies.

It is true that you can't effectively categorize question difficulty. It's based on various different factors, like the vocabulary of the question, the amount of critical thinking, and etc. But I think that someone can recognize a hard question or an easy question even if they have little knowledge of something. For the sake of test writing, and the layout of the test, ES are forced to designate a question as hard or easy (especially for scoring).

I wrote mayybe one or two tests, and I'm not very experienced either. I just think that this is very interesting too!

sneepity · Post by **sneepity** » January 5th, 2021, 9:06 am

knightmoves wrote: ↑January 5th, 2021, 9:00 am I think very long multiple choice tests are usually a speed round. Imagine a single person taking a 300-question multiple choice test in 50 minutes: that's a sustained rate of one question every 10s. That doesn't really give you time to think about the question, but it does give a speed merchant time to pick out some key words and make a likely guess - and with such a long test, "a likely guess" is probably a better strategy than thinking about each question in enough detail to be confident that you have it right.

The test taking strategy of guessing because of time constraints- I feel like a test where no strategies are used are the ones which actually test someone's knowledge of the event. What do you think?

knightmoves · Post by **knightmoves** » January 5th, 2021, 9:09 am

SilverBreeze wrote: ↑January 5th, 2021, 8:51 am Hmm my thought was splitting the test is teamwork - after all, for a school project, if you each work on a separate part, you wouldn't say you two didn't work together, right? You have to both agree on who does which part.

But on a project, you still need to talk about how the parts are going to join together to make a coherent whole. If we're working on a history project, say, each of us might research a different part of the topic, but what we produce at the end is one report - not two short reports stuck together.

Whereas what happens when you split the test is more like doing math homework together, and deciding that one of you will answer the odd questions and the other will answer the even questions. There's no actual working together involved - even if you agree that once you've done your questions, you'll work on the ones your partner skipped or whatever.

sneepity · Post by **sneepity** » January 5th, 2021, 9:12 am

knightmoves wrote: ↑January 5th, 2021, 9:09 am
SilverBreeze wrote: ↑January 5th, 2021, 8:51 am Hmm my thought was splitting the test is teamwork - after all, for a school project, if you each work on a separate part, you wouldn't say you two didn't work together, right? You have to both agree on who does which part.
But on a project, you still need to talk about how the parts are going to join together to make a coherent whole. If we're working on a history project, say, each of us might research a different part of the topic, but what we produce at the end is one report - not two short reports stuck together.

Whereas what happens when you split the test is more like doing math homework together, and deciding that one of you will answer the odd questions and the other will answer the even questions. There's no actual working together involved - even if you agree that once you've done your questions, you'll work on the ones your partner skipped or whatever.

A better example I think is dancing (yes, I know, not related to studying, but it makes sense!)
In a duet, you will have to collaborate and work together. There's no way you can do your best if you only know your part, but you don't know your partner's! Practice both for the best result :)

knightmoves · Post by **knightmoves** » January 5th, 2021, 9:28 am

sneepity wrote: ↑January 5th, 2021, 9:12 am A better example I think is dancing (yes, I know, not related to studying, but it makes sense!)
In a duet, you will have to collaborate and work together. There's no way you can do your best if you only know your part, but you don't know your partner's! Practice both for the best result

Sure, but with a duet, you're dancing with your partner - it's not just the two of you independently dancing solos on the same stage at the same time.

I think we agree about dancing - but perhaps not about how much a science olympiad test is like a dance.

Scioly.org

Musings on Test Length

Re: Musings on Test Length

Re: Musings on Test Length

Re: Musings on Test Length

Re: Musings on Test Length

Re: Musings on Test Length

Re: Musings on Test Length

Re: Musings on Test Length

Re: Musings on Test Length

Re: Musings on Test Length

Re: Musings on Test Length

Who is online

Connect

Learn

Get Involved

About

Disclaimer