Balancing Difficulty and Accessibility in Test Writing

Alumni make some of the best volunteers!
nicholasmaurer
Coach
Coach
Posts: 384
Joined: May 19th, 2017, 10:55 am
Division: Grad
State: OH
Location: Solon, OH

Re: Balancing Difficulty and Accessibility in Test Writing

Postby nicholasmaurer » September 26th, 2018, 5:45 pm

The hardest questions on the test should consist of applying various inter-related concepts of the event to solve a problem (or series of problems).
I agree with this 100%. For anyone familiar with USMLE-style questions, I find their philosophy to be beneficial. Essentially, you present a short vignette or scenario, and then ask objective questions (classically multiple choice) that require students to interpret the scenario and apply concepts.

For example, a Designer Genes test might provide students with brief background on a disease and a pedigree for its inheritance in an example family. Students could then be asked to identify the most likely inheritance pattern, which genetic biotechnology should be used to screen for the disease, and the likelihood of illness among offspring of different family members. These questions all relate to rather basic concepts, but you would be surprised at how many students are ineffective at application questions.
Assistant Coach and Alumnus ('14) - Solon High School Science Olympiad
Tournament Director - Northeast Ohio Regional Tournament
Tournament Director - Solon High School Science Olympiad Invitational

Opinions expressed on this site are not official; the only place for official rules changes and FAQs is soinc.org.

User avatar
lumosityfan
Exalted Member
Exalted Member
Posts: 328
Joined: July 14th, 2012, 7:00 pm
Division: Grad
State: TX
Location: Houston, TX
Contact:

Re: Balancing Difficulty and Accessibility in Test Writing

Postby lumosityfan » September 26th, 2018, 8:48 pm

I've had that feeling a lot adi when writing tests and I'll just say that it takes a lot of time to find the optimal question distribution and difficulty. What I try to do is spread out the question difficulty throughout the test, basing the overall difficulty on the level of competition I'm writing the test for (so for instance regional I aim for a top score of 75% but states/nationals I'll go more harder so more towards 60%) I also try to group questions whenever I can (so for instance for DSOs I have some general DSO questions but also questions about a specific DSO that might also go in-depth about features that tie into the general concepts).
John P. Stevens Class of 2015 (Go Hawks!)
Columbia University Class of 2019 (Go Lions!)
Houstonian

syo_astro
Exalted Member
Exalted Member
Posts: 592
Joined: December 3rd, 2011, 9:45 pm
Division: Grad
State: NY
Contact:

Re: Balancing Difficulty and Accessibility in Test Writing

Postby syo_astro » September 26th, 2018, 9:33 pm

In addition to the event type, this depends on the difficulty you're writing for: Regs and Nats are different.

First, note that if you've worked hard at scioly, plenty won't, so always write easier that you think you should for regs/(various) invites. You might love hard tests, etc, but it's good to be practical. Others have discussed distinguishing scores, but you should also make sure nobody gets a 0 and that grading is fast. What's as bad or worse than bad questions are questions that *nobody* answers. At regs, teams may also have to put someone into events they haven't studied for, you may be given little (experienced) help, so hard can be bad.

Skimming quickly:
The disadvantage of questions that build on each other are that they can be intimidating. Another is that depending how you do them, they can be difficult to grade (this can be mitigated by a rubric, but it definitely can be tough to balance). One other thing that's helpful when asking about interrelated concepts is making sure all/many questions can be answerable regardless of previous parts (see prev. para.)
I'd agree that distinguishing the bottom of your score distribution is usually harder than the top (assuming you've already put in effort to not copy qs / not make the test too easy). I'd be curious to know someone who reads this thread that makes their tests too easy and has difficulty coming up with harder questions.

At this point, I just try to think of some doable/less jargony questions that more people can answer / guess. I...also add 10 tiebreakers based on various question types (e.g. Recognize a feature of a diagram, explain a concept, math, short answer, etc)...just in case. One other thing I do sometimes is ask random friends the first few things they think of in relation to [insert event topic]. That at least helps me get an idea of basics that random people actually have some concept of.

Edit: As Lumo brought up (and others have) it's also good to break up in your head what range of scores you want. In addition to ranging from 20 to 80, if you want a mean of 40, then you better have that percent of the test be actually doable (like people...actually having a high probability of getting those).
Last edited by syo_astro on October 4th, 2018, 2:31 pm, edited 2 times in total.
B: Crave the Wave, Environmental Chemistry, Robo-Cross, Meteorology, Physical Science Lab, Solar System, DyPlan (E and V), Shock Value
C: Microbe Mission, DyPlan (Earth's Fresh Waters), Fermi Questions, GeoMaps, Gravity Vehicle, Scrambler, Rocks, Astronomy
Grad: Writing Tests/Supervising (NY/MI)

User avatar
windu34
Moderator
Moderator
Posts: 1340
Joined: April 19th, 2015, 6:37 pm
Division: Grad
State: FL
Location: Gainesville, Florida

Re: Balancing Difficulty and Accessibility in Test Writing

Postby windu34 » September 30th, 2018, 8:42 pm

For anyone familiar with USMLE-style questions, I find their philosophy to be beneficial. Essentially, you present a short vignette or scenario, and then ask objective questions (classically multiple choice) that require students to interpret the scenario and apply concepts..
This is a REALLY good point. In fact, I'm gonna start looking at USMLE questions and let it influence my writing style a bit more. Here is a very comprehensive guide on how to write a USMLE question and honestly, the vast majority of it is really good advice to improve question-writing in general that we as supervisors can take advantage of regardless of event subject.
How to write USMLE questions
USMLE Step 1 Sample Exam
Boca Raton Community High School Alumni
Florida State Tournament Director 2020
National Physical Sciences Rules Committee Member
kevin@floridascienceolympiad.org || windu34's Userpage

Circuit Lab Event Supervisor for 2020: UT Austin (B/C), MIT (C), Solon (C), Princeton (C), Golden Gate (C), Nationals (C)

Jacobi
Exalted Member
Exalted Member
Posts: 137
Joined: September 4th, 2018, 7:47 am

Re: Balancing Difficulty and Accessibility in Test Writing

Postby Jacobi » October 4th, 2018, 2:08 pm

My take on it is this:
-Tests need to be organized on concepts. Create a checklist, and make sure you cover each rubric point thoroughly. A wildly made test is a poorly made test.
-Let them know how many points are for each question. It's easier for everyone.
-Make the test look good. (LaTeX)
-Match the topic with the question type. Example from Chem Lab: A physical properties ID question is best as an MCQ, while an acid-base equilibrium question can be best run as a free-response.
-True/False is just messed up.
-Station formats are good. Say there are 10 teams in the block (an average count). Set up 10 stations, give each team 3 minutes to take notes at each station, rotate everyone through every station, then give the rest of the time to complete responses. It requires fast thinking and it keeps competitors from becoming lethargic. :| :|
I'm not an experienced test writer.

User avatar
Unome
Moderator
Moderator
Posts: 4130
Joined: January 26th, 2014, 12:48 pm
Division: Grad
State: GA
Location: somewhere in the sciolyverse

Re: Balancing Difficulty and Accessibility in Test Writing

Postby Unome » October 4th, 2018, 3:15 pm

-Tests need to be organized on concepts. Create a checklist, and make sure you cover each rubric point thoroughly. A wildly made test is a poorly made test.
-Let them know how many points are for each question. It's easier for everyone.
-Make the test look good.
-Match the topic with the question type. Example from Chem Lab: A physical properties ID question is best as an MCQ, while an acid-base equilibrium question can be best run as a free-response.
-True/False is just messed up.
Agree on all of these.
LaTeX
Someone who writes a poorly-formatted test in Word is still going to write a poorly formatted test in LaTeX.
-Station formats are good. Say there are 10 teams in the block (an average count). Set up 10 stations, give each team 3 minutes to take notes at each station, rotate everyone through every station, then give the rest of the time to complete responses. It requires fast thinking and it keeps competitors from becoming lethargic. :| :|
Station tests only tend to require faster thinking because test writers have a tendency to judge stations as having more time than they actually do (for better or worse). In my opinion, if the event format or questions used don't need stations, it's just extra effort that distracts from grading more efficiently.
Userpage
Chattahoochee High School Class of 2018
Georgia Tech Class of 2022

Opinions expressed on this site are not official; the only place for official rules changes and FAQs is soinc.org.

User avatar
fishman100
Exalted Member
Exalted Member
Posts: 478
Joined: January 28th, 2011, 1:26 pm
Division: Grad
State: VA

Re: Balancing Difficulty and Accessibility in Test Writing

Postby fishman100 » October 7th, 2018, 11:38 am

Little late to the party, but I wrote the VA Dynamic Planet regs/states test this past year and saw a pretty even spread at states. To put some numbers behind it, the mean=59/115 with the high=87/115, low=23/115, stdev=18, and 3/24 teams within 1 point of each other. I would've liked to see a higher mean and a lower stdev, but I don't think this spread is horrific by any means. I'm not a subject matter expert or veteran test writer, but I thought I'd throw in my 2 cents.

1) If you want to obtain a good spread, you need to have a few basic questions that level the playing field for everyone, and then gradually add questions that force students to problem solve or think outside the box. Adding "basic questions" is good because it allows everyone to (hopefully) get 10-20% of the total points, and teams won't be frustrated by a test entirely composed of curveballs/overly difficult/abstract questions.

2) Last year, DP had a focus on problem solving, and I fully embraced that for 2 reasons. First, the majority of DP tests I've taken/seen are incredibly long multiple choice tests with a few short answers mixed in. I always hated these kinds of tests as a competitor because they were tiring, boring, and didn't really enhance my problem solving ability or thought process--especially compared to the engineering events, which I also competed in heavily. Second, I believe that a team that can demonstrate creativity in answering some pretty tough, open-ended problems deserves to win over a team that memorized a bunch of facts. This philosophy has been beaten into me by my professors and I really agree. On a college midterm, you'll probably get lots of partial credit on a question if you get the wrong numeric answer but have the right logic. I try to make my tests follow a similar style: points are awarded for a solid thought process, not necessarily the correct answer.

I made my test like an Astro test; each group of questions required each team to analyze/interpret 1-3 images, primarily graphs of things like tectonic plate location over time and relative sea level (RSL) changes over time, etc. Instead of asking them what a RSL curve meant, I asked them to extrapolate how many inches it would rise by 2020 or made them justify if an impactful geologic event that occurred at the same time as a dip in the RSL directly caused the RSL drop or if the timing was just a coincidence. In another test, I gave students a set of tectonic plate movement-related data points which included some bogus data. I asked them to throw out whichever data points they thought were bogus, then construct a best-fit line based on the remaining data points and justify why they threw out their removed data. These questions are hard because they're unexpected to the average competitor; students aren't exactly used to doing these kinds of tasks. They also roll multiple concepts into 1 question, which is what someone else in this thread said (USMLE questions I think?). I strayed from asking calculation questions that just use a formula, because that doesn't really prove that you can problem solve. Astro tests can be guilty of this, so if I were to write an Astro test, this is something I would be wary of. (not trying to shade anyone who's written an astro test by any means!)

3) Pros of writing a problem solving-based test: you really get to see who knows what. Depending on what kinds of questions you ask, you get to look at real data sets and that's pretty awesome. Pretty solid distribution because not every team will be able to problem solve as effectively as others.

4) Cons of writing a problem solving-based test: takes a long time to write. Can be frustrating for students to take and hard for the ES to balance questions; I asked teams for feedback after regs/states and the most common complaint was something like "I felt like I didn't need to know anything about DP at all; I just needed to know was how to understand graphs." This is more of a criticism for me and the type of questions I asked (maybe I didn't balance the type of questions well enough), but at least people are acknowledging that they need some form of knowledge other than simply recalling information to do well. Grading also takes significantly longer than a predominantly MC test.

5) Balance is hard and takes experience. You may think your test is incredibly easy, but the test scores might prove otherwise. I had the same "problem solving-style" test at both regs and states; the regs scores were significantly lower than the states scores despite being "easier," just because most teams anticipated a simple MC/short answer test (VA DP tends to use that format). I told students that the states test would be similar in style, and I think telling them that helped them study more effectively.

6) Write a "question bank" and then delegate problems to the regs/states test. When I was caught off-guard by the low regs scores, I had a pool of questions of varying difficulty. I looked at which regs problems gave students the most trouble and threw out similar questions in the bank. The point is, you never know how students are going to perform, so it's better to have a range of questions already written and ready to go than to be shell shocked by what you thought was an "easy" regs test and have to rewrite the entire states test because you made it harder than the regs test. This was incredibly useful because I was really busy with college stuff in the spring semester when I wrote the states test.

Sorry for the massive word vomit haha but I have some pretty strong opinions on how I think a scioly test should be written when the rules stipulate an emphasis on problem solving. A few students really disliked my test because it's hard to prepare for and it seemed like some questions came out of left field, but I genuinely think that if you want to obtain a good spread, this is one way to do it. This handles the difficulty vs accessibility dilemma fairly well, since you don't need advanced resources to study for a problem solving-focused test. And because problem solving tests are so wide in scope, you can really customize the difficulty. Regs is a great place to test out a problem solving-style test and your test writing style in general. Invites are even better, but regs tests hold more weight since most teams attend multiple invites and might not think that your test is representative of the "norm." It's definitely something that comes with experience, but if you write a "bad test," it's a learning experience for all parties. Have fun with it, especially since you're writing for an event you're passionate about!
Langley HS Science Olympiad '15

syo_astro
Exalted Member
Exalted Member
Posts: 592
Joined: December 3rd, 2011, 9:45 pm
Division: Grad
State: NY
Contact:

Re: Balancing Difficulty and Accessibility in Test Writing

Postby syo_astro » October 7th, 2018, 2:02 pm

If you're bad at writing questions, make tests too easy / difficult, it'll show regardless. Some question types get flak for being guessable / memorization heavy, but:
-T/F or Agree/Disagree questions are fine questions...maybe misunderstood. They should be used for a few questions as they are fast to grade, and they can reveal issues with your test if people get them wrong (e.g. Too long, bad writing, bad question order, etc). They are great to try at invites.
-Similarly "basic calculations" can be misunderstood / overused (Fishman: Just expanding your point!). In Astro, using the parallax formula is accessible and also useful in general. Obviously, there are ways to make it conceptual/harder;), but that isn't always necessary.

Also, approaching scioly as "measuring thinking, looking to past/college/professors for ideas, etc" is not always useful (I get what you mean in context Fishman, but this seems to not be such an uncommon thought). "Non-test" skills are tough to translate into tests, and what you liked in college/scioly won't always be meaningful for middle / high schoolers. People are different, and I'd still opt for a mix of questions types.

On Test Banks (totally agree btw): I recommend outlines (might've said before, I forget). I've heard some don't and couldn't believe it. Outlining means you can write questions of different difficulty or style side by side for a single topic in the rules. They can really speed you up and can be reused.
B: Crave the Wave, Environmental Chemistry, Robo-Cross, Meteorology, Physical Science Lab, Solar System, DyPlan (E and V), Shock Value
C: Microbe Mission, DyPlan (Earth's Fresh Waters), Fermi Questions, GeoMaps, Gravity Vehicle, Scrambler, Rocks, Astronomy
Grad: Writing Tests/Supervising (NY/MI)

User avatar
fishman100
Exalted Member
Exalted Member
Posts: 478
Joined: January 28th, 2011, 1:26 pm
Division: Grad
State: VA

Re: Balancing Difficulty and Accessibility in Test Writing

Postby fishman100 » October 7th, 2018, 4:03 pm

Also, approaching scioly as "measuring thinking, looking to past/college/professors for ideas, etc" is not always useful (I get what you mean in context Fishman, but this seems to not be such an uncommon thought). "Non-test" skills are tough to translate into tests, and what you liked in college/scioly won't always be meaningful for middle / high schoolers. People are different, and I'd still opt for a mix of questions types.
Agreed, I've def heard this opinion before, and you're right: it depends on the competitors. I know I definitely didn't have a good problem solving mindset when I was in HS and would struggle with these kinds of tests, but that's exactly what I hope to inspire now that I have the capacity to write a test. It all circles back to balance: how many "soft skills" questions do you add without making the test too difficult? That's why writing tests at lower level competitions is crucial, so you can gauge the difficulty of your questions based on how students score.
Langley HS Science Olympiad '15

User avatar
SciolyHarsh
Member
Member
Posts: 37
Joined: May 20th, 2018, 5:44 pm

Re: Balancing Difficulty and Accessibility in Test Writing

Postby SciolyHarsh » February 3rd, 2019, 6:46 pm

late response, but I'm a division C participant, and our school made us write tests for division B students. I was in charge of a Dynamic Planet Glaciers test, and I tried to base my test on some tests I've seen. My test was around 40 mc and 15 frq questions with multiple parts. The thing is, I based it off of Division C tests since Division B and C dynamic have the same content. I personally am not the greatest at Dynamic, so I expected my test to be somewhat easy, but the average score was around 25%, and the highest score was Longfellow at 50%. Was this a good idea? Or did I screw up my difficulty? I could share the test as resources, but I don't know if it was worth taking tbh.
2017-2018 Events: Chemistry Lab, Dynamic Planet, Microbe Mission, Experimental Design, Rocks and Minerals

2018-2019 Events: Dynamic Planet, Astronomy, Sounds of Music, Circuit Lab, Geologic Mapping


Return to “Alumni”

Who is online

Users browsing this forum: No registered users and 0 guests