Concept Inventories: Predicting the Wrong Answer ... - ACS Publications


Concept Inventories: Predicting the Wrong Answer...

2 downloads 139 Views 995KB Size

Article pubs.acs.org/jchemeduc

Concept Inventories: Predicting the Wrong Answer May Boost Performance Vicente Talanquer* Department of Chemistry and Biochemistry, University of Arizona, Tucson, Arizona 85721, United States ABSTRACT: Several concept inventories have been developed to elicit students’ alternative conceptions in chemistry. It is suggested that heuristic reasoning may bias students’ answers in these types of assessments toward intuitively appealing choices. If this is the case, one could expect students to improve their performance by engaging in more analytical reasoning. Research has shown that analytical reasoning is activated when people experience metacognitive difficulty or conflict. This study presents the results of an intervention designed to trigger one of such experiences by asking students to make predictions about the wrong answers most commonly selected by unreflective students. Major findings show that this simple prompt has a significant positive impact on students’ answers, regardless of their academic performance in the course. The effects, however, are not uniform across different topics. KEYWORDS: High School/Introductory Chemistry, First-Year Undergraduate/General, Chemical Education Research, Misconceptions/Discrepant Events, Assessment, Metacognition FEATURE: Chemical Education Research



INTRODUCTION A significant body of research in science and chemistry education in the past four decades has focused on eliciting and characterizing students’ initial conceptions about a wide variety of concepts and ideas.1−9 These studies have revealed that students’ conceptions prior to instruction are not necessarily aligned with normative scientific ideas, and that some of these non-normative conceptions (alternative conceptions) are pervasive among diverse groups of students and quite resistant to change. There has been considerable debate about the origin of students’ alternative conceptions, the cognitive mechanisms through which these conceptions evolve, and the types of teaching strategies that enable and facilitate conceptual change.10−13 Despite these discussions, a variety of assessment instruments have been developed to help diagnose students’ prior ideas and to evaluate the effect of different instructional interventions on students’ conceptual understanding.6−9 These formative assessment instruments tend to take the form of concept inventories14 that target a core set of concepts within a specific domain (e.g., particulate nature of matter,15 chemical bonding,16 chemical equilibrium17). Concept inventories are often multiple-choice assessment tools in which item distractors are carefully crafted from common answers given by students during individual interviews.9 Students completing these inventories are expected to select the options that better align with their current knowledge. There is discussion, however, about the different factors that may affect students’ choices. Novice students’ responses are rather sensitive to contextual features, and thus question type, setting, and language may influence their decisions.18,19 We have proposed that student thinking in chemistry is strongly influenced by implicit assumptions about the nature of © XXXX American Chemical Society and Division of Chemical Education, Inc.

chemical entities and processes (e.g., assuming that the macroscopic properties of a substance are an average of the properties of its submicroscopic components) and by heuristic reasoning strategies that reduce cognitive effort when making decisions (e.g., focusing on the effects of a single variable).20−22 This type of intuitive reasoning is guided by surface features of the system under analysis and by quick mental associations that facilitate the building of inferences.23 According to this framework, students’ answers to a concept inventory should be sensitive to interventions that lead students to invest more time evaluating options, reflecting on their choices, or adopting a more analytical approach in their decision making. The results summarized in this paper present evidence of the impact of one of such interventions on student performance. In particular, this study explores the effects of asking college general chemistry students to answer a question and then predict the answer commonly selected by students who get that same question wrong because they do not carefully reflect on their answer or are misguided by their intuition. The findings of this investigation show that this simple prompt has a significant positive impact on students’ answers, regardless of their academic performance in the course.



THEORETICAL FRAMEWORK Current research on judgment and decision-making is often guided by the suggestion that the human mind engages in two different types of reasoning labeled Type 1 and Type 2.23−31 Type 1 reasoning invokes processes that are automatic, fast, and independent of working memory. These types of cognitive Received: June 17, 2017 Revised: August 17, 2017

A

DOI: 10.1021/acs.jchemed.7b00427 J. Chem. Educ. XXXX, XXX, XXX−XXX

Journal of Chemical Education

Article

• What is the effect on student performance on a concept inventory of asking them to predict the answers most commonly chosen by students who get each given question wrong because of lack of reflection or misguided intuition? The core hypothesis was that if students’ answers to the questions in the concept inventory were biased by heuristic reasoning, prompting participants to more carefully discriminate between options and hinting of the pitfalls of unreflective reliance on intuition would enhance analytical reasoning, increasing the selection of normative responses.

processes are triggered and applied autonomously, without the need for controlled attention. Type 1 reasoning corresponds to our common sense notion of intuitive thinking. Type 2 processes, on the other hand, are slow, sequential, and require working memory to function. Their application demands cognitive effort and conscious intervention. They constitute what we commonly identify as analytical or reflective thinking. Type 1 processes seem to be the mind’s default response when confronting novel problems or situations or when working under conditions of limited time, knowledge, or motivation. This type of reasoning often results in appropriate judgments and decisions in diverse contexts,25,26 but it can also bias our choices.27 For example, people often judge statements that are easier to read as truer than those written using uncommon fonts or words and commonly make choices based on surface features that are not necessarily relevant but are easier to process or recognize (e.g., selecting between two products based on packaging appeal).28 Many Type 1 processes are shortcut reasoning strategies, often called heuristics, that reduce cognitive load by, for example, reducing the number of cues used in making a decision or providing rules of thumb for how and where to look for relevant information.31 Heuristics are mostly effective cognitive tools that efficiently use information readily available in the environment to make choices. They are, however, responsible for systematic errors in judgment, particularly when relevant decision-making cues are implicit rather than explicit or unknown to people. Making good judgments and decisions when facing academic tasks often requires an override of heuristic reasoning and their replacement by analytical (Type 2) cognitive processes. The mind intervenes to inhibit or modify the responses automatically generated by Type 1 processes when these answers are somehow judged unsatisfactory.32 Interventions by the analytical system are more likely to occur when people are metacognitive and have relevant knowledge, high cognitive ability, or disposition to be reflective. There are particular conditions, actions, or strategies, however, that seem to activate analytic forms of reasoning and reduce the likelihood of Type 1 responses. Research studies suggest that metacognitive experiences of difficulty may act as an alarm that triggers Type 2 processes. These experiences of difficulty may be orchestrated by, for example, providing people with information in degraded form32 or presenting them with more than one intuitive choice to create conflict in decision making.33 Metacognitive awareness about how easy or difficult is to generate an initial response influences the extent of analytic engagement; in general, perceptions of difficulty enhance Type 2 processes. For example, asking individuals to rate their confidence in their answers34 or to assess the rightness of their responses35 often results in longer rethinking times and an increased probability of normative responses when confidence or feeling of rightness is low. Analytical reasoning may also be bolstered by asking people to adopt a different perspective (e.g., taking an outsider’s view,36 considering opposite hypotheses37) when making judgments and decisions, as this prompts them to reflect more carefully on available information and expand their search for supporting evidence.



RESEARCH METHODS

Context, Participants, and Data Collection

This study was carried out in a large, research-intensive public university with over 34,000 undergraduate students, 41% of them from under-represented groups. Study participants were science and engineering majors completing the second semester of an introductory general chemistry course. All students in this course are asked to complete an exit questionnaire as part of the participation activities in the class. At the time of this study, this multiple-choice assessment included 13 questions selected from prior research studies22,38,39 and existing concept inventories15,40 to explore the extent to which students finishing the general chemistry program apply normative chemistry ideas versus intuitive conceptualizations to explain and predict chemical properties and phenomena. In particular, this “intuitive chemistry inventory” (ICI) sought to assess performance in four areas: i) Inheritance (3 items),20,41 referring to the intuitive tendency to attribute macroscopic properties to submicroscopic components (e.g., assuming that a sulfur atom has the same density and boiling point as a macroscopic sample of the substance); ii) Additivity (3 items),38 referring to the intuitive tendency to assume that the properties of a composite entity (molecule, compound) are an average of the properties of its components (atoms, elements) (e.g., assuming that the product of the reaction between a blue and a yellow substance will always be green); iii) Matter Tracking (4 items),40−42 referring to over-reliance on surface features and appearances to track matter during processes (e.g., assuming that the mass of a closed system will decrease when a gas is formed); iv) Energy Investment (3 items),39 referring to the implicit conceptualization of molecules as springy structures that take effort to assemble but give back energy when broken apart (e.g., assuming that it takes more energy to synthesize a bigger molecule than a smaller molecule). Although the different items included in each of the categories in the ICI have been validated by other researchers using various strategies,15,22,38−40 the instrument as a whole has not been the subject of a rigorous study to evaluate its robustness. Thus, its results should be interpreted with caution. The ICI is part of a battery of assessments used to collect data about student outcomes in our general chemistry program, and interested readers may contact the corresponding author for a copy of this tool. At our institution, students complete this questionnaire online, in their own time, and without any supervision. On average, over two-thirds of all the students finishing the second semester of general chemistry have



RESEARCH QUESTION This investigation was guided by the following research question: B

DOI: 10.1021/acs.jchemed.7b00427 J. Chem. Educ. XXXX, XXX, XXX−XXX

Journal of Chemical Education

Article

Table 1. Comparison of Mean Scores in the Final ACS Conceptual Exam and Associated Standard Deviation for Students within Each Subgroup subgroup A

subgroup B

subgroup C

subgroup D

subgroup F

questionnaire

N

M

SD

N

M

SD

N

M

SD

N

M

SD

N

M

SD

standard, N = 1076 paired, N = 397

72 33

91.3 91.3

2.96 3.40

163 71

82.7 82.7

2.23 2.74

276 85

73.5 73.5

3.32 2.76

212 124

63.3 63.4

2.30 3.68

353 84

49.2 49.0

7.79 5.28

Table 2. Comparison of Mean Scores in the ICI and Associated Standard Deviation for Students within Each Subgroup, Created Using Results in the ACS Final Exam Grades subgroup A

subgroup B

subgroup C

subgroup D

subgroup F

questionnaire

N

M

SD

N

M

SD

N

M

SD

N

M

SD

N

M

SD

standard, N = 1076 paired, N = 397

72 33

63.0 76.5

21.0 17.4

163 71

53.8 62.9

18.9 19.2

276 85

45.2 53.8

19.3 19.5

212 124

37.1 47.0

17.4 19.1

353 84

27.4 38.7

14.9 20.8

two conditions in terms of fundamental knowledge of chemistry. Given that the distribution of grades in each of the subgroups included in Table 1 was non-normal, a Mann−Whitney U test was used to compare the mean ranks of the two subgroups within each letter grade category. Nonsignificant differences were found for all groups. Additionally, group equivalence was tested using a two one-sided test (TOST) using the criteria suggested by Lewis and Lewis43 to define tolerance intervals (20% of the pooled standard deviation). All groups in Table 1 passed this equivalence test. Although the TOST assumes normality in the distribution of data, it has been shown to produce reliable results for non-normal distributions that are not highly skewed, particularly when tolerance intervals are conservative.44 These conditions are satisfied by the proposed groupings in which grade distributions exhibit skew and kurtosis absolute values lower than 1.5 in all cases, and equivalence was tested with tolerance values lower than 2% of the mean for the compared subgroups. The results presented in the next section are based on the analysis using descriptive statistics of student responses to the ICI questionnaire in each of the subgroups in Table 1.

completed this questionnaire in the past three years. Data for this study were collected in the springs of 2016 and 2017. In these two semesters, a majority of the students completed the standard ICI, while a subsection of them completed an extended version of this assessment. This latter version included the same multiple-choice questions as the standard questionnaire, but each of the items was followed by a paired question that included the same distractors but began with the statement “Select the option below that you think is most commonly chosen by students who get this question wrong because they do not caref ully ref lect on what the question is asking or are misguided by their intuition.” Each set of paired questions was presented on a single page. In both types of questionnaires, students were allowed to change any of their answers as they worked on each question and before submitting their final responses. Data collected in each of the two conditions, standard ICI and modified ICI with paired items (paired ICI), included responses from students in different course sections taught by different instructors. Data Analysis

Although, on average, students assigned to each of the conditions had similar academic characteristics, they had a choice in completing the task and it was not possible to ensure equivalence in academic performance of the students who actually answered each of the questionnaires (we could not control for the types of students who decided to complete the assignment). For this reason, results in standardized final exam were used to distribute ex post facto actual participants within each condition into equivalent groups with different grade averages spanning the A to F (failing) letter grade scale (see Table 1). At our institution, all general chemistry instructors follow the same curriculum and implement common midterms and final exams. The full-year ACS standardized conceptual exam is used as final exam in all sections during the second semester of the general chemistry course. The correlation coefficient between results in the ACS conceptual exam and the average grade earned in the course (which also includes contributions from midterm exams, homework, class participation, and laboratory work) at our university is typically greater than 0.7. Regression analyses of results in the ACS exam since it was first implemented four years ago show that the section in which students are enrolled is not a statistically significant variable on final exam grade. Consequently, performance in the ACS standardized exam was judged to be an appropriate variable to build equivalent groups across the



MAJOR FINDINGS As described in the previous section, study participants within each condition (standard and paired ICI) were distributed into five different subgroups based on their score in the ACS final exam. Table 2 summarizes results in the ICI for each subgroup. These results are displayed in graphical manner in Figure 1 to make major data trends more explicit. The data show a statistically significant positive effect on student performance in the ICI of including an additional prompt asking students to adopt the perspective of a student who gets any given question wrong because of lack of reflection or over-reliance on intuition. All subgroups of students improved in their ICI scores with medium effect sizes as measured by Cohen’s d: A(0.67), B(0.55), C(0.45), D(0.48), and F(0.70). In most cases, score gains exceeded half of a standard deviation, with the largest effect observed on students with the weakest performance in the ACS final exam. Data trends in Figure 1 show that scores in the ICI were correlated with performance in the ACS final exam for both conditions. The actual value of the Pearson’s correlation coefficient (r) for each data set revealed a moderate positive relationship between ICI and ACS exam scores in both cases (ACS Exam-Standard ICI r = 0.53; ACS Exam-Paired ICI r = 0.50). C

DOI: 10.1021/acs.jchemed.7b00427 J. Chem. Educ. XXXX, XXX, XXX−XXX

Journal of Chemical Education

Article

Figure 1. Percent scores in the ICI for students in different subgroups who completed the standard or the paired questionnaires. The subgroup label (A, B, C, D, F) is indicative of the students’ score in the ACS final exam (Table 2). Standard error bars for each data point are included.

mostly confined to two categories of questions (Additivity and Energy Investment), whereas the effects on the other two areas were considerably smaller. One can hypothesize that benefits will be larger in those areas in which students’ have developed some latent normative understandings that can compete with intuitive ideas if properly prompted. If those latent understandings are not present, it is unlikely that extended reflection will lead students to the correct answers. It could also be that some types of intuitive initial responses are triggered more easily and more widely among diverse types of students. It is worth noticing that the largest effects of the intervention corresponded to those set of questions in which the average student performance in the standard ICI was the weakest. This may indicate that heuristic initial responses for these sets of questions (Additivity and Energy Investment) were more prevalent among all types of students, leading to lower average scores for all subgroups in the standard ICI but increasing the pool of students who could be positively affected by the additional prompt in the paired ICI. One cannot exclude the possibility that the different effects observed between categories of questions (i.e., Additivity and Energy Investment versus Inheritance and Matter Tracking) may be more due to the specific questions that are included in the ICI than to the content that they target. However, average percent gains were consistently above 10 points for each of the questions in the Additivity and Energy Investment categories, while they were consistently below 10 points for each of the questions in the other two areas. Although there were variations in the gains per question within any given category, gain differences between categories were consistently larger than within categories. This suggests that the nature of the subject matter may have had a larger influence on student reasoning than the specific nature of the questions chosen in each area. The ICI includes questions that explore the extent to which students rely on intuitive knowledge when making sense of chemical properties and phenomena. Thus, the results of this study may not necessarily apply to students working on typical academic assessments (i.e., exams, quizzes). There are, however, studies that suggest that students benefit from systematically engaging in metacognitive reflection when

As shown in Figure 2, the positive impact of the inclusion of the paired question was not uniform across the different categories of items in the ICI. Most of the statistically significant differences between the two data sets were associated with the “Additivity” and “Energy Investment” categories, with medium effect sizes for all but one (Additivity, group B) of the subgroups. Differences in performance in “Inheritance” questions were only statistically significant for students in group F (lowest ACS exam scores), whereas no statistically significant differences were observed in any case for the “Matter Tracking” items.



DISCUSSION AND IMPLICATIONS The results of this investigation show the significant impact that a small intervention designed to affect students’ reliance on heuristic reasoning may have on their performance on a concept inventory. Given the nature of this research, it is not possible to ascertain why the presence of the additional prompt resulted in a greater likelihood of selecting normative responses. The reasons may be diverse, depending on individual student characteristics. Based on prior research,32−37 one can speculate that the prompt may have led some students to rethink their own answers, invest more time discriminating between options, or question their initial responses, particularly in those cases in which students were less certain about their choices. The prompt may have raised awareness of the potential presence of intuitive “traps” or triggered conflict between intuitive and learned responses. In any case, the results suggest that students’ answers when working on concept inventories should be taken with caution as they may not fully elicit what students know because their choices might be biased by heuristic reasoning. Some authors have made similar claims about the answers given by students to some types of questions found in concept inventories used in physics courses.45,46 The findings of this study also suggest that the benefits of somehow prompting students to be more reflexive and adopt a more analytical stance may not extend to all types of questions or content areas. Major score gains in the paired ICI were D

DOI: 10.1021/acs.jchemed.7b00427 J. Chem. Educ. XXXX, XXX, XXX−XXX

Journal of Chemical Education

Article

Figure 2. Percent scores in the four main categories of items in the ICI for students in different subgroups based ACS exam grades (Table 2) who completed the standard or the paired questionnaires. Standard error bars for each data point are included.

answering questions in traditional exams.34 For example, students tend to perform better if they rank their confidence on each answer in a traditional multiple-choice test and then invest time reflecting on and revising only those responses in which they have low confidence. These results suggest that heuristic reasoning may be also at play in these cases. Assessing students’ understanding is challenging, particularly when using multiple-choice tests. Novice learners do not have a well-connected and coherent knowledge structure in chemistry, and their judgments and decisions are very sensitive to what and how information is presented. Their intuitive beliefs are often cued faster than their academic knowledge, and their ability to inhibit or modify their automatic responses is limited. Consequently, their answers may not reveal their incipient or latent understandings but rather their spontaneous intuitions. If this is the case, instructors should be cautious when diagnosing student understanding with a single multiple-choice instrument. Different tools may be needed to better probe student knowledge, or instruments may have to be modified to compel students to reflect on their answers. In particular, the findings of the present study support the claim that explicitly promoting

metacognitive awareness and engagement may help students boost their performance in tests, particularly when it involves real-time monitoring of their answers. This can be achieved by the inclusion of prompts that require students to look at questions from a different perspective, evaluate confidence in their responses, or compare and contrast choices.



AUTHOR INFORMATION

Corresponding Author

*E-mail: [email protected] ORCID

Vicente Talanquer: 0000-0002-5737-3313 Notes

The author declares no competing financial interest.



ACKNOWLEDGMENTS This work could not have been completed without the help of all general chemistry instructors at the University of Arizona and the participation of their students. E

DOI: 10.1021/acs.jchemed.7b00427 J. Chem. Educ. XXXX, XXX, XXX−XXX

Journal of Chemical Education



Article

Tsaparlis, G., Sevian, H., Eds.; Springer: Dordrecht, 2013; pp 331− 346. (23) Morewedge, C. K.; Kahneman, D. Associative Processes in Intuitive Judgment. Trends Cognit. Sci. 2010, 14, 435−440. (24) Gilovich, T., Griffin, D., Kahneman, D., Eds. Heuristics and Biases: The Psychology of Intuitive Judgment; Cambridge University Press: Cambridge, UK, 2002. (25) Todd, P. M.; Gigerenzer, G. Précis of Simple Heuristics that Make Us Smart. Behav. Brain. Sci. 2000, 23, 727−780. (26) Gigerenzer, G.; Gaissmaier, W. Heuristic Decision Making. Annu. Rev. Psychol. 2011, 62, 451−482. (27) Kahneman, D. Thinking, Fast and Slow; Farrar, Straus and Giroux: New York, 2011. (28) Talanquer, V. Chemistry Education: Ten Heuristics to Tame. J. Chem. Educ. 2014, 91 (8), 1091−1097. (29) Evans, J. St. B. T. Dual-Processing Accounts of Reasoning, Judgment, and Social Cognition. Annu. Rev. Psychol. 2008, 59, 255− 278. (30) Evans, J. St. B. T.; Stanovich, K. E. Dual-Process Theories of Higher Cognition: Advancing the Debate. Perspect. Psychol. Sci. 2013, 8, 223−241. (31) Shah, A. K.; Oppenheimer, D. M. Heuristics Made Easy: An Effort-Reduction Framework. Psychol. Bull. 2008, 134, 207−222. (32) Alter, A. L.; Oppenheimer, D. M.; Epley, N.; Eyre, R. N. Overcoming Intuition: Metacognitive Difficulty Activates Analytic Reasoning. J. Exp. Psychol. Gen. 2007, 136, 569−576. (33) Bhatia, S. Conflict and Bias in Heuristic Judgment. J. Exp. Psychol. Learn. Mem. Cogn. 2017, 43, 319−325. (34) Couchman, J. J.; Miller, N. E.; Zmuda, S. J.; Feather, K.; Schwartzmeyer, T. The Instinct Fallacy: The Metacognition of Answering and Revising During College Exams. Metacogn. Learn. 2016, 11, 171−185. (35) Thompson, V. A.; Prowse Turner, J. A.; Pennycook, G. Intuition, Reason, and Metacognition. Cogn. Psychol. 2011, 63, 107− 140. (36) Milkman, K. L.; Chugh, D.; Bazerman, M. H. How Can Decision Making Be Improved? Persp. Psychol. Sci. 2009, 4, 379−383. (37) Larrick, R. Debiasing. In Blackwell Handbook of Judgment and Decision Making; Koehler, D., Harvey, A., Eds.; Blackwell: Malden, MA, 2004; pp 316−337. (38) Talanquer, V. Students’ Predictions about the Sensory Properties of Chemical Compounds: Additive versus Emergent Frameworks. Sci. Educ. 2008, 92, 96−114. (39) Maeyer, J.; Talanquer, V. Making Predictions About Chemical Reactivity: Assumptions and Heuristics. J. Res. Sci. Teach. 2013, 50, 748−767. (40) Mulford, D. R.; Robinson, W. R. An Inventory for Alternate Conceptions among First-Semester General Chemistry Students. J. Chem. Educ. 2002, 79, 739−744. (41) Talanquer, V. On Cognitive Constraints and Learning Progressions: The Case of Structure of Matter. Int. J. Sci. Educ. 2009, 31, 2123−2136. (42) Driver, R.; Squires, A.; Rushword, P.; Wood-Robinson, V. Making Sense of Secondary Science: Research into Children’s Ideas; Routledge: London, 1994. (43) Lewis, S. E.; Lewis, J. E. The Same or Not the Same: Equivalence as an Issue in Educational Research. J. Chem. Educ. 2005, 82, 1408−1412. (44) Johnston, R. J.; Duke, J. M. Benefit Transfer Equivalence Tests with Non-normal Distributions. Environ. Resource Econ. 2008, 41, 1− 23. (45) Heckler, A. F. The Ubiquitous Patterns of Incorrect Answers to Science Questions: The Role of Automatic, Bottom-Up Processes. In Psychology of Learning and Motivation: Cognition in Education; Mestre, J. P., Ross, B. H., Eds.; Academic Press, Oxford, 2011; Vol. 55, pp 227−268. (46) Wood, A. K.; Galloway, R. K.; Hardy, J. Can Dual Processing Explain Physics Students’ Performance on the Force Concept Inventory? Phys. Rev. Phys. Educ. Res. 2016, 12, 023101.

REFERENCES

(1) Nakhleh, M. B. Why Some Students Don’t Learn Chemistry. J. Chem. Educ. 1992, 69 (3), 191−196. (2) Wandersee, J. H.; Mintzes, J. J.; Novak, J. D. Research on Alternative Conceptions in Science. In Handbook of Research in Science Teaching and Learning; Gabel, D. L., Ed.; Macmillan: New York, 1994; pp 17−210. (3) Libarkin, J. C.; Kurdziel, J. P. Research Methodologies in Science Education: Assessing Students’ Alternative Conceptions. J. Geosci. Educ. 2001, 49, 378−383. (4) Taber, K. Chemical MisconceptionsPrevention, Diagnosis and Cure: Vol. 1: Theoretical Background & Vol. II: Classroom Resources; Royal Society of Chemistry: London, 2002. (5) Kind, V. Beyond Aappearances: Students’ Misconceptions about Basic Chemical Ideas; Royal Society of Chemistry: London, 2004. (6) National Research Council (NRC). Discipline-Based Education Research; National Academy Press: Washington, DC, 2012. (7) Libarkin, J. Concept Inventories in Higher Education Science. In Promising Practices in Undergraduate Science, Technology, Engineering, and Mathematics Education: Summary of Two Workshops; Proceedings of the National Research Council’s Workshop Linking Evidence to Promising Practices in STEM Undergraduate Education, Washington, DC, October 13−14, 2008. (8) Bretz, S. L. A. Chronology of Assessment in Chemistry Education. In Trajectories of Chemistry Education Innovation and Reform; Holme, T., Cooper, M. M., Varma-Nelson, P., Eds.; ACS Symposium Series 1145, American Chemical Society; Washington, DC, 2013; Chapter 10. (9) Bretz, S. L. Designing Assessment Tools to Measure Students’ Conceptual Knowledge of Chemistry. In Tools of Chemistry Education Research; Bunce, D.; Cole, R., Eds.; ACS Symposium Series, Oxford University Press: Washington, DC, 2014; pp 155−168. (10) Bretz, S. L.; Nakhleh, M. B. Piaget, Constructivism, and Beyond. J. Chem. Educ. 2001, 78 (8), 1107. (11) Ozdemir, G.; Clark, D. B. An Overview of Conceptual Change Theories. Eurasia J. Math. Sci. Technol. Educ. 2007, 3, 351−361. (12) Vosniadou, S., Ed. International Handbook of Conceptual Change; Routledge: New York, 2008. (13) Docktor, J.; Mestre, J. Synthesis of Discipline-based Education Research in Physics. Phys. Rev. ST Phys. Educ. Res. 2014, 10, 020119. (14) Schultz, M.; Lawrie, G. A.; Bailey, C. H.; Bedford, S. B.; Dargaville, T. R.; O’Brien, G.; Tasker, R.; Thompson, C. D.; Williams, M.; Wright, A. H. Evaluation of Diagnostic Tools that Tertiary Teachers can Apply to Profile their Students’ Conceptions. Int. J. Sci. Educ. 2017, 39 (5), 565−586. (15) Yezierski, E. J.; Birk, J. P. Misconceptions About the Particulate Nature of Matter: Using Animations to Close the Gender Gap. J. Chem. Educ. 2006, 83, 954−960. (16) Luxford, C. J.; Bretz, S. L. Development of the Bonding Representations Inventory to Identify Student Misconceptions about Covalent and Ionic Bonding Representations. J. Chem. Educ. 2014, 91, 312−320. (17) Voska, K. W.; Heikkinen, H. W. Identification and Analysis of Students’ Conceptions Used to Solve Chemical Equilibrium Problems. J. Res. Sci. Teach. 2000, 37 (2), 160−176. (18) Clough, E. E.; Driver, R. A Study of Consistency in the Use of Students’ Conceptual Frameworks Across Different Task Contexts. Sci. Educ. 1986, 70, 473−496. (19) diSessa, A. A.; Gillespie, N. M.; Esterly, J. B. Coherence Versus Fragmentation in the Development of the Concept of Force. Cogn. Sci. 2004, 28, 843−900. (20) Talanquer, V. Common Sense Chemistry: A Model for Understanding Students’ Alternative Conceptions. J. Chem. Educ. 2006, 83, 811−816. (21) Talanquer, V. Threshold Concepts in Chemistry: The Critical Role of Implicit Schemas. J. Chem. Educ. 2015, 92, 3−9. (22) Talanquer, V. How Do Students Reason about Chemical Substances and Reactions? In Concepts of Matter in Science Education; F

DOI: 10.1021/acs.jchemed.7b00427 J. Chem. Educ. XXXX, XXX, XXX−XXX