How do the kids speak? Improving educational use of text mining with child-directed
language models (2023)
Abstract:Most educational assessments tend to be constructed in a close-ended format, which
is easier to score consistently and more affordable. However, recent work has leveraged
computation text methods from the information sciences to make open-ended measurement
more effective and reliable for older students. The purpose of this study is to determine
whether models used by computational text mining applications need to be adapted when
used with samples of elementary-aged children.
Measuring flexibility: A text-mining approach (2023)
Abstract:In creativity research, ideational flexibility, the ability to generate ideas by shifting
between concepts, has long been the focus of investigation. However, psychometric
work to develop measurement procedures for flexibility has generally lagged behind
other creativity-relevant constructs such as fluency and originality. Here, we build
from extant research to theoretically posit, and then empirically validate, a text-mining
based method for measuring flexibility in verbal divergent thinking (DT) responses.
The empirical validation of this method is accomplished in two studies. In the first
study, we use the verbal form of the Torrance Test of Creative Thinking (TTCT) to
demonstrate that our novel flexibility scoring method strongly and positively correlates
with traditionally used TTCT flexibility scores. In the second study, we conduct a
confirmatory factor analysis using the Alternate Uses Task to show reliability and
construct validity of our text-mining based flexibility scoring. In addition, we also
examine the relationship between personality facets and flexibility of ideas to provide
criterion validity of our scoring methodology. Given the psychometric evidence presented
here and the practicality of automated scores, we recommend adopting this new method
which provides a less labor-intensive and less costly objective measurement of flexibility.
Beyond Semantic Distance: Automated Scoring of Divergent Thinking Greatly Improves
with Large Language Models (2022)
Abstract:Automated scoring for divergent thinking seeks to overcome a key obstacle to creativity
measurement: the effort, cost, and reliability of scoring open-ended tests. For a
common test of divergent thinking, the Alternate Uses Task (AUT), the primary automated
approach casts the problem as a semantic distance between a prompt and the resulting
idea in a text model. This work presents an alternative approach that greatly surpasses
the performance of the best existing semantic distance approaches. Our system fine-tunes
deep neural network-based large-language models (LLMs) on human-judged responses.
Trained and evaluated against one of the largest collections of human-judged AUT responses,
with 27 thousand responses collected from nine past studies, our fine-tuned large-language-models
achieved up to r = .81 correlation with human raters, greatly surpassing current systems
(r = .12-.26). Further, learning transfers well to new test items and the approach
is still robust with small numbers of training labels; in some cases, without any
training at all. This work also suggests a limit to the underlying assumptions of
the semantic distance model, showing that a purely semantic approach that uses the
stronger language representation of LLMs, while still improving on existing systems,
does not achieve comparable improvements to our fine-tuned system. The increase in
performance can support stronger applications and interventions in divergent thinking
and opens the space of automated divergent thinking scoring to new areas for improving
and understanding this branch of methods.
Citation:Organisciak, P., Acar, S., Dumas, D., Berthiaume, K. (Pre-print, 2022). Beyond Semantic
Distance: Automated Scoring of Divergent Thinking Greatly Improves with Large Language
Models.http://dx.doi.org/10.13140/RG.2.2.32393.31840
Applying automated originality scoring to the verbal form of Torrance Tests of Creative
Thinking (2021)
Abstract:In this study, we applied different text-mining methods to the originality scoring
of the Unusual Uses Test (UUT) and Just Suppose Test (JST) from the Torrance Tests
of Creative Thinking (TTCT)–Verbal. Responses from 102 and 123 participants who completed
Form A and Form B, respectively, were scored using three different text-mining methods.
The validity of these scoring methods was tested against TTCT’s manual-based scoring
and a subjective snapshot scoring method. Results indicated that text-mining systems
are applicable to both UUT and JST items across both forms and students’ performance
on those items can predict total originality and creativity scores across all six
tasks in the TTCT-Verbal. Comparatively, the text-mining methods worked better for
UUT than JST. Of the three text-mining models we tested, the Global Vectors for Word
Representation (GLoVe) model produced the most reliable and valid scores. These findings
indicate that creativity assessment can be done quickly and at a lower cost using
text-mining approaches.
Citation:Acar, S., Berthiaume, K., Grajzel, K., Dumas, D., Flemister, C. T., & Organisciak,
P. (2021). Applying automated originality scoring to the verbal form of Torrance Tests
of Creative Thinking.Gifted Child Quarterly. Advance online publication.https://doi.org/10.1177/00169862211061874
Measuring up: Aligning creativity assessment with the Standards. In M. Runco & S.
Acar (Eds.), Handbook of Creativity Assessment. (In Press)
Citation:Dumas, D., & Grajzel, K.* (in press). Measuring up: Aligning creativity assessment
with the Standards. In M. Runco & S. Acar (Eds.), Handbook of Creativity Assessment.
Cheltenham, UK: Edward Elgar Publishing.
Presentations
Berthiaume, K. (Chair) (2022). Innovations in methodological approaches in creativity
assessments for gifted education. Symposium accepted for presentation at American
Educational Research Association Annual Meeting, San Diego, CA.
Dumas, D., & Dong, Y. (April, 2022). Observing students’ zone of proximal creativity
using a dynamic assessment procedure. In K. Berthiaume (Chair), Innovations in Methodological
Approaches in Creativity Assessments for Gifted Education. Symposium to be presented
at the annual meeting of the American Educational Research Association, San Diego,
CA.
Grajzel, K., Dumas, D., Berthiaume, K., Acar, S., & Organisciak, P. (April, 2022)
Measuring flexibility: A text-mining approach. In K. Berthiaume (Chair), Innovations
in Methodological Approaches in Creativity Assessments for Gifted Education. Symposium
to be presented at the annual meeting of the American Educational Research Association,
San Diego, CA.
Grajzel, K., Dumas, D., & Berthiaume, K. (2022). Time spent on task positively predicts
creative quality of responses for elementary students. Roundtable talk accepted for
presentation at American Educational Research Association Annual Meeting, San Diego,
CA.
Acar, S., Berthiaume, K., Grajzel, K., & Flemister, T. (2021). Automated scoring of
Torrance Tests of Creative Thinking (TTCT) verbal form for originality. Combined session
at National Association for Gifted Children Annual Conference.