Ric Luecht, professor in the Educational Research Methodology department in the School of Education, discusses the importance of assessment engineering in education, and how his experience providing consulting services to outside organizations provides opportunities to his students.
1). You are a specialist in quantitative methods, as well as educational measurement and assessment design. What led you to this career, and what do you enjoy most about your work?
I began my work in the testing field developing hands-on clerical, accounting and manufacturing skills qualification and placement tests for Manpower, Inc., an international temporary employment. Eventually, I finished my graduate studies in measurement and evaluation and was hired as a measurement research scientist at ACT where I worked in a small research group doing work on automated test assembly, computer-based testing research, multidimensional item response theory, and other state-of-the-art technical psychometric research. My work at ACT lead to a senior research with the National Board of Medical Examiners in Philadelphia, where was a senior psychometrician and eventually led the team that computerized the U.S. Medical Licensing Examination. That was my last job in the private sector before joining the ERM Department at UNCG. With each new career opportunity, I have tried to learn as much as possible, take on tough challenges and do the best job possible, and let each new experience hopefully open some new doors for me. Thankfully, my career has evolved into an extremely interesting amalgamation of roles that often lets me work with top people in many different disciplines around the world. In addition to the measurement field, my work routinely requires cross-over into fields like computer science and information technology, applied mathematics, statistics, cognitive science, and design engineering. I particularly love to work on challenging technological assessment design issues, but also on developing user-friendly tools and/or finding ways to automate operations that are tedious, time-consuming or costly. I like challenges that require me to think outside the box–not being able to depend on conventional thinking and well-established solutions. If asked what I do, I can honestly say that I get to work everyday with some very smart people on interesting and challenging problems, and I then get the chance to share some of my experiences with some very motivated and bright graduate students. What could be more enjoyable than that?
2). Could you please talk about assessment engineering, the new approach to large-scale assessment design that you invented?
I wouldn’t say that I invented assessment engineering (AE). Rather, I cobbled together some ideas and technologies borrowed from multiple disciplines, laid out a logical framework and some terminology, and then began working hands-on with various groups to develop concrete AE applications. To date, those projects have included working with the Defense Manpower Data Center on the Armed Services Vocational Aptitude Battery (ASVAB) used for military placement, the Uniform CPA Examination, The College Board’s Advanced Placement Examinations, and even the SAT. In order to understand the practical and real value of an approach like AE, it is important to first understand the status quo in educational and psychological testing. Modern test development is actually not very modern–and not just because of the continued use of multiple-choice test items for many so-called “standardized tests.” High-quality, professionally developed tests are extremely expensive to design and build, with individual test items costs from several hundred to several thousand dollars each. There are many reasons for the high costs, but the primary reasons center on expensive item writing and pretesting of items that, for security reasons, typically have a very limited “shelf life”. AE tries to reduce costs by an order of magnitude by applying some well-established design principles from modern software- and manufacturing-engineering practices to the “art” of item and test design. For example, instead of designing a small number of individual test questions or tasks that have to be individually pilot-tested with real test takers. AE designers would develop and empirically validate a system of “templates” each capable of generating hundreds or thousands of test items that all work more or less as exchangeable units within each template family. Fundamentally, it is all about replicable and scalable design–something that has not been achieved in the past 100 years of testing practice.
While reducing costs is certainly a motivation for testing organizations to consider AE, the personal motivation behind my research is to provide instructionally useful, engaging, high quality assessment tools that teachers and students can use on-demand in the classroom. Developing useful formative assessments is, in my opinion, one of the most important challenges to meet in reforming education in a concrete way and providing empirical evidence as to what works or does not work. Most state end-of-grade, end-of-grade, or graduation tests developed for summative or accountability purposes are wholly inadequate for any formative use. Furthermore, if we attempted to apply our current, inefficient paradigm of developing high-cost, limited shelf-life summative assessments and data-hungry statistical models to meet the demand for hundreds of thousands of formatively useful assessment tasks, we would bankrupt most state education and accountability agencies. Neither is the solution to depend on unstandardized, oftentimes poor quality teacher-made assessments. Ultimately, the formative assessment challenge–putting engaging and useful assessment tools in the hands of teachers and students on a daily basis–requires a carefully designed assessment instruments that can profile students strengths and weakness along multiple knowledge and skill dimensions as they progress through a semester or academic year, and provide actionable remediation and direct links to instruction and learning. I am highly confident that AE-base assessment systems design can meet massive item- and assessment-task production demands of producing low-cost, high-quality assessments for daily integration with classroom activities. It’s a tall order, but we’re slowing making progress with some of the proof-of-concept activities noted above. Stay tuned over the next few years…. more to come.
3). What makes UNCG’s Educational Research Methodology department unique?
I joined the ERM faculty in 1999. Since then, we have added enormously talented and energetic faculty, become far more selective in our recruitment and selection of graduate students, and today can boast a bit about a department that has grow into one of the premier measurement departments in the world. We offer a unique blend of expertise that ranges from basic research in statistic and psychometric modeling to developing practical, state-of-the-art technologies being employed by the best testing organizations around. When our students enter the work force–whether in academia or the private sector–they can be confident that they have received some of the most comprehensive and up-to-date training available in the country if not the world. More recently, we have begun to also develop a unique program of study and work experiences to create a new generation of “super program evaluators.” I know of no other program in the country that can today claim to simultaneous be one of the best in both educational measurement and program evaluation. We are just about there. Definitely unique.
4). You provide consultation services to the private testing industry – could you please talk about the services you provide and explain the importance of this role in the current educational field?
I am often asked to sit on technical advisory committees (TACs) for state assessments as well as for professional certification and licensing agencies (e.g., I was on several advisory groups for the American Institute of Certified Public Accountants in past years and current sit on quite a few state TACs, as well as the PARCC TAC involved in developing joint assessments for about 24-million K-12 students in the U.S.. These organizations generally ask me to joint their TACs partly because of my former life as a practicing senior psychometrican and researcher in the testing industry and partly because of my on-going research on technologies like AE, automated test assembly, multistage testing and adaptive testing, The role of a TAC is advisory insofar as offering concrete recommendations on topics ranging from test score scale maintenance over time to setting performance standards, also recognizing practical, financial, and logistical complications and limitations. I suppose that I perform that role well since I keep getting requests to join or remain on TACs.
Like others in our department and the measurement field, I also occasionally act as a consultant for organizations on a range of topics including their designs for assessment systems and new technologies such as automated test assembly, implementing various forms of computer-based testing, and developing new assessment tasks–especially technology-enhanced items. Occasionally, I may also be called on to audit or otherwise check the technical psychometric work of a testing vendor. As noted earlier, I like to learn new things and appreciate each new opportunity to work on challenging problems. In turn, those experiences often give me an integrated perspective that often transcends the particular problems of one organization and instead see general patterns and potential solutions.
There are four reasons why all of this work is important to our graduate program and more generally to the field of education. First, these opportunities to work with some great organizations and, in the case of TACs, to brainstorm with some of the brightest minds in psychometrics, are a personal joy and source of enrichment for me as a researcher, and force me to continually “raise my game” to higher levels. Second, the contacts that I and other ERM faculty make on TACs, at state departments of education, and within the testing industry at large are invaluable when it comes to finding internships for our graduate students or even post-graduate employment. Third, I truly believe that assessment serves an integral purpose in reforming education–especially along the formative lines noted above. Rather than merely writing journal articles and otherwise complaining about the status quo, my participation on TACs and my consulting work with various groups allows me to have a direct voice in the reform dialog and even the occasional opportunity to help develop solutions (like AE-inspired test design currently being used by PARCC). Finally, these advisory and consulting activities often directly lead to extramural funding opportunities–when as fee-for-service contacts or grants.
5). What do you enjoy the most about teaching and mentoring your students in the ERM department at UNCG?
From a teaching perspective, I enjoy introducing students to real-world analysis and measurement challenges. There is often a huge gap between what is in our textbooks and journal articles and what practitioners need to know to handle challenges that do not always fit theories based on unrealistic, convenient assumptions. As one of my colleagues has said, “Ric, that is a practical solution and seems to work well, but does it work in theory?” From a mentoring perspective, there are two rather different types of mentoring experiences that make what I do so worthwhile. The first is working with very talent confident students who want to work hard, who are willing to take some personal responsibility for their own learning, and who ultimately can take the lead on a new research or technological development. In those cases, it is fun to continually challenge them and eventually see them arrive at something that really is unique and capable of making a potential contribution to the field. The other type of mentoring experience is in getting students to go far beyond their own expectations or self-belief to do a challenging dissertation. For example, they may lack a particular skill set such as computer programming or initially fail to see how they could ever do a methodological comparison of two or three very [conceptually and analytically] different statistical techniques applied to a common data set. Helping them to gather the needed confidence, getting them to push themselves beyond what they thought possible, and eventually seeing take ownership of the idea is wonderfully rewarding to as a professor. It really is.