The world’s largest publisher of educational textbooks and resources, Pearson, recently extended its work into digital media and learning. As well as producing innovative new digital learning resources and platforms, Pearson is also positioning itself as a major center for the analysis of educational big data. This has implications for how learning is going to be conceptualized in the near future, and begs big questions about how the private ownership of educational data might impact emerging understandings and explanatory theories of the learning process itself.
The Big Data Gatekeeper
Originally established in 1844, by 2014 Pearson reported $7.9 billion USD revenue with operations in more than 70 countries and over 40,000 employees. Rather than relying on its educational textbook business alone, it had also significantly broadened its field of operations to include major digital platforms for online publishing, global standardized testing and assessment, data analysis and digital research, as well as data visualization tools for data-informed policymaking and low-cost private schooling in low-income countries. I have recently been researching Pearson’s digital expansion, and suggested it is now becoming a kind of gatekeeper in relation to the analysis of digital data in education — with implications for the definition and measurement of learning itself.
As part of its operational expansion, in 2012, Pearson established the Center for Digital Data, Analytics and Adaptive Learning (CDDAAL), one of five centers in its Research and Innovation Network. The aim of CDDAAL is to explore how “the billions of bits of digital data generated by students’ interactions with online lessons as well as everyday digital activities can be combined and reported to personalize learning.” Its staff possess expertise in data mining, computer science, algorithm design, intelligent systems, human-computer interaction, data analytics tools and methods, and interactive data visualization.
One of CDDAAL’s publications details how it intends to develop new kinds of digital research methods to deal with educational big data:
“Once much of teaching and learning becomes digital, data will be available not just from once-a-year tests, but also from the wide-ranging daily activities of individual students … in real time. … [W]e need further research that brings together learning science and data science to create the new knowledge, processes, and systems this vision requires.”
The authors argue that big data analysis methods will enable researchers to “capture stream or trace data from learners’ interactions” with learning materials, detect “new patterns that may provide evidence about learning,” and “more clearly understand the micro-patterns of teaching and learning by individuals and groups.”
Big data methods of pattern recognition are at the heart of its activities, and Pearson ambitiously aims to use pattern recognition to identify generalizable insights into learning processes not just at the level of the individual learner but at vast scale:
“Learner interactions with activities generate data that can be analysed for patterns. … Performance in individual activities can often provide immediate feedback … based on local pattern recognition, while performance over several activities can lead to profile updates, which can facilitate inferences about general performance. … These developments have the potential to inform us about patterns and trajectories for individual learners, groups of learners, and schools. They may also tell us more about the processes and progressions of development in ways that can be generalised outside of school.”
The promise of pattern recognition methods practiced by Pearson is therefore the generation of new generalizable theories and models of cognitive development and learner progression.
Those insights can then be made actionable as new software-based learning products; Pearson is of course well positioned as an educational publisher to codify these insights in its own software applications for schools and colleges. As Pearson’s own data analysts articulate it, they are motivated by a “theory of action”:
“By using better data analysis techniques applied to data captured from better designed activities, we hope to build more complete and accurate models of learners’ knowledge, skills, and attributes that will provide better information to teachers and learners and provide systems that are relevant to each student’s individual proficiency levels, interests, and current states.”
Notably, for example, Pearson has partnered with Knewton, a major learning analytics provider, to power its digital content:
“The Knewton Adaptive Learning Platform™ uses proprietary algorithms to deliver a personalized learning path for each student…. ‘Knewton adaptive learning platform, as powerful as it is, would just be lines of code without Pearson,’ said Jose Ferreira, founder and CEO of Knewton. ‘You’ll soon see Pearson products that diagnose each student’s proficiency at every concept, and precisely deliver the needed content in the optimal learning style for each. These products will use the combined data power of millions of students to provide uniquely personalized learning.’”
What Pearson and Knewton appear to be proposing is a new data-driven model of precision pedagogy — an algorithmic approach to personalizing learning based on pattern recognition processes conducted on vast databases of student learning data — but its claim is bigger than this alone.
The Educational ‘Theory Gap’
Through its big data techniques of pattern recognition, Pearson appears to be proposing that it can generate new insights into and understandings of learning itself. It has been suggested that big data means “the end of theory” or even “the death of the theorist” as the data are seen to “speak for themselves” — revealing truths about human actions and social behaviours that existing theories from the human and social sciences cannot. However, Pearson’s ambitious aim is to go beyond “post-theory” claims and instead to generate new theories from analysing the learning data it possesses. In particular, it intends to identify a “theory gap between the dramatic increase in data-based results and the theory base to integrate them,” and then to use the results of data analysis to build new theories.
One question to be raised here is about how “learning” can be counted in a database. In order for anything to be entered into a database, it first needs to be sorted into a classification system. This means that for Pearson and Knewton to make algorithmic calculations about learning processes, there needs to be a precise classification scheme available in advance into which various indicators of learning can be entered. The classifications of learning it uses are drawn from “learning science,” a field itself largely defined in terms of concepts and methods from the psychological and cognitive sciences. Therefore any inferences or insights drawn from its data analyses need to be understood as pre-defined by the theoretical, conceptual and classificatory systems of this particular field of educational research.
Ultimately, Pearson is positioning itself as a big data gatekeeper in relation to the production of new knowledge about learning. It has a vast organizational, technical and expert infrastructure — in the shape of CDDAAL and its partnership with Knewton — for conducting big data analyses in education. It is seeking to use the insights it generates from such analyses to construct new conceptual models and theories of learning that it can encode into new e-learning products.
This raises a very pressing question for research in digital media and learning, since it now appears that insight into learning itself is increasingly likely to emanate from private companies with their own proprietorial systems, intellectual property claims, and market needs. These companies are staking their claim to expertise in the conceptualization of learning through their ownership of the systems required to calculate big data.
Who Owns Educational Theory?
Who owns big data? This is the important question posed by Evelyn Ruppert in a recent short article that details how big data are the product of different actors and technologies involved in its generation and analysis, which can in many ways be seen to “own” big data. If big data is increasingly owned by private companies, then this raises major questions about the ownership of the insights that can be extracted from it.
Few education departments in universities have the big data infrastructure to conduct the kinds of advanced data scientific studies that Pearson is able to do (Stanford University’s recent dedication to learning analytics is a notable exception here, but Stanford has a long-standing synergy with Silicon Valley, which is where the promise of big data originates). This means that as big data gains credibility as the source for educational knowledge production and theorizing, it is likely that legitimacy will flow toward those centers able to conduct such analyses.
In other words, there’s a political economy dimension to educational theorizing as it seems to be migrating toward well-resourced commercial research centres like those of Pearson. How learning is understood, conceptualized and theorized looks increasingly to be led by for-profit actors with the in-house expertise and technical capacity to generate insights from big data, who might then stand to gain commercially by designing and patenting e-learning software resources on the basis of the theories they’ve generated — essentially a case of locking-in a theory to a specific technical innovation. Audrey Watters suggests that the technological future of education is one in which software patents become educational theory:
“This version of the future does not guarantee that these companies have developed technologies that will help students learn. But it might mean that there will be proprietary assets to litigate over, to negotiate with, and to sell.”
In education departments, we are used to tracing the provenance of educational theories to their original thinkers. As big data practices increasingly infuse educational research, and educational analyses are increasingly performed by profit-making companies with ownership of the relevant big data facilities, might we need to address the question of who owns educational theory, and how it becomes patented into ed-tech products?
This is important not least because governments are taking an increasing interest in the potential of big data analysis in education. In the UK, for example, the government’s Educational Technology Action Group has advised much greater take up of learning analytics across the entire spectrum of schools, collesges and universities, while another report by a cross-party group of politicians and experts has recommended linking learning analytics to teaching metrics in higher education. This latter example proposes to put privately-owned big data analytics businesses at the center of the evaluation of teaching quality in universities.
The ownership of educational big data, the ownership of educational theory, and the application of such theories within proprietorial systems and software patents may then be leading to a near-future scenario where private companies with market imperatives become government-approved sites of expertise into learning and teaching processes. In this context, how learning is conceptualized and understood looks likely to become the intellectual property of privately-resourced research centers; a secret recipe for proprietorial algorithms to perform by prescribing personalized, data-driven, precision pedagogies.
Banner image credit: r2hox