Wednesday, April 27, 2016
Morning sessions (10:30 AM – 12:00 PM)
Keynote discussion and panel
Keynote chat 1A1 (20 min): Professor Mireille Hildebrandt
Panel 1A2 (60 min): “Physiological Data and Learning Analytics: Opportunities and Challenges for Research and Practice
Keywords: Physiological data, Learning Analytics, Biomarkers, wearable technology, self-monitoring, Quantified-Self
Wearable technology that monitors physiological data through noninvasive techniques has become popular in mainstream consumer markets. In education, the use of physiological data as one or part of other Learning Analytic (LA) tools is still marginal. Considering the fast path of technological development in the field of wearable technology and physiological data, we consider necessary to foster discussion among the education research community about the opportunities and challenges that physiological data might have for research and practice in learning. The expert panel will highlight topics related to self-monitored body generated data in the context of learning. In the panel, we introduce most popular biomarkers related to learning and identify main approaches regarding the use of this kind of data in education. The panel consists in a group interview of the participating experts from the fields relevant in the emerging transdisciplinary theme. Furthermore the panel will be open for discussion with the audience. In order to foster debate and enable participation, we adopt a conversation format called fishbowl discussion. The topics discussed during the panel will help learning researchers and practitioners to take more informed decisions and identify areas that are interesting for further exploration regarding the inclusion of physiological data in LA.
Presentation 1B1 (20 min): “Topic Modeling for Evaluating Students’ Reflective Writing: a Case Study of Pre-service Teachers’ Journals” (short paper)
Keywords: Text mining, Topic modeling, LDA, Reflection, Journal writing, Automated grading, Learning analytics, Education.
Journal writing is an important and common reflective practice in education. Students’ reflection journals also offer a rich source of data for formative assessment. However, the analysis of the textual reflections in class of large size presents challenges. Automatic analysis of students’ reflective writing holds great promise for providing adaptive real time support for students. This paper proposes a method based on topic modeling techniques for the task of themes exploration and reflection grade prediction. We evaluated this method on a sample of journal writings from pre-service teachers. The topic modeling method was able to discover the important themes and patterns emerged in students’ reflection journals. Weekly topic relevance and word count were identified as important indicators of their journal grades. Based on the patterns discovered by topic modeling, prediction models were developed to automate the assessing and grading of reflection journals. The findings indicate the potential of topic modeling in serving as an analytic tool for teachers to explore and assess students’ reflective thoughts in written journals.
Presentation 1B2 (30 min): “Combining Click-Stream Data with NLP Tools to Better Understand MOOC Completion” (full paper)
Keywords: MOOC, click-stream data, natural language processing, sentiment analysis, educational success, predictive analytics
Completion rates for massive open online classes (MOOCs) are notoriously low. Identifying students patterns related to course completion may help to develop interventions that can improve retention and learning outcomes in MOOCs. Previous research predicting MOOC completion has focused on click-stream data, student demographics, and natural language processing (NLP) analyses. However, most of these analyses have not taken full advantage of the multiple types of data available. This study combines click-stream data and NLP approaches to examine if students’ on-line activity and the language they produce in the on-line discussion forum is predictive of successful class completion. We study this analysis in the context of a subsample of 320 students who completed at least one graded assignment and produced at least 50 words in discussion forums, in a MOOC on educational data mining. The findings indicate that a mix of click-stream data and NLP indices can predict with substantial accuracy (78%) whether students complete the MOOC. This predictive power suggests that student interaction data and language data within a MOOC can help us both to understand student retention in MOOCs and to develop automated signals of student success.
Presentation 1B3 (30 min): “Towards Automated Content Analysis of Discussion Transcripts: A Cognitive Presence Case” (full paper)
Keywords: Community of Inquiry (CoI) model, content analysis, content analytics, online discussions, text classification
In this paper, we present the results of an exploratory study that examined the problem of automating content analysis of student online discussion transcripts. We looked at the problem of coding discussion transcripts for the levels of cognitive presence, one of the three main constructs in the Community of Inquiry (CoI) model of distance education. Using Coh-Metrix and LIWC features, together with a set of custom features developed to capture discussion context, we developed a random forest classification system that achieved 70.3\% classification accuracy and 0.63 Cohen’s kappa, which is significantly higher than values reported in the previous studies. Besides improvement in classification accuracy, the developed system is also less sensitive to overfitting as it uses only 205 classification features, which is around 100 times less features than in similar systems based on bag-of-words features. We also provide an overview of the classification features most indicative of the different phases of cognitive presence that gives an additional insights into the nature of cognitive presence learning cycle. Overall, our results show great potential of the proposed approach, with an added benefit of providing further characterization of the cognitive presence coding scheme.
Presentation 1C1 (20 min): “Towards Personalizing An E-quiz Bank for Primary School Students: An Exploration with Association Rule Mining and Clustering” (short paper)
Keywords: Association rule mining, clustering, e-quiz bank, reading
Given the importance of reading proficiency and habits for young students, an online e-quiz bank, Reading Battle, was launched in 2014 to facilitate reading improvement for primary-school students. With more than ten thousand questions in both English and Chinese, the system has attracted nearly five thousand learners who have made about half a million question answering records. In an effort towards delivering personalized learning experience to the learners, this study aims to discover potentially useful knowledge from learners’ reading and question answering records in the Reading Battle system, by applying association rule mining and clustering analysis. The results show that learners could be grouped into three clusters based on their self-reported reading habits. The rules mined from different learner clusters can be used to develop personalized recommendations to the learners. Implications of the results on evaluating and further improving the Reading Battle system are also discussed.
Presentation 1C2 (30 min): “Introduction of Learning Visualisations and Metacognitive Support in a Persuadable Open Learner Model” (full paper)
Keywords: Visual Learning Analytics, Open Learner Models, Learning Analytics for Learners, Supporting Metacognitive Activities
This paper describes open learner models as visualisations of learning for learners, with a particular focus on how information about their learning can be used to help them reflect on their skills, identify gaps in their skills, and plan their future learning. We offer an approach that, in addition to providing visualisations of their learning, allows learners to propose changes to their learner model. This aims to help improve the accuracy of the learner model by taking into account student viewpoints on their learning, while also promoting learner reflection on their learning as part of a discussion of the content of their learner model. This aligns well with recent calls for learning analytics for learners. Building on previous research showing that learners will use open learner models, we here investigate their initial reactions to open learner model features to identify the likelihood of uptake in contexts where an open learner model is offered on an optional basis. We focus on university students’ perceptions of a range of visualisations and their stated preferences for a facility to view evidence for the learner model data and to propose changes to the values.
Presentation 1C3 (30 min): “Impact of Data Collection on Interpretation and Evaluation of Student Models” (full paper)
Keywords: student modeling, evaluation, bias, data sets, parameter fitting
Student modeling techniques are evaluated mostly using historical data. Researchers typically do not pay attention to details of the origin of the used data sets. However, the way data are collected can have important impact on evaluation and interpretation of student models. We discuss in detail two ways how data collection in educational systems can influence results: mastery attrition bias and adaptive choice of items. We systematically discuss previous work related to these biases and illustrate the main points using both simulated and real data. We summarize specific consequences for practice — not just for doing evaluation of student models, but also for data collection and publication of data sets.
Analytic visualizations and dashboards
Presentation 1D1 (20 min): “Semantic Visual Analytics for Today’s Programming Courses” (short paper)
Keywords: Visual analytics, programming, auto grading, semantic analytics, intelligent authoring, dashboard, orchestration technology
We designed and studied an innovative semantic visual learning analytics for orchestrating today’s programming classes. The visual analytics integrates sources of learning activities by their content semantics. It automatically processs paper-based exams by associating sets of concepts to the exam questions. Results indicated the automatic concept extraction from exams were promising and could be a potential technological solution to address a real world issue. We also discovered that indexing effectiveness was especially prevalent for complex content by covering more comprehensive semantics. Subjective evaluation revealed that the dynamic concept indexing provided teachers with immediate feedback on producing more balanced exams.
Presentation 1D2 (30 min): “The Role of Achievement Goal Orientations When Studying Effect of Learning Analytics Visualizations” (full paper)
Keywords: Learning Analytics, Visualizations, Achievement Goal Orientation, Online Discussions
When designing learning analytics tools for use by learners we have an opportunity to provide tools that consider a particular learner’s situation and the learner herself. To afford actual impact on learning, such tools have to be informed by theories of education. Particularly, educational research shows that individual differences play a significant role in explaining students’ learning process. However, limited empirical research in learning analytics has investigated the role of theoretical constructs, such as motivational factors, that are underlying the observed differences between individuals. In this work, we conducted a field experiment to examine the effect of three designed learning analytics visualizations on students’ participation in online discussions in authentic course settings. Using hierarchical linear mixed models, our results revealed that effects of visualizations on the quantity and quality of messages posted by students with differences in achievement goal orientations could either be positive or negative. Our findings highlight the methodological importance of considering individual differences and pose important implications for future design and research of learning analytics visualizations.
Presentation 1D3 (30 min): “The NTU Student Dashboard: Implementing a whole institution learning analytics platform to improve student engagement” (practitioner presentation)
Keywords: Learning, analytics, student, engagement, retention, attainment, institutional, change, collaboration
The NTU Student Dashboard is a learning analytics solution designed to improve overall engagement by raising student and staff awareness about how students are engaging with their course. In 2013-14, NTU piloted the Dashboard using the Solutionpath Stream tool. The findings from the pilot led to the institution-wide adoption of the Dashboard in 2014-15. Research at NTU (Foster, Kerrigan & Lawther, 2016, forthcoming) demonstrates that student engagement measured by the Dashboard strongly correlates with both progression and attainment. This however, is only the first step in an ongoing institutional change process.
Early afternoon sessions (01:00 PM – 02:30 PM)
Presentation 2A1 (30 min): “Investigating collaborative learning success with physiological coupling indices based on electrodermal activity” (full paper)
Keywords: learning analytics, biosensors, electrodermal activity, collaborative learning, physiological coupling indices
Collaborative learning is considered a critical 21st century skill. Much is known about its contribution to learning, but still investigating a process of collaboration remains a challenge. This paper approaches the investigation on collaborative learning from a psychophysiological perspective. An experiment was set up to explore whether biosensors can play a role in analysing collaborative learning. On the one hand, we identified five physiological coupling indices (PCIs) found in the literature: 1) Signal Matching (SM), 2) Instantaneous Derivative Matching (IDM), 3) Directional Agreement (DA), 4) Pearson’s correlation coefficient (PCC) and the 5) Fisher’s z-transform (FZT) of the PCC. On the other hand, three collaborative learning measurements were used: 1) collaborative will (CW), 2) collaborative learning product (CLP) and 3) dual learning gain (DLG). Regression analyses showed that out of the five PCIs, IDM related the most to CW and was the best predictor of the CLP. Meanwhile, DA predicted DLG the best. This is a first step in determining informative collaboration measures for designing a learning analytics, biofeedback dashboard.
Presentation 2A2 (30 min): “A Pedagogical Framework for Learning Analytics in Collaborative Inquiry Tasks: An Example from a Teamwork Competency Awareness Program” (full paper)
Keywords: Teamwork, teamwork competency, collaboration, pedagogical model, learning design, dispositional analytics, assessment, twenty-first century skills, evaluation
Many pedagogical models in the field of learning analytics are implicit and do not overtly direct learner behavior. While this allows flexibility of use, this could also result in misaligned practice, and there are calls for more explicit pedagogical models in learning analytics. This paper presents an explicit pedagogical model, the Team and Self Diagnostic Learning (TSDL) framework, in the context of collaborative inquiry tasks. Key informing theories include experiential learning, collaborative learning, and the learning analytics process model. The framework was trialed through a teamwork competency awareness program for 14 year old students. A total of 272 students participated in the program. This paper foregrounds students’ and teachers’ evaluative accounts of the program. Findings reveal positive perceptions of the stages of the TSDL framework, despite identified challenges, which points to its potential usefulness for teaching and learning. The TSDL framework aims to provide theoretical clarity of the learning process, and foster alignment between learning analytics and the learning design. The current work provides trial outcomes of a teamwork competency awareness program that used dispositional analytics, and further efforts are underway to develop the discourse layer of the analytic engine. Future work will also be dedicated to application and refinement of the framework for other contexts and participants, both learners and teachers alike.
Presentation 2A3 (20 min): “An Analysis Framework for Collaborative Problem Solving in Practice-based Learning Activities: A Mixed-method Approach” (short paper)
Keywords: Collaborative learning, problem solving, practice-based learning, analysis frameworks
Systematic investigation of the collaborative problem solving process in open-ended, hands-on, physical computing design tasks requires a framework that highlights the main process features, stages and actions that then can be used to provide ‘meaningful’ learning analytics data. This paper presents an analysis framework that can be used to identify crucial aspects of the collaborative problem solving process in practice-based learning activities. We deployed a mixed-methods approach that allowed us to generate an analysis framework that is theoretically robust, and generalizable. Additionally, the framework is grounded in data and hence applicable to real-life learning contexts. This paper presents how our framework was developed and how it can be used to analyse data. We argue for the value of effective analysis frameworks in the generation and presentation of learning analytics for practice-based learning activities.
LA challenges, accessibility and ethics
Presentation 2B1 (30 min): “Privacy and Analytics – it’s a DELICATE Issue. A Checklist for Trusted Learning Analytics.” (full paper)
Keywords: learning analytics, data management, transparency, control, implementation, privacy, ethics, identity, trust
The widespread adoption of Learning Analytics (LA) and Educational Data Mining (EDM) has somewhat stagnated recently, and in some prominent cases even been reversed following concerns by governments, stakeholders and civil rights groups. In this ongoing discussion, fears and realities are often indistinguishably mixed up, leading to an atmosphere of uncertainty among potential beneficiaries of Learning Analytics, as well as hesitations among institutional managers who aim to innovate their institution’s learning support by implementing data and analytics with a view on improving student success. In this paper, we try to get to the heart of the matter, by analysing the most common views and the propositions made by the LA community to solve them. We conclude the paper with an eight-point checklist named DELICATE that can be applied by researchers, policy makers and institutional managers to facilitate a trusted implementation of Learning Analytics.
Presentation 2B2 (20 min): “Using predictive indicators of student success at scale – implementation successes, issues and lessons from the Open University” (practitioner presentation)
Keywords: Predictive analytics, intervention strategies, students at risk, intervention tools
The Open University has deployed two predictive models to identify students at risk of drop-out for intervention by their tutors and student support staff. This presentation will describe the deployment of the two models and outline the technical and cultural challenges experienced along with the lessons learned. Additional application of the models will also be explored, including their use in aggregate to inform senior management of potential curriculum areas that might under-perform and to help module leaders identify the pinch points in their learning designs.
Presentation 2B3 (20 min): “What Can Analytics Contribute to Accessibility in e-Learning Systems and to Disabled Students’ Learning?” (short paper)
Keywords: Learning Analytics, Metrics, Accessibility, HCI, Technology Enhanced Learning, Higher Education
This paper explores the potential of analytics for improving accessibility of e-learning systems and for supporting disabled learners in their studies. The work is set in the context of learning and academic analytics’ focus on issues of retention. The definitions of disability and accessibility in e-learning are outlined and the implications of these for how disabled students needs may be modeled in learning analytics systems briefly discussed. A comparative analysis of completion rates between disabled and non-disabled students in a large data set of 5 years of Open University modules is presented. The wide variation in comparative retention rates is noted and characterized. A key assertion of this paper are that learning analytics provide a set of tools for identifying and understanding such discrepancies and that analytics can be used to focus interventions that will improve the retention of disabled students in particular. A comparison of this quantitative approach with that of qualitative end of module surveys is made. How an approach called Critical Learning Paths, currently being researched, may be used to identify accessibility deficits in module components that are significantly impacting on the learning of disabled students is described. An outline agenda for onward research currently being planned is given. It is hoped that this paper will stimulate a wider interest in the potential benefits of learning analytics for higher educational institutions as they try to assure the accessibility of their e-learning and for the provision of support for disabled students.
Determination of off-task/on-task behaviours
Presentation 2C1 (30 min): “Affecting Off-Task Behaviour: How Affect-aware Feedback Can Improve Student Learning” (full paper)
Keywords: Affect, Feedback, Exploratory Learning Environments
This paper describes the development and evaluation of an affect-aware intelligent support component that is part of a learning environment known as iTalk2Learn. The intelligent support component is able to tailor feedback according to a student’s affective state, which is deduced both from speech and interaction. The affect prediction is used to determine which type of feedback is provided and how that feedback is presented (interruptive or non-interruptive). The system includes two Bayesian networks that were trained with data gathered in a series of ecologically-valid Wizard-of-Oz studies, where the effect of the type of feedback and the presentation of feedback on students’ affective states was investigated. This paper reports results from an experiment that compared a version that provided affect-aware feedback (affect condition) with one that provided feedback based on performance only (non-affect condition). Results show that students who were in the affect condition were less off-task, a result that was statistically significant. Additionally, the results indicate that students in the affect condition were less bored. The results also show that students in both conditions made learning gains that were statistically significant, while students in the affect condition had higher learning gains than those in the non-affect condition, although this result was not statistically significant in this study’s sample. Taken all together, the results point to the potential and positive impact of affect-aware intelligent support.
Presentation 2C2 (30 min): “Investigating Boredom and Engagement during Writing Using Multiple Sources of Information: The Essay, The Writer, and Keystrokes” (full paper)
Keywords: Intelligent Tutoring Systems, Natural Language Processing, Steal Assessment, Corpus Linguistics, Writing
Computer-based writing systems have been developed to provide students with instruction and deliberate practice on their writing. While generally successful in providing accurate scores, a common criticism of these systems is their lack of personalization and adaptive instruction. In particular, these systems tend to place the strongest emphasis on delivering accurate scores, and therefore, tend to overlook additional variables that may ultimately contribute to students’ success, such as their affective states during practice. This study takes an initial step toward addressing this gap by building a predictive model of students’ affect using information that can potentially be collected by computer systems. We used individual difference measures, text features, and keystroke analyses to predict engagement and boredom in 132 writing sessions. The results from the current study suggest that these three categories of features can be utilized to develop models of students’ affective states during writing sessions. Taken together, features related to students’ academic abilities, text properties, and keystroke logs were able to more than double the accuracy of a classifier to predict affect. These results suggest that information readily available in compute-based writing systems can inform affect detectors and ultimately improve student models within intelligent tutoring systems.
Learning tools and interventions
Presentation 2D1 (30 min): “Interactive Surfaces and Learning Analytics: Data, Orchestration Aspects, Pedagogical Uses and Challenges” (full paper)
Keywords: Design, groupware, visualisations design, dashboard, studies in the wild, awareness, face-to-face
The proliferation of varied types of multi-user interactive surfaces (such as digital whiteboards, tabletops and tangible interfaces) is opening a new range of applications in face-to-face (f2f) contexts. They offer unique opportunities for Learning Analytics (LA) by facilitating multi-user sensemaking of automatically captured digital footprints of students’ f2f interactions. This paper presents an analysis of current research exploring learning analytics associated with the use of surface devices. We use a framework to analyse our first-hand experiences, and the small number of related deployments according to four dimensions: the orchestration aspects involved; the phases of the pedagogical practice that are supported; the target actors; and the levels of iteration of the LA process. The contribution of the paper is two-fold: 1) a synthesis of conclusions that identify the degree of maturity, challenges and pedagogical opportunities of the existing applications of learning analytics and interactive surfaces; and 2) an analysis framework that can be used to characterise the design space of similar areas and LA applications.
Presentation 2D2 (30 min): “Evaluation of an Adaptive Practice System for Learning Geography Facts” (full paper)
Keywords: evaluation, learning curve, attrition bias, computerized adaptive practice, survival analysis
Computerized educational systems are increasingly provided as open online services which provide adaptive personalized learning experience. To fully exploit potential of such systems, it is necessary to thoroughly evaluate different design choices. However, both openness and adaptivity make proper evaluation difficult. We provide a detailed report on evaluation of an online system for adaptive practice of geography, and use this case study to highlight methodological issues with evaluation of open online learning systems, particularly attrition bias. To facilitate evaluation of learning, we propose to use randomized reference questions. We illustrate application of survival analysis and learning curves for declarative knowledge. The result provide an interesting insight into the impact of adaptivity on learner behaviour and learning.
Presentation 2D3 (20 min): “Towards Analytics for Educational Interactive e-Books. The case of the Reflective Designer Analytics Platform (RDAP)” (short paper)
Keywords: designer analytics, exploratory data analysis, usability
This paper presents an analytics dashboard that has been developed for designers of interactive e-books. This is part of the EU-funded MC Squared project that is developing a platform for authoring interactive educational e-books. The primary objective is to develop technologies and re- sources that enhance creative thinking for both designers (authors) and learners. The learning material is expected to offer learners opportunities to engage creatively with mathematical problems and develop creative mathematical think- ing. The analytics dashboard is designed to increase authors’ awareness so that they can make informed decisions on how to redesign and improve the e-books. This paper presents architectural and design decisions on key features of the dashboard, and discusses the evaluation of a high- fidelity prototype. We discuss our future steps and some findings related to use of the dashboard for exploratory data analysis that we believe generalise to similar work.
Late afternoon sessions (03:00 PM – 04:30 PM)
Teaching and teacher analytics
Presentation 3A1 (30 min): “Teaching Analytics: Towards Automatic Extraction of Orchestration Graphs Using Wearable Sensors” (full paper)
Keywords: Teaching analytics, Multimodal learning analytics, Activity detection, Wearable sensors, Teacher reflection
‘Teaching analytics’ is the application of learning analytics techniques to understand teaching and learning processes, and eventually enable supportive interventions. However, in the case of (often, half-improvised) teaching in face-to-face classrooms, such interventions would require first an understanding of what the teacher actually did, as the starting point for teacher reflection and inquiry. Currently, such teacher enactment characterization requires costly manual coding by researchers. This paper presents a case study exploring the potential of machine learning techniques to automatically extract teaching actions during classroom enactment, from five data sources collected using wearable sensors (eye-tracking, EEG, accelerometer, audio and video). Our results highlight the feasibility of this approach, with high levels of accuracy in determining the social plane of interaction (90%, Kappa=0.8). The reliable detection of concrete teaching activity (e.g., explanation vs. questioning) accurately still remains challenging (67%, Kappa=0.56), a fact that will prompt further research on multimodal features and models for teaching activity extraction, as well as the collection of a larger multimodal dataset to improve the accuracy and generalizability of these methods.
Presentation 3A2 (30 min): “Student perspectives on data provision and use: Starting to unpack disciplinary differences” (full paper)
Keywords: Disciplinary differences, Student needs, Learning analytics, Knowledge, Legitimation Code Theory, Sociology of education
How can we best align learning analytics practices with disciplinary knowledge practices in order to support student learning? Although learning analytics itself is an interdisciplinary field, it tends to take a ‘one-size-fits-all’ approach to the collection, measurement, and reporting of data, overlooking disciplinary knowledge practices. In line with a recent trend in higher education research, this paper considers the contribution of a realist sociology of education to the field of learning analytics, drawing on findings from recent student focus groups at an Australian university. It examines what learners say about their data needs with reference to organizing principles underlying knowledge practices within their disciplines. The key contribution of this paper is a framework that could be used as the basis for aligning the provision and/or use of data in relation to curriculum, pedagogy, and assessment with disciplinary knowledge practices. The framework extends recent research in Legitimation Code Theory, which understands disciplinary differences in terms of the principles that underpin knowledge-building. The preliminary analysis presented here both provides a tool for ensuring a fit between learning analytics practices and disciplinary practices and standards for achievement, and signals disciplinarity as an important consideration in learning analytics practices.
Presentation 3A3 (20 min): “Design and Evaluation of Teacher Assistance Tools for Exploratory Learning Environments” (short paper)
Keywords: teacher assistance tools, exploratory learning, intelligent support
We present our approach to designing and evaluating tools that can assist teachers in classroom settings where students are using Exploratory Learning Environments (ELEs), using as our case study the MiGen system, which targets 11-14 year old students’ learning of algebra. We discuss the challenging role of teachers in exploratory learning settings and motivate the need for visualisation and notification tools that can assist teachers in focusing their attention across the whole class and inform teachers’ interventions. We present the design and evaluation approach followed during the development of MiGen’s Teacher Assistance tools, drawing parallels with the recently proposed LATUX workflow but also discussing how we go beyond this to include a large number of teacher participants in our evaluation activities, so as to gain the benefit of different view points. We present and discuss the results of the evaluations, which show that participants appreciated the capabilities of the tools and were mostly able to use them quickly and accurately.
Institutional perspectives and challenges
Presentation 3B1 (30 min): “Going Enterprise: Challenges Faced and Lessons Learned When Scaling and Integrating a Research-based Product for Commercial Distribution.” (practitioner presentation)
Keywords: Scalable solutions, Cloud-based, Platform analytics, accessibility.
This presentation explores some of the challenges faced by researchers when tasked with implementing a research based product for large-scale distribution. We divide these challenges into three categories: presentation-layer, methodology and back-end infrastructure and architecture. Using our own experience as an example, we explain the rationale behind the decisions made and look at how they were implemented in practice. We finally provide some guidelines for a successful transition from pilot project to enterprise solution.
Presentation 3B2 (30 min): “The Learning Analytics Readiness Instrument” (full paper)
Keywords: Learning Analytics, Readiness, Survey Design, Higher Education, Reflection, Ethics
Little is known about the processes institutions use when discerning their readiness to implement learning analytics. This study aims to address this gap in the literature by using survey data from the beta version of the Learning Analytics Readiness Instrument (LARI) . Twenty-four institutions were surveyed and 560 respondents participated. Five distinct factors were identified from a factor analysis of the results: Culture; Data Management Expertise; Data Analysis Expertise; Communication and Policy Application; and, Training. Data were analyzed using both the role of those completing the survey and the Carnegie classification of the institutions as lenses. Generally, information technology professionals and institutions classified as Research Universities–Very High research activity had significantly different scores on the identified factors. Working within a framework of organizational learning, this paper details the concept of readiness as a reflective process, as well as how the implementation and application of analytics should be done so with ethical considerations in mind. Limitations of the study, as well as next steps for research in this area, are also discussed.
Presentation 3B3 (20 min): “Real-time indicators and targeted supports: Using online platform data to accelerate student learning” (short paper)
Keywords: Networked Improvement Community, Behavior Modeling, Statistical Analysis, Growth Mindset, Just-in-time Interventions, Productive Persistence, HLM (Hierarchical linear models)
StatwayⓇ is one of the Community College Pathways initiatives designed to promote students’ success in their developmental math sequence and reduce the time required to earn college credit. A recent causal analysis confirmed that Statway dramatically increased students’ success rates in half the time across two different cohorts. These impressive results also obtain results across the gender and race/ethnicity groups. However, there is still room for improvement. Students who did not succeed overall in Statway often did not complete the first of the two-course sequence. Therefore, the objective of this study is to formulate a series of indicators from self-report and online learning system data, alerting instructors to students’ progress during the first weeks of the first course in the Statway sequence.
MOOC discussion analysis
Presentation 3C1 (30 min): “Bringing Order to Chaos in MOOC Discussion Forums with Content-Related Thread Identification” (full paper)
Keywords: Massive open online courses, social interaction, information overload, discussion forum, natural language processing, machine learning, content-related
This study addresses the issues of overload and chaos in MOOC discussion forums by developing a model to categorize and identify threads based on whether or not they are substantially related to the course content. Content-related posts were defined as those that give/seek help for the learning of course material and share/comment on relevant resources. A linguistic model was built based on manually-coded starting posts in threads from a statistics MOOC (n=837) and tested on thread starting posts from the second offering of the same course (n=304) and a different statistics course (n=298). The number of views and votes threads received were tested to see if they helped classification. Results showed that content-related posts in the statistics MOOC had distinct linguistic features which were mainly domain-independent; the linguistic model demonstrated good cross-course reliability (all recall and precision > .76) and was useful across all time segments of the courses; number of views and votes were not helpful for classification.
Presentation 3C2 (30 min): “Investigating Social and Semantic User Roles in MOOC Discussion Forums” (full paper)
Keywords: MOOCs, Discussion Forums, Social Network Analysis, Blockmodeling, Socio-semantic analysis
This paper describes the analysis of the social and semantic structure of discussion forums in massive open online courses (MOOCs) in terms of information exchange and user roles. To that end, we analyse a network of forum users based on information-giving relations extracted from the forum data. Connection patterns that appear in the information exchange network of forum users are used to define specific user roles in a social context. Semantic roles are derived by identifying thematic areas in which an actor seeks for information (problem areas) and the areas of interest in which an actor provides information to others (expertise). The interplay of social and semantic roles is analysed using a socio-semantic blockmodelling approach. The results show that social and semantic roles are not strongly interdependent. This indicates that communication patterns and interests of users develop simultaneously only to a moderate extent. In addition to the case study, the methodological contribution is in combining traditional blockmodelling with semantic information to characterise participant roles.
Presentation 3C3 (20 min): “Untangling MOOC Learner Networks” (short paper)
Keywords: MOOCs, forums, networked learning
Research in formal education has repeatedly offered evidence of the value social interactions bring to student learning. However, it remains unclear whether the development of such interpersonal relationships has the same influence on learning in the context of large-scale open online learning. For instance, in MOOCs group members change continuously and the interactions can quickly amass to chaos, therefore impeding an individual’s propensity to foster relationships. This paper examined a MOOC for its potential to develop social processes. As it is exceedingly difficult to establish a relationship with somebody who seldom accesses a MOOC discussion, we singled out a cohort defined by their regular forum presence. The study, then, analysed this ‘cohort’ and its development, as compared to the entire MOOC learner network. Mixed methods of social network analysis (SNA), content analysis and statistical network modelling revealed the potential for unfolding social processes among a more persistent group of learners in the MOOC setting.
Language, writing, and interaction
Presentation 3D1 (30 min): “Reflecting on Reflective Writing Analytics: Assessment Challenges and Iterative Evaluation of a Prototype Tool” (full paper)
Keywords: Learning Analytics, Writing Analytics, Reflection, Natural Language Processing, Metadiscourse, Rhetoric
When used effectively, reflective writing tasks can deepen learners’ understanding of key concepts, help them critically appraise their developing professional identity, and build qualities for lifelong learning. As such, reflecting writing is attracting substantial interest from universities concerned with experiential learning, reflective practice, and developing a holistic conception of the learner. However, reflective writing is for many students a novel genre to compose in, and tutors may be inexperienced in its assessment. While these conditions set a challenging context for automated solutions, natural language processing may also help address the challenge of providing real time, formative feedback on draft writing. This paper reports progress in designing a writing analytics application, detailing the methodology by which informally expressed rubrics are modelled as formal rhetorical patterns, a capability delivered by a novel web application. This has been through iterative evaluation on an independently human-annotated corpus, showing improvements from the first to second version. We conclude by discussing the reasons why classifying reflective writing has proven complex, and reflect on the design processes enabling work across disciplinary boundaries to develop the prototype to its current state.
Presentation 3D2 (30 min): “Longitudinal Engagement, Performance, and Social Connectivity: a MOOC Case Study Using Exponential Random Graph Models” (full paper)
Keywords: MOOC, network analysis, forum participation, exponential random graph model, ERGM, learning
This paper explores a longitudinal approach to combining engagement, performance and social connectivity data from a MOOC using the framework of exponential random graph models (ERGMs). The idea is to model the social network in the discussion forum in a given week not only using performance (assignment scores) and overall engagement (lecture and discussion views) covariates within that week, but also on the same person-level covariates from adjacent previous and subsequent weeks. We find that over all eight weekly sessions, the social networks constructed from the forum interactions are relatively sparse and lack the tendency for preferential attachment. By analyzing data from the second week, we also find that individuals with higher performance scores from current, previous, and future weeks tend to be more connected in social network. Engagement with lectures had significant but sometimes puzzling effects on social connectivity. However, the relationships between social connectivity, performance, and engagement weakened over time, and results were not stable across weeks.
Presentation 3D3 (20 min): “English Language Learner Experiences of Formal and Informal Learning Environments” (short paper)
Keywords: ubiquitous learning, experience sampling methodology, informal learning, formal learning, language learning, English language learner, communication, affect, analytics
Many people who do not know English have moved to English-speaking countries to learn English. Once there, they learn English through formal and informal methods. While considerable work has studied the experiences of English language learners in different learning environments, we have yet to see analytics that detail the experiences of this population within formal and informal learning environments. This study used the experience sampling methodology to capture the information that is needed to detail the communication and affective experiences of advanced English language learners. The collected data reveals differences in how English language learners perceived their communication success based on their learning context, with higher levels of communicative success experienced in formal learning settings. No such differences were found for learners’, highly negative, affect. The data suggest a need for additional emotional support within formal and informal learning environments as well as a need for oral communication support within informal contexts.
Thursday, April 28, 2016
Morning sessions (10:30 AM – 12:00 PM)
Keynote discussion and panel
Keynote chat 4A1 (20 min): Professor Paul A. Kirschner
Panel 4A2 (60 min): “Learning Analytics: Visions of the Future
Keywords: future, LACE project, learning, learning analytics, vision
It is important that the LAK community looks to the future, in order that it can help develop the policies, infrastructure and frameworks that will shape its future direction and activity. Taking as its basis the Visions of the Future study carried out by the Learning Analytics Community Exchange (LACE) project, the panelists will present eight future scenarios and their implications. The session will include time for the audience to discuss both the findings of the study and actions that could be taken by the LAK community in response to these findings.
Presentation 4B1 (20 min): “Analysing Engagement in an Online Management Programme and Implications for Course Design” (short paper)
Keywords: analysing interaction data, engagement and performance, predicting student performance
We analyse engagement and performance data arising from participants’ interactions with an in-house LMS at Imperial College London while a cohort of students follow two courses on a new fully online postgraduate degree in Management. We identify and investigate two main questions relating to the relationships between engagement and performance, drawing recommendations for improved guidelines to inform the design of such courses.
Presentation 4B2 (30 min): “Measuring financial implications of an early alert system” (full paper)
Keywords: Financial, valuing, student retention, early alert systems
The prevalence of early alert systems (EAS) at tertiary institutions is increasing. These systems are designed to assist with targeted student support in order to improve student retention. They also require considerable human and capital resources to implement, with significant costs involved. It is therefore an imperative that the systems can demonstrate quantifiable financial benefits to the institution.. The purpose of this paper is to report on the financial implications of implementing an EAS at an Australian university as a case study.. The case study institution implemented an EAS in 2011 using data generated from a data warehouse. The data set is comprised of 16,124 students enrolled between 2011 and 2013. Using a treatment effects approach, the study found that the cost of a student discontinuing was on average $4,687. Students identified by the EAS remained enrolled for longer, with the institution benefiting with approximately an additional $4,004 in revenue per student. Within the schools of the institution, all schools had a significant positive effect associated with the EAS. Finally, the EAS showed significant value to the institution regardless of when the student was identified. The results indicate that EAS had significant financial benefits to this institution and that the benefits extended to the entire institution beyond the first year of enrolment.
Presentation 4B3 (30 min): “Getting Started with Learning Analytics: Implementing a Predictive Model Validation Project at North Carolina State University” (practitioner presentation)
Keywords: open learning analytics, predictive model validation analysis, practical implementation strategies
This session will present a practical strategy deployed at North Carolina State University (NC State) that allows institutions to explore the use of learning analytics without the complexity and risk associated with production implementations. At the heart of this strategy is a predictive model validation analysis in which historical data is “run” through an open predictive model designed for general use in higher education. This approach sheds light on the effectiveness of the model and what implementation challenges may arise when larger scale deployment is undertaken. Presenters will share an overview of the strategy and analysis results.
Presentation 4C1 (20 min): “Data2U: Scalable Real time Student Feedback in Active Learning Environments” (short paper)
Keywords: dashboard, visualization, active learning
The majority of applications and products that use learning analytics to understand and improve learning experiences assume the creation of actionable items that will affect students but through the presence of an intermediary, typically a government organization, institutional management, or instructors. Much less focus is devoted to aggregating data and providing the derived insight directly to students. Student engagement is becoming central in the strategy to increase the quality of a learning environment. Ideally, learning analytics could be used to provide real-time insight tightly integrated with the learning outcomes. This paper presents two contributions in this space. First, an abstract architecture is described as a framework to identify the requirements to deploy such type of applications. The second contribution uses a case study deployed in a first year engineering course using active learning to explore the behavior of the students when interacting with a dashboard constantly providing them with indicators of their engagement with the course activities. The results show different patterns of use and their evolution throughout the duration of the experience and shed some light on how students perceive this resource.
Presentation 4C2 (30 min): “Supporting learning by considering emotions: Tracking and Visualization. A case study” (full paper)
Keywords: Self-reflection, quantified-self, students’ emotions, face to face interactions, visual dashboards, visualization
The adequate emotional state of students has proved to be essential for favoring learning. This paper explores the possibility of obtaining students’ feedback about the emotions they feel in class in order to discover potential emotion patterns that might indicate learning fails. This paper presents a visual dashboard that allows students to track their emotions and follow up on their evolution during the course. We have compiled the principal classroom related emotions and developed a two-phase inquiry process to: verify the possibility to measure students’ emotions in classroom; discover how emotions can be displayed to promote self-reflection; and confirm the impact of emotions on learning performance. Our results suggest that students’ emotions in class are related to evaluation marks. This shows that early information about students’ emotions can be useful for teachers and students to improve classroom results and learning outcomes.
Presentation 4C3 (30 min): “A Rule-Based Indicator Definition Tool for Personalized Learning Analytics” (full paper)
Keywords: Learning analytics, open learning analytics, personalized learning analytics, indicator
In the last few years, there has been a growing interest in learning analytics (LA) in technology-enhanced learning (TEL). Generally, LA deals with the development of methods that harness educational data sets to support the learning process. Recently, the concept of open learning analytics (OLA) has received a great deal of attention from LA community, due to the growing demand for self-organized, networked, and lifelong learning opportunities. A key challenge in OLA is to follow a personalized and goal-oriented LA model that tailors the LA task to the needs and goals of multiple stakeholders. Current implementations of LA rely on a predefined set of questions and indicators. There is, however, a need to adopt a personalized LA approach that engages end users in the indicator definition process by supporting them in setting goals, posing questions, and self-defining the indicators that help them achieve their goals. In this paper, we address the challenge of personalized LA and present the conceptual, design, and implementation details of a rule-based indicator definition tool to support flexible definition and dynamic generation of indicators to meet the needs of different stakeholders with diverse goals and questions in the LA exercise.
Theoretical and conceptual models
Presentation 4D1 (20 min): “Designing MOOCs for Success: A Student Motivation-Oriented Framework” (short paper)
Keywords: Structural equation modeling, confirmatory factor analysis, learning analytics, motivation, MOOC
Considerable literature exists regarding MOOCs. Evaluations of MOOCs range from ringing endorsements to its vilification as a delivery model. Much evaluation focuses on completion rates and/or participant satisfaction. Overall, MOOCs are ill-defined and researchers struggle with appropriate evaluation criteria beyond attrition rates. In this paper, we provide a brief history of MOOCs, a summary of some evaluation research, and we propose a new model for evaluation with an example from a previously-delivered MOOC. Measurement of the MOOC success framework through four student satisfaction types is proposed in this paper with a model for informal learning satisfaction, one of the proposed types, theorized and tested. Results indicated theoretical underpinnings, while intended to improve instruction, might not have influenced the same satisfaction construct. Therefore, future research into alternative satisfaction factor models is needed.
Presentation 4D2 (30 min): “The Assessment of Learning Infrastructure (ALI): The Theory, Practice, and Scalability of Automated Assessment” (full paper)
Keywords: Assessment of Learning Infrastructure, Automated Analysis, Randomized Controlled Experiments at Scale, The ASSISTments TestBed, Universal Data Reporting, Tools for Learning Analytics
Researchers invested in K-12 education struggle not just to enhance pedagogy, curriculum, and student engagement, but also to harness the power of technology in ways that will optimize learning. Online learning platforms offer a powerful environment for educational research at scale. The present work details the creation of an automated system designed to provide researchers with insights regarding data logged from randomized controlled experiments conducted within the ASSISTments TestBed. The Assessment of Learning Infrastructure (ALI) builds upon existing technologies to foster a symbiotic relationship beneficial to students, researchers, the platform and its content, and the learning analytics community. ALI is a sophisticated automated reporting system that provides an overview of sample distributions and basic analyses for researchers to consider when assessing their data. ALI’s benefits can also be felt at scale through analyses that crosscut multiple studies to drive iterative platform improvements while promoting personalized learning.
Presentation 4D3 (30 min): “When should we stop? – Towards Universal Instructional Policies” (full paper)
Keywords: instructional policies, student models, Dynamic Bayesian Networks
The adaptivity of intelligent tutoring systems relies on the accuracy of the student model and the the design of the instructional policy. Recently an instructional policy has been presented that is compatible with all common student models. In this work we present the next step towards a universal instructional policy. We introduce a new policy that is applicable to an even wider range of student models including DBNs modeling skill topologies and forgetting. We theoretically and empirically compare our policy to previous policies. Using synthetic and real world data sets we show that our policy can effectively handle wheel spinning as well as forgetting across common student models.
Early afternoon sessions (01:00 PM – 02:30 PM)
Presentation 5A1 (20 min): “Model Accuracy – Training vs. Reality” (practitioner presentation)
Keywords: Predictive model, Random forest, Accuracy, Validation, Model efficacy
Blue Canary is a higher education data and analytics company based in Phoenix, Arizona USA. We worked with a university to help predict at-risk students in their undergraduate degree programs. Our model predicted attendance in a given week since we knew that missing a week of class was a proxy for attrition. The models were trained and selected using standard efficacy measures (precision, recall, F1 score). After using the models in production for 6 months, we saw that those metrics for actual data were fairly true to the training metrics. This validated the development of our predictive models.
Presentation 5A2 (20 min): “Applying classification techniques on temporal trace data for shaping student behavior models” (short paper)
Keywords: Learner behavioral modeling, assessment analytics, computer-based testing, supervised learning classification
Differences in learners’ behavior have a deep impact on their educational performance. Consequently, there is a need to detect and identify these differences and build suitable learner models accordingly. In this paper, we report on the results from an alternative approach for dynamic student behavioral modeling based on the analysis of time-based student-generated trace data. The goal was to unobtrusively classify students according to their time-spent behavior. We applied 5 different supervised learning classification algorithms on these data, using as target values (class labels) the students’ performance score classes during a Computer-Based Assessment (CBA) process, and compared the obtained results. The proposed approach has been explored in a study with 259 undergraduate university participant students. The analysis of the findings revealed that a) the low misclassification rates are indicative of the accuracy of the applied method and b) the ensemble learning (treeBagger) method provides better classification results compared to the others. These preliminary results are encouraging, indicating that a time-spent driven description of the students’ behavior could have an added value towards dynamically reshaping the respective models.
Presentation 5A3 (30 min): “Using A/B Testing in MOOC Environments” (full paper)
Keywords: MOOC, A/B testing, Microservices E-Learning, Controlled Online Tests
In recent years, Massive Open Online Courses (MOOCs) have become a phenomenon offering the possibility to teach thousands of participants simultaneously. In the same time the platforms used to deliver these courses are still in their fledgling stages. While course content and didactics of those massive courses are the primary key factors for the success of courses, still an snmart platform may increase or decrease the learners experience and his learning outcome. This paper at hand proposes the usage of an A/B testing framework that is able to be used within an microservice architecture to validate hypotheses about how learners use the platform and to enable data-driven decisions about new features and settings. To evaluate this framework three new features (Onboarding Tour, Reminder Mails and a Pinboard Digest) have been identified based on a user survey. They have been implemented and introduced on two large MOOC platforms and their influence on the learners behavior have been mea- sured. Finally this paper proposes a data driven decision workflow for the introduction of new features and settings on e-learning platforms.
Proficiency and positioning
Presentation 5B1 (30 min): “Translating network position into performance: Importance of Centrality in Different Network Configurations” (full paper)
Keywords: Social network analysis, ERGM, MOOC, Simmelian ties
As the field of learning analytics continues to mature, there is a corresponding evolution and sophistication of the associated analytical methods and techniques. In this regard social network analysis (SNA) has emerged as one of the cornerstones of learning analytics methodologies. However, despite the noted importance of social networks for facilitating the learning process, it remains unclear how and to what extent such network measures are associated with specific learning outcomes. Motivated by Simmel’s theory of social interactions and building on the argument that social centrality does not always imply benefits, this study aimed to further contribute to the understanding of the association between students’ social centrality and their academic performance. The study reveals that learning analytics research drawing on SNA should incorporate both – descriptive and statistical methods to provide a more comprehensive and holistic understanding of a students’ network position. In so doing researchers can undertake more nuanced and contextually salient inferences about learning in network settings. Specifically, we show how differences in the factors framing students’ interactions within two instances of a MOOC affect the association between the three social network centrality measures (i.e., degree, closeness, and betweenness) and the final course outcome.
Presentation 5B2 (20 min): “Data-driven Proficiency Profiling – Proof of Concept” (short paper)
Keywords: Data-driven, Tutoring system, Student classification, Clustering
Data-driven methods have previously been used in intelligent tutoring systems to improve student learning outcomes and predict student learning methods. We have been incorporating data-driven methods for feedback and problem selection into Deep Thought, a logic tutor where students practice constructing deductive logic proofs. %These methods include data-driven hints and a data-driven mastery learning system (DDML) which calculates student proficiency based on rule scores weighted based on expert input in order to assign problem sets of appropriate difficulty. In this latest study we have implemented our data-driven proficiency profiler (DDPP) into Deep Thought as a proof of concept. The DDPP determines student proficiency without expert involvement by comparing relevant student rule scores to previous students who behaved similarly in the tutor and successfully completed it. The results show that the DDPP did improve in performance with additional data and proved to be an effective proof of concept.
Presentation 5B3 (20 min): “Learning analytics dashboard for improving the course passing rate in a randomized controlled experiment.” (practitioner presentation)
Keywords: Learning analytics, dashboard, predictive models, randomized controlled trial, flip the classroom, students at risk, online feedback.
Many freshman computer science students failed the essential Java programming course. This course is supported by e-learning systems. Students who performed all their task in the e-learning system were more successful. This is the reason for developing a dashboard application. The aim is to increase the amount of students passing by giving feedback on their online behavior. Weekly the dashboard shows students which tasks are performed, compares results of fellow students and chance of passing the course. To prove the effect of the dashboard a randomized controlled experiment was set up with all the 544 freshman students.
Learning design and analytics
Presentation 5C1 (30 min): “A Conceptual Framework linking Learning Design with Learning Analytics” (full paper)
Keywords: Learning analytics, Intervention design, Learning design, Conceptual framework
In this paper we present a learning analytics conceptual framework that supports enquiry-based evaluation of learning designs. The dimensions of the proposed framework emerged from a review of existing analytics tools, the analysis of interviews with teachers, and user case profiles to understand what types of analytics would be useful in evaluating a learning activity in relation to pedagogical intent. The proposed framework incorporates various types of analytics, with the teacher playing a key role in bringing context to the analysis and making decisions on the feedback provided to students as well as the scaffolding and adaptation of the learning design. The framework consists of five dimensions: temporal analytics, tool-specific analytics, cohort dynamics, comparative analytics and contingency. Specific metrics and visualisations are also defined for each dimension of the conceptual framework. Finally the development of a tool that partially implements the conceptual framework is discussed.
Presentation 5C2 (20 min): “The impact of 151 learning designs on student satisfaction and performance: social learning (analytics) matters” (short paper)
Keywords: Learning design, Learning analytics, Academic retention.
Since LAK2015 an increasing number of researchers are taking learning design into consideration when predicting learning behavior and outcomes across different modules. Learning design is widely studied in the Higher Education sector, but few studies have empirically connected learning designs of a substantial number of courses with learning behavior in Learning Management Systems (LMSs) and learning performance. This study builds on preliminary learning design work that was presented at LAK2015 by the Open University UK. In this study we linked 151 modules and 111.256 students with students’ behavior (
Presentation 5C3 (30 min): “Student differences in regulation strategies and their use of learning resources: implications for educational design” (full paper)
Keywords: Individual differences, Regulation strategies, Blended learning, Learning dispositions, Cluster analysis
The majority of the learning analytics research focuses on the prediction of course performance and modeling student behaviors with a focus on identifying students who are at risk of failing the course. Learning analytics should have a stronger focus on improving the quality of learning for all students, not only identifying at risk students. In order to do so, we need to understand what successful patterns look like when reflected in data and subsequently adjust the course design to avoid unsuccessful patterns and facilitate successful patterns. However, when establishing these successful patterns, it is important to account for individual differences among students since previous research has shown that not all students engage with learning resources to the same extent. Regulation strategies seem to play an important role in explaining the different usage patterns students’ display when using digital learning recourses. When learning analytics research incorporates contextualized data about student regulation strategies we are able to differentiate between students at a more granular level. The current study examined if regulation strategies could account for differences in the use of various learning resources. It examines how students regulated their learning process and subsequently used the different learning resources throughout the course and established how this use contributes to course performance. The results show that students with different regulation strategies use the learning resources to the same extent. However, the use of learning resources influences course performance differently for different groups of students. This paper recognizes the importance of contextualization of learning data resources with a broader set of indicators to understand the learning process. With our focus on differences between students, we strive for a shift within learning analytics from identifying at risk students towards a contribution of learning analytics in the educational design process and enhance the quality of learning; for all students.
Presentation 5D1 (30 min): “Sequencing Educational Content in Classrooms using Bayesian Knowledge Tracing” (full paper)
Keywords: Sequencing, Bayesian knowledge tracing, field studies
Despite the prevalence of e-learning systems in schools, most of today’s systems do not personalize educational data to the individual needs of each student. Although much progress has been made in modeling students’ learning from data and predicting performance, these models have not been applied in real classrooms. This paper proposes a new algorithm for sequencing questions to students that is empirically shown to lead to better performance and engagement in real schools when compared to a baseline approach. It is based on using knowledge tracing to model students’ skill acquisition over time, and to select questions that advance the student’s learning within the range of the student’s capabilities, as determined by the model. The algorithm is based on a Bayesian Knowledge Tracing (BKT) model that incorporates partial credit scores, reasoning about multiple attempts to solve problems, and integrating item difficulty. This model is shown to outperform other BKT models that do not reason about (or reason about some but not all) of these features. The model was incorporated into a sequencing algorithm and deployed in two schools where it was compared to a baseline sequencing algorithm that was designed by pedagogical experts. In both schools, students using the BKT sequencing approach solved more difficult questions, and with better performance than did students who used the expert-based approach. Students were also more engaged using the BKT approach, as determined by their log-ins in the system and a questionnaire. We expect our approach to inform the design of better methods for sequencing and personalizing educational content to students that will meet their individual learning needs.
Presentation 5D2 (20 min): “Studying the relationship between BKT adjustment error and the skill’s difficulty index” (short paper)
Keywords: BKT, BKT-BF, RMSE modeling, educational data mining, difficulty index
Bayesian Knowledge Tracing (BKT) is one of the most popular knowledge inference models due to its interpretability and ability to infer student knowledge. A proper student modeling can help guide the behavior of a cognitive tutor system and provide insight to researchers on understanding how students learn. Using four different datasets we study the relationship between the error coming from adjusting the parameters and the difficulty index of the skills and the effect of the size of the dataset in this relationship.
Presentation 5D3 (30 min): “Modeling Common Misconceptions in Learning Process Data” (full paper)
Keywords: Misconceptions, Knowledge Component Model, Additive Factors Model, Q-Matrix, Fraction Arithmetic
Student mistakes are often not random but, rather, reflect thoughtful yet incorrect strategies. In order for educational technologies to make full use of students’ performance data to estimate the knowledge of a student, it is important to model not only the conceptions but also the misconceptions that a student’s particular pattern of successes and errors may indicate. The student models that drive the “outer loop” of Intelligent Tutoring Systems typically only track positive skills and conceptions, not misconceptions. Here, we present a method of representing misconceptions in the kinds of Knowledge Component models, or Q-Matrices, that are used by student models to estimate latent knowledge. We show, in a case study on a fraction arithmetic dataset, that incorporating a misconception into the Knowledge Component model dramatically improves model fit. We also derive qualitative insights from comparing predicted learning curves across models that incorporate varying misconception-related parameters. Finally, we show that the inclusion of a misconception in a Knowledge Component model can yield individual student estimates of misconception strength. These estimates are significantly correlated with out-of-tutor individual measures of student errors indicative of the misconception.
Late afternoon sessions (03:00 PM – 04:30 PM)
SOLAR General Annual Meeting
The Annual General Meeting is a meeting of all members of SoLAR. The meetig will include introducing the Executive Committee and Officer Bearers board of directors and informing the members of previous and future activities. It is an opportunity for discussion and input from all members.
Friday, April 29, 2016
Morning sessions (10:30 AM – 12:00 PM)
Keynote discussion and panel
Keynote chat 7A1 (20 min): Professor Robert J. Mislevy
Panel 7A2 (60 min): “Institutional Learning Analytics Centres: Contexts, Strategies and Insights
Keywords: Learning Analytics, Organizational Strategy, Innovation Diffusion
An indicator of the maturing field of learning analytics is the creation of new organizational entities dedicated to using learning analytics services to improve the student experience through institutional research. Going beyond traditional Business Intelligence (BI), these groups operate firmly at the intersection of learning and analytics — they can speak the language of pedagogy and assessment with educators, invent/deploy novel analytics tools, while engaging IT and BI colleagues around mainstreaming services. The end-users targeted by these learning analytics centres are educators and learners. In this panel, the leaders of seven differently configured centres, from diverse universities, share insights on issues such providing rapid value from pilots, research-based innovation, ways to engage stakeholders, vendor partnerships, data quality, and alignment with university strategy. Our hope is that attendees will leave with fresh ideas on the options they have to advance learning analytics in their own contexts.
Presentation 7B1 (20 min): “Recipe for Success – Lessons Learnt from Using xAPI within the Connected Learning Analytics Toolkit” (short paper)
Keywords: xAPI, CLA toolkit, CLRecipe, Architecture, Learning Analytics, Learning Record Store
An ongoing challenge for Learning Analytics research has been the scalable derivation of user interaction data from multiple technologies. The complexities associated with this challenge are increasing as educators embrace an ever growing number of social and content related technologies. The Experience API (xAPI) alongside the development of user specific record stores has been touted as a means to address this challenge, but a number of subtle considerations must be made when using xAPI in Learning Analytics. This paper provides a general overview to the complexities and challenges of using xAPI in a general systemic analytics solution, the Connected Learning Analytics (CLA) toolkit. The importance of careful design is emphasised, as is the notion of common vocabularies and xAPI Recipes. Early decisions about vocabularies and structural relationships between statements can serve to either facilitate or handicap later analytics solutions. The CLA toolkit case study provides us with a way of examining both the strengths and the weaknesses of the current xAPI specification, and we conclude with a proposal for how xAPI might be improved by using JSON-LD to formalise Recipes in machine readable form.
Presentation 7B2 (30 min): “How CRS deployed Watershed LRS and xAPI to evaluate the effectiveness of training for disaster response teams” (practitioner presentation)
Keywords: case studies, data integration, information visualization, data sources, lifelong learning, xAPI
The Catholic Relief Services (CRS) Emergency Response and Recovery programme sends teams to serve in areas that have been hit by disaster, such as an earthquake, flood or tsunami. CRS has trained 3,300 first responders sent to these areas since 2009. But how effective is that training when these teams arrive in some of the most challenging environments on earth? This presentation will explore how CRS deployed Watershed LRS to capture data about both learning and job performance via the xAPI. Find out how CRS use this data to evaluate and improve their training provision and become more effective.
Presentation 7B3 (30 min): “How CUES deployed Watershed LRS and xAPI to track and analyse continuous learning.” (practitioner presentation)
Keywords: case studies, data integration, information visualization, data sources, lifelong learning, xAPI
The Credit Union Executives Society (CUES) is an international membership association dedicated to the education and development of credit union CEOs, directors and future leaders. CUES provides resources to their members including learning materials on their LMS and content on their website. Members also learn from external resources across the internet. This presentation will explore how CUES deployed Watershed LRS to capture data about members’ learning via the xAPI. This data includes tracking of LMS, website and 3rd party content. Find out how CUES is using this data to inform their member engagement initiatives and provide the best possible services.
Supporting learning and achievement
Presentation 7C1 (20 min): “Forecasting Student Achievement in MOOCs with Natural Language Processing” (short paper)
Keywords: intention, motivation, natural language processing, certification, prediction
Student intention and motivation are among the strongest predictors of persistence and completion in Massive Open Online Courses (MOOCs), but these factors are typically measured through fixed-response items that constrain student expression. We use natural language processing techniques to evaluate whether text analysis of open responses questions about motivation and utility value can offer additional capacity to predict persistence and completion over and above information obtained from fixed-response items. Compared to simple benchmarks based on demographics and dictionary-based language analyses, we find that a machine learning prediction model can learn from unstructured text to predict which students will complete an online course. We show that the model performs well out-of-sample within a single course, and out-of-context in a related course, though not out-of-subject in an unrelated course. These results demonstrate the potential for natural language processing to contribute to predicting student success in MOOCs and other forms of open online learning.
Presentation 7C2 (30 min): “Is the Doer Effect a Causal Relationship? How Can We Tell and Why It’s Important” (full paper)
Keywords: Learn by doing, learning prediction, course effectiveness, doer effect, learning engineering
The “doer effect” is an association between the number of online interactive practice activities students’ do and their learning outcomes that is not only statistically reliable but has much higher positive effects than other learning resources, such as watching videos or reading text. Such an association suggests a causal interpretation–more doing yields better learning–which requires randomized experimentation to most rigorously confirm. But such experiments are expensive, and any single experiment in a particular course context does not provide rigorous evidence that the causal link will generalize to other course content. We suggest that analytics of increasingly available online learning data sets can complement experimental efforts by facilitating more widespread evaluation of the generalizability of claims about what learning methods produce better student learning outcomes. We illustrate with analytics that narrow in on a causal interpretation of the doer effect by showing that doing within a course unit predicts learning of that unit content more than doing in units before or after. We also provide generalizability evidence across four different courses involving over 12,500 students that the learning effect of doing is about six times greater than that of reading.
Presentation 7C3 (30 min): “Towards triggering higher-order thinking behaviors in MOOCs” (full paper)
Keywords: Discussion, Learning analytics, coding manual, regression analysis, propensity score matching, LDA topic modeling
With the aim of better scaffolding discussion to improve learning in a MOOC context, this work investigates what kinds of discussion behaviors contribute to learning. We explored whether engaging in higher-order thinking behaviors results in more learning than paying general or focused attention to course materials. In order to evaluate whether to attribute the effect to engagement in the associated behaviors versus persistent characteristics of the students, we adopted two approaches. First, we used propensity score matching to pair students who exhibit a similar level of involvement in other course activities. Second, we explored individual variation in engagement in higher-order thinking behaviors across weeks. The results of both analyses support the attribution of the effect to the behavioral interpretation. A further analysis using LDA applied to course materials suggests that more social oriented topics triggered richer discussion than more biopsychology oriented topics.
Eary afternoon sessions (01:00 PM – 02:20 PM)
Real time data
Presentation 8A1 (30 min): “A Study On Eye Fixation Patterns of Students in Higher Education Using an Online Learning System” (full paper)
Keywords: Eye Tracking, Human-Computer Interaction, Instructional Design, Cognitive Activity, Online Learning
We study how the use of online learning systems stimulates cognitive activities, by conducting an experiment with the use of eye tracking technology to monitor eye fixations of 60 final year students engaging in online interactive tutorials at the start of their Final Year Project module. Our findings show that the students’ learning behaviours fall into three different types of eye fixation patterns, and the data corresponding to the different types of learners relates to the performance of the students in other related academic modules. We conclude that this method of studying eye fixation patterns can identify different types of learners with respect to their cognitive capability and academic potential, and also allow educators to understand how their instructional design and online learning environment can stimulate higher-order cognitive activities.
Presentation 8A2 (20 min): “A Gaze-based Learning Analytics Model: In-Video Visual Feedback to Improve Learner’s Attention in MOOCs” (short paper)
Keywords: Eye-tracking, video based learning, MOOCs, Student attention
In the context of MOOCs, “With-me-ness” refers to the extent to which the learner succeeds in following the teacher, specifically in terms of looking at the area in the video that the teacher is explaining. In our previous works, we employed eye-tracking methods to quantify learners’ With-me-ness and showed that it is positively correlated with their learning gains. In this contribution, we describe a tool that is designed to improve With-me-ness by providing a visual-aid superimposed on the video. The position of the visual-aid is suggested by the teachers’ dialogue and deixis, and it is displayed when the learner’s With-me-ness is under the average value, which is computed from the other students’ gaze behavior. We report on a user-study that examines the effectiveness of the proposed tool. The results show that it significantly improves the learning gain and it significantly increases the extent to which the students follow the teacher. Finally, we demonstrate how With-me-ness can create a complete theoretical framework for conducting gaze-based learning analytics in the context of MOOCs.
Supporting SRL and 21st century skills
Presentation 8B1 (30 min): “Exploring the relation between Self-regulation, Online Activities, and Academic Performance: A case study” (full paper)
Keywords: self-regulated learning, quantitative analysis, case-study
The areas of educational data mining and learning analytics focus on the extraction of knowledge and actionable items from data sets containing detailed information about students. However, the potential impact from these techniques is increased when properly contextualized within a learning environment. More studies are needed to explore the connection between student interactions, approaches to learning, and academic performance. Self-regulated learning (SRL) is defined as the extent to which a student is able to motivationally, metacognitively, and cognitively engage in a learning experience. SRL has been the focus of research in traditional classroom learning and is also argued to play a vital role in the online or blended learning contexts. In this paper we study how SRL affects students’ online interactions with various learning activities and its influence in academic performance. The results derived from a naturalistic experiment among a cohort of first year engineering students showed that positive self-regulated strategies (PSRS) and negative self-regulated strategies (NSRS) affected both the interaction with online activities and academic performance. NSRS directly predicted academic outcomes, whereas PSRS only contributed indirectly to academic performance via the interactions with online activities. These results point to concrete avenues to promote self-regulation among students in this type of learning contexts.
Presentation 8B2 (20 min): “Fostering 21st century literacies through a collaborative critical reading and learning analytics environment: User-perceived benefits and problematics” (short paper)
Keywords: Visual learning analytics, Computer-supported collaborative learning, 21st century skills, Critical literacy, Perceived usefulness
The affordances of learning analytics (LA) are being increasingly harnessed to enhance 21st century (21C) pedagogy and learning. Relatively rare, however, are use cases and empirically based understandings of students’ actual experiences with LA tools and environments aimed at fostering 21C literacies, especially in secondary schooling and Asian education contexts. This paper addresses this knowledge gap by presenting 1) a first iteration design of a computer-supported collaborative critical reading and LA environment and its 16-week implementation in a Singapore high school; and 2) foregrounding students’ quantitative and qualitative accounts of the benefits and problematics associated with this learning innovation. We focus the analytic lens on the LA dashboard components that provided visualizations of students’ reading achievement, 21C learning dispositions, critical literacy competencies and social learning network positioning within the class. The paper aims to provide insights into the potentialities, paradoxes and pathways forward for designing LA that take into consideration the voices of learners as critical stakeholders.
Presentation 8B3 (20 min): “Improving efficacy attribution in a self-directed learning environment using prior knowledge individualization” (short paper)
Keywords: Massive Open Online Courses (MOOCs), Self-directed learning, Self-selection bias, Bayesian Knowledge Tracing, Efficacy attribution, Prior knowledge, Individualization, Education
Models of learning in EDM and LAK are pushing the boundaries of what can be measured from large quantities of historical data. When controlled randomization is present in the learning platform, such as randomized ordering of problems within a problem set, natural quasi-randomized controlled studies can be conducted, post-hoc. Difficulty and learning gain attribution are among factors of interest that can be studied with secondary analyses under these conditions. However, much of the content that we might like to evaluate for learning value is not administered as a random stimulus to students but instead is being self-selected, such as a student choosing to seek help in the discussion forums, wiki pages, or other pedagogically relevant material in online courseware. Help seekers, by virtue of their motivation to seek help, tend to be the ones who have the least knowledge. When presented with a cohort of students with a bi-modal or uniform knowledge distribution, this can present problems with model interpretability when a single point estimation is used to represent cohort prior knowledge. Since resource access is indicative of a low knowledge student, a model can tend towards attributing the resources with low or negative learning gain in order to better explain performance given the higher average prior point estimate. In this paper we present several individualized prior strategies and demonstrate how learning efficacy attribution validity and prediction accuracy improve as a result. Level of education attained, relative past assessment performance, and the prior per student cold start heuristic were employed and compared as prior knowledge individualization strategies.
Presentation 8C1 (30 min): “Using Game Analytics to Evaluate Puzzle Design and Level Progression in a Serious Game” (full paper)
Keywords: Serious Games, Educational Data Mining, Survival Analysis, Complex Problem Solving, Learning Analytics
Modeling and visualizing behavior in complex problem solving tasks such as games is important for both assessing learning and for the design of content. Our previous work has demonstrated that players who perceive a game as more challenging are likely to perceive greater learning from that game. However, this may not be the case for all sources of challenge. Using Interaction Networks, we identified two primary types of errors in Quantum Spectre; Science Errors related to the game’s core educational content, and Puzzle Errors related to rules of the game but not to science knowledge. Preliminary regression analyses showed no strong relationship between the Puzzle Errors identified during gameplay and science learning gains for either group, confirming that these errors seem to be unrelated to science content. However, we also found a significant drop out in both groups (though larger for the game group) on one puzzle where such errors were most common. In this work, we explore and model the game log data for both groups to discover why this dropout occurred.
Presentation 8C2 (20 min): “Bayesian Modelling of Student Misconceptions in the one-digit Multiplication with Probabilistic Programming” (short paper)
Keywords: Learning Analytics, Bayesian Modelling, Probabilistic Programming, One-Digit Multiplication
One-digit multiplication errors are one of the most extensively analysed mathematical problems. Research work primarily emphasises the use of statistics whereas learning analytics can go one step further and use machine learning techniques to model simple learning misconceptions. Probabilistic programming techniques ease the development of probabilistic graphical models (bayesian networks) and their use for prediction of student behaviour that can ultimately influence decision processes.
Presentation 8C3 (20 min): “Enhancing the Efficiency and Reliability of Group Differentiation through Partial Credit” (short paper)
Keywords: Partial Credit, Group Differentiation, Resampling with Replacement, Randomized Controlled Trial, Data Mining
The focus of the learning analytics community bridges the gap between controlled educational research and data mining. Online learning platforms can be used to conduct randomized controlled trials to assist in the development of interventions that increase learning gains; datasets from such research can act as a treasure trove for inquisitive data miners. The present work employs a data mining approach on randomized controlled trial data from ASSISTments, a popular online learning platform, to assess the benefits of incorporating additional student performance data when attempting to differentiate between two user groups. Through a resampling technique, we show that partial credit, defined as an algorithmic combination of binary correctness, hint usage, and attempt count, can benefit assessment and group differentiation. Partial credit reduces sample sizes required to reliably differentiate between groups that are known to differ by 58%, and reduces sample sizes required to reliably differentiate between less distinct groups by 9%.
Presentation 8D1 (30 min): “What and When: The Role of Course Type and Timing in Students’ Academic Performance” (full paper)
Keywords: Early Warning System, Disciplinary Fields, Time Based Learning Analytics, Undergraduate Education
In this paper we discuss the results of a study of students’ academic performance in first year general education courses. Using data from 566 students who received intensive academic advising as part of their enrollment in the institution’s pre-major/general education program, we investigate individual student, organizational, and disciplinary factors that might predict a students’ potential classification in an Early Warning System as well as factors that predict improvement and decline in their academic performance. Disciplinary course type (based on Biglan’s  typology) was significantly related to a student’s likelihood to enter below average performance classifications. Students were the most likely to enter a classification in fields like the natural science, mathematics, and engineering in comparison to humanities courses. We attribute these disparities in academic performance to disciplinary norms around teaching and assessment. In particular, the timing of assessments played a major role in students’ ability to exit a classification. Implications for the design of Early Warning analytics systems as well as academic course planning in higher education are offered.
Presentation 8D2 (20 min): “Predicting Student Performance on Post-requisite Skills Using Prerequisite Skill Data: An alternative method for refining Prerequisite Skill Structures” (short paper)
Keywords: Perquisite Skill Structures, Linear Regression, Prerequisite Skills, Learning Maps
Prerequisite skill structures have been closely studied in past years leading to many data-intensive methods aimed at refining such structures. While many of these proposed methods have yielded success, defining and refining hierarchies of skill relationships are often difficult tasks. The relationship between skills in a graph could either be causal, indicating a prerequisite relationship (skill A must be learned before skill B), or non-causal, in which the ordering of skills does not matter and may indicate that both skills are prerequisites of another skill. In this study, we propose a simple, effective method of determining the strength of pre-to-post-requisite skill relationships. We then compare our results with a teacher-level survey about the strength of the relationships of the observed skills and find that the survey results largely confirm our findings in the data-driven approach.
Presentation 8D3 (20 min): “Generating Actionable Predictive Models of Academic Performance” (short paper)
Keywords: Learning analytics, personalization, feedback, recursive partitioning.
The pervasive collection of data has opened the possibility for educational institutions to use analytics methods to improve the quality of the student experience. However, the adoption of these methods faces multiple challenges particularly at the course level where instructors and students would derive the most benefit from the use of such analytics and predictive models. The challenge lies in the knowledge gap between how the data is captured, processed and used to derive models of student behavior, and the subsequent interpretation and the decision to deploy pedagogical actions and interventions by instructors. Simply put, the provision of learning analytics alone has not necessarily led to changing teaching practices. In order to support pedagogical change and aid interpretation, this paper proposes a model that can enable instructors to readily identify subpopulations of students to provide specific support actions. The approach was applied to a first year course with a large number of students. The resulting model classifies students according to their predicted exam scores, based on indicators directly derived from the learning design.
Late afternoon sessions (03:00 PM – 03:45 PM)
Community Building Panel
In this panel, Presidents and other leadership from a spectrum of leading research societies in the Learning Sciences, Educational Technology, and Analytics space will share their views on the potential synergies arising between the communities. In particular, the discussion will focus on current synergies and potential areas for further bridging and collaborations. The goal is to identify new research and community building opportunities for partnerships as we prepare for a new year of cutting edge research as we close out this year’s conference.