You are here

user2code2vec: Embeddings for Profiling Students Based on Distributional Representations of Source Code

Authors: 

David Azcona, Piyush Arora, I-Han Hsiao, Alan Smeaton

Publication Type: 
Refereed Conference Meeting Proceeding
Abstract: 
In this work, we propose a new methodology to profile individual students of computer science based on their programming design using a technique called embeddings. We investigate different approaches to analyze user source code submissions in the Python language. We compare the performances of different source code vectorization techniques to predict the correctness of a code submission. In addition, we propose a new mechanism to represent students based on their code submissions for a given set of laboratory tasks on a particular course. This way, we can make deeper recommendations for programming solutions and pathways to support student learning and progression in computer programming modules effectively at a Higher Education Institution. Recent work using Deep Learning tends to work better when more and more data is provided. However, in Learning Analytics, the number of students in a course is an unavoidable limit. Thus we cannot simply generate more data as is done in other domains such as FinTech or Social Network Analysis. Our findings indicate there is a need to learn and develop better mechanisms to extract and learn effective data features from students so as to analyze the students’ progression and performance effectively.
Conference Name: 
The 9th International Learning Analytics & Knowledge Conference, LAK 2019
Digital Object Identifer (DOI): 
10.1145
Publication Date: 
04/03/2019
Conference Location: 
United States of America
Research Group: 
Institution: 
Dublin City University (DCU)
Open access repository: 
Yes