Journal of Information Systems Education (JISE)

Volume 34

Volume 34, Issue 1, Pages 84-93

Winter 2023


A Data-Driven Approach to Compare the Syntactic Difficulty of Programming Languages


Erno Lokkila
Athanasios Christopoulos
Mikko-Jussi Laakso
University of Turku
Turku, 20014, Finland

Abstract: Educators who teach programming subjects are often wondering “which programming language should I teach first?”. The debate behind this question has a long history and coming up with a definite answer to this question would be farfetched. Nonetheless, several efforts can be identified in the literature wherein pros and cons of mainstream programming languages are examined, analysed, and discussed in view of their potential to facilitate the didactics of programming concepts especially to novice programmers. In line with these efforts, we explore the latter question by comparing the syntactic difficulty of two modern, but fundamentally different, programming languages: Java and Python. To achieve this objective, we introduce a standalone and purely data-driven method which stores the code submissions and clusters the errors occurred under the aid of a custom transition probability matrix. For the evaluation of this model a total of 219,454 submissions, made by 715 first-year undergraduate students, in 259 unique programming exercises were gathered and analysed. The results indicate that Python is an easier-to-grasp programming language and is, therefore, highly recommended as the steppingstone in introductory courses. Besides, the adoption of the described method enables educators to not only identify those students who struggle with coding (syntax-wise) but further paves the pathway for the adoption of personalised and adaptive learning practices.

Keywords: Computer programming, Computer science, Data analytics, Higher education, Information systems education, Program assessment & design

Download This Article: JISE2023v34n1pp84-93.pdf


Recommended Citation: Lokkila, E., Christopoulos, A., & Laakso, M.-J. (2023). A Data-Driven Approach to Compare the Syntactic Difficulty of Programming Languages. Journal of Information Systems Education, 34(1), 84-93.