Data modelling as activity operates in the intersection between software design and programming. It takes input from the problem domain to be addressed by the information system, and creates a description of this domain in terms that lend themselves to the rigorous procedures of programming (i.e. coding). Some sort of data
modelling is often required to provide a manageable overview of a problem domain prior to embarking on the development of the implemented solution. In this respect, data modelling stands out as a particularly important topic for novice students to master in order to handle the complex tasks involved in system design and
development. Accordingly, data modelling is increasingly taught as an essential part of system design and development in introductory computer science courses. A significant amount of research has been carried out, providing insight into various aspects related to the teaching and learning of computer science – in particular,
psychological and organisational issues concerning introductory courses in programming, in addition to studies of expert behaviour. Some of the contributions made, and topics covered, are presented and discussed in chapter 2. The learning of system design and data modelling has, however, been far less focused on in computer
science education research than is the case for the more traditional issues related to the learning or understanding of programming (McCracken, 2004). Contributing to the body of knowledge in computer science education research, this thesis addresses the learning of data modelling in school and undergraduate university computer
science classrooms. Special attention is given to some aspects of this learning process where language plays an important role.
The first aspect studied, which was also the initial focus for this project, concerns the scientific concept building of students learning data modelling. Data modelling as an activity relies on scientific concepts like connectivity, attributes and different types of keys. The results presented concern students’ understanding of
candidate key, primary key, and foreign key. Emphasising that scientific concepts are not absorbed ready-made, but formed under influence from teaching and learning in social settings, Vygotsky states that “to uncover the complex relation between instruction and the development of scientific concepts is an important task.”
(Vygotsky, 1986: p162). The study of conceptual knowledge in novices is accordingly seen as an important source of information for future design of teaching and facilitation of learning.
Furthermore, a conceptual data model is supposed to represent a subset of some problem domain (Peckham & Maryanski, 1988). In order to maintain a comprehensible link between the different parts of the data model and the “real world” features that they represent, it is common to label the components of the data model using terms from the language of the problem domain. It has been shown in studies of programming (e.g. Bonar & Soloway, 1985) that this mapping is not
necessarily trivial. This thesis addresses the issue of labelling as the second major aspect in which language relates to the learning of data modelling. Across both of these aspects, it is possible to discuss cognition and learning both on an individual level and as a socially distributed construction of knowledge. I will take a distributed cognition perspective adopted from Salomon (1993) in order to allow for discussions of both these levels of cognition as well as the interaction between them. This perspective will be discussed in section 3.2. The inclusion of socially constituted cognitions introduces a third aspect of the relationship between language and the learning of data modelling. This last aspect
concerns the collaborative problem solving activities in the classrooms as discursive practices constituting and shaping the collective construction of knowledge within both of the two first aspects. This third aspect has methodological implications, as it
forms the rationale behind the link between choice of data collection method and research questions.