Abstract There are multiple item characteristics, some unrelated to mathematics, that can have an impact on item difficulty. Research into the effect of item characteristics such as number of words and comparative language has already been performed in larger state assessments in an American context but has not yet been implemented in a Norwegian setting. In this paper, the relationship between mathematical and linguistic item characteristics to variation in item difficulty is investigated in two tests of elementary mathematics via an explanatory item response modelling approach. The results show that number of words are the biggest driver of item difficulty in the second-grade test, and that comparative terms and number of words combined are the biggest drivers of item difficulty in the third-grade test, explaining 38% and 45% of the variance respectively. A higher number of words was related to a higher expected difficulty in both tests, and the presence of a comparative term in an item was related to a higher expected difficulty in the third-grade test. This finding indicates that the number of words should be considered while creating new test items both in research and in practice, as this might have an unexpected impact on item difficulty. The next stage would be to further investigate the item characteristics in a mathematical and linguistic framework-based test and extend the mathematical framework to distinguish better between different mathematical content.