Armed conflict theory has in recent years seen an increase in the use of forecasting models. These have brought with them a shift from the use of explanatory power to predictive power when evaluating model performance (Gurr et al., 1999; Goldstone et al., 2010; Hegre et al.,2013). As methods of evaluation change, so must our diagnostic tools. Tests for statistical outliers are common, but so far little has been done to adapt such tests to the use of predictive power. In order to improve our understanding of theory, and ultimately to be able to give better advice to policy makers, it is important to investigate the effects of single countries on our model's forecasts. In this thesis I present a method of testing for statistical outliers for forecasting models using common measures of predictive power. By applying the method to a forecasting model I attempt to uncover any patterns among the outlying countries that could help further the theoretical understanding of armed conflict occurrence. I utilize a dynamic forecasting model developed by Hegre et al. (2013) and a cross-sectional time-series dataset containing 162 countries observed between 1950 and 2013. The model is repeated once for every country, each time dropping one of them from the estimation and evaluation process. The results are compiled into evaluation sets, and these are then used to estimate each country's influence on model accuracy. Four measures of predictive power are used to evaluate this: ROC AUC, PR AUC, F-score and Brier score. I find that effect on coefficients is only partially related to effect on predictive power. By examining the outliers in detail I illustrate differences in how the measures weigh predictions, and how this affects the overall score. I also show how cross-validation using cross-sectional time-series data is problematic and greatly influenced by choice of evaluation period.