AbstractProblemThe positive relationship between HIV-prevalence and education in sub-Saharan Africa has been verified by several studies. However, this hypothesis has been challenged recently, and it is in this context that I place my thesis.
The HIV-epidemic in sub-Saharan Africa is the only major epidemic that has rooted itself in the main population. The other large epidemics spread mainly in marginalized groups like injecting drug users, men who have sex with men and people involved in the commercial sex industry. UNAIDS characterizes the sub-Saharan epidemic as a generalized epidemic since there are more than 1% of the population that is infected and the main mode of transmission is heterosexual intercourse. This implies that demographic and socio-economic variables will interact with the level of HIV-infection. Other variables than education need to be taken into account when analysing variations in HIV-infection levels. The know-do gap, the distance from knowledge to practice, is also quite large in many cultures. Then depending solely on education can cause biases in the estimates and is yet another argument for including more explanation variables in the analysis. A particular problem is that levels of HIV-infection are low in countries with large Muslim communities and at the same time education levels are also low in the same countries.
According to theory and previous research it is possible to group the variables that affect the level of HIV-infection into three groups; education effects, migration effects and socio-economic factors. The research has also mainly focused on regional and individual studies, not very much is done on macro-level. My problem is therefore twofold. First, are there other variables than education that explain the variation across countries in level of HIV-infection, and second, whether it is possible to see the same effects on macro level that has been observed on regional and individual level. Methods and dataI have done a cross-section analysis on country-level data from sub-Saharan Africa. The dependent variable in the analysis is HIV-prevalence. Prevalence describes the percentage of the population that is infected by HIV. Because I have so few observations I have had to divide the data into two datasets. The first dataset has primary education as the first explanatory variable and the second has secondary education as explanatory variable. The other explanatory variables reflect the two last groups of effects that effect HIV-prevalence; migration and socio-economic factors. For migration I have used the percentage of population that lives in urban areas. This is a proxy for the degree of urbanization in the population. The socio-economic factors included are percentage of Muslims in the population, GDP pr capita in US $ adjusted for purchasing power parity and total expenditure on health measured as percentage of GDP. Data on HIV-prevalence is taken from “2006 Report on the global AIDS epidemic” published by UNAIDS in May 2006. Data on education is collected by UNESCO and downloaded from Global Education Database operated by USAID. Data on GDP and urbanization is taken from UNDP’s Human Development Report Statistics 2005, data on total expenditure on health is from the World Health Organization, and the data for Muslims in the population is reported by CIA’s World Factbook.
I have used Ordinary Least Squares and Instrumental Variable Estimation performed by PcGive in GiveWin version 2.10. In advance I feared endogeneity problems in the education variable, but in the analysis there are signs of endogeneity only in the variable for total expenditure on health. Results and conclusionsThe results show an omitted variable bias in the coefficient for education when the percentage of Muslims in the population is not included. The coefficient for education changes from 0.0142 to -0.0048 for primary education and from 0.00544 to -0.0166 for secondary education, when I include the variable for Muslims in the population. The coefficients for education are not significant in any of the cases. There is one exception where the t-value for the coefficient for secondary education is -2.37, but in that case total expenditure on health is included in the model and there are strong signs of endogeneity in the total expenditure on health variable, so the results cannot be trusted when this is included. GDP pr capita and degree of urbanization does not have a significant effect on prevalence in these dataset. However, since there are so few observations the question of significance should not be weighted too much. The coefficient for GDP pr capita ranges from -0.00023 to 0.000691 and the coefficient for urban residents ranges from -0.02078 to 0.0254. There are no substantial differences between the results in the dataset with primary education and the results for secondary education.
From this analysis it is possible to conclude that there are differences in what explains the variation in level of HIV-infection on macro level compared to regional and individual level. Studies that have included education, but excluded proxies for culture and religion can have the same omitted variable bias as I have found here. This is a relatively basic result, but to overlook elementary relations is incompatible with scientific research so it is of importance. The omitted variable bias point in the direction of a negative relationship between education and HIV-infection, and this can imply that other aspects than information and education need to be stressed in efforts to prevent the spread of HIV.