Data source
Secondary data from the Zambia Demographic and Health Survey (ZDHS) conducted in 2007, 2013 and 2018 was used [43]. Specifically, the study used the women’s individual recode files (IR) which contain the responses of women aged 15–49 who were enrolled in surveys. The Demographic and Health Survey (DHS) is a nationwide cross-sectional survey that is usually carried out across low-and middle-income countries every five-years [44] and collects data on several indicators such related to demographic and health of a country. The DHS has been an essential source of country level data on issues surrounding sexual and reproductive health indicators in low-and middle-income countries as it gathers data on several indicators such as marriage, sexual-activity, fertility, fertility-preferences and family-planning [44]. Stratified, two-stage sampling approach is usually employed in selecting the sample for the DHS. A pooled sample of 9990 women aged 20–29 years, who were ever-married prior to survey and had complete information on reported age at first marriage were included in the analysis. The age group 20 to 24 years old is the typical age range for researching child marriage among women who have ever been married [1, 11, 17]. We used the broader age range of the ever-married women 20 to 29 years old for analyses of the data samples because the age sample for the group 20–24 years was not sufficient for our analysis. Because age at first marriage was measured retrospectively, we excluded all teenage women aged 15–19 years from the analysis. This is due to the fact that not all members of this cohort had a chance to experience child marriage as they had not yet completed childhood age. The selection criteria for the study sample size for the three DHS’s is described in Fig. 1.
Measures
Outcome measure
The outcome variable for this study was age at marriage. Age at first marriage is defined as “age at which woman or a man was first married or stated cohabiting with partner” usually age at first marriage is presented as; less than 18 years or 18 years and above [8, 9, 35]. During the DHS survey, all women who reported being ever married prior to the survey were asked to state the age at which they got married or started cohabiting with a partner. The variable was collected and recorded as continuous data. To facilitate binary analysis, we then recoded the variable into two categories: (i) ‘less than 18 years’ and; (ii) 18 years or above. A binary outcome variable was then classified as “0” representing age at first marriage/cohabitation of 18 years or above and “1” representing age at first marriage below 18 years, which was treated as child marriage.
Independent variables
Based on exiting literature [30, 34, 35, 45], a number of explanatory variables were selected, these included: age of a woman; age at first sex; education; literacy; residence; region; wealth status; employment status; exposure to family planning messages; age at first birth; gave birth in the last five years; age of partner; education of partner; and employment of partner. These variables were grouped into individual and community-level variables.
Individual level factors
Individual-level factors included age of a woman categorized as [20,21,22,23,24,25,26,27,28,29]; education level (none, primary, secondary and higher); literacy (illiterate and literate); age at first sex (less than 15, 15–19, 20–24 and 25–29); age at firth birth (less than 15, 15–19 and 20–29); age of a partner at the time of the survey (less than 25, 25–29, 30–34, and 35 +); wealth status (poor, middle and rich). Other individual variables included employment status (not working and working); education level of partner (none, primary, secondary, and higher); partner’s employment status (not working and working); gave birth last five years (no and yes); media exposure (no and yes); and desired family size (less than 4 children, 4–5 and 6 + children).
Community-level factors
The aggregation of socioeconomic and demographic characteristics (education, employment, wealth status, age at first birth) and behaviour-related factors (fertility desire, exposure to FP messages) from individual-level to community-level was done to study these variables at the community or neighbourhood level. These community variables were chosen based on their significance in previous research [21, 30]. A community was defined as the primary sampling unit (i.e., cluster) of the ZDHS’s. Household wealth, employment, women’s education, age at first birth, ideal number of children, and exposure to media FP messages were aggregated to a sampling unit to generate community level. The community-level factors, except for residence, were aggregated individual-level variables at the cluster level measured as average proportions classified into low, medium, and high levels for each variable for easy interpretation. The following categorisation was used to group the percentile into three discrete categories (low = “0–49 percent”; medium = “50–75 percent”; high = “75–100 percent”). A number of studies guided the construction of the indices and community variables used in this study [22, 25, 40, 45,46,47].
Statistical analysis
Data analysis was done at three levels: descriptive, bivariate and multilevel using Stata version 17 software, with 5% level of significance. At the descriptive level, percent distributions of outcome indicators were presented. At the bivariate level, cross-tabulations with chi-square tests were used to analyse the association between child marriage and the selected independent variables. In order to assess the effects of several identified individual and community-level factors on child marriage in Zambia, a two-level multilevel binary logistic regression model was applied on a pooled data for all the three surveys phases. First level involved analysing data at the individual level and the second involved analysis at community level. The “melogit” command was used in Stata software to account for the clustering of the outcome variable within and across sampling clusters of the survey design. Adjusted odds ratios (aOR) with corresponding 95% confidence intervals (CI) were reported. Four multilevel logistic models were estimated. Model 1 included the outcome variable only in order to test the random variability in the intercept. Model 2 included the individual-level variables to examine women’s characteristics on early marriage experience while Model 3 examined the effect of community-level characteristics only; model 4 included both the individual and community-level factors. All covariates were included in the multilevel analyses regardless of level of significance at bivariate analysis. This is because all the variables in our study conceptual framework have been reported to significantly influence child marriage in prior studies [30, 34, 35, 45].
The intra-class correlation (ICC) was used to understand variations of relationships between communities and the relative effect of community-level variables. ICC provides information on the share of variance at each level. The latent method was used to calculate the PVC at each level. It assumes a threshold model, approximating the level 1 variance by \({\pi }^{2}/3(\approx 3.29)\) [40, 47, 48]. To explain the heterogeneity in the probabilities of early marital experience, the Proportional Change in Variance (PCV) was computed for each model compared to the empty model. The PCV provided information on the share of variance for each model relative to model I. Aikake Information Criteria (AIC) were used to compare models and measure goodness of fit [40, 47].The model with the lower Aikake Information Criteria (AIC) was considered being a better fit for the data. To assess multicollinearity among independent factors, the variance inflation factor (VIF) was used. There were no concerns with multicollinearity in any of the variables (all VIF < 5). The variance inflation factor values are presented in Additional file 1: Table 1.
Ethical approval
The data analysed in this study is available in the public domain at (https://dhsprogram.com/) Permission to use the data was obtained from the DHS program. All datasets used in this study did not contain any personal identification information from survey participants. The original Zambian DHS Biomarker and survey protocols were approved by Tropical Disease and Research Center (TDRC) and the Research Ethics Review Board of the Center for Disease Control and Prevention (CDC) Atlanta.