Forecasting total fertility rates for urban and rural areas in Pakistan with a coherent functional model

Objective: To project total fertility rates for urban and rural areas in Pakistan up to 2027 with 80% prediction intervals. Method: The secondary-data study was conducted from March 2020 to August 2020 at Data Bank in the Department of Statistics, University of Karachi, Pakistan, and comprised publicly available data of the Pakistan Demographic Survey from 1984 to 2007. Two statistical models, the functional time series model and the coherent functional model, were used to make the predictions about age-specific fertility rates. The forecasting performance of the models was compared through error measures. Data was analysed using R version 3.6.3. Results: The predicted total fertility rate was 1.7 (80% prediction interval: 0.4-4.4) births per woman for urban areas and 2.2 (80% prediction interval: 0.6-5.3) births per woman for rural areas in 2027 using coherent functional model. The total fertility rate predicted by the functional time series model was 2.1 (80% prediction interval: 1.6-2.6) births in urban areas and 2.7 (80% prediction interval: 1.7-4.1) births per woman in rural areas. The empirical comparison of forecast error measures obtained from the two models indicated that the coherent functional model performed better for forecasting total fertility rates for urban and rural areas in Pakistan. Conclusion: The projection of fertility rate obtained by the functional time series model and the coherent functional model described the future fertility behaviour of urban and rural populations.


Introduction
Fertility is one of the main demographic features of any population 1 and can be measured through birth statistics. Fertility rates are important for low-and middle-income countries (LMICs) as well as for high-income countries (HICs). The total fertility rate (TFR) in LMICs is 3.0, which is double the TFR in HICs. 2 Numerous factors are responsible for affecting the fertility rates, like age at marriage, early marriages due to social and economic factors, female education, urbanisation, use of contraceptives, etc. 3 The population of Pakistan increased from 32.5 million in 1947 4 to 207.85 million, according to the Pakistan Bureau of Statistics (PBS), making it the sixth most populous country in the world. 5 This population increase places a heavy burden on Pakistan's economy. Demographers have observed a fertility transition in Pakistan after the 1990s due to contraceptive awareness programmes and the introduction of family planning (FP) schemes since that period. 6 The TFR is the most widely used index of fertility. The TFR is computed by adding all the age-specific fertility rates (ASFRs) for the age groups (15-19, 20-24, … , 45-49 years) of the year to cover the whole reproductive life span of mothers, and then this summation is multiplied by the length of interval (e.g. 5); TFR=∑ (i=15) ASFR×5. 7 The global TFR was forecasted to be 1·6 in 2100.7 According to the World Bank, the TFR for Pakistan was 3.4 births per woman in 2020. 8 The TFR decline varies considerably by geographical location and sociodemographic subgroup. The decrease in TFR is also partly due to the postponement of marriage by women, and knowledge of fertility control and the use of contraceptives. 8 The desire to have more children is important for understanding future reproductive behaviour. The TFR has influenced the health of women. 8 Some of the observed changes in women's maternal and reproductive health have undoubtedly led to the decline in fertility rates. Furthermore, women's roles, such as marital status, employment and child care, may have an impact on women's health as a result of changes in fertility patterns. 9 According to the Pakistan Demographic Survey (PDS) 1984-2007, the TFR in the urban areas of Pakistan decreased from 6.24 to 3.2 births per woman. The TFR in rural areas of Pakistan decreased from 7.27 to 4 births per woman. 10,11 As per the PDS 2020, the women in rural areas have an average of 4.1 children compared to 3.1 children in urban areas. 12

Open Access J Pak Med Assoc
Modelling fertility curves is the main concern when studying reproductivity. The goal of any modelling exercise is to extract as much information as possible from the available data. Fertility forecasting is an important measure of population growth. Reasonably accurate fertility forecasting techniques are therefore of great importance to society and play an important role in guiding policymakers. However, TFR forecasts in urban and rural areas of Pakistan is lacking. The current study was planned to fill the gap by projecting TFRs for urban and rural areas in Pakistan up to 2027 with 80% prediction intervals, and to compare the functional time series (FTS) model with the coherent functional model (CFM) for such projections.

Materials and Methods
The secondary-data study was conducted from March 2020 to August 2020 at Data Bank maintained in the Department of Statistics, University of Karachi, Pakistan, and comprised publicly available data of PDS from 1984 to 2007. 10, 11 The place of residence was classified as urban or rural based on socioeconomic characteristics, education facilities, hospitals, or health clinics. The start and end dates performed for this study was September 2020 to August 2021.
PDS is managed by the Federal Bureau of Statistics (FBS), and is based on a nationally representative sample. It is used to generate annual birth and death rates for Pakistan in both rural and urban areas. A stratified two-stage sample design has been adopted for the survey. The current study used a complete sample of all ASFRs (15-49 years) and female population in urban and rural areas from 1984 to 2007. PDS has explained the details of factors while evaluating the sample size, like sampling procedure, size of sample, rate of response and field work procedures. 10,11 Female population aged 15-49 years and usually residing in households in the sample area were included, while female population not being members of those households and living in military barracks and other security or prohibited areas were excluded.
The secondary data of annual ASFRs and female population in urban and rural areas were obtained from the PDS. The data was available for 1984-86, 1988-92, 1995-97, 1999-2001, 2003, 05 and 2007. 10,11 The missing data for the years 1987, 1993, 1994, 1998, 2002, 2004 and 2006 was estimated using interpolating splines in the R package stats. 13 The FTS model was developed in 2007 for forecasting demographic indicators, like fertility and mortality, based on functional principal components. In this model, data is presented in the form of curves that are observed at regular time intervals. 14  Suppose yt,u(xi,u) and yt,r(xi,r) are the observed fertility rates of urban and rural areas at age x in years t,t=1,2 .....n and gt,u(xi,u) and gt,r (xi,r) is an underlying smooth function that is observed with error. Thus, the main FTS model for urban and rural areas is given by: yt,r(xi,r)=gt,r (xi,r)+σt,r (xi,r) εt,j where xi,u and xi,r represent the centre of age groups of urban and rural areas, i, (i=1 ....,p) , εt,i and εt,j are independent and identically distributed standard normal random variables, and σt,u (xi,u) and σt,r (xi,r) represent the amount of noise to vary with age (heteroscedasticity). The FTS model proposes a functional principal component approach to decompose the time series of functional data into principal components and their scores. Hence, the smoothed curves are decomposed using the following model: where μ u (x) and μ r (x) represent the mean log fertility rate of urban and rural areas across years, φ k,u (x) and φ k,r (x) are a set of orthogonal basis functions, α tu,k and α tr,k define univariate time series, and e t,u (x) and e t,r (x) represent model error, which is assumed to be serially uncorrelated, i.e., e, (x) ~ N (0,V(x)) Firstly, the smoothed log fertility rates for urban and rural regions in Pakistan was obtained. Secondly, a model with four basis functions was selected.
It has been proposed that each univariate time series {α 1,k ,…,α n,k } can be forecasted independently using a univariate time series model. 14  where and are the estimates of the mean function, and are the estimates of the basis functions, and and denotes the h-step ahead forecast of k,n,u,a and k,n,r using a state-space exponential smoothing model.
The CFM was first proposed in 2013 and is based on the product and ratio of ASFRs. It has been applied to forecast the mortality rates of subgroups within a large population. 16 Different types of coherent forecasts have previously been studied, 17,18 A 2016) study obtained coherent ASFR forecasts for urban and rural areas in Pakistan. 19 Supposing that f tr (x) and f tu (x) denote the rural and urban fertility rates for age x and year t, t=1,2, ...., n, respectively, the log fertility rate is modelled thus: The product model is defined as the product of the smoothed urban and rural fertility rates, and the ratio model is the ratio between the smoothed fertility rates for rural and urban regions under the square root. The equations are given as: Forecasts are obtained by forecasting each coefficient (a t,1, a t,2, …………,a t,k ) and (Υ t,1. Υ t,2, …………,Υ t,j ) using an independent functional time series model. The product model coefficients were forecasted by using nonstationary autoregressive integrated moving average (ARIMA) models. The ratio model coefficients are forecasted by using any stationary autoregressive moving average (ARMA) (p,q) or autoregressive fractionally integrated moving average (ARFIMA) (p,d,q) process.

Suppose
denotes the h-step ahead forecast of a n+h/k and suppose denotes the h-step ahead forecast of γ n+h,l,j . Then, the h-step-ahead forecast of log is given as: where h i is the observed value, is the predicted value and n is the number of predicted values. The model with the smallest values of MAE, RMSE and MAPE is taken as the most appropriate forecasting model.
Data was analysed using R version 3.6.3. FTS and CFM were used to forecast the ASFRs for urban and rural areas with 80% probability level of PIs. The packages used in R software for the application of these models were 'demography' and 'forecast' . After computing ASFRs with 80% probability level, TFR with 80% probability level was obtained for urban and rural regions using the formula . Forecast accuracy at horizon 5 and 10 of these two models was evaluated by using MAE, RMSE and MAPE. The 20-year fertility forecast for coefficients of FTS model for urban and rural regions (2008-27) were worked out.

Results
In the FTS model, the percentage variation due to the basis function for urban areas was 90.8%, 5.1%, 2.4% and 1.1%. For rural areas it was 89.4%, 3.9%, 2.0% and 0.8%. Hence, the total variability due to the four basis functions explained approximately 99.4% of the fertility rates of urban women, and 99.1% of the total variability in the fertility rates of rural women. Then, the forecasts of the coefficients were multiplied by the basis functions of the respective areas. Adding the results, forecasts of the overall fertility curves were achieved. The major changes in fertility rates among Pakistani urban women occurred at the ages of 22 years and 40 years. In contrast, the major changes in fertility rates among Pakistani rural women occurred at the ages of 18 years and 37 years. No attempt ws made to interpret the other basic functions as they involved second and higher-order effects. The next 20 future fertility rates of urban and rural regions showed that fertility declined for both regions, but a greater decline has been observed in the urban region than in the rural region (Figures 1-2).
In CFM, the 'product component of smooth log fertility rates' for the 1984-2007 period represented the geometric mean of the urban and rural fertility rates. The product component indicated that the fertility rates increased with age until age 27 years, and then they declined. This decline  was sharper in the age group of 25-35 years and relatively slower in the earlier and older ages. The ratio component of 'urban-to-rural smoothed log fertility rates' showed that urban-to-rural fertility was less than 1, which suggested that urban women showed lower fertility than rural women of all ages during the 1984-2007 period. Additionally, the ratio decreased for women aged <25 years and >35 years, while the ratio of women aged 25-35 years was relatively stable. The 20-year forecast of product and ratio models showed a decline in fertility rates for both urban and rural women over the next 20 years.
First, the FTS and CFM models were fitted on ASFR and then the forecast of ASFR for both regions was obtained; TFR was then obtained, , for urban and rural regions. The predicted TFR from FTS, and CFM, was 2.1 and 1.7 births per woman for urban areas and 2.7 and 2.2 births per woman for rural areas in 2027, indicating a sharp TFR decline in FTS compared to CFM for 2027 (Table 1).
To compare the forecast accuracy at different forecast horizons for fertility rates in urban and rural areas, the available fertility data was split into two parts: the training set (1984-97) and the testing set (1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007). The purpose of the training set was to estimate the parameters of the two forecasting models, and to evaluate the accuracy of the models.
The errors obtained by fitting the FTS and CFM models to TFRs of urban and rural areas at all forecasting horizons (h=5 and 10) were noted. CFM provided the minimum value of errors compared to the FTS model at all forecast horizons for urban and rural areas. The projected fertility from CFM was more accurate than that from the FTS model (Table 2).

Discussion
The study presented the application of FTS and CFM models for forecasting TFR of urban and rural areas in Pakistan. The overall pattern of fertility cannot be precisely predicted. However, aspirations for a better material existence in conjunction with increasing unemployment and economic uncertainty are likely to force the population to follow this path. 21 Women's health is said to be at risk if they give birth to 1444 Vol. 73, No. 7, July 2023 Open Access  children "too frequently". The sociocultural norms in Pakistan expose women to this risk. Social pressure to marry off daughters at an early age is still prevalent. Hence, many young women marry early and are expected to bear a child soon after marriage, and to continue childbearing into the late years of life. 22 Therefore, there is a need to control health problems by focussing on future fertility trends in urban and rural regions in the country.
Fertility in Pakistan has registered some change, and this change is shared quite evenly by urban and rural residents, and, to some extent, in various geographical regions. 22,23 The most important factor in the transition of fertility in Pakistan is changing the age at marriage, which is higher in urban areas than in rural areas. 24,25 The result might be that childbearing starts later in urban areas. Contraceptive use is much higher in urban areas, reducing the effect of shorter lactation to some extent. A study investigated the association between fertility patterns and child survival, 6 while another study concluded that the rise in the prevalence of contraceptives had intensified the fertility transition in Pakistan, and had resulted in a decrease in marital fertility. 25 Forecasting comparisons have influenced the field of forecasting greatly over the years, providing a solid basis for assessing different extrapolation approaches, and learning empirically how to advance forecasting theory and practice. 24 Some polynomial models are also fitted to fertility rates in Pakistan and its urban-rural areas. 1 The coherent approach for forecasting regional fertility was used in Serbua 19 and confirmed that the forecasts of the two regions were highly convergent in the long term. 19 Yasmeen et al. used FTS model for forecasting fertility rates, 13 while different types of coherent forecasts have also been studied. 18,25 From the application of FTS and CFM models, rural women of Pakistan showed higher fertility rates than urban women. The forecasts of future fertility rates (2008-27) suggested that these differences are expected to be maintained over the next 20 years. These predictions are consistent with the findings of an earlier study. 17 The comparison of FTS versus CFM model showed that CFM forecasts were coherent and more plausible than those obtained from independent models fitted to each subgroup. Besides, they were also more accurate.
The application of automatic forecasting strategy based on these models for TFR of urban and rural areas showed that the CFM is appropriate to forecast TFR for the next 20 years in urban and rural areas. The predicted TFR from the FTS and CFM showed that future fertility is expected to decline, but other sociodemographic factors may also affect it. The overall comparison of models was made in forecasted fertility rates for 2008-27 in urban and rural areas. Minimum error measures were obtained in the CFM for forecasting TFRs. These future TFRs of urban and rural areas showed that the least reduction would be noticed in rural areas compared to the urban areas. These predictions are consistent with earlier findings. 13,19 The current study has limitations as some important risk factors were not included, such as marriage age, husband's education, women's education, husband's occupation, and women's work status, that may influence fertility rates. Currently, FTS and CFM models do not incorporate such effects, but the smoothing process used in these models may reduce the variation attributable to such effects. It is likely that improving prediction models will require the inclusion of additional known risk factors, which may play a great role in the better fertility pattern.
Fertility forecasts are an essential input to any population projection as they provide a component of long-term planning of children's services. In the context of Pakistan, the first step towards controlling fertility is to educate women about FP modes. In Pakistan, a huge population lives in rural areas where there are not enough health facilities, sanitation is poor, and education level is low. The current forecast may help the government allocate more resources to FP services, contraceptive awareness programmes, and educational facilities to rural areas.

Conclusion
Rural areas should be the focus on FP interventions and policy initiatives for fertility control because two-thirds of the Pakistani population lives in these areas.