Statistical Annex

 
 
 

Click here to download the Statistical Annex

 
 

 
 

Methodological Annex

 
 
 

1.
Sanitation and Morbidity of Infectious Gastrointestinal Diseases

The analysis of the effects of sanitation on the incidence of diarrhea was based on the cross-referencing of work-related absence due to diarrhea and vomiting, access to sewage, access to treated water and socioeconomic indicators. To calculate these effects, data from the National Health Survey of 2013 conducted by the IBGE were used. The socioeconomic indicators used in the econometric model are: information about individuals: (i) gender and (ii) age group; and information on the household: (iii) coverage material, (iv) garbage collection system; (v) availability of refrigerator; (vi) unit of the Federation in which the individual lives and (vii) area of the household (rural or urban).

A logistic regression model was used in which the probability of absence from activities due to diarrhea is a binary variable with values (1) for absence and (0) for non-absence. The logistic regression model is described by the following equation:

Screen Shot 2018-11-27 at 11.14.36.png

in which, y represents the dependent variable (probability of departure from diarrhea), xj are the information provided by the set of explanatory variables, where j = 1, 2, ..., k, γ are the coefficients quantifying the relationships between these variables and the dependent variable. G is a function that assumes strictly positive values between zero and one: 0 <G (z) <1, for all real numbers z. This ensures that the estimated probabilities are strictly between zero and one.

The estimated model to analyze the effect of sanitation on the probability of absence from routine activities due to diarrhea or vomiting presented quite satisfactory results. The greater the share of the population with access to treated water and to the sewage collection network, the lower the probability of absence from routine activities due to diarrhea or vomiting, the coefficients of these two variables are presented in Table A.M.1. The other control variables had the expected signal and are statistically significant.

Table A.M.1
Regression of absences due to diarrhea, Brazil, 2013

CoefficientStandard errorp-value
Access to treated water-0.22430.00820.0000
Access to sewage system-0.17970.00550.0000

Source: Pesquisa Nacional de Saúde 2013 (IBGE, 2015). Observation: Likelihood Log: 3,300,153.094. Elaboration: Ex Ante Consultoria Economica.

 

 

2.
Sanitation and Days of Leave due to Infectious Gastrointestinal Diseases

The analysis of the effects of sanitation on the number of days of leave due to diarrhea or vomiting identified the relationship between the number of days of withdrawal indicated in the SNP and the availability of sanitation (adequate access to water and sewage collection), controlling for a set of variables. The database used was the National Health Survey of 2013 conducted by IBGE and the control variables were: (i) gender; (ii) age group; (iii) material covering the domicile; (iv) waste collection system; (v) availability of refrigerator; (vi) unit of the Federation in which the individual lives; (vii) area of housing (rural or urban); and (viii) place of residence (capital, metropolitan regions or interior).

The econometric model used was Poisson type. This type of model is used when the dependent variable is a counting variable, in this case, number of days away (1, 2, 3, etc.). This technique consists of modeling the expected value as an exponential function according to the following equation:

Screen Shot 2018-11-27 at 11.14.45.png

Since exp (.) is always positive, the equation guarantees that the predicted values of y will always be positive. On the inference processes using the Poisson model, see Wooldridge (2006).

The estimated model presented a very satisfactory result. The greater the share of the population with access to sewage, the smaller the number of days of leave due to diarrhea or vomiting. Access to treated water also had a positive effect, contributing to decrease the duration of the removal. The other control variables had the expected signal and are statistically significant.

Table A.M.2
Days of absence due to diarrhea or vomiting, Brazil, 2013

CoefficientStandard errorp-value
Access to treated water-0.05940.0019-
Access to sewage system-0.16810.0020-

Source: Pesquisa Nacional de Saúde 2013 (IBGE, 2015).

 

 

3.
Sanitation and School Lag

The analysis of the effects of sanitation on school performance was based on the dependent variable school delay built from the difference between the years of study of the person and the year that they should be attending. This analysis was applied only to school-age individuals. The database used was the Continuous National Survey by Domicile Sample of 2016 and the control variables were: (i) age; (ii) age squared; (iii) gender; (ivii) race; (v) schooling; (viiii) housing wall material; (vii) housing roof material; (viii) garbage collection system; (ix) unit of the Federation in which the individual lives; (x) area of housing (rural or urban); and (xi) place of residence (capital, metropolitan regions or interior).

The econometric model used was a Poisson model. This type of model is used when the dependent variable is a counting variable. In this case, the variable is the number of years of school delay. This technique consists of modeling the expected value as an exponential function according to the following equation:

Screen Shot 2018-11-27 at 11.14.52.png

Since exp (.) is always positive, the equation guarantees that the predicted values of y will always be positive. On the inference processes using the Poisson model, see Wooldridge (2006).

The estimated model presented a very satisfactory result. The greater the share of the population with access to sewage, the lower the school lag, that is, access to this service contributes positively to school performance. Access to treated water also had a positive effect, contributing to reduce the school delay. The other control variables had the expected signal and are statistically significant.

Table A.M.3
Regression of school delay, Brazil, 2016

CoefficientStandard errorp-value
Access to treated water-0.01110.00020.0000
Access to sewage system-0.01510.00020.0000
Bathroom availability-0.07310.00040.0000

Source: PNADC 2016 (IBGE, 2017). Elaboration: Ex Ante Consultoria Economica.

 

 

4.
Sanitation and School Performance

The analysis of the effects of sanitation on school performance was based on the crossing of performance information in the ENEM 2016 tests with data on the availability of bathroom in the household and a broad set of socioeconomic indicators of control. The population analyzed was between 15 and 29 years of age. The database used in this evaluation was the micro data base of ENEM 2016 provided by INEP. The control variables were: (i) age; (ii) age squared; (iii) gender; (iv) race; (v) schooling; (vi) housing wall material; (vii) housing roof material, (viii) garbage collection system; (ix) unit of the Federation in which the individual lives; (x) area of housing (rural or urban); and (xi) place of residence (capital, metropolitan regions or interior).

The econometric models used were linear equations estimated by OLS, in which the dependent variables are the grades in the tests (Di) of: natural sciences (CN), humanities (CH), languages and codes (LC), mathematics (MT), and writing (RE). It was also estimated a regression for the average of the grades of the five tests (average). The following equation describes the statistical model.

Screen Shot 2018-11-27 at 11.15.00.png

The regression results are presented in Table A.M.4. The estimated models presented quite satisfactory results. As expected, the absence of a bathroom in the student's home reduces his grades in all ENEM tests. The table also shows the interaction between the coefficient associated with gender and the coefficients associated with the availability of bathroom in the candidate's household. With the exception of the math test, in which the interaction is positive, that is, in the group of women the unavailability of the bathroom has a smaller effect on the test score, in the other evaluations the bathroom unavailability has a negative effect on women's grades.

Table A.M.4
Regression of school performance, Brazil, 2016

Partial effect of the existence of bathroom in the houseCoefficientStandard errorp-value
Natural Sciences-1.84780.48830.0000
Humanities-5.81680.50010.0000
Languages and Codes-4.47330.46960.0000
Math-4.61070.15150.0002
Essay0.72070.23070.0000
Average-6.26370.47600.0000
 
Interaction of the partial effect with the female genderCoefficientStandard errorp-value
Natural Sciences-0.68650.63390.0000
Humanities-1.16450.64930.0000
Languages and Codes-3.36680.60960.0000
Math4.15880.89420.0000
Essay-4.27971.36200.0000
Average-1.06770.61800.0000

Source: ENEM 2016 (INEP, 2017). Elaboration: Ex Ante Consultoria Economica.

 

 

5.
Sanitation and Productivity

The analysis of the effects of sanitation on labor income was based on a cross-reference of hourly compensation information with data on access to sewage, access to treated water, availability of bathrooms in the household, and a broad set of socioeconomic indicators of control. The database used in this evaluation was the Continuous National Survey by Domicile Sample of 2016. The control variables were: (i) age; (ii) age squared; (iii) gender; (iv) race; (v) schooling; (vi) housing wall material; (vii) housing roof material, (viii) garbage collection system; (ix) unit of the Federation in which the individual lives; (x) area of housing (rural or urban); and (xi) place of residence (capital, metropolitan regions or interior).

The econometric model used was an estimated linear model OLS, in which the dependent variable, mean hourly compensation, was transformed into ln, for better statistical adequacy (lny). The following equation describes the statistical model.

The regression results are presented in Table A.M.5. The estimated model presented quite satisfactory results. The larger the share of the population with access to sewage, the greater is their income from work. Access to treated water also positively affects the income of workers. The absence of a bathroom in the household reduces by 21.7% the average hourly remuneration expected.

Table A.M.5
Productivity regression, Brazil, 2016

CoefficientStandard errorp-value
Access to treated water0.03140.00030.0000
Access to sewage system0.06950.00030.0000
Bathroom availability0.21500.00140.0000

Source: PNADC 2016 (IBGE, 2017). Elaboration: Ex Ante Consultoria Economica.

 

 

6.
Factors Determining Access to Sanitation

The analysis of determinants of access to sanitation was based on a cross-referencing of access to sewage, access to treated water with socioeconomic indicators. To calculate these effects, the data from the National Survey by Continuous Household Sample of Continuous of 2016 carried out by the IBGE were used. The control variables were: (i) agegender; (ii) age squared; (iii) gender; (iv) race; (iv) schooling; (vi) income; and domicile information: housing wall material; (vii) housing roof material, (viii) garbage collection system; (ix)  (vi) unit of the Federation in which the individual lives; (viix)) area of housing (rural or urban); and (xiviii) place of residence (capital, metropolitan regions or interior).

A logistic regression model was used in which the probabilities of not having access to the treated water or to the sewage collection service are binary variables with values (1) for not having access and (0) for access. The logistic regression model is described by the following equation:

Screen Shot 2018-11-27 at 11.15.20.png

where y is the dependent variable (probability of departure from diarrhea), xj is the information provided by the set of explanatory variables, where j = 1, 2, ..., k, γ are the coefficients quantifying the relationships between these variables and the dependent variable. G is a function that assumes strictly positive values between zero and one: 0 <G (z) <1, for all real numbers z. This ensures that the estimated probabilities are strictly between zero and one.

The models estimated to analyze the probabilities for not having access to treated water or not having access to the sewage collection service presented satisfactory results. The coefficients of the main explanatory variables used to estimate the probabilities are shown in Table A.M.6.

Table A.M.6
Regressions of probabilities for not having access to treated water or of not having access to the sewage collection service, Brazil, 2016

 
Inadequate water access CoefficientInadequate water access Standard errorInadequate water access p-valueInadequate access to sewage collection CoefficientInadequate access to sewage collection Standard errorInadequate access to sewage collection p-value
Urban Area-2.47140.00060.00002.00010.00070.0000
Capital Area-0.82330.00060.0000-1.21560.00060.0000
Other municipalities in the metropolitan region0.62030.00060.0000-0.17950.00060.0000
Municipalities in Integrated Development Regions-0.05300.00240.00000.45540.00200.0000
Male Gender0.03340.00040.00000.04360.00040.0000
up to 4 years old0.02710.00150.00000.13110.00150.0000
from 5 to 14 years of age0.14310.00160.00000.32430.00170.0000
from 15 to 19 years of age0.20710.00160.00000.42780.00160.0000
from 20 to 29 years of age0.16630.00160.00000.36560.00160.0000
from 30 to 39 years of age0.17680.00150.00000.27560.00150.0000
from 40 to 59 years of age0.14920.00150.00000.14170.00160.0000
White Race0.01930.02760.4851-0.91810.02350.0000
Black Race0.01910.02760.4885-0.77220.02350.0000
Of Asian descent-0.08530.02780.0022-1.22760.02380.0000
Multiracial0.04850.02760.0790-0.73250.02350.0000
Indigenous-0.13180.02790.0000-0.73000.02380.0000
Uneducated0.35420.00110.00000.65030.00110.0000
Incomplete elementary school0.32710.00100.00000.55530.00090.0000
Complete elementary school0.21840.00110.00000.31210.00110.0000
Incomplete high school0.16440.00120.00000.34150.00120.0000
Complete high school0.11870.00090.00000.16710.00090.0000
Incomplete higher education0.07840.00140.0000-0.01300.00130.0000
Income class - 1st decile0.62570.00120.00000.99350.00120.0000
Income class - 2nd decile0.59230.00120.00000.71860.00120.0000
Income class - 3rd decile0.59220.00120.00000.61610.00110.0000
Income class - 4th decile0.45650.00120.00000.51510.00110.0000
Income class - 5th decile0.48800.00110.00000.50830.00110.0000
Income class - 6th decile0.43450.00110.00000.45390.00110.0000
Income class - 7th decile0.34500.00110.00000.32910.00110.0000
Income class - 8th decile0.27500.00110.00000.25480.00110.0000
Income class - 9th decile0.33240.00110.00000.18560.00110.0000

Source: PNADC 2016 (IBGE, 2017). Elaboration: Ex Ante Consultoria Economica.