## 1.

Sanitation and Morbidity of Infectious Gastrointestinal Diseases

The analysis of the effects of sanitation on the incidence of diarrhea was based on the cross-referencing of work-related absence due to diarrhea and vomiting, access to sewage, access to treated water and socioeconomic indicators. To calculate these effects, data from the National Health Survey of 2013 conducted by the IBGE were used. The socioeconomic indicators used in the econometric model are: information about individuals: (i) gender and (ii) age group; and information on the household: (iii) coverage material, (iv) garbage collection system; (v) availability of refrigerator; (vi) unit of the Federation in which the individual lives and (vii) area of the household (rural or urban).

A logistic regression model was used in which the probability of absence from activities due to diarrhea is a binary variable with values (1) for absence and (0) for non-absence. The logistic regression model is described by the following equation:

in which, y represents the dependent variable (probability of departure from diarrhea), xj are the information provided by the set of explanatory variables, where j = 1, 2, ..., k, γ are the coefficients quantifying the relationships between these variables and the dependent variable. G is a function that assumes strictly positive values between zero and one: 0 <G (z) <1, for all real numbers z. This ensures that the estimated probabilities are strictly between zero and one.

The estimated model to analyze the effect of sanitation on the probability of absence from routine activities due to diarrhea or vomiting presented quite satisfactory results. The greater the share of the population with access to treated water and to the sewage collection network, the lower the probability of absence from routine activities due to diarrhea or vomiting, the coefficients of these two variables are presented in Table A.M.1. The other control variables had the expected signal and are statistically significant.

**Table A.M.1**

Regression of absences due to diarrhea, Brazil, 2013

Regression of absences due to diarrhea, Brazil, 2013

Coefficient | Standard error | p-value | |
---|---|---|---|

Access to treated water | -0.2243 | 0.0082 | 0.0000 |

Access to sewage system | -0.1797 | 0.0055 | 0.0000 |

Source: Pesquisa Nacional de Saúde 2013 (IBGE, 2015). Observation: Likelihood Log: 3,300,153.094. Elaboration: Ex Ante Consultoria Economica.

## 2.

Sanitation and Days of Leave due to Infectious Gastrointestinal Diseases

The analysis of the effects of sanitation on the number of days of leave due to diarrhea or vomiting identified the relationship between the number of days of withdrawal indicated in the SNP and the availability of sanitation (adequate access to water and sewage collection), controlling for a set of variables. The database used was the National Health Survey of 2013 conducted by IBGE and the control variables were: (i) gender; (ii) age group; (iii) material covering the domicile; (iv) waste collection system; (v) availability of refrigerator; (vi) unit of the Federation in which the individual lives; (vii) area of housing (rural or urban); and (viii) place of residence (capital, metropolitan regions or interior).

The econometric model used was Poisson type. This type of model is used when the dependent variable is a counting variable, in this case, number of days away (1, 2, 3, etc.). This technique consists of modeling the expected value as an exponential function according to the following equation:

Since exp (.) is always positive, the equation guarantees that the predicted values of y will always be positive. On the inference processes using the Poisson model, see Wooldridge (2006).

The estimated model presented a very satisfactory result. The greater the share of the population with access to sewage, the smaller the number of days of leave due to diarrhea or vomiting. Access to treated water also had a positive effect, contributing to decrease the duration of the removal. The other control variables had the expected signal and are statistically significant.

**Table A.M.2**

Days of absence due to diarrhea or vomiting, Brazil, 2013

Coefficient | Standard error | p-value | |
---|---|---|---|

Access to treated water | -0.0594 | 0.0019 | - |

Access to sewage system | -0.1681 | 0.0020 | - |

Source: Pesquisa Nacional de Saúde 2013 (IBGE, 2015).

## 3.

Sanitation and School Lag

The analysis of the effects of sanitation on school performance was based on the dependent variable school delay built from the difference between the years of study of the person and the year that they should be attending. This analysis was applied only to school-age individuals. The database used was the Continuous National Survey by Domicile Sample of 2016 and the control variables were: (i) age; (ii) age squared; (iii) gender; (ivii) race; (v) schooling; (viiii) housing wall material; (vii) housing roof material; (viii) garbage collection system; (ix) unit of the Federation in which the individual lives; (x) area of housing (rural or urban); and (xi) place of residence (capital, metropolitan regions or interior).

The econometric model used was a Poisson model. This type of model is used when the dependent variable is a counting variable. In this case, the variable is the number of years of school delay. This technique consists of modeling the expected value as an exponential function according to the following equation:

Since exp (.) is always positive, the equation guarantees that the predicted values of y will always be positive. On the inference processes using the Poisson model, see Wooldridge (2006).

The estimated model presented a very satisfactory result. The greater the share of the population with access to sewage, the lower the school lag, that is, access to this service contributes positively to school performance. Access to treated water also had a positive effect, contributing to reduce the school delay. The other control variables had the expected signal and are statistically significant.

**Table A.M.3**

Regression of school delay, Brazil, 2016

Regression of school delay, Brazil, 2016

Coefficient | Standard error | p-value | |
---|---|---|---|

Access to treated water | -0.0111 | 0.0002 | 0.0000 |

Access to sewage system | -0.0151 | 0.0002 | 0.0000 |

Bathroom availability | -0.0731 | 0.0004 | 0.0000 |

Source: PNADC 2016 (IBGE, 2017). Elaboration: Ex Ante Consultoria Economica.

## 4.

Sanitation and School Performance

The analysis of the effects of sanitation on school performance was based on the crossing of performance information in the ENEM 2016 tests with data on the availability of bathroom in the household and a broad set of socioeconomic indicators of control. The population analyzed was between 15 and 29 years of age. The database used in this evaluation was the micro data base of ENEM 2016 provided by INEP. The control variables were: (i) age; (ii) age squared; (iii) gender; (iv) race; (v) schooling; (vi) housing wall material; (vii) housing roof material, (viii) garbage collection system; (ix) unit of the Federation in which the individual lives; (x) area of housing (rural or urban); and (xi) place of residence (capital, metropolitan regions or interior).

The econometric models used were linear equations estimated by OLS, in which the dependent variables are the grades in the tests (Di) of: natural sciences (CN), humanities (CH), languages and codes (LC), mathematics (MT), and writing (RE). It was also estimated a regression for the average of the grades of the five tests (average). The following equation describes the statistical model.

The regression results are presented in Table A.M.4. The estimated models presented quite satisfactory results. As expected, the absence of a bathroom in the student's home reduces his grades in all ENEM tests. The table also shows the interaction between the coefficient associated with gender and the coefficients associated with the availability of bathroom in the candidate's household. With the exception of the math test, in which the interaction is positive, that is, in the group of women the unavailability of the bathroom has a smaller effect on the test score, in the other evaluations the bathroom unavailability has a negative effect on women's grades.

**Table A.M.4**

Regression of school performance, Brazil, 2016

Regression of school performance, Brazil, 2016

Partial effect of the existence of bathroom in the house | Coefficient | Standard error | p-value |
---|---|---|---|

Natural Sciences | -1.8478 | 0.4883 | 0.0000 |

Humanities | -5.8168 | 0.5001 | 0.0000 |

Languages and Codes | -4.4733 | 0.4696 | 0.0000 |

Math | -4.6107 | 0.1515 | 0.0002 |

Essay | 0.7207 | 0.2307 | 0.0000 |

Average | -6.2637 | 0.4760 | 0.0000 |

Interaction of the partial effect with the female gender | Coefficient | Standard error | p-value |
---|---|---|---|

Natural Sciences | -0.6865 | 0.6339 | 0.0000 |

Humanities | -1.1645 | 0.6493 | 0.0000 |

Languages and Codes | -3.3668 | 0.6096 | 0.0000 |

Math | 4.1588 | 0.8942 | 0.0000 |

Essay | -4.2797 | 1.3620 | 0.0000 |

Average | -1.0677 | 0.6180 | 0.0000 |

Source: ENEM 2016 (INEP, 2017). Elaboration: Ex Ante Consultoria Economica.

## 5.

Sanitation and Productivity

The analysis of the effects of sanitation on labor income was based on a cross-reference of hourly compensation information with data on access to sewage, access to treated water, availability of bathrooms in the household, and a broad set of socioeconomic indicators of control. The database used in this evaluation was the Continuous National Survey by Domicile Sample of 2016. The control variables were: (i) age; (ii) age squared; (iii) gender; (iv) race; (v) schooling; (vi) housing wall material; (vii) housing roof material, (viii) garbage collection system; (ix) unit of the Federation in which the individual lives; (x) area of housing (rural or urban); and (xi) place of residence (capital, metropolitan regions or interior).

The econometric model used was an estimated linear model OLS, in which the dependent variable, mean hourly compensation, was transformed into ln, for better statistical adequacy (lny). The following equation describes the statistical model.

The regression results are presented in Table A.M.5. The estimated model presented quite satisfactory results. The larger the share of the population with access to sewage, the greater is their income from work. Access to treated water also positively affects the income of workers. The absence of a bathroom in the household reduces by 21.7% the average hourly remuneration expected.

**Table A.M.5**

Productivity regression, Brazil, 2016

Productivity regression, Brazil, 2016

Coefficient | Standard error | p-value | |
---|---|---|---|

Access to treated water | 0.0314 | 0.0003 | 0.0000 |

Access to sewage system | 0.0695 | 0.0003 | 0.0000 |

Bathroom availability | 0.2150 | 0.0014 | 0.0000 |

Source: PNADC 2016 (IBGE, 2017). Elaboration: Ex Ante Consultoria Economica.

## 6.

Factors Determining Access to Sanitation

The analysis of determinants of access to sanitation was based on a cross-referencing of access to sewage, access to treated water with socioeconomic indicators. To calculate these effects, the data from the National Survey by Continuous Household Sample of Continuous of 2016 carried out by the IBGE were used. The control variables were: (i) agegender; (ii) age squared; (iii) gender; (iv) race; (iv) schooling; (vi) income; and domicile information: housing wall material; (vii) housing roof material, (viii) garbage collection system; (ix) (vi) unit of the Federation in which the individual lives; (viix)) area of housing (rural or urban); and (xiviii) place of residence (capital, metropolitan regions or interior).

A logistic regression model was used in which the probabilities of not having access to the treated water or to the sewage collection service are binary variables with values (1) for not having access and (0) for access. The logistic regression model is described by the following equation:

where y is the dependent variable (probability of departure from diarrhea), xj is the information provided by the set of explanatory variables, where j = 1, 2, ..., k, γ are the coefficients quantifying the relationships between these variables and the dependent variable. G is a function that assumes strictly positive values between zero and one: 0 <G (z) <1, for all real numbers z. This ensures that the estimated probabilities are strictly between zero and one.

The models estimated to analyze the probabilities for not having access to treated water or not having access to the sewage collection service presented satisfactory results. The coefficients of the main explanatory variables used to estimate the probabilities are shown in Table A.M.6.

**Table A.M.6**

Regressions of probabilities for not having access to treated water or of not having access to the sewage collection service, Brazil, 2016

Regressions of probabilities for not having access to treated water or of not having access to the sewage collection service, Brazil, 2016

Inadequate water access Coefficient | Inadequate water access Standard error | Inadequate water access p-value | Inadequate access to sewage collection Coefficient | Inadequate access to sewage collection Standard error | Inadequate access to sewage collection p-value | |
---|---|---|---|---|---|---|

Urban Area | -2.4714 | 0.0006 | 0.0000 | 2.0001 | 0.0007 | 0.0000 |

Capital Area | -0.8233 | 0.0006 | 0.0000 | -1.2156 | 0.0006 | 0.0000 |

Other municipalities in the metropolitan region | 0.6203 | 0.0006 | 0.0000 | -0.1795 | 0.0006 | 0.0000 |

Municipalities in Integrated Development Regions | -0.0530 | 0.0024 | 0.0000 | 0.4554 | 0.0020 | 0.0000 |

Male Gender | 0.0334 | 0.0004 | 0.0000 | 0.0436 | 0.0004 | 0.0000 |

up to 4 years old | 0.0271 | 0.0015 | 0.0000 | 0.1311 | 0.0015 | 0.0000 |

from 5 to 14 years of age | 0.1431 | 0.0016 | 0.0000 | 0.3243 | 0.0017 | 0.0000 |

from 15 to 19 years of age | 0.2071 | 0.0016 | 0.0000 | 0.4278 | 0.0016 | 0.0000 |

from 20 to 29 years of age | 0.1663 | 0.0016 | 0.0000 | 0.3656 | 0.0016 | 0.0000 |

from 30 to 39 years of age | 0.1768 | 0.0015 | 0.0000 | 0.2756 | 0.0015 | 0.0000 |

from 40 to 59 years of age | 0.1492 | 0.0015 | 0.0000 | 0.1417 | 0.0016 | 0.0000 |

White Race | 0.0193 | 0.0276 | 0.4851 | -0.9181 | 0.0235 | 0.0000 |

Black Race | 0.0191 | 0.0276 | 0.4885 | -0.7722 | 0.0235 | 0.0000 |

Of Asian descent | -0.0853 | 0.0278 | 0.0022 | -1.2276 | 0.0238 | 0.0000 |

Multiracial | 0.0485 | 0.0276 | 0.0790 | -0.7325 | 0.0235 | 0.0000 |

Indigenous | -0.1318 | 0.0279 | 0.0000 | -0.7300 | 0.0238 | 0.0000 |

Uneducated | 0.3542 | 0.0011 | 0.0000 | 0.6503 | 0.0011 | 0.0000 |

Incomplete elementary school | 0.3271 | 0.0010 | 0.0000 | 0.5553 | 0.0009 | 0.0000 |

Complete elementary school | 0.2184 | 0.0011 | 0.0000 | 0.3121 | 0.0011 | 0.0000 |

Incomplete high school | 0.1644 | 0.0012 | 0.0000 | 0.3415 | 0.0012 | 0.0000 |

Complete high school | 0.1187 | 0.0009 | 0.0000 | 0.1671 | 0.0009 | 0.0000 |

Incomplete higher education | 0.0784 | 0.0014 | 0.0000 | -0.0130 | 0.0013 | 0.0000 |

Income class - 1st decile | 0.6257 | 0.0012 | 0.0000 | 0.9935 | 0.0012 | 0.0000 |

Income class - 2nd decile | 0.5923 | 0.0012 | 0.0000 | 0.7186 | 0.0012 | 0.0000 |

Income class - 3rd decile | 0.5922 | 0.0012 | 0.0000 | 0.6161 | 0.0011 | 0.0000 |

Income class - 4th decile | 0.4565 | 0.0012 | 0.0000 | 0.5151 | 0.0011 | 0.0000 |

Income class - 5th decile | 0.4880 | 0.0011 | 0.0000 | 0.5083 | 0.0011 | 0.0000 |

Income class - 6th decile | 0.4345 | 0.0011 | 0.0000 | 0.4539 | 0.0011 | 0.0000 |

Income class - 7th decile | 0.3450 | 0.0011 | 0.0000 | 0.3291 | 0.0011 | 0.0000 |

Income class - 8th decile | 0.2750 | 0.0011 | 0.0000 | 0.2548 | 0.0011 | 0.0000 |

Income class - 9th decile | 0.3324 | 0.0011 | 0.0000 | 0.1856 | 0.0011 | 0.0000 |

Source: PNADC 2016 (IBGE, 2017). Elaboration: Ex Ante Consultoria Economica.