TEXTO DE DISCUSSÃO Nº 43 APPLYING THE BOOTSTRAP TECHNIQUES IN DETECTING TURNING POINTS: A STUDY OF CONSUMER SENTIMENT SURVEY Pedro Guilherme Costa Ferreira José Lisboa Gondin Junior Viviane Seda Bittencourt

2014

1

Abstract: The purpose of this study is to improve the ability of the Consumer Confidence Index (CCI) of detecting turning points by shorting the statistical confidence interval by applying the Bootstrap Technique to Consumers Survey at the Getúlio Vargas Foundation (IBRE/FGV). Confidence Indicators are estimates that reflect not only macroeconomic conditions, but they also estimate psychological factors, which cannot be captured by traditional economic indicators. The results indicate that the ability of detecting turning points has significantly improved, moving from 41% to 68% of the significant monthly changes in the CCI by replacing the Theoretical Confidence Interval to the Bootstrap Confidence Interval. This result, besides increasing the dynamism of the survey, allows the indicator to detect Turning Points more quickly. An example of the effectiveness of the new methodology is shown in July 2009, when the new methodology indicates a significant monthly change in the CCI, as opposed to the current methodology which indicates a significant monthly change a few months later. Keywords: Consumer Sentiment; Leading Indicators; Survey Methodology; Confidence Interval; Bootstrap; Brazil JEL Classification: C42, C82, D84, E27, E32

1. Introduction According to (Issler, Notini, & Rodrigues, 2009), every society has an interest to know what is their Economic Business Cycle or what state they are in (expansion or recession). However, both the Business Cycle and the Economic Sentiment are unobserved variables and there is no consensus how to estimate these latent variables. The impossibility to estimate directly the Business Cycle and Economic Sentiment has led to constructions of these proxies. These variables are able to be used in real time and/or forecast in. The Consumers Surveys which have been conducted in forty-five countries at least.(Curtin, 2007) The monitoring of consumer sentiment aims to produce information on their decisions on spending and future savings. These, in turn, are useful indicators in anticipation the short-term tendency of the economy. The Consumer Survey, however, aims to generate information that reflect the macroeconomic conditions in vigor and to extract information in the psychological scope not captured by traditional economic indicators, thus contributing to the improvement of economic forecasting models. In market economies, the consumers spending represent two third of the whole economy. In consequence of that, small changes on the composition of the families spending may cause a big economic impact. (Curtin, 2007). Moreover, the consumer’s ability to predict economic cyclical changes is, in general, likely to coincide with other economic variables during periods of 2

stable economic growth, while the importance of gut feelings increases near turning poins or as a result of non economic impacts (Parigi & Golinelli, 2004) In order to synthesize the results of each research indicators known as Confidence Indicators have been developed. These indicators are endogenous variables, reflect economic activities and are capable of quantifying psychological factors, which are not captured by others economic variables. The application of these variables into economic and statistics models can improve the economic phenomeno analysis, taking into account unobserved variables like optimism, which make it possible to improve the short term forecast and detection of possible turning points. According to (Curtin, 2007), the small contribution of these variables has disappointed many researchers. However, this is what happens with most other economic variables. The idea of this article is to improve the level of sensivity of the indicator. This contribution will allow researchers to detect turning points faster than traditional terms. To do this, it will apply the Bootstrap technic in the Consumer Confidence Index, made by FGV’s Consumer Survey (SACE, 2013). The proposed methodology uses the (Efron, 1979) technique to evaluate the estimator's variance, taking into account the database of only one sample. The results show that the indicator's sensibility increases considerably form 41%, in the theoretical interval, to 65% in the bootstrap interval. This result, besides increasing the research dynamics, allows that the indicator capture the turning points faster. An example of the effectiveness of the new methodology is shown in July 2009, when the new methodology indicates a significant monthly change in the CCI, as opposed to the current methodology which indicates a significant monthly change a few months later. Beyond this introduction, this article is organized as follow. Section 2 presents a brief discussion about Bootstrap technic, the Consumer Confidence Index, their indicators and their relationship with macroeconomics variables, and the proposed model. In section 3 the empirical results are presented, and in section 4 are presented the final conclusions.

2. The Bootstrap technique, the Consumer Survey and the Proposed Model

2.1. The Bootstrap Technique The Bootstrap Technique, introduced by Efron (Efron, 1979), is a nonparametric computer-intensive statistical method. It may enable the analist to evaluate the variability of the estimators based on the data of a single sample.

3

This technique is indicated for problems that conventional statistical techniques are difficult to be applied. In most cases, this technique presents advantages in situations involving either large or small samples, as long as it provides results near the results obtained by asymptotic methods in large samples or exceeding the reduced sample. In practical terms, this technique consists in drawing randomly with replacement from original sample to generate a same size sample and stratum, which will be called Bootstrap Sample. A suitable number of bootstrap samples are computed in order to obtain a Bootstrap Distribution of the statistic that has been studied. Thus, the dataset obtained by bootstrapping is an estimation of the true sampling distribution of the statistic. As shown in (Efron, 1992), the Bootstrap Distributions converges to the real sampling distribution when the number of bootstrap samples tends to infinity. Let X  x1 , x2 ,..., xn1 , xn  be the original sample and the integer number n the length of X. Assume that X is obtained by an unknown probabilistic model which may be described by its cumulative function F and a statistics   S (X ) . Let X i* , i = 1, 2, ..., B, be the i-th Bootstrap Sample the length of n obtained from the sample X. For each Bootstrap Sample, X i* , there is a corresponding statistics  i* , i.e.,  i*  S ( X i* ) . The mean, variance and standard error of the Bootstrapped estimator of  may be defined by; B

* 

 i 1

* i

(2.1)

B

B

Var (i* ) 

 ( i 1

* i

  * )2

(2.2)

B 1

SEboot  Var (i* )

(2.3)

respectively. In (Efron, 1992) , it is shown that Var ( SEboot ) 

C1 C 2  n 2 nB

(2.4)

where C1 and C2 are constants which depends on the distribution F, but not on n e B.

4

Hence, the uncertainty associated to the Bootstrap estimator will depend on the size of the original sample at last. In other words, there is not any guarantee that the Bootstrapped Estimator converges to the truth estimator when the number of Bootstrap Samples tend to infinity. However, we obtain a good estimate of the confidence interval. In order, to evaluate the Botstrap confidence interval was used the Percentile Method, ^

this is, suppose one settles for 1000 bootstrapped replications of  , denoted * by 1* , 2* ,,1000  . After ranking from bottom to top, let us denote these





bootstrap values as  (*1) , (*2) ,, (*1000) . Then the bootstrapped percentile





confidence interval at 95% level of confidence would be  (*25) , (*975) . Turning to the theoretical aspects of this method, it should be pointed out that the ^

method requires the symmetry of the sampling distribution of  around  (Singh & Xie, 2008); (Babu & Singh, 1983); (Beran, 1990).

2.2. Brief explanation about methodology of the Consumer Survey. The Consumer Survey is a monthly survey that aims to generate indicators regarding topics such as general economic situation. The questions may be classified into: (i) Observations on the time of performing the survey, and (ii) Forecasts for the next six months. For each question on the survey has for options which are made in a comparative way. For instance, the options may be 5 – much better, 4 – better; 3 – the same, 2 – worse, 1 – much worse) (SACE, 2013). The Survey is conducted in the seven capitals of Brazil, which have the largest GDP (Belo Horizonte - BH, Brasília - BS, Porto Alegre - PO, Recife Re Salvado - Sa, Rio de Janeiro - RJ, São Paulo - SP). The sample is stratified by income level (I1 – less than US$ 897.00; I2 – between US$ 897.01 and US$ 2,051.00; I3 – between US$ 2,051.01 and US$ 4,103.00; I4 – more than R$ 4,103.001)), region of interest (Capitals), proportionalized by the participation of household consumption in each stratum. In order to calculate the Consumer Confidence Index (CCI), the statistics of interest in this article, a few remarks are necessary. The CCI is arithmetic mean of the five indicators (will be defined below) calculated for five questions from the Consumer Survey, as following: (Local Economic Situation at the moment – LESM; Households Financial Situation at the moment – HFSM; Local Economic Situation in the next six months – 1

R$/US$ = 2.34 (12/19/2013) 5

LESF; Households Financial Situation in the next six months – HFSF; Intention to Purchase Durable Goods in the next six months – IPDGF). The first two questions are assessments on the present economic moment and the last three ones are assessment on the consumers’ expectations of the economy in the future. (Diagram 2.1) (SACE, 2013).

Diagram 2.1 – Questions that comprised the Consumer Confidence Index (CCI) Source: Autores The sample has twenty-eight strata. In each one of seven Brazilian State capitals (São Paulo, Belhohorizonte, Brasília, Rio de Janeiro, Salvador, Porto Alegre and Recife) four household monthly income levels are cosidered (I1 – less than US$ 897.00; I2 – between US$ 897.01 and US$ 2,051.00; I3 between US$ 2,051.01 and US$ 4,103.00; I4 – more than R$ 4,103.002) as exemplified at Diagram 2.2.

Diagram 2.2 – Organogram (part of the sample design for one Indicador) Source: Authors The strata are weighted by their participation in the Brazilian household consumption. Thus, for each question one Indicador is calculated by adding 100 to the difference of the aggregates of favorable and unfavorable, i.e.: Indicador = 100 + favorables – unfavorables The Consumers Survay aims to estimate, with a low sampling error and high probabilistic reliability, proportions of responses in multiple choice questions (SACE, 2013). In this situation, the sample size is determined to estimate the parameters of a random variable that has Multinomial distribution, where the sample size solves the equation (2.5).

2

R$/US$ = 2.34 (12/19/2013) 6





ˆ  erro  1   , i = 1, 2,..., k. P P P i i

(2.5)

where: Pi

proportion being estimated;

Pˆ i

estimator of the proportion Pi (Pi = ni / n, where ni is the number of favorable responses to the alternative i and n is the sample size);

Error

maximum error of the estimate resulting from the use of a sample (referred to as sampling error and usually set to 0.02 or 2%);

1–

level of probabilistic reliability of the sample (usually 95%). The Consumer Survey, conducted monthly by IBRE/FGV, currently has

size sample of two thousand Brazilian consumers. Following international standards, for that sample size and a confidence interval of 95% the absolute sampling error is 2.19%. See Table 2 in (SACE, 2013). 2.3. The Proposed Method The main goal of this article is introduce a method that goes further than the maximum variance and maximum absolute error of the concerned variable, fixed to all months at 2.19%. In this article, the sample error is actually estimated for each month of the whole Time Serie do the ICC. The Proposed Method consites in a Bootstrap Resampling of each stratrum of the monthly sample (e.g. I1 in the LESF question from Rio de Janeiro) in order to create Bootstrap Sample. The sampling described in (SACE, 2013) and the careful collecting process, conducted by IBRE/FGV in the seven Brazilian States Capitals, guarantee that this bootstrap sample is a good approximation of a true random sample. Given a monthly sample, the proposed algorithm resamples in each stratum, keeping the characteristics and it can be presented in three steps: (i)

For each question (e.g. LESF), select the interviewee into each state capital (e.g. Rio de Janeiro) and income level (e.g. I3) (section 2.2); then twenty-eight Answer Set (column Ansuer Table 2.1) for each question are obtained.

7

(ii)

Generate thousand five hundred observations sampled uniformly at random, with replacemt, from each Answer Set. Then 28 sorted observation lists are obtained.

(iii)

Join the 28 strata of each observation list, following the sorting. Then we 2,500 bootstrap sample are generated for each question.

Table 2.1 – Example of the Answer matrix Interviewee Question

State Capital

Income level

Answer

1

LESF

RJ

I3

2

2

LESF

RJ

I3

4

3

LESF

RJ

I3

3

4

LESF

RJ

I3

5

5

LESF

RJ

I3

3

6

LESF

RJ

I3

1

Source: authors

Following those steps, 2,500 boostrap samples are generated from a single monthly sample. After that, an ICC* calculated for each bootstrap sample as it was said above. The histogram of ICC* is, in fact, an impirical distribution which is taken a confidence intervalo of 95% (Figura 2.1).

Figure 2.1 – Impirical Distribution of ICC* for 2,500 bootstrap samples from the monthly sample of September of 2013. Source: Authors

8

3. Results Conforme explicitado na seção 2.3, a metodologia proposta bootstrapa as respostas respeitando os blocos de respostas (capital e faixa de renda), agrega as estatísticas calculadas em cada bloco e estima-se uma distribuição para o ICC. Apesar de essa metodologia estar de acordo com as melhores práticas estatísticas quanto ao uso da reamostragem bootstrap, uma preocupação dos autores foi testar a robustez do método antes de analisar o resultado propriamente dito. Para tal, realizou-se o método proposto diversas vezes com diferentes números de reamostragens 2,000, 6,000 e 10,000 e calculouse o ICC e o intervalo de confiança para uma amostra mensal. Conforme pode ser observado na tabela 3.1, em todos os casos a média, o standard error e o intervalo de confiança convergem para o mesmo resultado com uma casa decimal. A mesma análise foi realizada para várias outras amostras mensais e os resultados obtidos foram satisfatórios. Assim, o algoritmo apresenta robustez com 2,500 reamostragens. Table 3.1 - Resultados das reamostragems do microdados, mês setembro, 2013 Reamostragens Média

Intervalo (95%)

com

confiança Standard error

2,000

113.1764

[111.7539, 114.5117]

0.6979

6,000

113.2074

[111.8521, 114.5343]

0.6842

10,000

113.1952

[111.8663, 114.5619]

0.6869

Fonte: Authors

Com relação aos resultados gerais para a série histórica de setembro/2005 a setembro/2013, observou-se que o standard error máximo de 0.9, com valor médio de 0.65 e intervalo de confiança máximo de 1.68 pontos percentuais e valor médio de 1.14 pontos percentuais. Outro resultado que mostra a robustez e a adequabilidade do método ao problema exposto é a variação do nível de confiança da estatística de interesse nos meses da pesquisa. Conforme pode ser observado, o intervalo de confiança do parâmetro é maior nos anos de 2008/2009 (chart 3.1), anos de crise e menor nos anos de 2011/2012 (chart 3.2), ano de relativa tranquilidade. Por fim, analisou-se a assimetria da amostra bootstrap de ICCs e atestou-se a simetria da distribuição dos ICCs, resultado que valida a utilização do método percentílico para o cálculo do intervalo de confiança bootstrap, conforme destacado por (Hall, 1988).

9

Chart 3.1 – Boxplot – valores bootstrapados do ICC no período próximo a crise econômica de 2008

Fonte: autores

10

Chart 3.2 - Boxplot – valores bootstrapados do ICC em um período de relativa estabilidade econômica

Fonte: autores

Tratando dos resultados, como pode ser observado no gráfico 3.3, ao utilizar o intervalo de confiança explicitado na seção 2.2, o nível de sensibilidade da pesquisa é baixo e muitas vezes, tardio quanto à certeza de turning points na economia, isto é, em alguns casos, conforme destacado no gráfico, há uma mudança na expectativa do consumidor, mas, em termos de significância estatística, essa mudança só pode ser garantida após certo período. Observando o gráfico 3.3, observa-se que, por exemplo, nos pontos 1 e 2, a mudança de sentido, já é significante no mês de ocorrência, isto é, as tendências mudam em Jan-06 e Jun-08 o ICC sinaliza no mesmo mês. Por outro lado, há longos períodos em que não há mudanças no índice, com destaque para o período Fev-09 a Out-09 e meses que o turning point não é observado no mês de ocorrência, destaque para os pontos 3, 4 e 5, onde muda-se a tendência em Jul-06, Nov-06, Ago-07 e o ICC sinaliza apenas em Out-06, Abr-07 e Dez-07, respectivamente. Ao utilizar o método proposto (série histórica – gráfico 3.4) observa-se que a sensibilidade do indicador, sinalizado pela linha vertical laranja, aumenta consideravelmente, passando de 41% para o caso do intervalo teórico para 68% com intervalo bootstrap. Tal resultado, além de aumentar a dinamicidade da pesquisa, permite que o indicador “capture” mais rapidamente os turning points, como por exemplo, comparando com os casos destacados anteriormente, a mudança em Jul-06 é sinalizada em Set-06, a de Nov-06 é sinalizada em o Jan-07 e a madunaça de Ago-07 é sinalizada no próprio mês.

11

Chart 3.3 - Série histórica ICC com intervalo de confiança assumindo variância máxima– Sondagem do Consumidor - Brasil

Fonte: autores (*) linhas pontilhadas pretas indicam o intervalo de confiança teórico; (*) barras laranjas indicam os pontos onde a variação do indicador é estatisticamente significante;

Chart 3.4 - Série histórica ICC com intervalo de confiança Bootstrap– Sondagem do Consumidor - Brasil

Fonte: autores (*) linhas pontilhadas pretas indicam o intervalo de confiança bootstrap; (*) barras laranjas indicam os pontos onde a variação do indicador é estatisticamente significante;

Outro ponto interessante de ser avaliado é Jul/2009 como se pode observar no gráfico 3.3 utilizando a metodologia atual não há nenhuma sinalização evidente de que os consumidores estão sentido o

12

desaquecimento da economia, contudo, ao analisar o gráfico 3.4 há uma sinalização da queda da confiança do consumidor com significância estatística. Antecipando uma sequencia de quedas no ICC. 4. Final remarks Conforme foi observado no artigo o método proposto atingiu seu objetivo de melhorar a sensibilidade estatística do indicador, deixando mais claro as percepções do consumidor em cada momento. Prova disso, foi o resultado de Set/2008 que com a metodologia bootstrap sinalizou, com 95% de confiança, que os consumidores estavam mais pessimistas com a situação da economia. Como resultado secundário, mas também importante, verificou-se que nos meses em torno da crise (Set-08 a Fev-09) o coeficiente de variação das amostras bootstrap é 130 pontos percentuais superior a períodos de calmaria (Set-11 a Fev-12), tal resultado pode ser entendido como um forte indicador antecedente de períodos de crise e precisa ser melhor estudado. Por fim, entende-se que a metodologia proposta mostrou-se útil para o acompanhamento dos ciclos econômicos e pode ser utilizada por outras Sondagens que objetivam aumentar a sensibilidade na detecção de turning points. 5. References Babu, G. J., & Singh, K. (1983). Inference on means using the bootstrap. Ann. Stat., 11. Beran, R. (1990). Refining bootstrap simultaneous confidence sets. Jour. Amer. Stat. Assoc., pp. 417-428. Curtin, R. (2007). Consumer Sentiment Surveys: Worldwide Review and Assessment. Journal of Business Cycle Measurement and Analysis. Efron, B. (1979). Bootstrap Methods: another look at jackknife. Ann. Stat. 7, , pp. 1-26. Efron, B. (1992). Jackkinife-after-bootstrap standard erros and influences functions (with discussion). J. R. Stat. Soc. B., 54, pp. 463-479. Hall, P. (1988). Theoretical comparison of bootstrap confidence intervals. Ann. Stat., 16, pp. 927-953. Issler, J. V., Notini, H. H., & Rodrigues, C. F. (2009, Junho). Um Indicador Coincidente e Antecedente da Atividade Econômica Brasileira. Ensaios Econômicos.

13

Parigi, G., & Golinelli, R. (2004). Consumer Sentiment and Economic Activity: A Cross Country Comparison. Journal of Business Cycle Measurement and Analysis, pp. pp. 147-70. SACE. (2013). Consumer Survey Methodology - Superintendence of Economic Cycles (SACE). Retrieved December 01, 2013, from Brazilian Institute of Economics (IBRE | FGV): http://portalibre.fgv.br Singh, K., & Xie, M. (2008). Bootstrap: A Statistical Method. Unpublished Working Paper. Rutgers

University.

.

14

Rio de Janeiro www.fgv.br/ibre

Rua Barão de Itambi, 60 22231-000 - Rio de Janeiro – RJ

São Paulo Av. Paulista, 548 - 6º andar 01310-000 - São Paulo – SP

15