R_Econometría: Basic Econometrics. Gujarati and Porter: Réplica de ejemplos con R.

Basic Econometrics from Damodar Gujarati and Dawn Porter: R replications

En esta entrada se presenta, una de las tantas formas, de replicar los ejemplos de la 5ta edición del libro Basic Economectics de Damodar Gujarati y Dawn Porter.

Hace mucho que no se publicaba nada en este blog respecto a temas de econometría y es irónico, ya que el nombre es r-econ que hace referencia a econometría con R, sin embargo, espero con estas nuevas entradas, honrar el nombre del blog.

Empecemos por el incio, ¿por qué este libro y no el de Wooldridge o cualquier otro de econometría?, precisamente, porque le tengo un cariño muy especial a este libro y es de los de referencia más básicas que muchos estudiantes consultan; posteriormente, espero tener la oportunidad de replicar los ejemplos de otros libros intermedios y más avanzados, no el de Wooldridgre, porque este ya tiene solución en esta página.

Antes de empezar, quiero aclarar que esto no pretende de ser un manual de solución a los ejercicios propuestos, esto sería darles un atajo a los estudiantes y quitarles el incentivo del estudio, más bien, esto es sólo, como su título lo indica, la forma de replicar los ejemplos de las regresiones presentadas en cada capítulo del libro usando mi herramienta de trabajo preferida: R! para que sea un punto de partida para aquellos que quieran reciclar estos códigos y hacerlos suyos en el estudio de la econometría.

Los datos se importarán directamente del sitio https://rdrr.io/github/brunoruas2/gujarati/. Dicho todo lo anterior, vamos a ponernos manos a la obra.

Instalación de paquete con los datos

Primero procederemos a instalar de github el paquete necesario para obtener las tablas de datos de nuestro interés, para ello hacemos lo siguiente:

install.packages("remotes")
remotes::install_github("brunoruas2/gujarati")
library(gujarati)

Introduction

Dado que la Tabla I.1 no se encuentra disponible en el paquete previamente instalado, vamos a obtenerla de la página econometrics.com

Table_I.1 <- read.delim("http://www.econometrics.com/manuals/gujarati/data_3.1.shd", sep=" ", header=FALSE)[,2:4]
colnames(Table_I.1) <- c("Year", "PCE", "GDP")

plot(Table_I.1$GDP, Table_I.1$PCE,
     bty = "l",
     main = "Figure I.3:  Personal consumption expenditure (Y)
     in relation to GDP (X), 1982-1996, in billions of 2000 dollars.",
     font.main = 1,
     xlab = "GDP (X)",
     ylab = "PCE (Y)")
abline(lm(PCE~GDP, data=Table_I.1))

Chapter 1: The Nature of Regression Analysis

library(gujarati) # carga las tablas del libro
library(readr)    # para usar parse_number

Este paquete tiene un problema en la definición de los datos dentro de los dataframes, todos los datos los presenta como factor, aún debiendo ser valores numéricos, se presentan como factor por lo cual, antes de empezar vamos a crear una función para que cargue correctamente los datos y no tengamos que estar corrigiendo en cada tabla los valores con el atributo incorrecto.

load_table <- function(df){
  ind_character <- sapply(lapply(df, grepl, pattern="\\d+\\.?\\d*"), any)
  df[ ,ind_character] <- data.frame(sapply(df[,ind_character], function(x) readr::parse_number(as.character(x))))
  return(df)
}

Table1_1 <- load_table(Table1_1)

plot(Table1_1$Y1,Table1_1$X1, xaxt="n", yaxt="s",
     main="FIGURE 1.6:  Relationship between eggs produced and prices, 1990",
     font.main=1,
     ylab="Price of eggs per dozen (in cents)",
     xlab="Number of eggs produced (millions)",
     bty="l")
axis(2, xaxp=c(40,160,6))
axis(1, xaxp=c(0,8000,4))

Chapter 2: Two-Variable Regression Analysis: Some Basic Ideas

Aparentemente, las columnas de todas las tablas de este paquete son factor, por lo cual siempre vamos estar convirtiendo a numéricas las variables que así deben ser, es molesto pero es la solución de momentánea.

Table2_4  <- load_table(Table2_4)
Table2_5  <- load_table(Table2_5)

plot(c(80,260), c(1,200),
     main = "Figure 2.4",
     xlab = "Weekly income, $",
     ylab = "Weekly consumption expenditure, $",
     type = "n", bty="l", xaxt = "n",
     sub = "Figure 2.4 Regression lines based on two different samples")
axis(1, xaxp = c(80,260,9), tick = TRUE)
points(Table2_4$X, Table2_4$Y, pch=20, bty="l")
points(Table2_5$X, Table2_5$Y, pch=4)
abline(coef(lm(Y ~ X, data=Table2_4)))
abline(coef(lm(Y ~ X, data=Table2_5)), lty=2)
savefont <- par(font=6)
legend("topleft",c("First sample (Table 2.4)", "Second sample (Table 2.5)"),
       pch = c(4, 20), col=c("black"), bg="white", bty="n")
text(170,160, "Regression based on \n the second sample", adj = c(0, -.1))
arrows(x0=205, y0=157, x1=225, y1=150, length=0.1, code=0, col="gray30", lwd=1)
text(220,105, "Regression based on \n the first sample", adj = c(0, -.1))

Chapter 3: Two-Variable Regression Model: The Problem of Estimation

EXAMPLE 3.1

# reutilizamos la Table_I.1
model3_1 <- lm( PCE ~ GDP, Table_I.1 )
summary(model3_1)

## 
## Call:
## lm(formula = PCE ~ GDP, data = Table_I.1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -131.44  -67.37  -42.13   -8.75 1013.72 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -181.13011   67.77077  -2.673   0.0105 *  
## GDP            0.70578    0.01043  67.659   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 175.7 on 44 degrees of freedom
## Multiple R-squared:  0.9905, Adjusted R-squared:  0.9903 
## F-statistic:  4578 on 1 and 44 DF,  p-value: < 2.2e-16

Para un output más profesional, listo para imprimir, podemos utilizar stargazer:

#library(stargazer)
stargazer::stargazer(model3_1, type="html")


	Dependent variable:

	PCE

GDP	0.706^***
	(0.010)

Constant	-181.130^**
	(67.771)


Observations	46
R²	0.990
Adjusted R²	0.990
Residual Std. Error	175.724 (df = 44)
F Statistic	4,577.750^*** (df = 1; 44)

Note:	p<0.1; p<0.05; p<0.01

EXAMPLE 3.2

Table2_8 <- load_table(Table2_8)
model3_2 <- lm(FOODEXP ~ TOTALEXP, data = Table2_8)
stargazer::stargazer(model3_2, type='html')


	Dependent variable:

	FOODEXP

TOTALEXP	0.437^***
	(0.078)

Constant	94.209^*
	(50.856)


Observations	55
R²	0.370
Adjusted R²	0.358
Residual Std. Error	66.856 (df = 53)
F Statistic	31.103^*** (df = 1; 53)

Note:	p<0.1; p<0.05; p<0.01

EXAMPLE 3.3

Table3_3 <- load_table(Table3_3)
model3_3 <- lm(Cellphone ~ Pcapincome, data = Table3_3)
model3_4 <- lm(PCs ~ Pcapincome, data = Table3_3)
stargazer::stargazer(model3_3, model3_4, title = "Example 3.3", align = TRUE, type='html')

**Example 3.3**

	Dependent variable:

	Cellphone	PCs
	(1)	(2)

Pcapincome	0.002^***	0.002^***
	(0.0003)	(0.0001)

Constant	14.467^**	-6.583^**
	(6.152)	(2.744)


Observations	34	34
R²	0.602	0.829
Adjusted R²	0.590	0.824
Residual Std. Error (df = 32)	20.555	9.167
F Statistic (df = 1; 32)	48.476^***	155.177^***

Note:	p<0.1; p<0.05; p<0.01

Chapter 4: Classical Normal Linear Regression Model (CNLRM)

##install.packages("maxLik")
library(maxLik)

# Definiendo la verosimilitud
loglik <- function(theta){
  b1 <- theta[1]
  b2 <- theta[2]
  sigma <- theta[3]
  n <- nrow(Table2_8)
  
  # Esto corresponde a la Eq. (5) de la página 103
  (-n*log(sigma)) - ((n/2)*log(2*pi)) - (1/2)* sum((Table2_8$FOODEXP-b1-b2*Table2_8$TOTALEXP)^2/sigma^2)
} 

# maximixando el logaritmo de la verosimilitud:
results_loglik <- maxLik(loglik, start = c(b1=100, b2=1, sigma=60), method = "NR")
summary(results_loglik)

## --------------------------------------------
## Maximum Likelihood estimation
## Newton-Raphson maximisation, 40 iterations
## Return code 8: successive function values within relative tolerance limit (reltol)
## Log-Likelihood: -308.1625 
## 3  free parameters
## Estimates:
##       Estimate Std. error t value Pr(> t)    
## b1    94.20622    4.73688   19.89  <2e-16 ***
## b2     0.43681    0.01539   28.38  <2e-16 ***
## sigma 65.63805    2.20132   29.82  <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## --------------------------------------------

Como se puede observar los resultados son muy parecidos (casi iguales) a los obtenidos por Mínimos Cuadrados Ordinarios (MCO), sin embargo, debo aclarar dos puntos:

El resultado obtenido en sigma corresponde a la desviación estándar, así, para obtener la varianza se tiene que elevar ese resultado al cuadrado, por tanto, la varianza estimada es results$estimate["sigma"]^2 que da como resultado 4308.354.
El otro punto importante es que hay un fallo en el libro donde indica que el estimador máximoverosimil (ML) de la varianza de los residuos es 4407.1563 en realidad es 4307.157, el cual puede extraerse basado en la ecuación (14) del libro y con la siguiente instrucción de R sum(model3_2$residuals^2)/55. La estimación de la varianza de los errores por MCO es correcta según reporta el libro y se obtiene en R a partir de: sum(model3_2$residuals^2)/53

Cabe destacar que estas diferencias respecto a los resultados presentados en el libro se debe, principalmente, a que la respuesta del libro es una respuesta exacta obtenida a partir de las expresiones analíticas de cada estimador máximo verosímil, mientras que las respuestas obtenidas usando maxLikes una aproximación basado en optimización numérica.

Chapter 5: Two-Variable Regression: Interval Estimation and Hypothesis Testing

Sin ejercicios que reproducir

Chapter 6: Extensions of the Two-Variable Linear Regression Model

Regresión a través del origen (es decir, regresión sin intercepto)

EXAMPLE 6.1

Table6_1 <- load_table(Table6_1)
model6_1 <- lm(Y ~ X - 1, data = Table6_1)  # sin intercepto
model6_2 <- lm(Y ~ X, data = Table6_1)  # con intercepto
stargazer::stargazer(model6_1, model6_2, type="html", title = "Example 6.1")

**Example 6.1**

	Dependent variable:

	Y
	(1)	(2)

X	1.156^***	1.171^***
	(0.074)	(0.075)

Constant		-0.447
		(0.363)


Observations	240	240
R²	0.502	0.503
Adjusted R²	0.500	0.501
Residual Std. Error	5.549 (df = 239)	5.543 (df = 238)
F Statistic	241.243^*** (df = 1; 239)	241.336^*** (df = 1; 238)

Note:	p<0.1; p<0.05; p<0.01

Regresión sobre variables estandarizadas.

Table6_2  <- load_table(Table6_2)
model6_3 <- lm(GPDIBL ~ GDPB, data = Table6_2)   # Eq (6.3.6)
model6_4 <- lm(GPDIBL ~ GDPB, data = data.frame(scale(Table6_2[, c("GPDIBL", "GDPB")]))) # Eq (6.3.7)
stargazer::stargazer(model6_3, model6_4, type="html", align = TRUE)


	Dependent variable:

	GPDIBL
	(1)	(2)

GDPB	0.254^***	0.982^***
	(0.013)	(0.050)

Constant	-926.090^***	0.000
	(116.358)	(0.049)


Observations	16	16
R²	0.965	0.965
Adjusted R²	0.962	0.962
Residual Std. Error (df = 14)	66.427	0.194
F Statistic (df = 1; 14)	383.469^***	383.469^***

Note:	p<0.1; p<0.05; p<0.01

EXAMPLE 6.3

#install.packages("equatiomatic")
library(equatiomatic)
Table6_3 <- load_table(Table6_3)
model6_5 <- lm(log(EXPDUR) ~ log(PCEXP), data=Table6_3)
extract_eq(model6_5, wrap = TRUE, use_coefs = TRUE)  # para extraer la ecuación como se presenta en el libro

\[ \begin{aligned} \operatorname{\widehat{log(EXPDUR)}} &= -7.54 + 1.63(\operatorname{\log(PCEXP)}) \end{aligned} \]

EXAMPLE 6.4

Table6_3$T <- 1:nrow(Table6_3)
model6_6 <- lm(log(EXPSERVICES) ~ T, data=Table6_3)
extract_eq(model6_6, wrap = TRUE, use_coefs = TRUE)

\[ \begin{aligned} \operatorname{\widehat{log(EXPSERVICES)}} &= 8.32 + 0.01(\operatorname{T}) \end{aligned} \]

EXAMPLE 6.5

model6_7 <- lm(FOODEXP ~ log(TOTALEXP), data = Table2_8)
extract_eq(model6_7, wrap = TRUE, use_coefs = TRUE)

\[ \begin{aligned} \operatorname{\widehat{FOODEXP}} &= -1283.91 + 257.27(\operatorname{\log(TOTALEXP)}) \end{aligned} \]

EXAMPLE 6.6

Table6_4 <- load_table(Table6_4)
model6_8 <- lm(CM ~ I(1/PGNP), data = Table6_4)
extract_eq(model6_8, wrap = TRUE, use_coefs = TRUE)

\[ \begin{aligned} \operatorname{\widehat{CM}} &= 81.79 + 27273.17(\operatorname{1/PGNP}) \end{aligned} \]

EXAMPLE 6.7

Table6_5 <- load_table(Table6_5)

# creando el cambio en la tasa de inflación
Table6_5$dINFLRATE <- with(Table6_5, INFLRATE - dplyr::lag(INFLRATE))

linear_model <- lm(dINFLRATE ~ UNRATE, data = Table6_5)
reciprocal_model <- lm(dINFLRATE ~ I(1/UNRATE), data = Table6_5)
stargazer::stargazer(linear_model, reciprocal_model, title = "Example 6.7", align = TRUE, type="html")

**Example 6.7**

	Dependent variable:

	dINFLRATE
	(1)	(2)

UNRATE	-0.638^***
	(0.149)

I(1/UNRATE)		17.208^***
		(5.232)

Constant	3.784^***	-3.068^***
	(0.903)	(0.970)


Observations	46	46
R²	0.294	0.197
Adjusted R²	0.277	0.179
Residual Std. Error (df = 44)	1.445	1.540
F Statistic (df = 1; 44)	18.280^***	10.815^***

Note:	p<0.1; p<0.05; p<0.01

Chapter 7: Multiple Regression Analysis: The Problem of Estimation

EXAMPLE 7.1

model7_1 <- lm(CM ~ PGNP + FLR, data = Table6_4)
stargazer::stargazer(model7_1, type="html", title = "Example 7.1")

**Example 7.1**

	Dependent variable:

	CM

PGNP	-0.006^***
	(0.002)

FLR	-2.232^***
	(0.210)

Constant	263.642^***
	(11.593)


Observations	64
R²	0.708
Adjusted R²	0.698
Residual Std. Error	41.748 (df = 61)
F Statistic	73.833^*** (df = 2; 61)

Note:	p<0.1; p<0.05; p<0.01

EXAMPLE 7.3

Table7_3 <- load_table(Table7_3)
colnames(Table7_3)[-1] <- c("Y", "X1", "X2")
cobb_douglas <- lm(Y ~ X1 + X2, data=Table7_3)
stargazer::stargazer(cobb_douglas, type="html", title = "Example 7.3")

**Example 7.3**

	Dependent variable:

	Y

X1	47.987^***
	(7.058)

X2	9.952^***
	(0.978)

Constant	233,621.600
	(1,250,364.000)


Observations	51
R²	0.981
Adjusted R²	0.980
Residual Std. Error	6,300,694.000 (df = 48)
F Statistic	1,243.514^*** (df = 2; 48)

Note:	p<0.1; p<0.05; p<0.01

EXAMPLE 7.4

Table7_4 <- load_table(Table7_4)
total_cost <- lm(Y ~ X + I(X^2) + I(X^3), data = Table7_4)
extract_eq(total_cost, wrap = TRUE, use_coefs = TRUE)

\[ \begin{aligned} \operatorname{\widehat{Y}} &= 141.77 + 63.48(\operatorname{X}) - 12.96(\operatorname{X\texttt{^}2}) + 0.94(\operatorname{X\texttt{^}3}) \end{aligned} \]

plot(Table7_4$X, Table7_4$Y,
     las = 1,
     main = "FIGURE 7.2 The total cost curve.",
     xlab = "Output",
     ylab = "Total cost of production")

Chapter 8: Multiple Regression Analysis: The Problem of Inference

CM1 <- lm(CM ~ PGNP, data = Table6_4)  # Eq (8.4.14)
anova(CM1) # resultados de Table 8.5: ANOVA Table for Regression Equation (8.4.14)

EXAMPLE 8.4

Para hacer test de hipótesis a partir del “General F Testing” se puede proceder con cualquiera de las dos formas siguientes para reproducir el ejemplo 8.4

#install.packages("car")
library(car)
Table7_9 <- load_table(Table7_9)

# forma 1: estimar el modelo y usar el paquete 'car'
model1 <- lm(Y ~ X2 + X3 + X4 + X5, data = log(Table7_9))
linearHypothesis(model1, c("X4=0", "X5=0"), test = "F")

# forma 2: estimar el modelo restringido y obtener la anova de ambos modelos
model2 <- lm(Y ~ X2 + X3, data = log(Table7_9))
anova(model2, model1)

Note que ambos procedimientos dan el mismo resultado. Particularmente, lo que nos interesa es el estadístico F y el p-valor que coinciden con los valores reportados en el libro.

EXAMPLE 8.5

library(lmtest)
Table7_6 <- load_table(Table7_6)

roses_lin <- lm(Y ~ X2 + X3, data = Table7_6)
roses_log <- lm(Y ~ X2 + X3, data = log(Table7_6))
petest(roses_lin, roses_log)

Note el p-valor de 0.1225 que es el mismo del libro.

Chapter 9: Dummy Variable Regression Models

EXAMPLE 9.1

Table9_1 <- load_table(Table9_1)
lm(Salary ~ D2 + D3, data=Table9_1)

## 
## Call:
## lm(formula = Salary ~ D2 + D3, data = Table9_1)
## 
## Coefficients:
## (Intercept)           D2           D3  
##       48015         1524        -1721

EXAMPLE 9.3

lm(Salary ~ D2 + D3 + Spending, data=Table9_1)

## 
## Call:
## lm(formula = Salary ~ D2 + D3 + Spending, data = Table9_1)
## 
## Coefficients:
## (Intercept)           D2           D3     Spending  
##    28694.92     -2954.13     -3112.19         2.34

EXAMPLE 9.4

Table9_2 <- load_table(Table9_2)
eq1 <- lm(SAVINGS ~ DUM + INCOME + DUM*INCOME, data = Table9_2) # Eq (9.5.4)
eq2 <- lm(SAVINGS ~ INCOME, subset = DUM==0, data = Table9_2)   # Eq (9.5.5)
eq3 <- lm(SAVINGS ~ INCOME, subset = DUM==1, data = Table9_2)   # Eq (9.5.6)

stargazer::stargazer(eq1, eq2, eq3, type="html", title = "Example 9.4", align = TRUE)

**Example 9.4**

	Dependent variable:

	SAVINGS
	(1)	(2)	(3)

DUM	152.479^***
	(33.082)

INCOME	0.080^***	0.080^***	0.015
	(0.014)	(0.008)	(0.008)

DUM:INCOME	-0.065^***
	(0.016)

Constant	1.016	1.016	153.495^***
	(20.165)	(11.638)	(32.712)


Observations	26	12	14
R²	0.882	0.902	0.207
Adjusted R²	0.866	0.892	0.141
Residual Std. Error	23.150 (df = 22)	13.361 (df = 10)	28.875 (df = 12)
F Statistic	54.784^*** (df = 3; 22)	92.190^*** (df = 1; 10)	3.136 (df = 1; 12)

Note:	p<0.1; p<0.05; p<0.01

EXAMPLE 9.6

Table9_4 <- load_table(Table9_4)

# crear una dummy para el primer trimestre
Table9_4$D1 <- with(Table9_4, D2==0 & D3==0 & D4==0)*1

# estimar la regresión sin intercepto
four_dummies <- lm(FRIG ~ -1 + D1 + D2 + D3 + D4, data = Table9_4)  # Eq (9.7.2)

# estimar regresión con 3 dummies y con intercepto
three_dummies <- lm(FRIG ~  D2 + D3 + D4, data = Table9_4)  # Eq (9.7.3)

stargazer::stargazer(four_dummies, three_dummies, type="html", title = "Example 9.6", align = TRUE)

**Example 9.6**

	Dependent variable:

	FRIG
	(1)	(2)

D1	1,222.125^***
	(59.990)

D2	1,467.500^***	245.375^***
	(59.990)	(84.839)

D3	1,569.750^***	347.625^***
	(59.990)	(84.839)

D4	1,160.000^***	-62.125
	(59.990)	(84.839)

Constant		1,222.125^***
		(59.990)


Observations	32	32
R²	0.987	0.532
Adjusted R²	0.985	0.482
Residual Std. Error (df = 28)	169.679	169.679
F Statistic	518.003^*** (df = 4; 28)	10.601^*** (df = 3; 28)

Note:	p<0.1; p<0.05; p<0.01

EXAMPLE 9.7

Table9_6 <- load_table(Table9_6)
colnames(Table9_6) <- c("TotalCost", "Output")

# creando la variable (X-X*), luego  una dummy para (X-X*)>0, finalmente, el producto de ambas
Table9_6$Output2 <- (Table9_6$Output - 5500)
Table9_6$D <- (Table9_6$Output2 > 0)*1
Table9_6$Output2_D <- with(Table9_6, Output2 * D)

lm(TotalCost  ~ Output + Output2_D, data=Table9_6 )

## 
## Call:
## lm(formula = TotalCost ~ Output + Output2_D, data = Table9_6)
## 
## Coefficients:
## (Intercept)       Output    Output2_D  
##   -145.7167       0.2791       0.0945

Indian Wage Earners, 1990. Página 300

Para este ejemplo se mostrará cómo se resolvería en R, sin embargo, no los resultados no coinciden con los mostrados en el libro, ya que la Table9_7 contiene 114 observaciones, las mismas que se muestran en la Tabla 9.7 de la página 301 del libro, mientras que en el output de Eviews mostrado en el libro se registra un total de 261 observaciones utilizadas.

Table9_7 <- load_table(Table9_7)
wage1 <- lm(log(WI) ~ ., data=Table9_7)
wage2 <- lm(log(WI) ~ AGE + DSEX + DSEX*DE2 + DSEX*DE3 + DSEX*DE4 + DPT, data=Table9_7)
wage3 <- lm(log(WI) ~ AGE + DSEX + DSEX:DE2 + DSEX:DE3 + DSEX:DE4 + DPT, data=Table9_7)

stargazer::stargazer(wage1, wage2, wage3, type="html", title="Modelling the wage equation of workers in a town in southern India", align = TRUE)

**Modelling the wage equation of workers in a town in southern India**

	Dependent variable:

	log(WI)
	(1)	(2)	(3)

AGE	0.029^***	0.030^***	0.029^***
	(0.005)	(0.005)	(0.005)

DE2	0.157	0.064
	(0.165)	(0.177)

DE3	0.505^***	0.433^**
	(0.165)	(0.178)

DE4	0.623^**	0.450
	(0.267)	(0.365)

DPT	0.319^**	0.333^**	0.369^***
	(0.126)	(0.128)	(0.129)

DSEX:DE2		0.656	0.734
		(0.490)	(0.465)

DSEX:DE3		0.463	0.893^*
		(0.486)	(0.462)

DSEX:DE4		0.400	0.830^**
		(0.526)	(0.392)

DSEX	-0.666^***	-0.804^***	-0.908^***
	(0.144)	(0.167)	(0.162)

Constant	3.622^***	3.642^***	3.740^***
	(0.170)	(0.171)	(0.162)


Observations	114	114	114
R²	0.499	0.512	0.479
Adjusted R²	0.471	0.470	0.450
Residual Std. Error	0.607 (df = 107)	0.608 (df = 104)	0.619 (df = 107)
F Statistic	17.769^*** (df = 6; 107)	12.123^*** (df = 9; 104)	16.412^*** (df = 6; 107)

Note:	p<0.1; p<0.05; p<0.01

Chapter 10: Multicollinearity: What Happens If the Regressors Are Correlated?

EXAMPLE 10.1

Table10_5 <- load_table(Table10_5)
model10_1 <- lm(Y ~ X2 + X3, data = Table10_5)
summary(model10_1)

## 
## Call:
## lm(formula = Y ~ X2 + X3, data = Table10_5)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -11.1120  -4.4767   0.9206   4.1744   7.5844 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)   
## (Intercept) 24.77473    6.75250   3.669  0.00798 **
## X2           0.94154    0.82290   1.144  0.29016   
## X3          -0.04243    0.08066  -0.526  0.61509   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 6.808 on 7 degrees of freedom
## Multiple R-squared:  0.9635, Adjusted R-squared:  0.9531 
## F-statistic:  92.4 on 2 and 7 DF,  p-value: 9.286e-06

anova(model10_1)

EXAMPLE 10.2

Table10_7 <- load_table(Table10_7)
summary(lm(log(C) ~ log(Yd) + log(W) + I, data = Table10_7))

## 
## Call:
## lm(formula = log(C) ~ log(Yd) + log(W) + I, data = Table10_7)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.018441 -0.010001  0.000337  0.007039  0.032578 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -0.4677111  0.0427780 -10.933 7.33e-15 ***
## log(Yd)      0.8048729  0.0174979  45.998  < 2e-16 ***
## log(W)       0.2012700  0.0175926  11.441 1.43e-15 ***
## I           -0.0026891  0.0007619  -3.529 0.000905 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.01193 on 50 degrees of freedom
## Multiple R-squared:  0.9996, Adjusted R-squared:  0.9995 
## F-statistic: 3.783e+04 on 3 and 50 DF,  p-value: < 2.2e-16

Scatter plot for example 10.2

pairs(Table10_7[, -1], pch = 19)

An Extended Example: The Longley Data

Table10_8 <- load_table(Table10_8)
# completar la tabla con los valore del libro
Table10_8 <- rbind(Table10_8, c(1962, 70551, 1169, 554894, 4007, 2827, 130081, 16))
stargazer::stargazer(lm(Y ~ ., data = Table10_8[, -1]), type="html")


	Dependent variable:

	Y

X1	1.506
	(8.491)

X2	-0.036
	(0.033)

X3	-2.020^***
	(0.488)

X4	-1.033^***
	(0.214)

X5	-0.051
	(0.226)

TIME	1,829.151^***
	(455.478)

Constant	77,270.120^***
	(22,506.710)


Observations	16
R²	0.995
Adjusted R²	0.992
Residual Std. Error	304.854 (df = 9)
F Statistic	330.285^*** (df = 6; 9)

Note:	p<0.1; p<0.05; p<0.01

Puede observarse que los resultados son equivalentes a los presentados por el libro, excepto por el valor de la pendiente, puede significar un error de escritura en el libro. Los resultados de la matriz de correlaciones tampoco coinciden con los reportados en el libro.

cor(Table10_8[, -1])

##              Y        X1        X2         X3         X4        X5      TIME
## Y    1.0000000 0.9708985 0.9835516  0.5024981  0.4573074 0.9603906 0.9713295
## X1   0.9708985 1.0000000 0.9915892  0.6206334  0.4647442 0.9791634 0.9911492
## X2   0.9835516 0.9915892 1.0000000  0.6042609  0.4464368 0.9910901 0.9952735
## X3   0.5024981 0.6206334 0.6042609  1.0000000 -0.1774206 0.6865515 0.6682566
## X4   0.4573074 0.4647442 0.4464368 -0.1774206  1.0000000 0.3644163 0.4172451
## X5   0.9603906 0.9791634 0.9910901  0.6865515  0.3644163 1.0000000 0.9939528
## TIME 0.9713295 0.9911492 0.9952735  0.6682566  0.4172451 0.9939528 1.0000000

Chapter 11: Heteroscedasticity: What Happens If the Error Variance Is Nonconstant?

EXAMPLE 11.1

data11_1 <- read.table("http://www.econometrics.com/manuals/gujarati/data_11.1.shd")
colnames(data11_1) <- c("Y", "X")

# regresión de la Eq (11.5.3)
y_hat <- lm(Y ~ X, data = data11_1)

# obteniendo los residuos y elevándolos al cuadrado
data11_1$u <- residuals(y_hat)

# regresión de la Eq (11.5.4)
u2_hat <- lm(log(I(u^2)) ~ log(X), data =  data11_1)
summary(u2_hat)

## 
## Call:
## lm(formula = log(I(u^2)) ~ log(X), data = data11_1)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.18904 -0.43633 -0.10916  0.08593  3.10162 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)
## (Intercept)   35.827     38.323   0.935    0.381
## log(X)        -2.802      4.196  -0.668    0.526
## 
## Residual standard error: 1.467 on 7 degrees of freedom
## Multiple R-squared:  0.05989,    Adjusted R-squared:  -0.07442 
## F-statistic: 0.4459 on 1 and 7 DF,  p-value: 0.5257

EXAMPLE 11.2

summary(lm(abs(u) ~ X, data=data11_1 ))

## 
## Call:
## lm(formula = abs(u) ~ X, data = data11_1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -161.27  -79.65  -55.17  -37.46  558.67 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)
## (Intercept) 407.47601  633.18944   0.644    0.540
## X            -0.02035    0.06750  -0.301    0.772
## 
## Residual standard error: 228 on 7 degrees of freedom
## Multiple R-squared:  0.01282,    Adjusted R-squared:  -0.1282 
## F-statistic: 0.09088 on 1 and 7 DF,  p-value: 0.7718

EXAMPLE 11.4

Table11_3 <- load_table(Table11_3)
library(lmtest)
gqtest(Y ~ X, order.by = ~ X, fraction = 4/30, data=Table11_3[-c(14:17)])

## 
##  Goldfeld-Quandt test
## 
## data:  Y ~ X
## GQ = 4.0746, df1 = 11, df2 = 11, p-value = 0.01409
## alternative hypothesis: variance increases from segment 1 to 2

EXAMPLE 11.5

bptest(Y ~ X, studentize = FALSE, data=Table11_3)

## 
##  Breusch-Pagan test
## 
## data:  Y ~ X
## BP = 5.214, df = 1, p-value = 0.02241

EXAMPLE 11.7

data_example_11_7 <- read.table("http://www.econometrics.com/manuals/gujarati/data_11.7.shd")
colnames(data_example_11_7) <- c("Compensation", "Employment_size", "Sigma")
WLS <- lm(Compensation ~ Employment_size, weights = 1/Sigma^2, data = data_example_11_7) 
OLS <- lm(Compensation ~ Employment_size, data = data_example_11_7)
stargazer::stargazer(WLS, OLS, column.labels = c("WLS", "OLS"), type="html", title = "Example 11.7: WSL vs OLS", align = TRUE)

**Example 11.7: WSL vs OLS**

	Dependent variable:

	Compensation
	WLS	OLS
	(1)	(2)

Employment_size	154.241^***	148.800^***
	(16.946)	(14.406)

Constant	3,406.216^***	3,417.778^***
	(80.926)	(81.070)


Observations	9	9
R²	0.922	0.938
Adjusted R²	0.911	0.930
Residual Std. Error (df = 7)	0.135	111.592
F Statistic (df = 1; 7)	82.849^***	106.682^***

Note:	p<0.1; p<0.05; p<0.01

EXAMPLE 11.11

library(sandwich)
Table11_11 <- read.table("http://www.econometrics.com/manuals/gujarati/data_11.11.shd")
colnames(Table11_11) <- c("Salary", "Famincome", "Propvalue")
OLS <- lm(Salary ~ ., data  = log(Table11_11))
Robust <- coeftest(OLS, vcov = vcovHC(OLS, type = "HC1"))
stargazer::stargazer(OLS, Robust, column.labels = c("OLS se", "Robust se"), type="html", align = TRUE, 
                     title = "Example 11.11: OLS estimations vs robust standard errors estimations.")

**Example 11.11: OLS estimations vs robust standard errors estimations.**

	Dependent variable:

	Salary
	OLS	coefficient
		test
	OLS se	Robust se
	(1)	(2)

Famincome	0.258^***	0.258^**
	(0.080)	(0.101)

Propvalue	0.070^***	0.070
	(0.021)	(0.046)

Constant	7.020^***	7.020^***
	(0.805)	(0.772)


Observations	94
R²	0.220
Adjusted R²	0.203
Residual Std. Error	0.093 (df = 91)
F Statistic	12.824^*** (df = 2; 91)

Note:	p<0.1; p<0.05; p<0.01

Chapter 12: Autocorrelation: What Happens If the Error Terms Are Correlated?

Section 12.13

#library(sandwich)
Table10_7 <- load_table(Table10_7)
reg1 <- lm(log(C) ~ log(Yd) + log(W) + I, data = Table10_7)
dwtest(reg1)

## 
##  Durbin-Watson test
## 
## data:  reg1
## DW = 1.2892, p-value = 0.0009444
## alternative hypothesis: true autocorrelation is greater than 0

# Página 450
reg2 <- lm(log(C) ~ log(Yd) + log(W) + I + dplyr::lag(log(C)), data = Table10_7)
dwtest(reg1)

## 
##  Durbin-Watson test
## 
## data:  reg1
## DW = 1.2892, p-value = 0.0009444
## alternative hypothesis: true autocorrelation is greater than 0

# Página 451
coeftest(reg1, vcov = vcovHAC(reg1))

## 
## t test of coefficients:
## 
##                Estimate  Std. Error  t value  Pr(>|t|)    
## (Intercept) -0.46771107  0.04263209 -10.9709 6.492e-15 ***
## log(Yd)      0.80487293  0.01714664  46.9406 < 2.2e-16 ***
## log(W)       0.20126996  0.01517438  13.2638 < 2.2e-16 ***
## I           -0.00268906  0.00089154  -3.0162  0.004017 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Chapter 13: Econometric Modeling: Model Specification and Diagnostic Testing

EXAMPLE 13.4

# Davidson–MacKinnon J Test, página 491-492
library(lmtest)
Table13_3 <- load_table(Table13_3)

example_13.4 <- ts(Table13_3[,-1], frequency = 1, start = 1970)
example_13.4 <- na.contiguous(cbind(example_13.4, lag(example_13.4, k = -1)))
colnames(example_13.4) <- c("PPCE", "PDPI", "lag_PPCE", "lag_PDPI")

model1 <- lm(PPCE ~ PDPI + lag_PDPI, data = example_13.4)
model2 <- lm(PPCE ~ PDPI + lag_PPCE, data = example_13.4)
jtest(model1, model2)

Chapter 14: Nonlinear Regression Models

EXAMPLE 14.1

Table14_1 <- load_table(Table14_1)
summary(nls(Y ~ a*exp(b*X), start=list(a=0.3, b=0), data=Table14_1))

## 
## Formula: Y ~ a * exp(b * X)
## 
## Parameters:
##     Estimate Std. Error t value Pr(>|t|)    
## a  0.5089023  0.0074592   68.22 1.12e-14 ***
## b -0.0059652  0.0004844  -12.31 2.29e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.01303 on 10 degrees of freedom
## 
## Number of iterations to convergence: 5 
## Achieved convergence tolerance: 1.444e-06

Para obtener los mismos resultados del ejemplo, la ecuación que se introdujo en la función nls fue $\hat{Fee}_t = \beta_1 e^{\beta_2 Asset}$ y no la ecuación (14.5.1) mostrada en el libro que es $\hat{Fee}_t = \beta_1 Asset^{\beta_2}$. Así, la ecuación (14.5.1) debería ser $\hat{Fee}_t = 0.5089 e^{-0.0059 Asset}$. Adicionalmente, verifíquese que esta es la ecuación planteada en (14.3.1). Debe entenderse que esto es un error de escritura en el libro.

Si la espeficiación del modelo fuese $\hat{Fee}_t = \beta_1 Asset^{\beta_2}$, entonces los resultados cambiarían y serían los siguientes:

summary(nls(Y ~ a*(X^b), start=list(a=0.3, b=0), data=Table14_1))

## 
## Formula: Y ~ a * (X^b)
## 
## Parameters:
##   Estimate Std. Error t value Pr(>|t|)    
## a  0.52640    0.01694  31.067 2.80e-11 ***
## b -0.06947    0.01067  -6.509 6.82e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.02388 on 10 degrees of freedom
## 
## Number of iterations to convergence: 6 
## Achieved convergence tolerance: 5.919e-07

Como nota importante del error de este ejemplo es que se debe ser cuidadoso con la especificación del modelo teórico y la estimación del modelo empírico, ambas deben coincidir.

EXAMPLE 14.2

Table14_3 <- load_table(Table14_3)
summary(nls(GDP ~ a*(Labor^b)*(Capital^c), start = list(a=.5, b=.2, c=.9), data=Table14_3))

## 
## Formula: GDP ~ a * (Labor^b) * (Capital^c)
## 
## Parameters:
##   Estimate Std. Error t value Pr(>|t|)    
## a  0.52923    0.27124   1.951   0.0677 .  
## b  0.18106    0.14130   1.281   0.2173    
## c  0.88277    0.07081  12.466 5.61e-10 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 6643 on 17 degrees of freedom
## 
## Number of iterations to convergence: 4 
## Achieved convergence tolerance: 1.487e-06

EXAMPLE 14.3

Table14_2 <- load_table(Table14_2)
summary(lm(log(POPULATION) ~ YEAR, data=Table14_2))

## 
## Call:
## lm(formula = log(POPULATION) ~ YEAR, data = Table14_2)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.010838 -0.002177  0.001099  0.003441  0.011455 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -8.710e+00  1.477e-01  -58.96   <2e-16 ***
## YEAR         1.063e-02  7.429e-05  143.06   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.005022 on 36 degrees of freedom
## Multiple R-squared:  0.9982, Adjusted R-squared:  0.9982 
## F-statistic: 2.047e+04 on 1 and 36 DF,  p-value: < 2.2e-16

Chapter 15: Qualitative Response Regression Models

EXAMPLE 15.1

Table15_1 <- load_table(Table15_1)

lpm <- lm(Y ~ X, data=Table15_1)
predict(lpm, newdata = data.frame(X=12))  # predicción del modelo con X=12

##         1 
## 0.2798857

Table 15.8, Table 15.13 y Table 15.14. Ver este enlace para más detalle sobre estimación y predicción de con estos modelos

Table15_7 <- load_table(Table15_7)
lpm <- lm(GRADE ~GPA + TUCE + PSI, data = Table15_7)
logit <- glm(GRADE ~GPA + TUCE + PSI, family = "binomial", data = Table15_7)
probit <- glm(GRADE ~ GPA + TUCE + PSI, family = binomial(link = "probit"), data = Table15_7)
stargazer::stargazer(lpm, logit, probit, title = "Modelos de respuesta cualitativa", type="html", align = TRUE)

**Modelos de respuesta cualitativa**

	Dependent variable:

	GRADE
	OLS	logistic	probit
	(1)	(2)	(3)

GPA	0.464^***	2.826^**	1.626^**
	(0.162)	(1.263)	(0.690)

TUCE	0.010	0.095	0.052
	(0.019)	(0.142)	(0.081)

PSI	0.379^**	2.379^**	1.426^**
	(0.139)	(1.065)	(0.587)

Constant	-1.498^***	-13.021^***	-7.452^***
	(0.524)	(4.931)	(2.572)


Observations	32	32	32
R²	0.416
Adjusted R²	0.353
Log Likelihood		-12.890	-12.819
Akaike Inf. Crit.		33.779	33.638
Residual Std. Error	0.388 (df = 28)
F Statistic	6.646^*** (df = 3; 28)

Note:	p<0.1; p<0.05; p<0.01

Chapter 16: Panel Data Regression Models

Sin datos para reproducir los ejemplos de estimaciones.

Chapter 17: Dynamic Econometric Models: Autoregressive and Distributed-Lag Models

EXAMPLE 17.7

Table17_2 <- load_table(Table17_2)
example_17.7 <- ts(Table17_2[,-1], frequency = 1, start = 1959)
example_17.7 <- na.contiguous(cbind(example_17.7, lag(example_17.7, k = -1)))
colnames(example_17.7) <- c("PPCE", "PDPI", "lag_PPCE", "lag_PDPI")
summary(lm(PPCE ~ PDPI + lag_PPCE, data = example_17.7))

## 
## Call:
## lm(formula = PPCE ~ PDPI + lag_PPCE, data = example_17.7)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -613.26  -87.23   27.00  164.31  365.84 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -252.91895  157.35174  -1.607   0.1151    
## PDPI           0.21389    0.07062   3.029   0.0041 ** 
## lag_PPCE       0.79715    0.07331  10.874 4.74e-14 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 224.9 on 44 degrees of freedom
## Multiple R-squared:  0.9982, Adjusted R-squared:  0.9981 
## F-statistic: 1.231e+04 on 2 and 44 DF,  p-value: < 2.2e-16

EXAMPLE 17.11

# install.packages("dLagM")
library(dLagM)
Table17_8 <- load_table(Table17_8)
almon <- polyDlm(y=Table17_8$INVETORY, x=Table17_8$SALES, q=3 , k=2, show.beta = FALSE)
summary(almon$model)

## 
## Call:
## "Y ~ (Intercept) + X.t"
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -58320 -15092  -3625  13570  52228 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  2.585e+04  6.597e+03   3.918  0.00035 ***
## z.t0         1.115e+00  5.382e-01   2.072  0.04494 *  
## z.t1        -3.714e-01  1.374e+00  -0.270  0.78842    
## z.t2        -6.005e-02  4.550e-01  -0.132  0.89568    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 23950 on 39 degrees of freedom
## Multiple R-squared:  0.9755, Adjusted R-squared:  0.9736 
## F-statistic: 517.8 on 3 and 39 DF,  p-value: < 2.2e-16

Chapter 22: Time Series Econometrics: Forecasting

EXAMPLE 22.4

#Table22_5
#Table22_7

Table22_6 <- load_table(Table22_6)
yen <- ts(Table22_6$Yen.exchange, start=c(1971,1), frequency = 12)

# usando función Arima del paquete forecast 
library(forecast)
Arima(log(yen), order=c(0,1,1), include.constant = TRUE)

## Series: log(yen) 
## ARIMA(0,1,1) with drift 
## 
## Coefficients:
##          ma1    drift
##       0.3293  -0.0028
## s.e.  0.0443   0.0016
## 
## sigma^2 estimated as 0.0006623:  log likelihood=1002.67
## AIC=-1999.34   AICc=-1999.29   BIC=-1987.03

# o directamente con R base
arima(diff(log(yen)), order=c(0,0,1))

## 
## Call:
## arima(x = diff(log(yen)), order = c(0, 0, 1))
## 
## Coefficients:
##          ma1  intercept
##       0.3293    -0.0028
## s.e.  0.0443     0.0016
## 
## sigma^2 estimated as 0.0006593:  log likelihood = 1002.67,  aic = -1999.34

#FIN.

Páginas

Basic Econometrics. Gujarati and Porter: Réplica de ejemplos con R.