The gravity model
The gravity model is commonly applied in economics and has been deemed to be a successful tool for estimating international trade (Anderson 1979), a general framework to examine trade patterns (Eichengreen and Irwin 1995) and one of the most “empirically successful” trade analytical tools in economics (Anderson and van Wincoop 2003, p.170). The theoretical foundation of the model has been established through the work of several scholars, such as Linnemann (1966); Bergstrand (1985); Evenett and Keller (2002) and Anderson and van Wincoop (2003).
The gravity model estimates bilateral trade flows where trade is positively related to the level of GDP of the trading partners and negatively related to the distance between them. In the model, bilateral trade flows are based on the mutual gravitational force between the nations, with the gravity variable GDP reflecting mass. In addition to the conventional standard version of the model, several modifications can be made and dummy variables added (Chi and Kilduff 2010).
The gravity model has been widely used to estimate product and factor movements within the context of bilateral trade flows across international borders (Anderson 1979; Bergstrand 1985; McCallum 1995; Baier and Bergstrand 2001; Hummels 2001; Feenstra 2002; Anderson and van Wincoop 2003; Anderson and van Wincoop 2004; Anderson 2011) and trade agreements (McCallum 1995; Lavergne 2004; Rose 2004; Carrere 2006; Baier and Bergstrand 2007; Caporale et al. 2009; Cipollina and Salvatici 2010; Kepaptsoglou et al. 2010). Nobel laureate, Jan Tinbergen, was the first to apply the gravity model to the effect of Free Trade Agreements (FTAs) on bilateral trade flows, by including them in the model as a dummy variable (Tinbergen 1962). Since then, the gravity model has become the foundation for estimating the effects of FTAs and customs unions on bilateral trade flows (Bayoumi and Eichengreen 1995), particularly in relation to bilateral trade flows between fellow members of the EU (Balassa 1967; Aitken 1973; Abrams 1980; Brada and Mendez 1985; Frankel et al. 1995).
There is minimal agreement as to which variables should be included in the gravity equation, and which ones that should be omitted (Yamarik and Ghosh 2005). Anderson and van Wincoop (2003) point out that bias can appear in both the estimation and the analysis through the omission of the wrong variables. However, trade data appears to perform empirically well in the gravity model (Feenstra 2002) and, as a result, the gravity model has gained in popularity in the empirical trade literature (Yamarik and Ghosh 2005).
The gravity equation is derived as a reduced form from a general equilibrium model of international trade in final goods. According to Chi and Kilduff (2010) the original gravity model in international trade is defined as:
$$ {T}_{ij}=A\times \left(\frac{Y_i\times {Y}_j}{D_{ij}}\right) $$
(1)
…where the variables are defined as follows:
Tij trade flow from country i to country j;
Yi GDP of country i;
Yj GDP of country j;
Dij physical distance between country i and country j and;
A is a constant.
Nevertheless, according to Bergstrand (1985), the gravity model in international trade commonly takes the form:
$$ {T}_{ij}={\beta}_0{\left({Y}_i\right)}^{\beta_1}{\left({Y}_j\right)}^{\beta_2}{\left({D}_{ij}\right)}^{\beta_3}{\left({A}_{ij}\right)}^{\beta_4}{\mu}_{ij} $$
(2)
…where the parameters to be estimated are denoted by β and the variables are defined as follows:
Tij trade flow from country i to country j;
Yi GDP of country i;
Yj GDP of country j;
Dij physical distance between country i and country j;
Aij other factor(s) either aiding or resisting trade between country i and country j and;
μij a logarithmic-normally distributed error term with E(ln μij) = 0.
The gravity equation is normally specified in a double-logarithmic form and estimated using Ordinary Least Squares (OLS) regression analysis (Eichengreen and Irwin 1995), although there are some exceptions to this general practice. Variations which have been applied to resolve a number of different issues include the use of non-linear OLS (Anderson and van Wincoop 2003), maximum likelihood estimation (Baier and Bergstrand 2007), a tobit model form (Chen 2004; Martin and Pham 2015), poisson pseudo maximum-likelihood estimation (Santos Silva and Tenreyro 2006) and a semi-logarithmic form (Eichengreen and Irwin 1995).
Sample
The sample used to estimate the model consisted of all countries to which Germany exported more than 1000 passenger cars in the designated year and for which data were available. Country i in the model denotes Germany and country j the import country. The total sample consists of more than 80 observations per year, representing approximately 98% of the total quantity of passenger cars exported by Germany over the 4-year period 2012 to 2015 inclusive.Footnote 5
In specifying the sample, the work was delimited by focussing solely on the export of complete cars. Thus, interactions between countries or industries which take place either before or after a complete car is exported are not accounted for. This means that the following are not addressed in the sample specification, data collection, model estimation or forecasts: the movement of components; whether Brexit scenarios bring about a change in the export quantities of passenger cars from Germany to other countries or; from where the UK would import cars in the future in the case that export quantities from Germany are predicted to decline. Model forecasts assume ceteris paribus applies to external factors. Thus, for example, they do not take into account the expected growth in demand for electric cars which, inevitably, will disrupt the current market structure. Finally, the work does not distinguish between new and used passenger cars.
Selection of variables and data collection
The dependent variable in the model is the volume of passenger cars exported from Germany (country i) to a range of importing nations (country j). Data on the export and import quantities of passenger cars from country i to country j were collected from the Comtrade databaseFootnote 6 (United Nations 2017a). The collection of the required trade data was undertaken on the basis of the following approach:
-
Data were extracted using the 4th version of the Harmonized Commodity Description and Coding System (hereinafter HS) which is an international nomenclature. The six-digit system consists of goods classified at different levels of specificity.
-
Some countries do not report data at lower commodity code levels (United Nations Statistics 2017a). Hence, this analysis uses the highest commodity code level for which quantity is reported. Thus, the commodity code “HS 8703 Passenger Cars” is used to collect data on imported and exported quantities of passenger cars.
-
Although the Comtrade database provides information on quantity, weight and value of trade, this analysis utilises quantities so that issues such as valuation and currency conversion are avoided.
-
In line with the advice of United Nations Statistics (2017a), the total quantities were based on the consolidated amount for all countries and not what the database refers to as “world” totals.
-
An average was taken for those situations where there were differences between reported export and reported import quantities (United Nations Statistics 2017b).
-
Where relevant, export quantities include re-exports.
The core of the gravity model is based on GDP and distance, but a variety of variables were considered for initial inclusion (Yamarik and Ghosh 2005). Selecting the appropriate variables for inclusion is important since including irrelevant variables can lower the precision of the model, while omitting variables that are important could introduce bias into the model estimates (Greene 2003).
The independent variables included within the initial specification of the gravity model to be tested are as follows: the GDP of countries i and j; the GDP per capita of countries i and j; the population of countries i and j; the geographical distance between the trade partners; the quality of logistics in country j and; the import tariff on passenger cars moving from country i to country j. In addition to these, the gravity model is initially specified to include a number of dummy variables controlling for: membership of the EEA; if country j has direct access to the sea; country adjacency and; if countries i and j share a common language. The choice of these variables was made by reviewing work by, for example, Aitken (1973), Rose (2004) and Chi and Kilduff (2010), who have all performed similar studies.
GDP and population
GDP is included in the model on the basis that the GDP of an exporting nation measures its productive capacity (Aitken 1973: Abrams 1980), while the GDP of an importing nation provides a measure of absorptive capacity or potential market size (Tinbergen 1962). Together with population, the value of GDP will impact the demand for imports (Aitken 1973; Abrams 1980). In terms of the exporting nation, the potential for economies of scale suggests that the larger the population, the more efficient is market production (Aitken 1973).
GDP per capita for countries i and j are also included in the model because, as established by Linder (1961), countries that have similar demand structures trade more with each other than dissimilar countries and that greater inequality has a negative effect on trade. Bergstrand (1990) argues that this relationship is present in both the supply structure, based on the Heckscher-Ohlin theorem, as well as in the demand structure, such as in the work by Linder (1961).
Data on GDP, population and GDP per capita for all countries were collected from the World Bank (2017a). GDP data referred to the GDP at purchaser’s prices in USDFootnote 7; population data was the total population based on mid-year figures for all residents, regardless of legal status or citizenship and; GDP per capita is the ratio of the former over the latter (World Bank 2017a). In utilising this source, it should be recognised that the World Bank relies on international and regional sources such as the United Nations (2017b), Eurostat by the European Commission (2017) and Prism (2017). The World Bank also uses national statistics gathered from census reports and other national sources which mean that they are reliant on those individual countries to provide updated statistics (see World Bank (2017a) for more details). Countries which did not report their national statistics were excluded from the sample.
Distance, Total logistics cost and the quality of logistics
Geographical distance has long been treated as a proxy for transportation cost (for example, see Linnemann 1966). Disdier and Head (2008) found that bilateral trade is almost directly inversely proportionate to physical distance, with an average increase of distance by 10% reducing the trade between the parties by approximately 9%. Chi and Kilduff (2010) suggest that this is because transportation costs and convenience favour closer relationships and sourcing. Due to the advancement of logistics-related technology, distance as a proxy for transportation costs has been questioned and total logistics costs argued as being a more appropriate input variable. Disdier and Head (2008) have shown, however, that the effect of geographical distance has not declined in more recent years, indicating that technological change has not led to a reduction in the impact of distance.
A distance variable is thus included within the model as one proxy for total logistics cost, with distance measured either from the capital city of country i to the capital city of country j, as suggested by Yamarik and Gosh (2005) or as the “great circle distance” from the location where the largest port is situated in country i to the location of the largest port of country j, in line with Smarzynska (2001). The choice between these two measures is made on a country-by-country basis where countries north of Turkey or located within Europe were assumed to transport cars by land and the others by sea. If country j lacked a port and was assumed to transport cars by sea, the distance was measured from the capital city of country j to the closest port, and from that port to the largest port of country i. Road transport distances were obtained from Google Maps (2017) and sea transport distances from Marinetraffic (2017).Footnote 8
In order to test other potential influences on total logistics cost, the model initially included a proxy for infrastructure, namely the total span of the motorway network, in line with Bougheas et al. (1999). However, due to the characteristics of the international car trade (i.e. it is mostly moved as seaborne freight in car carriers), the model was later modified to instead include a dummy variable for country j’s direct access to the sea. Google Maps (2017) provided the source for data on whether country j had direct access to the sea and for countries that share a border with country i.
The overall quality of a nation’s logistics system is sourced from the World Bank (2017b), where the Logistics Performance Index (LPI) is derived from a survey where respondents rate countries based on several logistics performance criteria: “the efficiency of customs and border clearance”; “the quality of trade and transport infrastructure”; “the ease of arranging competitively priced shipments”; “the competence and quality of logistics services”; “the ability to track and trace consignments” and; “the frequency of which shipments reach consignees within scheduled or expected delivery times” (World Bank 2014, pp.51–52). The index is only made available every second year. Hence, the index for 2012 was applied to the models for 2012 and 2013 and the index for 2014 was applied in 2014 and 2015. The input variable was based on the country with the highest index value being the benchmark and determined as follows for the importing nation, country j:
$$ {LOGIS}_j=\left(\ \frac{x_j}{x_i}\ \right)\times 100 $$
(4)
…where:
LOGISj represents the overall quality of logistics performance of country j in year t; xj is the observed quality of logistics in country j in year t and; xi is the observed quality of logistics in the country with the highest LPI value in year t.
Tariffs
All countries profit from less barriers to trade (Eaton and Kortum 2002) and reductions of tariffs have been argued to explain about 26% of the growth of trade in OECD countries between the late 1950s and the late 1980s (Baier and Bergstrand 2001). Therefore, a variable reflecting the tariff rate was included in the model. For the purpose of collecting the data, the MFN tariff rates for ‘HS 8703 Passenger Cars’ were sourced from the World Trade Organization (2017b). The rates were presented as applied MFN tariff rates in weighted averages based on the sub-categories of ‘HS 12 8703 Passenger Cars’. The data were compared to all of the EU’s PTAs and. if there was a deviation, the bound rate in the PTA was applied. In cases where HS 8703 was not specifically referred to in a PTA, the applied MFN tariff rate presented by the World Trade Organization (2017b) was utilised.
In addition, the most recent updated tariff rates were assumed to be valid in the years following. Thus, if country j reported a tariff rate x for HS 8703 in year t, then this rate was applied in years t + 1 and t − 1 in cases where there was no other tariff rate present. If there was a change of tariff rate x to tariff rate z in year t + 1, the tariff rate z was applied in t + 1 and all years following it. If the tariff rate x was introduced and came into effect in year t − 1, but tariff rate y was applied in all years before year t − 1, then the tariff rate x applies in year t − 1 and all years following it. A value of 1 was added to all tariff rates so that logarithms could be applied.
Language commonality and country adjacency
Language commonality was included to show whether countries i and j shared a language or cultural similarity (Frankel et al. 1995) since this makes trade easier (Bougheas et al. 1999). When two countries share a language, it increases trade “substantially” (Havrylyshyn and Pritchett 1991, p.6). In addition to the language commonality variable, the model also included a border effect dummy variable. Aitken (1973, p.882) argues that neighbouring countries can be expected to trade more with each other due to “similarity of tastes and an awareness of common interests”. The data on language commonality was based on CIA (2017).
EEA membership
Most economists argue that international trade should be free (Rose 2004). However, the regional integration provided by the EU has the “potential to harm participants through trade diversion or nonparticipants nearby through worsened terms of trade” (Eaton and Kortum 2002, p.1743). A dummy variable is included, therefore, for membership of the European Community. Baier and Bergstrand (2001) explain that it might seem unnecessary to include dummy variables to reflect a preferential trade agreement (hereinafter PTA), but the PTA itself might lead to greater trade beyond the effect of no tariff barriers. Input data on membership of the EEA was collected from the European Union (2017) and included all member countries of the EU or EFTA.
Model specification
In summary, the fully specified as follows:
$$ {\displaystyle \begin{array}{c}\ln \left({EX}_{ij}\right)=\alpha +{\beta}_1\ln \left({GDP}_i\right)+{\beta}_2\ln \left({GDP}_j\right)+{\beta}_3\ln \left({D}_{ij}\right)+{\beta}_4\ln \left({POP}_i\right)\\ {}+{\beta}_5\ln \left({POP}_j\right)+{\beta}_6\ln \left({GDP CAP}_i\right)+{\beta}_7\ln \left({GDP CAP}_j\right)\\ {}+{\beta}_8\ln \left({TARIFF}_{ij}\right)+{\beta}_9\ln \left({LOGIS}_j\right)+{\beta}_{10}{CA}_{ij}\\ {}+{\beta}_{11}{LC}_{ij}+{\beta}_{12}{EEA}_j+{\beta}_{13}{SEA}_j+{e}_{ij}\end{array}} $$
(3)
..where the parameters to be estimated are denoted by β and the variables defined as follows:
EXij export of passenger cars from country i to country j, in units;
GDPi GDP of country i, in current USD;
GDPj GDP of country j, in current USD;
Dij physical distance between the trade centre in country i and country j, in kilometres;
POPi total population of country i;
POPj total population of country j;
GDPCAPi GDP per capita of country i, in current USD;
GDPCAPj GDP per capita of country j, in current USD;
TARIFFij tariff rate that country j imposes on passenger cars from country i;
LOGISj quality of logistics of country j;
CAij country adjacency, a dummy variable with a value of 1 if country j shares a common border with country i, 0 otherwise;
LCij common language, a dummy variable with a value of 1 if country j shares an official language with country i, 0 otherwise;
EEAj European Community, a dummy variable with a value of 1 if country j is a member of the European Community, 0 otherwise;
SEAj direct access to the sea, a dummy variable with a value of 1 if country j has direct access to the sea, 0 otherwise; and.
eij the error term.
This model is estimated using OLS regression analysis. A higher GDP and population in country j were expected to lead to greater demand and, consequently. to higher passenger car exports from country i to country j. A higher GDP and population in country i were expected to increase the production capacity. Similarly, a higher quality of logistics in the importing country can also be expected to be positively related to trade volumes. With respect to the dummy variables, the adjacency of trading nations, a common language, membership of the EEA and direct access to the sea would all also be expected to facilitate trade. Hence, the variables GDPj; POPj; GDPCAPj; LOGISj; CAij; LCij; EEAj and SEAj were all expected to be positively correlated to export quantities. On the other hand, increasing physical distance between trade partners, as well as higher tariffs, were both expected to have a depressing effect on trade quantities. Hence, the variables Dij and Tariffij were expected to be negatively correlated to trade volumes.