Income Inequality Data

08 November 2018

There has been much progress in improving the availability, quality and comparability of income and wealth inequality data. Several cross-national databases containing summary inequality statistics are now available. In this note, we review the World Bank’s PovcalNet, the Luxembourg Income Study and Wealth Study Databases (LIS, LWS), the Standardized World Income Inequality Database (SWIID), the World Income Inequality Database (WIID), the World and Wealth Income Database (WID.world), All the Ginis dataset, the Estimated Household Income Inequality dataset (EHII) and the Global Consumption and Income Project (GCIP).1 All data are publicly available free of charge in all the databases examined with the exception LIS/LWS.
The databases reviewed differ considerably in purpose, coverage, data sources and indicators provided. Some of them are just repositories of estimates compiled from primary and other secondary sources. Others provide original estimates based on microdata from, mainly, a growing number of household surveys. Some rely on imputation methods to obtain estimates for years when data are missing while others do not. As a result, coverage by country and year differs significantly across datasets. Some databases are produced by institutions while others are developed by individual researchers. Some institutions make data harmonization one of their priorities while others offer diverse sets of data—and the metadata needed to identify differences across data sources and countries.

Although there is significant agreement among these datasets, there are also inconsistencies in both the levels and trends of inequality obtained from each database (for each given indicator). Some of the differences across databases are illustrated below. Overall, there are trade-offs between breadth (coverage) and comparability. Maximizing comparability and quality means focusing on a small number of (developed) countries. It also requires thoroughly harmonizing data, using data from one source or using only a single basis of calculation. Increasing coverage means relying on less reliable data, using different variables to produce estimates (income is used in practically all developed countries; consumption is often the underlying measure in developing countries), and/or making assumptions to impute values where data are missing.

Among the databases examined, PovcalNet appears to have the most non-imputed estimates for the largest number of countries. It is also the data source used for the international monitoring of SDG target 10.1. 2 On the other hand, LIS is the only source that uses a uniform set of assumptions and definitions on the basis of thoroughly harmonized microdata to maximize comparability. SWIID provides the most complete dataset, but many of the values are imputed.

Read our Note on Income Inequality Data

News | 28 June 2024
New York, 20 June 2024 – The UN General Assembly yesterday declared 2025 as the International Year of Cooperatives (IYC2025) to be celebrated under the theme "Cooperatives Build a Better World." The theme highlights the lasting global impact of…
News | 28 June 2024
The President of the 78th session of the General Assembly, H. E. Dennis Francis, has appointed Co-facilitators and Advisers to conduct consultations on possible further measures necessary to enhance the participation of Indigenous Peoples’…
27 June 2024
Condolence Note for the passing of Ms. Moana Sinclair 21 June 2024 On behalf of the Permanent Forum on Indigenous Issues and its Secretariat, I am sharing the sad news of the passing of Ms. Moana Sinclair on 30 May 2024. Moana was a lawyer,…