Skip to main content

CADS

Data Analytics Themes

Return to DART main page

Global Studies and Living Standards Analytics

The global studies and living standards analytics node of DART is dedicated to the use of innovative data analysis techniques applied to household living standards survey data, promoting a better understanding of how and why living standards vary across the world. The approach is strongly cross-disciplinary, joining fields such as geography, economics, sociology and statistics, as well as global studies and computer science. Past and current interests include studies related to Africa, Vietnam and Asia and investigations of the international digital divide.

Core Group Members

Joel Deichmann, Dominique Haughton , Abdi Eshghi, Phong Nguyen, Maria Skaletsky

Current Projects

Small area estimation and multilevel models in Vietnam, using Vietnam Household Living Standards Surveys

Dominique Haughton (Bentley University, Universities Paris 1 and Toulouse 1), Phong Nguyen (T&C Consulting, Hanoi, Vietnam).

Multilevel models and longitudinal living standards in Vietnam

Dominique Haughton (Bentley University, Universities Paris 1 and Toulouse 1), Phong Nguyen (T&C Consulting, Hanoi, Vietnam).

Tet and holiday expenditures in Vietnam

Dominique Haughton (Bentley University, Universities Paris 1 and Toulouse 1), Le Thi Xuan Mai (Science University, Ho Chi Minh City, Vietnam), Phong Nguyen (T&C Consulting, Hanoi, Vietnam).

Global digital divide

Robert Galliers (Bentley University), Dominique Haughton (Bentley University, Universities Paris 1 and Toulouse 1), Maria Skaletsky (Bentley University).

Reflections on the United Nations Millenium Development Goals using Kohonen self-organizing maps

Joel Deichmann (Bentley University), Dominique Haughton (Bentley University, Universities Paris 1 and Toulouse 1).

Informal sector, business environment and economic growth : a comparative analysis of West and Central Africa (funded by CRDI, Canada)

Jean-Jacques Ekomié (Université Omar Bongo, Gabon), Dominique Haughton (Bentley University and Universities Paris 1 and Toulouse 1), Bernadette Kamgnia (Université Youandé 2, Cameroon), Aly Mbaye (Université Cheikh Anta Diop, Senegal), Steve Golub (Swarthmore College).

Agent-based model of “clandos” taxis and other public transportation in Libreville, Gabon

Jean-Jacques Ekomié (Université Omar Bongo, Gabon), Dominique Haughton (Bentley University and Universities Paris 1 and Toulouse 1), Adrien Lammoglia (VitiTerroir, France), Aly Mbaye (Université Cheikh Anta Diop, Senegal).

Quality of Life of Joining European Union
Joel Deichmann (Bentley University), Abdi Eshghi (Bentley University), Dominique Haughton (Bentley University, Universities Paris 1 and Toulouse 1), Mingfei Li (Bentley University).

Representative Publications
  • Deichmann, Joel I., Eshghi, Abdolreza, Haughton, Dominique, and Li, Mingfei. 2017. Socio-Economic Convergence in Europe One Decade after the EU Enlargement of 2004: Application of Self-Organizing Maps. Eastern European Economics. 3, 236-260. DOI:10.1080/00128775.2017.1287547.
     
  • “Reciprocity in social networks - A case study In Tamil Nadu, India”, Case Studies in Business, Industry and Government Statistics, 5(2), 126-131 (S. Arumugam, D. Haughton, B. Vasanthi and Changan Zhang) (2014). 
     
  • "Kohonen Self-organizing maps as a tool for assessing progress toward the UN Millennium Development Goals", Journal of Human Development and Capabilities, 14(3), 393-419 (Joel Deichmann, Dominique Haughton, Charles Malgwi and Olomayokun Soremekun) (2013).
     
  • “Multilevel models and inequality in Vietnam”, Journal of Data Science, 8, 289-306 (Dominique Haughton, Phong Nguyen) (2010). 
     
  • “Living standards of Vietnamese provinces: a Kohonen map”, Case Studies in Business, Industry and Government Statistics, 2(2), 109-113 (Dominique Haughton, Phong Nguyen and Irene Hudson) (2009).
     
  • “Measuring the international digital divide: an application of Kohonen self-organizing maps”, Journal of International Knowledge and Learning, 3(6), 552-575 (J. Deichmann, D. Haughton, A. Eshghi, D. Haughton, S. Sayek and S. Woolford ) (2007). 
     
  • “Shifts in living standards: the case of Vietnamese households 1992-1998”, 32(1), 79-101, Philippine Journal of Development, (Dominique Haughton, Le Thi Thanh Loan) (2005).
     
  • “A Kohonen map of Eastern European countries and former Soviet republics”, Journal of Business Strategies, 20(1), 23-44 (J. Deichmann, A. Eshghi, D. Haughton, S. Sayek, N. Teebagy, H. Topi) (2003).
     
  • “Determinants of foreign direct investment in the Eurasian transition states”, Eastern European Economics, 41(1), 5-34 (J. Deichmann, A. Eshghi, D. Haughton, S. Sayek, N. Teebagy) (2003).

Bentley Big Data Initiative (BBDI)

“Big data” has emerged as an important and valuable component in the decision making process for business, healthcare, finance and many other fields. There are two key aspects of big data. First, there is an incredible volume of data presently generated and available for analysis. Second, data comes in a variety of formats and different types of data require different types of analysis. Novel techniques are being developed to analyze this variety and volume of data. Our goal at this stage is to collect and develop techniques to analyze large datasets and to analyze various types of data, and to educate interested members of the Bentley community in their use.

Technical Group Members

David Oury, Jason Wells, Brock Tibert

Current Projects

Open government datasets (Kevin Mentzer)


The Open Data Initiative by President Obama is designed to establish “transparency, public participation and collaboration.” The datasets that this initiative makes available come from the health care, finance and global development sectors (to name just a few.) They enable researchers to study government functions and spending, and to enhance the analysis of private datasets by incorporating government data into these datasets. Our first step is to identify and catalog those datasets that appear useful to the Bentley community. Then we will develop methods to investigate these datasets. 

Analyzing large datasets and using parallel processing with R (David Oury)


R is an excellent language for mathematical modeling and research, but some large datasets are difficult if not impossible to analyze with R. Fortunately, libraries are available which facilitate work with large datasets and which provide access to external databases. These libraries are studied with the goal of increasing the size of datasets with which we can work and the types of analysis applied to these datasets. Computers with multiple cores and clusters of computers are increasingly common and available to everyday users. Though R was not designed for a multiprocessor environment, there are now R libraries that make this possible. Several parallel processing models are made available through these libraries. We are studying these models and the libraries that implement them. Our goal is to document techniques for using R to work with large datasets and to work in a multiprocessing environment.

The Kiva dataset and graphical analysis (David Oury)


Kiva.org has been instrumental in microlending by connecting lenders and borrowers worldwide. Two lenders are considered “connected” when they fund the same loan. Network analysis is used to study the networks formed by these connections to better understand lenders and help them identify meaningful loans. Our goal is to increase funding (through Kiva) by helping lenders strengthen their engagement in the lending process.

U.S. Patent Database (Michael Walsh, Fred Ledley)


The Center for Integration of Science and Industry is analyzing the U.S. Patent database to study the emergence of technology in the fields of biotechnology, drug development and renewable energy. Our initial research seeks to identify use patent grants and applications to evaluate technological evolution at both global and institutional levels. This relational database includes all publically available data associated with full-text patent grants, patent applications, classifications, and assignments. It allows for in-depth research and analysis of available patent data through various data mining and analytics techniques.

Benchmarking and system configuration (Jason Wells)


The BBDI group uses a ScaleMP system and cluster consisting of six Dell M710 blades with a total of 72 cores and 864GB of memory using Red Hat Enterprise which supports several NOSQL servers. Realizing the benefits of this system will require performance monitoring with an eye toward optimizing system configurations.

Publications

To Come

Resources

To come

Social Networks Analysis Project (SNAP)

The SNAP group is interested in a variety of problems related to networks, where the focus is on links between people or entities. The team has investigated the evolution of cross-departmental co-publication links, introduced methods to generate random networks from a given distribution, and has recently investigated social links between household in a village in South India.

Core Group Members

Susan Adams, Nathan Carter, Charles Hadlock, Dominique Haughton

Current Projects

Temporal Social Network analysis of interlocking boards in the CAC 40 and Dow Jones (Paris 1 University funded project)

Francois-Xavier Dudouet (Université Paris Dauphine, Dominique Haughton (Bentley University, Paris 1 and Toulouse 1 Universities), Kevin Mentzer (Bentley University), Pierre Latouche (Université Paris 1) and Fabrice Rossi (Université Paris 1).

Temporal Social Network analysis of the cross-departmental co-publication network at a business university

Dominique Haughton (Bentley University, Paris 1 and Toulouse 1 Universities), Maria Skaletsky (Bentley University).

A comparative study of reciprocity in two rural social networks in Tamil Nadu, India

Subramanian Arumugam (Kalasalingam University, India), Dominique Haughton(Bentley University, Paris 1 and Toulouse 1 Universities), BalasubramanianVasanthi (Kalasalingam University, India), Changan Zhang (Bentley University and Epsilon)

Representative Publications
  • “Reciprocity in social networks - A case study In Tamil Nadu, India”, Case Studies in Business, Industry and Government Statistics, 5(2), 126-131 (S. Arumugam, D. Haughton, B. Vasanthi and Changan Zhang) (2014).
  • “On the generation of random networks from a given distribution or type”, Computational Statistics and Data Analysis, 52(8), 3928-3938 ( N. Carter , C. Hadlock, D. Haughton) (2008).
  • “Proactive encouragement of interdisciplinary research teams in a business school environment: strategy and results”, Journal of Higher Education Management, 30(2), 153-164 ( S. Adams, N. Carter, C. Hadlock, D. Haughton, G. Sirbu) (2008).
  • “Change in connectivity in a social network over time: a Bayesian perspective”, Connections, 28(2), 17-27 (S. Adams, N. Carter, C. Hadlock, D. Haughton, G. Sirbu) (2007).
  • ‘’A recipe for Collaborative Research, BizEd, September/October 2006 (S. Adams, N. Carter, C. Hadlock, D. Haughton, G. Sirbu).

Marketing Analytics

The Marketing Analytics team focuses on leveraging novel analytics techniques to contribute significantly to marketing problems. The team has investigated determinants of churn in the telecommunication industry, as well as how the digital divide impacts consumption patterns. In a corporate-academia NSF-funded initiative, the teams has successfully utilized Hidden Markov Models to impute unknown competitor marketing activity.

Core Group Members

Dominique Haughton, Abdi Eshghi, Heikki Topi, Changan Zhang

Current Projects

Collaborative networks and cost-effectiveness in the pharmaceutical industry: an academic-corporate initiative to improve predictive modeling with Big Data

Dominique Haughton (Bentley University and Universities Paris 1 and Toulouse 1), Danny Jin (Epsilon), John Lin (Epsilon), Heikki Topi (Bentley University), Qizhi Wei (Epsilon), Changan Zhang (Bentley University and Epsilon).

Representative Publications
  • “Imputing unknown competitor activity with Hidden Markov Models”, under revision for Journal of Direct, Data and Digital Marketing Practice (Dominique Haughton, Guangying Hua, Danny Jin, John Lin, Qizhi Wei, and Changan Zhang) (2013). 
  • “Determinants of customer loyalty in the wireless telecommunications industry”, Telecommunications Policy, 31(2), 93-106 (D. Haughton, J. Deichmann, A. Eshghi, N. Teebagy, H. Topi) (2006).
  • “Determinants of customer churn behavior: the case of the local telephone service”, Marketing Management Journal, 16(2), 179-187 (D. Haughton, J. Deichmann, A. Eshghi, N. Teebagy, H. Topi) (2006).
  • “Digital divide and consumption patterns in the U.S.: an exploratory investigation”, Marketing Management Journal, 15(4), 108-122 ( A. Eshghi, D. Haughton, H. Topi, J. Deichmann) (2005).

Health Care Analytics

Health care systems have become more complex across the world and many stakeholders are interested in the impact of different policies on health care. New questions arise constantly in the measurement of quality, in health care management, medical science and clinical studies. These questions need to be investigated and answered in depth.

Health Analytics Research Group (HARG) is a subgroup of health care analytics in DART. It involves members from both academia and non-academic fields, with interests in health care and medical research-related topics. Existing nationwide, health care databases such as the U.S. Department of Veterans’ Affairs Decision Support Systems database or the U.S. Medicare/Medicaid database can help us with health care and medical research-related topics such as patients’ outcomes, epidemiology studies, related economic research, patients’ and providers’ behavior, service assessment and health care insurance. The team uses these data sources and up-to-date analytics techniques to address important health care issues.

Core Group Members

Mingfei Li (Department of Mathematical Sciences), Swati Mukerjee (Department of Economics), Gang Li (Department of Management), Lan Xia(Department Marketing), Jennifer Xu (Department of Computer Information System), Chao Wang (Business Analytics PhD Student), Ying Wang (Business Analytics PhD Student)

Current Projects

THE DIRECT AND INDIRECT EFFECT OF PAYMENT METHODS ON PATIENT SATISFACTION.
Mingfei Li, Swati Mukerjee, Gang Li, Lan Xia, Jennifer Xu

Selecting the Leading Factors for Sleep Problems in Pediatric Burns: The Multicenter Benchmarking Study
Chao Wang

Biomarkers of miRNA in Alzheimer Disease
Mingfei Li (Bentley University), Vladimir Tsivinski

Representative Publications

To Come

Media Analytics

Media analytics is  concerned with the analysis of data of any type arising on websites, print media, audio and video media etc. Typically, text analytics and social network analyses are used, along with more traditional techniques, to derive novel insights. A recent project has  contributed two novel applications of social network visualization and data mining techniques to the motion picture industry.

Music Analytics Group (MAG) is a subgroup of Media Analytics in DART, which is an interdisciplinary research group focused on projects at the intersection of music studies and data analysis. The group is interested in all aspects of a wide range of music from a variety of geographical locations and all types of data analysis. 

Core Group Members

Dominique Haughton, Mingfei Li, Piaomu Liu

Current Projects

Movie Analytics: a Hollywood introduction to big data, monograph for Springer-Verlag

Dominique Haughton (Bentley University, Universities Paris 1 and Toulouse 1, Mark-David McLaughlin (Cisco Systems and Bentley University), Kevin Mentzer (Bentley University), Nathalie Villa-Vialaneix (Institut National pour la Recherche Agronomique Toulouse and Université Paris 1), Changan Zhang, (Bentley University and Epsilon).

Representative Publications

To come

Finance Analytics

The finance analytics theme focuses on applications of novel data mining techniques to problems related to the financial industry.  Current projects involve microfinance and the study of interlocking boards in Europe and the United States

Core Group Members

Dominique Haughton, David Oury

Current Projects
  • Temporal Social Network analysis of interlocking boards in the CAC 40 and Dow Jones (Paris 1 University funded project) Francois-Xavier Dudouet (Université Paris Dauphine, Dominique Haughton (Bentley University, Paris 1 and Toulouse 1 Universities), Pierre Latouche (Université Paris 1) and Fabrice Rossi (Université Paris 1).
  • The Kiva dataset and graphical analysis (David Oury) Kiva.org has been instrumental in micro lending by connecting lenders and borrowers worldwide. Two lenders are considered “connected” when they fund the same loan. Network analysis is used to study the networks formed by these connections to better understand lenders and help them identify meaningful loans. Our goal is to increase funding (through Kiva) by helping lenders strengthen their engagement in the lending process.
Publications

To Come

Analytics: Other

Many other issues in analytics  have arisen recently. These areas include ethics and analytics (for example, definitions of what constitutes an ethical collection and use of data, trade-offs between privacy and security etc.), causal business analytics (for example, when can one claim causal impacts of variables on other variables in realistic business settings?) , government analytics (for example, analyses of government databases for policy research), etc.

Core Group Members

Dominique Haughton, Mingfei Li, Heikki Topi

Current Projects

Cause-and-Effect Business Analytics, book for Chapman and Hall, Dominique Haughton (Bentley University and Universities Paris 1 and Toulouse 1), Jonathan Haughton (Suffolk University), Victor Lo (Fidelity Investments).

Machine-Learning Based Personalization: Uplift Modeling and Related Analytics, Dominique Haughton (Bentley University and Universities Paris 1 and Toulouse 1), Victor Lo (Fidelity Investments), Mingfei Li (Bentley University).

Representative Publications

To come