+49 30 2592190 (ext. 252) Mon-Friday: 09:00 AM - 17:00 PM

Welcome to Hertie School Data Science Lab

Public Policy challenges meet the power of Data Science

The Hertie School Data Science Lab is a research and training hub that tackles societal challenges with computational and data-intensive methods.

Our mission is to foster, advance and promote excellence in data science research, education and application to enable better decision-making for the common good.

Over the next decade, data science is expected to have a significant impact across all sectors of the economy and society through automation and personalisation of services. The Hertie School Data Science Lab will leverage and amplify breakthroughs in data science and artificial intelligence (AI) to tackle major societal problems.

Our Work

What we do

Research

The key research focus of the Hertie School as a whole and Data Science Lab in particular is contribution to the common good. Core research areas involve experimental survey methods, causal inference, and natural language processing. Additional research areas will be developed organically through growth and cooperation with partner institutions.

Teaching

The Lab builds capacity through training and educating the new generation of public policy professionals and practitioners that utilize Data Science and AI to tackle complex societal challenges of today. We aim to nurture our students to become AI-enabled policy makers and practitioners through the new MSc Data Science for Public Policy at the Hertie School.

Research Consulting

Our Research Consulting team provides resources and advice on methodological and technical issues for graduate students, researchers, and faculty at the Hertie School and the SCRIPTS Cluster of Excellence. We primarily focus on questions of research design, quantitative measurement and statistical inference, computational tools, and data visualisation.

Our Teaching

Course offerings from the Lab

Introduction to Data Science

An intensive foray into the science of extracting insights from data

This course will teach you how to do data science with R. In recent years, data analysis skills have become essential for those pursuing careers in policy advocacy and evaluation, business consulting and management, or academic research in the fields of education, health, medicine, and social science. This course provides students with advanced data science skills using the powerful R programming language.

Course Page Syllabus

Python Programming for Data Scientists

Explore the foundation of computer science and the art of elegant code

Python is a versatile and expressive programming language that is becoming more and more important in data science and analysis. Python has become essential in modern day applications of Machine Learning and NLP. This course is an introduction to the Python programming language for students without prior programming experience. We cover data structures, control flow, object-oriented programming and algorithm analysis. Upon its completion, students will master foundational concepts of programming and be able to write professional-grade Python code.

Course Page Syllabus

Mathematics for Data Science

Crucial understanding of maths for success in data science

This course aims to deliver a compact and tailored introduction to the core mathematical concepts of data science. Linear algebra, probability theory, statistics, and optimisation are mathematical pillars underlying the practice of data science. The course covers foundational mathematical concepts such as norms, matrix algebra, multivariate calculus, maximum likelihood estimation and many more in theory and practice. Upon completing the course, students will have a broad knowledge of linear algebra, probability theory, statistics, and optimisation necessary to understand the theoretical underpinnings of modern statistics and machine learning methods.

Syllabus

Data Structures and Algorithms

The fundamental building blocks of any programming languages

If programming is an art, then data structures and algorithms are the colours and strokes with which programmers display their expertise. Data structures are the manner in which data is stored so that algorithms can operate on them. Algorithms, in turn, are instructions on how to handle data efficiently to achieve a desired result. By the end of the course, students will understand these elementary building blocks of programming, solve simple coding problems, evaluate complexity and efficiency of algorithms, and know the best use cases for each type of data structure.

Syllabus

Causal Inference

A framework to gain a deeper understanding of the world and its causal mechanism

Public policy and data-driven organisations are complex arenas, and the ability to uncover causal connections rather than simple correlations is vital in evidence-based decision making. This course teaches the analytical framework of contemporary causal inference, which isconnected to modern statistical and machine learning methods.The course covers a comprehensive array of topics and guidelines on designing and implementing causal evaluation research based on the latest methodologies. Special emphasis will be given to the application of causal analysis for policymakers and development practitioners.

Course Page Syllabus

Machine Learning

Theoretical and hands-on practice with the most exciting data science tool today

Machine learning is a core technology of artificial intelligence and data science that enables computers to operate without being explicitly programmed. Recent advances in machine learning have given us innovations behinds self-driving cars, AlphaGo, Amazon, and Netflix. This technology has also allowed us to predict armed conflict and post-electoral violence, detect fake news, develop targeted provision of care and public services, and implement early policy interventions. This course provides a hands-on introduction to machine learning. The course covers topics in supervised and unsupervised learning, including the most common learning algorithms such as regression, classification, random forests, clustering, and dimensionality reduction. Students will learn the fundamental concepts underlying machine learning algorithms, but will equally focus on the practical use of machine learning algorithms using open-source frameworks.

Syllabus

Natural Language Processing with Deep Learning

Use state-of-the-art deep learning techniques to build cutting-edge NLP models for our AI-powered future

Natural Language Processing (NLP) is a key technology of the information age. Automatically processing natural language outputs is a key component of artificial intelligence. Applications of NLP are everywhere as people and institutions largely communicate in language. Recently statistical techniques based on neural networks have achieved a number of remarkable successes in natural language processing leading to a great deal of commercial and academic interest in the field. This course provides an overview of modern data-driven models to richer structural representations of how words interact to create meaning. We will discuss salient linguistic phenomena and successful computational models. We will also cover machine learning techniques relevant to natural language processing.

Course Page Syllabus

Text as Data: Quantitative Text Analysis for Political Science & Public Policy

Mining massive records of text for human knowledge and data understanding

In the study of politics, text of one kind or another is often essential to measuring important underlying concepts, e.g. policy sentiment, issue frames, or ideological positions. It also poses special challenges, notably to machines but also to researchers who simply cannot themselves read everything their research designs require and must therefore look to operationalise and extend their own understanding using quantitative tools. This course is about the kinds of statistical and computational tools that political scientists and policy analysts have found useful for treating text 'as data'.

Course Page Syllabus

Governance and Politics of Artificial Intelligence

Explore the way in which AI is transforming our economy, society and politics and how we can use it as a force for good

Innovations in Artificial intelligence (AI) are transforming economies and societies globally, and with them politics. This course explores these transformations and corresponding policy challenges. As a governance school Hertie has a special responsibility to address these critical topics. Integrating perspectives from both natural and social sciences, this course will provide learning experiences that examine the impact of AI on humans and societies. We will explore the proliferation of algorithmic decision-making and autonomous systems; the issues of ethics, fairness, transparency and accountability raised by AI techniques such as machine learning; balances and interactions between regulation and innovation; the effects of AI on human rights and economic wellbeing; the global AI arms race; and increasing oppressive capabilities of state- and non-state actors. We consider both public and private strategies of regulation, and local, national, and transnational aspects of governance.

Syllabus

Decision Making and Data Science

How to make decision under uncertainty and a proliferation of data.

This course provides relevant insights into the day-to-day operations of data and policy practitioners in the field and how their choices affect the work of their organizations at large. The course begins with a formal introduction to decision theory and how to reason under a multitude of choices, information and data sources, especially when the decision taken will have real-world consequences. Following this, the second part of the course will feature practitioners from government, industry and non-profit organizations to explain in details how these principles and theories are applied in reality. Depending on availability, the course seeks to include a range of speakers to generate a lively discussion on how data-driven decision-making is implemented in different scenarios and professional settings.

Syllabus

Artificial Intelligence in Government

What AI means for the present and future of governments

Artificial Intelligence and Machine Learning have been dominating the headlines in the last few years coming with a lot of promises also for transforming government work. Whether it is to gain efficiency in current processes, improve serviced delivery or transform decision making and service delivery, there are many ways to utilise artificial intelligence technologies in a government context. What do these new technologies mean? What are these technologies and where can they be applied in a government context? What benefits can public sector organisations derive from deploying such technologies and how can they go about and embed them to deliver tangible benefits? What are the key management challenges in implementing such technologies and how can they be addressed? This course aims to demystify these concepts by looking at government implementation experiences. We look beyond the hype and focus on the real challenges and opportunities of practical applications of AI for government organisations. We also consider challenges and opportunities arising from ethical, fair, transparent, and accountable deployment of artificial intelligence and will look at key factors for successful implementation such as data management and data sharing, public sector innovation, project management, change management, cross-sectoral collaboration and safe implementation.

Syllabus

Artificial Intelligence and Climate Change

How can AI address the climate change challenge?

Artificial intelligence (AI) and climate change are both topics on top of the policy agenda that require a deep technological understanding of the problem space. It does not come as a surprise that these two topics also affect one another in multifaceted ways. This course will explore the relationship of AI and climate change through a policy lens, and ask the question of what policy-makers can do to align AI with climate change goals. Readings will provide students with insights into cutting edge research using AI and machine learning (ML) to address climate change. The course will also cover how AI is deployed in ways that are detrimental to these goals, provide a perspective on systemic effects of AI-driven technologies and their impacts on social well-being, and discuss energy and resource consumption related to AI’s computational requirements. Together, we will explore technology assessment and design possible policy instruments. Students will learn how to navigate this intersection of two hot button topics and provide informed and practical advice to policy makers.

Syllabus

Statistics II: Statistical Modeling & Causal Inference

Explore the sciencne of causal analysis

This course continues the sequence in statistical modeling. Assuming prior knowledge in simple and multiple linear regression modelling, it introduces students to a new perspective on studying causes and effects in policy analysis. Based on a framework of causality, the course agenda covers various strategies to uncover causal relationships using statistical tools. We start with reflecting about causality, the ideal research design, and then learn to use a framework to evaluate the causal impact of policy interventions and other treatments. Then, we revisit common regression estimators of causal effects and learn about their limits. Next, we will focus on matching, instrumental variables, regression discontinuity designs, difference-in-differences and panel estimators, and techniques to explore heterogeneous effects. All classes divide time between theory and application.

Syllabus

Our Students

Check out our students' projects

Avatar

Drug Screening and Family Welfare

This project discovered evidence of a causal link between implementation of a state drug screening questionnaire and a reduction in applications for public assistance across the United States.

Project Github
View report
Avatar

The Key Drivers of Bicycle Accidents in Berlin

This project presents a machine learning approach to predicting injury severity and accident likelihood of bicycle accidents in Berlin.

See project page
Avatar

COVID Fake News Detection

This project constructs a fake news detection algorithm using Natural Language Processing. The built model is then used to quantify the extent to which these types of machine learning models degrade over time.

See project page
Avatar

Predicting payment method from transaction characteristic

This study runs explainable machine learning algorithms on the Kenya Financial Diaries, a high-frequency financial transaction dataset.

See project page
Avatar

Automating photo-ID of Whales and Dolphins

This project presents a classical machine learning and an advanced deep learning approach to automate the photo-ID of whales and dolphins on the species and individual level.

See project page
Avatar

PISA-Revisited

This project uses a machine learning framework to identify the strongest predictors of reading scores for boys and girls from the complete 2018 PISA dataset.

See project page
Avatar

Data for Democracy: Can Machine Learning Predict Democratic Backsliding?

The project developes a forecasting model that predicts the risk of regime transitions toward more autocratic systems of governance in the future.

See project page
Avatar

Predicting Drug Consumption Through Personality

The project presents a machine learning approach to predict the consumtion of five categories of drug using personality traits.

See project page
Avatar

High-resolution Traffic Accident Prediction

The project trained several machine learning models to predict the risk of occurrence of a traffic accident on a road segment for a given hour and day in Berlin.

See project page
Avatar

Detecting Hate Speech on Twitter with Transformers

The following project devises a method of detecting hate speech on Twitter, using deep-learning algorithms. In this regards, the project proposes the Bidirectional Encoder Representations from Transformers (BERT), which is the state-of-the-art method for the majority of NLP-associated tasks.

See project page
Avatar

Environmental Links to Health Outcomes

This project presents a Machine Learning approach to predicting maternal and infant health outcomes in Tanzania and Rwanda, using data of the external and built-environment of the individual.

See project page
Avatar

Electricity Demand Forecasting

This project develops a deep learning approach to Electricity Demand Forecasting (EDF).By looking at the Romanian electricity market, the project demonstrates how moving from traditional econometric model to algorithms such as LSTM-based RNNs brings significant improvements in accuracy.

See project page
Avatar

Face Mask Detection using ML Pixel Analysis

This project teaches you how to use ML classifiers for image classification using pixel analysis. Additionally, it gives you insights in how to do ML classifier performance evaluation regarding accuracy and speed.

See project page
Avatar

Thesis Supervisor Recommender

This project employs a Latent Dirichlet Allocation (LDA) topic model to unveil the underlying topics from the collection of supervision plans at the Hertie School and match with thesis proposal from the students with a similarity score ranking.

See project page
Avatar

News Summarisation with BERT

This project explores the potential of utilizing BERT as the basis for a document level encoder that can capture and generate a representation for text sentences and meanings, ultimately providing a reliable and accurate automated summarization process of news articles from different international outlets

See project page
Avatar

Predicting Policy Domains and Preferences with BERT and CNN

This project presents a deep learning approach to classifying labeled texts and phrases in party manifestos, using the coding scheme and documents from the Manifesto Project Corpus.

See project page

Contact

Contact Us

Location:

Friedrichstraße 180, 10117 Berlin

Open Hours:

Monday-Friday:
09:00 AM - 17:00 PM

Call:

+49 302592190 (ext. 252)

Loading
Your message has been sent. Thank you!