Improving Accessibility, Harmonisation and Data Linkage in Europe

Event review

On the 15th of February 2024 (14:00 – 16:00 CET), we hosted our 1st Scientific Online Seminar, entitled: “Improving Accessibility, Harmonisation and Data Linkage in Europe”.  The objective of the event was to share methodological advances and perspectives that are being developed in the framework of the project while also dialoguing and learning from practical experience of leading experts.

Follow up

How policy and science can work towards a rich data landscape in Europe, Policy insight from Mapieq project coordinator Janie Erola.

Summary of Content

Our 1st Mapineq Scientific Seminar opened up a space to discuss a variety of topics, examples and recommendations on developments related to administrative register data for the social sciences. Below, a summary of the discussions is presented, illustrated with quotes from the speakers.

Accessibility: How did we get here?

The history of availability and access to anonymised administrative register data for scientific purposes in Europe varies considerably from country to country. While in the Nordic countries it has been possible to access and use individual data covering the whole population of a country since the mid-1980s or 1990s, depending on the country, in others, such as France, access to data has been a slow but successful process, while in the Netherlands it has been a rather fast and efficient process. In other countries, open access has only just begun or is not yet in place. While there are undeniable differences in pace and in the legal complexity of opening up secure access to this type of data, there is a general consensus among experts that the overall story is one of progress and expansion in the availability and use of data.

Speakers repeatedly mentioned the importance of concrete developments to make this happen: 

First, building relationships and trust with data providers and national data protection authorities is essential. Security is always a core element in discussions about providing access to data, and stakeholder dialogue about the importance of access for scientific use of data often helps in the process of providing secure access. In addition, promoting a culture of data sharing and educating the public about what opening up secure data access actually means for citizens is recommended by experts in many countries as a way to facilitate future progress.

Second, collaboration between stakeholders is essential for many reasons, including promoting changes in legislation to enable secure access. In the words of Roxane Silberman: 

We had to change legal frameworks. This is something that is still ongoing and has demanded a lot of effort and collaboration between researchers and national authorities. We had to change different regulations, mostly related to statistical and privacy protection law. We also had the chance to specifically focus on administrative data in the digital law very recently and the possibility to request access for linkage with administrative data for research purposes. Security services providing services were at the core of these developments.” 

Roxane Silberman, Scientific Advisor at the Secure Data Access Center (CASD) and Elected Chair of the European Statistical Advisory Committee (ESAC)

Harmonisation and Data Linkage in Europe: Remarkable achievements in a short span

Examples of rapid and substantial progress in harmonisation and data linkage in Europe were widely shared at the seminar. In the context of the Mapineq project, register data are being complemented by other sources such as contextual information (e.g. environmental indicators such as air pollution, but also policies) and survey data to produce unique research results (for examples, see the Mapineq publications list). In addition, the Mapineq geo-linked inequality database – to be launched in Autumn 2024 – will include institutional and policy measures and a variety of environment, demographic and socio-economic indicators in a single dashboard, with the possibility to link them with micro-level datasets.

Siri Eldevik Håberg presented fascinating research from Norway, where linked data from the population register, medical register, education-related individual data, tax-related register, and others were the main data input. Rodosthenis Rodosthenous presented work being done at FinnGen, where experts are collecting and analysing genome and health data from 500,000 Finnish biobank participants. Tina Hinz, in turn, introduced a new data service created for research purposes by the Research Data Centre of the German Federal Office for Migration and Refugees.

Tom Emery introduced ODISSEI, the Dutch Open Data Infrastructure for Social Science and Economic Innovations. With 45 member organisations – including all social science faculties and economics faculties in the Netherlands – ODISSEI provides access to large, longitudinal data collections that can be linked to register data from Statistics Netherlands (CBS) in a secure environment. This virtual network enables researchers to answer new, interdisciplinary research questions and to investigate existing questions in new ways.

Finally, a common concern of the participants is the documentation of datasets. It is often the case that data providers only share documentation on the current state of datasets and do not include information on how the data collection or data structure has changed over time. In addition, privacy documentation is not always available.

Challenges in ethics

The seminar agenda also included presentations on ethics in data collection and data security, which is a fundamental issue in the Mapineq project. First, Daunia Pavone shared some thoughts based on decades of experience working in the humanitarian sector. She suggested starting the approach to ethical data management with a simple question: “While we’re trying to do something good, how do we avoid doing something bad?”. This applies to all stages, from data collection to data storage, analysis, sharing and use. She also highlighted the importance of Data Responsibility in Humanitarian Action, a concept that is shared by humanitarian organisations of safe, ethical and management of personal and non-personal data for operational response.

Another shared concern among participants was how to effectively build trust among citizens on data collection and sharing processes. Essentially, it’s about fostering a data-sharing environment where individuals trust the data providers, are knowledgeable enough to make informed decisions, and can critically analyse the consequences of sharing personal information.

The future is about investing in cooperation

Participants stressed the importance of ongoing dialogue and the creation of networks to share knowledge and best practices. As Jani Erola concluded: 

“As technological developments and knowledge accumulation is happening faster and faster, innovation is becoming much harder than it used to be. So, it makes a lot of sense to join forces to innovate. You might get more out of it. And even if you only collaborate occasionally, it is more fun than working on your own. And I think it is good to remember that the original idea of academia – and I think it is still the idea of academia – is the idea that we are a global community of scientists. We need to embrace that principle.”

Jani Erola, Mapineq Project Coordinator

Watch the recording of the plenary session now

Structure and Speakers

The Seminar was divided into three parts: first, a plenary session took place. Speakers included: 

  • Roxane Silberman, Scientific Advisor at the Secure Data Access Center (CASD) and Elected Chair of the European Statistical Advisory Committee (ESAC), one of the governance bodies of the European Statistical System and Eurostat. 
  • Tom Emery, Director of ODISSEI, the Dutch National Infrastructure for Social Science and Associate Professor in the Department of Public Administration and Sociology of Erasmus University Rotterdam. 
  • Siri Eldevik Håberg, Director of the Centre for Fertility and Health, a Centre of Excellence (SFF) founded by The Norwegian Research Council.
  • Daunia Pavone, Senior Data and Analysis Quality Advocate at the International Organization for Migration.
  • Jani Erola, Professor of Sociology at the University of Turku, Director of the INVEST Research Flagship Center and Project Coordinator of Mapineq.

Moderator: Daniela Vono de Vilhena, Population Europe / Mapineq.

The second part of the event was devoted to the presentation of case studies and best practices on four different topics. This was possible thanks to the use of breakout rooms. Below, the titles and speakers of the group discussions are presented:

1- Creating comparable datasets 

  • Markus Jäntti, Professor of Economics at the Swedish Institute for Social Research at Stockholm University and Work-Package Leader at the Mapineq Project. 
  • Domantas Jasilionis, Researcher at the Max Planck Institute of Demographic Research and Member of the Human Mortality Database Executive Board. 

Moderator: Daniela Vono de Vilhena, Population Europe / Mapineq.

2- Linking administrative and other types of data

  • Aki Koivula, Senior Researcher at the INVEST Research Flagship Centre, University of Turku.
  • Rodosthenis Rodosthenous, Research Coordinator at the Institute for Molecular Medicine in Finland and Head of the Sample and Data Logistics team at the FinnGen research project.

Moderator: Elina Kilpi-Jakonen, University of Turku / Mapineq.

3- Improving data accessibility

  • Jan Paul Heisig, Head of the Research Group Health and Social Inequality at WZB Berlin Social Science Center, Professor of Sociology at the Freie Universität Berlin and Work-Package Co-Leader at the Mapineq Project. 
  • Tina Hinz, Researcher at the division “Research Data Centre (FDZ)” at the German Federal Office for Migration and Refugees. 

Moderator: Peter Weissenburger, Population Europe.

4- Challenges in ethics, re-identification and data security

  • Melinda Mills, Professor of Demography and Population Health at Oxford Population Health and Nuffield College, and Director of the Leverhulme Centre for Demographic Science and Demographic Science Unit at the University of Oxford. She is also Professor of Data Science and Public Health Policy at the University of Groningen and Department of Genetics, UMCG, the Netherlands and Work-Package Leader at the Mapineq Project.
  • Andrea Ganna, Associate Professor in Health Data Science at the Helsinki Institute of Life Science (HiLIFE) and Research Associate at Harvard Medical School and Massachusetts General Hospital. 

Moderator: Andreas Edel, Population Europe.

Finally, the third part of the meeting consisted in summarising the discussions that took place in the breakout rooms to all participants, and a wrap-up of the event by Mapineq’s Project Coordinator Professor Jani Erola.

This article is based on this post, first published on Population Europe.