SoBigData strives to formulate a vision of Responsible Data Science for the social sciences and associated innovations. SoBigData facilitates the use of big data for research and provides access to various data resources. Such data resources may include, but are not limited to: user mobility data, web page data, social media and transaction records. Some digital records of personal activities may contain potentially sensitive information; some data sets, such as Flickr or Twitter data may contain content protected by Intellectual Property Rights (IPR). SoBigData promotes and adopts the legally and ethically grounded collection, management and analysis of this data. SoBigData therefore uses policies for the collection and analysis of large-scale datasets in an ethical manner by-design and implements privacy enhancing tools and works towards ensuring its analyses are also just, fair and non-discriminatory.

Legal and Ethical framework

Aims at making the ethical and legal aspects fundamental elements of the social mining experiments performed within the RI. To this aim the following functionalities are supported:

  1. SoBigData offers to the final user appropriate functionalities to make the user informed when (s)he is managing personal data:
    • The e-infrastructure provides guidelines for considering the appropriate legal aspects into dataset descriptions by using specific meta data.
    • The legal requirements associated to datasets enable the definition of different levels of sharing for the data access.
    • Each dataset has assigned responsible scientists, and any virtual and on-site access is monitored and made responsible when is needed
    • Security is guaranteed to accomplish the requested level of sharing. This comprises access restrictions to IT systems and storage complying with all national and EU legislation.
  2. SoBigData provides a framework for the assessment of the privacy risks inherent to a data set to be shared. The framework allows to explore for a given dataset possible scenarios deriving by the combination of three inter-winded dimensions: privacy risks, data quality and remedies (e.g. sanitization or other).
  3. SoBigData will offer to the final user appropriate small tutorials for getting acquainted with ethics and social dilemma on data to the aim of improving the responsibility of any SoBigData data scientists.

Privacy-by-design and Value-Sensitive Design in social mining

SoBigData develops big data analytics and social mining tools by following the privacy by design methodology that requires to inscribe the privacy requirements in the design of the analytical process since the start.

Moreover, value sensitive design principles are applied throughout the research infrastructure, where possible privacy preserving techniques are integrated in a technical manner. Researchers in SoBigData are motivated to explicitly make value choices wherever they occur. SoBigData supports researches to apply ethical standards by (1) the implementation of work-flows that include privacy impact assessments at the earliest possible stage (2) a knowledge base of best practices and (3) by an overarching responsibility architecture, ensuring that the use of these best practices become standard practice.

Intellectual Property Rights

SoBigData RI provides access to some datasets, which may be protected by Intellectual Property Rights (IPR), most typically copyrights. In particular, it may refer to social media content, such as pictures or posts from Facebook, Twitter or Flickr. The use of copyrighted material for research, such as by modification, replication, re-publication, etc. may be subject to certain limitations, such as set by the author (e.g., a Flickr photo released under Creative Commons (CC) share-alike license) or by the platform terms (e.g., Twitter terms). The type of IP, right holder and terms of use associated with SoBigData datasets are indicated in the metadata ( Researchers are made aware and agree to comply with third party terms when using SoBigData datasets by accepting SoBigData RI terms of use (

Eminent experts prepared to serve on the Ethics Board

  • Prof. Helen Nissenbaum of New York University. Professor Nissenbaum is an international expert in the field of ethics and IT, and has gained wide international recognition for her work on privacy, big data, ethics of algorithms, and Values and Design. 
  • Prof. Dag Elgesem of The University of Bergen, Information Sciences and Media Studies. Prof. Elgesem has a background in logic and analytical philosophy and has published widely on Ethics and IT, and Internet research ethics. Professor Elgesem is leading a large research initiative (SAMKUL) in Norway on the foundations of digital societies.
  • Jeroen Terstegge is a legal expert and independent privacy consultant. Jeroen Terstegge has decades of experience in data protection and privacy in Europe. He was a senior staff member of the Dutch Data Protection Office and Privacy Officer of Philips. Terstegge is also a member of a Privacy and Big Data Committee established (2015) by the Ministry of Economic Affairs in The Netherlands.  .  

Members of the SoBigData consortium Profs. Nikolaus Forgó, Hannover, (vice Chair) and Jeroen van den Hoven, Delft (Chair) interact with these external members when and where appropriate. Prof. van den Hoven collects feedback from the Board on the Ethical and legal SoBigData framework to be integrated in the draft text of the relevant deliverables.