About open data
Open data, according to Open Definition, are:
“data anyone can access, use, modify and share, for any purpose”.
They represent a very important subset in the wide domain of public sector information, which reuse is promoted by an European directive (transposed in Portugal by the Law no. 26/2016 of 22 August).
The open data movement, which is an integral part of the policies focused on Open Government, combines the transparency, participation and collaboration principles, as well as the economic development potential brought by the digital.
The great amount of data generated and centralized by the Public Administration aggregates, in itself, a great use and development potential, which may be useful and important both for the State and for the civil society and business world.
The majority of the data are already, by law, deemed as public. The great challenge (and the greatest concern of the open data initiatives, such as dados.gov) implies the facilitation of its access and reuse, benefiting several society groups and sectors:
- the citizens, who may now more immediately access the information which belongs to them, reinforcing the transparency vision and the legal report of the State before the voters;
- the governmental institutions, which become more transparent and able to become more efficient and effective, also reinforcing their public service role and the access to date of other bodies;
- the business sector, which may reuse public information to create applications, platforms or services with high commercial potential;
- and many other sectors, such as journalism, university research or even non-profit organizations with civic concerns.
This challenge implies the provision of data in formats which may be read by automatized mechanisms, through open formats and tools, which may be reused, transformed or integrated by any citizen or entity, usually provided as datasets.
Open datasets
A dataset, public or managed by a single agent, and made available for access or download through one or more formats.
--W3C Data Catalogue Vocabulary (DCAT)
Datasets, within the scope of public open data, are sets of data in digital format, focused on a specific topic. A list of addresses of public services, for instance, or the monthly data on management or adjudication practices of a public body.
Data with no reuse restrictions
One of the crucial aspects of open data is the idea that any person or entity may use, transform or adapt the public data to monetize within a business context.
When we talk about public open data, we may consider data regarding: public adjudication, management of public bodies, economic and financial statistics, public expenses and revenues, electoral results, georeferentiation of addresses and public services, transports timetables, service quality indicators, among many others.
A great part of this information may already be publicly available, but for a specific governmental dataset to be classified as Open, there cannot be any restrictions to its access, either legal, political, technological and financial.
Therefore, the open data are distinguished from the data only provided to the public. The open data are always covered by open licenses, which allow the commercial reuse. If that is not the case - they cannot be deemed as open data.
Privacy and open data
Not all public sector’s information should become public. There is wide dataset in Public Administration which must remain under restrict access, for safety reasons, legal reasons or citizens’ right to privacy. Open data policies do not stipulate that all State’s information is “opened”, only the one which may be deemed as public.
The considerations regarding the opening of data from specific sectors shall be carried out by the entities which manage them in line with bodies such as the National Data Protection Committee.
When mentioning open data, we are mainly mentioning the governmental data which are already or should be available for the society and, thus, have the potential to become open, ensuring its reuse in new projects.
How to identify datasets for publication?
In the case there are not several delimited and structured datasets in a body which may be immediately open, we may start by opening only a subset of these data.
We may start by considering to provide information which is already available to the public, either as website contents or through requests to the organization, under a datasets less “processed” way. Here, the advantage to provide gross data, in open format and able to be read by machines is clear, in order to privilege its free reuse.
Overall, the data re-users specially appreciate if them privilege concerns such as:
- Access to the entities, available services and service desks;
- Transparency in the accounts, activities and resources of the entity;
- Monitorization of a public interest topic, sector or area.
It is very important to also establish a data maintenance and update compromise. In this sense, and even though the initial effort is slightly higher, the use of automatized mechanisms (e.g. API) is recommended, in order to guarantee that the process is carried out regularly.
Data with geographical information
On the data to be opened, special attention and effort must be granted to the information’ georeferentiation as it significantly increases the interest and the use potential of the open data. By enriching a dataset with geographical reference domains, the entity is opening the potential for the development of applications which allow to visualize the data on a map. And these are highly appreciated applications by the final users and by the development community as they allow to navigate the data in a user-friendly and pleasant way and boost the creation of apps for mobile devices.
Interacting with the re-users
The interaction with the community of re-users may provide precious leads on the type of data to provide.
The body may carry out any type of public consultation and ask to its most frequent interlocutors or other stakeholders which data they would like to access.
AMA may also help in this process, including with the collaboration in the organization of workshops/events aiming to promote these interactions, contact us at dados@ama.pt.