The Open Geospatial Data Ecosystem

This summer my first peer-reviewed article, “The Open Geospatial Data Ecosystem”, was published in “Kart og plan”. Unfortunately, the journal is not that digital, and they decided to withhold the issue from the web for a year, “in order to protect the printed version”. What?!

However, I was provided a link to a pdf of my article, and told I could distribute it. I interpret this as an approval of me publishing the article on my blog, so that is exactly what I’ll do.

The full article can be downloaded here: http://docs.atlefren.net/ogde.pdf, and the abstract is provided here:

Open Governmental Data, Linked Open Data, Open Government, Volunteered Geographic Information, Participatory GIS, and Free and Open Source Software are all parts of The Open Geospatial Data Ecosystem. How do these data types shape what we define as Open Geospatial Data; Open Data of a geospatial nature? While all these areas are well described in the literature, there is a lack of a formal definition and exploration of the concept of Open Geospatial Data as a whole. A review of current research, case-studies, and real-world examples, such as OpenStreetMap, reveal some common features; governments are a large source of open data due to their historical role and as a result of political pressure on making data public, and the large role volunteers play both in collecting and managing open data and in developing open source tools. This article provides a common base for discussion. Open Geospatial data will be even more important as it matures and more governments and corporations release and use open data.

Prosjektbeskrivelse i boks

Ting tar tid. I høst fikk jeg beskjed om at jeg måtte levere inn en formell prosjektbeskrivelse av PhD-prosjektet mitt til doktorgradsutvalvet ved IVT. Det endte opp med at jeg fikk levert denne 18. april, og 2. mai var den blitt behandlet. Hyggelig melding der:

Doktorgradsutvalget godkjenner den endelige prosjektbeskrivelsen for ph.d.-avhandlingen til Atle Frenvik Sveen

Men hva er en prosjektbeskrivelse? Det sier seg vel igrunn selv? Mer spesifikt er det en beskrivelse av bakgrunn, mål, omfang (og begrensninger), metode, etiske vurderinger, forventede resultater, og en plan for arbeidet. For meg virker det litt søkt å skulle svare så mye i detalj før jeg er skikkelig i gang, men jeg skjønner jo at man må reflektere litt. Jeg vet ikke helt om dette dokumentet regnes som offentlig materiale, men jeg tenker nå uansett å sakse det “viktigste” innholdet her, sånn for å gi en oversikt over hva jeg driver med.

I bakgrunns-delen går jeg inn på hva som har blitt gjort tidligere, og snakker om hva som gjør at dette arbeidet er relevant:

Geospatial Data has been created and managed since the first maps where made (Garfield, 2013). The impact of the digital revolution on this field have far-ranging consequences. A map is but one of several representations of the underlying digital data. The digitalization of the map-making process thus involves several shifts. One is the de-coupling of the printed map from the actual data, another is the fact that geospatial data can be used for more than printing maps.

Open Data is another consequence of digitalization. There is an increasing political pressure to make digital data produced and maintained by governments available to the public (Cox & Alemanno, 2003; Ginsberg, 2011; Yang & Kankanhalli, 2013). Political accountability, business opportunities, and a more general trend towards openness are all cited as reasons behind this movement (Huijboom & Broek, 2011; Janssen, Charalabidis, & Zuiderwijk, 2012; Sieber & Johnson, 2015). In practice this means that geospatial data from a range of sources are becoming available for everyone to use for whatever purpose they see fit.

A third trend is crowdsourcing, or Volunteered Geographic Information (VGI) (Goodchild, 2007). This concept bears some resemblance to Free and Open Source Software. The underlying concept is that amateurs collaborate on tasks such as writing online encyclopedias, writing computer software, or, as in the case of geospatial data, create a database of map data covering the world: OpenStreetMap (OSM) (Haklay & Weber, 2008).

What is lacking is a combined overview and a set of best practices. What characterizes a system built to handle an automated gathering of geospatial data published in a myriad of formats, with different metadata standards (or no metadata at all), with different update frequencies, and different licenses? A thorough investigation of these problems will enable a better understanding of what data is of interest, how it should be shared, and how the promised value of Open Geospatial Data can be extracted.

Målene oppsummerer jeg ganske enkelt slik:

The overall objectives of this project are (1) to establish guidelines on how to store and manage geospatial data from disparate sources, with different structure and quality, and (2) to explore how this data can be utilized for value generation and decision support. The overarching theme of both objectives are how the Open Source mindset can be utilized.

Når det gjelder forventede resultater summerer dette det meste ganske greit opp:

There are two main results we hope to obtain from this project. The fist is a better understanding of how geospatial data can be gathered from disparate sources and stored in an efficient manner that can be utilized. The other main result is to find new areas, products, and methods that be carried out by using this data. Establishing systems for assessing quality and fitness for use of the data is also an important aspect.

Hvis du er interessert i å lese hele prosjektbeskrivelsen finner du den her: phd_prosjektbeskrivelse_atlefren.

Fastmail not taking security seriously?

About three years ago I figured I’d had enough Google-control of my online communication and was looking for an alternative email-provider. A friend of mine recommended Fastmail, which seemed like a good solution: Great web-interface, Android app, and the possibility of using an address from my own domain.

I signed up and have been using Fastmail since (with a redirect from my Gmail-address). The service has had some small issues (mainly the Android app being anything but “fast”), but overall I’ve been a happy customer.

Yesterday I figured out that I wanted to test 1password, moving away from LastPass after the recent security issues. In this process I decided to use the “generate password” functionality in 1password to set a new, strong password for my Fastmail account. Before I did that I made sure to set the “Account Recovery” email and phone number, so that if I made en error I would still be able to access my email.

And I was right. Indeed I made an error. I copied the generated password from 1password and pasted it into the change password dialog on fastmail. This logged me out, and then I managed to copy something else, removing the password from my clipboard. Then I managed to do something stupid in the 1password app, and my generated, 30-character, completely random, password was lost. I had managed to lock myself out of my email-account! Stupid! But hey, I have a recovery-email, right?

So I headed to the “Lost password screen” and typed in my gmail.address (to which I 10 minutes before had recieved a confirmation mail from fastmail).

Then I got the message:

The existing email address you entered was not for an existing user, or was for an account that has been disabled. Please try again

What?! Ok, after re-trying 5-6 times i had to open a ticket and provide a lot of information to regain-access by a manual process. In the ticket I wrote:

Thanks for the verification details.
I have now set your backup email address to:
*****@gmail.com

And I’m back in. Hooray! But I’m still wondering why the recovery email I entered did not work, so I’m asking:

Wasn’t my backup email set, or was there some problems regarding this feature? I am quite sure that I set my backup email yesterday.

The reply to this confused me:

Looks like the backup email address was not set. We then set it from our end and it worked for you. Please let me know if you need any further assistance.

After some back and forth I find out why:

Did you set this address from the Password & Security screen? If that is the case, you had set the “Recovery email address”. This is currently different from the backup email. Backup email can be set from the backend only.

And the password reset can be done using the backup email address only. The recovery process through recovery email address is not yet released into production. So I am afraid it will not work as of now.

What the actual, flying, fuck? The “Password & Security screen” is a frontend for some code that does not work? It presents itself as a way of setting a recovery mail, while it actually does nothing? The situation seems to have been like this for about 8 months, as this page from july 2016 clearly states:

Add your mobile phone number(s) and backup email address to the recovery options on the Password & Security screen. If you get locked out, we can use this to help verify your identity and restore access to your account.

I did express these concerns, and the reply I got was:

I really understand your frustration. I am sorry about that. I will pass your feedback to our supervisors.

We hope to implement the recovery procedure very soon.

But who knows? If they’ve been delaying this for 8 months now, I’m not confident that this will be fixed anytime soon, and that the “Password & Security screen” will continue to be a non-functioning, misleading page that does nothing but confuse the users. If the information isn’t used, don’t give the user the impression that it will. I can understand that not everything can be implemented at once, but have the balls to admit it, don’t lie to me. And about security issues? This is talentless!

So, to recap: The “Password & Security screen” of Fastmail is a sham. The information used there is not used. In order to regain access to your account if if loose your password you have to have a “backup email”. This backup email is not the same as the “recovery email”. The backup email has to be set by Fastmail staff.

Geospatial anarchy

It’s not that long ago since I started my PhD, but it feels like more time than a mere 2.5 months since.

But, what have I been doing? Well, one thing is that I’m taking classes, so some time has been spent attending lectures and examns (had my first examn in 8 years today, strange feeling). I’ve also started my literature review, so I’ve done a lot of reading.

But, to not derail too much: the title of this blog post is “Geospatial Anarchy”, which was the title of a talk I gave at the danish mapping conference “Kortdage” a week ago (see abstract here). The talk was in Norwegian (but understandable by danes, I hope). There is not much of a point in sharing my slides, as they are kinda devoid of meaning without me talking.

But, even better, the conference also asked if I could write an article covering the topic of the talk. Given the rather short deadline I opted out of the peer-review-process, but submitted a non-reviewed article.

I’ll post the abstract here, and if you want to read the whole article it’s available here.

OpenStreetMap (OSM) is the largest and best-known example of geospatial data creation using Volunteered Geographic Information (VGI). A large group of non-specialists joins their efforts online to create an open, worldwide map of the world. The project differs from traditional management of geospatial data on several accounts: both the underlying technology (Open Source components) and the mindset (schema-less structures using tags and changesets). We review how traditional organizations are currently using the OSM technology to meet their needs and how the mindset of OSM could be employed to traditional management of spatial datasets as well.

Student igjen!

Nei, jeg har ikke tenkt å innlede alle poster med en forklaring på hvorfor det er så lenge siden forrige, men nå er det jo forferdelig lenge siden sist jeg har skrevet noe. Blogg er kanskje litt ut, men jeg har nå tenkt å ta det opp igjen. Hvorfor? Les vidre og bli klokere.

Så sitter jeg på Gløshaugen igjen, 7 år etter at jeg leverte inn masteroppgaven min og trodde jeg var ferdig. Hva har skjedd? Ble det oppdaget en feil som gjorde at jeg må ta opp fag? Er jeg lei livet som geomatiker og skal begynne å studere fluidmekanikk?

Nei. Livet har det med å presentere muligheter man igrunn ikke kan takke nei til. For min del begynnte det for snart 2 år siden, da jeg søkte jobb hos Norkart og havna i et kjempemiljø av likesinnede kart-nerder (hei Alex). Dog, det forklarer ikke helt hvorfor jeg nå er student igjen? Løsningen ligger ikke så langt unna, i våres fikk jeg en telefon fra Terje Midtbø med spørsmål om jeg var interessert i å ta en doktorgrad. Jeg svarte “jo, men det er vel noe som må avklares med Norkart”, hvorpå Terje svarte: “Jeg har vært i kontakt med Norkart, de er positive til en nærings-ph.d!”.

Etter en ukes tenking (og konsultasjoner med B) konkluderte jeg med at dette er en mulighet jeg ikke kan la gå fra meg. Muligheten til å fordype meg i et emne jeg synes er spennende, over lang tid, samtidig som jeg jobber for Norkart med det miljøet som finnes der, med lønn? Kjør på!

Men, hva er så emnet? Det ligger til grunn for en nærings-ph.d at temaet skal være relevant for bedriften såvel som kandidaten. Etter litt tenking og konsultasjoner landet jeg (vi?) på forvaltining og analyse av geografiske datasett, med fokus på åpne datasett. Den foreløpige tittelen er “Geospatial data management in the future – Combining data management techniques from opensource communities and governmental bodies to enable cross-data-exploration and sensemaking”, ikke småtteri.

Kortformen på norsk er “Håndtering av romlige datasett i fremtiden”, og en populærvitenskapelig fremstilling av oppgaven er som følger.

Romlige data, eller kartdata, er datene som ligger bak alle kart og lokasjonstjenester vi bruker i det daglige. I tillegg er dette data det offentlige bruker i en rekke områder av forvaltningen: Matrikkeldata (eiendomsdata), forvaltning av naturvernområder, veiutbygging er noen få eksempler.
PC-, og senere smarttelefon-revolusjonen, har økt både tilgangen på og bruken av kartdata. Ved bruk av GPS-mottakere kan man enkelt finne og dele sin egen posisjon. Et resultat av denne revolusjonen er en Wikipedia-lignende tjeneste der alle kan legge til og endre kartdata for hele verden, OpenStreetMap.

Både bruken og produksjonen av romlige datasett har økt og vil fortsette å øke nærmest eksponentielt, og det finnes en rekke formater, utvalg av kartlagte objekter og store forskjeller i kvalitet og nøyaktighet på disse dataene. Dette reiser en rekke interessante problemstillinger og spørsmål; hvordan kan vi best mulig kombinere kartdata fra en rekke ulike kilder? Hvordan kan vi effektivt lagre og oppdatere store menger kartdata? Hvordan kan vi sammenstille data fra ulike kilder og analysere dem for å finne nye sammenhenger?

Så, dette skal jeg altså vie 75% av arbeidsdagen (+++++) min til de neste 4 årene. Jeg håper å kunne komme med bidrag som mange finner interessante, både i form av de påkrevde vitenskapelige artikler, foredrag, kode (åpen) og ikke minst oppdateringer her på bloggen. Det betyr nok at det meste av innholdet fremover kommer til å være ganske snevert, men denne bloggen har vel aldri vært så veldig bred..

Rent praktisk har jeg Terje Midtbø som hovedveileder og Alexander Nossum som med-veileder, jeg har kontorplass både på NTNU (Lerkendalsbygget) og Norkarts tronddheimskontor. Jeg har ingen undervisningsplikt, men klarer nok ikke å ikke involvere meg i studentprosjekter allikevel.

Og ja, jeg gleder meg stort til å komme i gang. Formelt sett er jeg i gang fra 1. september, men ting tar tid og jeg prøver nå å sette meg inn i litteratur og forfine oppgaven min. Følg med her, eller møt meg over en øl for flere detaljer!