Category Archives: English

The SOSI-format: The crazy, Norwegian, geospatial file format

Imagine trying to coordinate the exchange of geospatial data long before the birth of the Shapefile, before XML and JSON was thought of. Around the time when “microcomputers” was _really_ new, and mainframes was the norm. Before I was born.

Despite this (rather sorry) state of affairs, you realize that the “growing use of digital methods used in the production and use of geospatial data raises several coordiantion-issues” [my translation]. In addition, “there is an expressed wish from both software companies and users of geospatial data that new developments does not lead to a chaos of digital information that cannot be used without in-depth knowledge and large investments in software” [my translation].

Pretty forward-thinking if you ask me. Who was thinking about this in 1980? Turns out that two Norwegians, Stein W. Bie and Einar Stormark, did this in 1980, by writing this report.

This report is fantastic. It’s the first hint of a format that Norwegians working with geospatial data (and few others) still has to relate to today. The format, known as the “SOSI-Format” (not to be confused with the SOSI Standard) is a plaintext format for representing points, lines, areas and a bunch of other shapes, in addition to attribute data.

My reaction when I first encountered this format some 8 years ago was “what the hell is this?”, and I started on a crusade to get rid of the format (“there surely are better formats”). But I was hit by a lot of resistance. Partly because I confused the format with the standard, partly because I was young and did not know my history, partly because the format is still in widespread use, and partly because the format is in many ways really cool!

So, I started reading up on the format a bit (and made a parser for it in JavaScript, sosi.js). One thing that struck me was that a lot of things I’ve seen popping up lately has been in the SOSI-format for ages. Shared borders (as in TopoJSON) Check! Local origins (to save space) Check! Complex geometries (like arcs etc) Check!

But, what is it like? It’s a file written in what’s referred to as “dot-notation” (take a look at this file and you’ll understand why). The format was inspired by the british/canadion format FILEMATCH and a french database-system called SIGMI (anyone?).

The format is, as stated, text (i.e. ASCII) based, with the reason was that this ensured that data could be stored and transferred on a wide range of media. At the time of writing the report, there existed FORTRAN-implementations (for both Nord-10/S and UNIVAC 1100) for reading and writing. Nowadays, there exists several closed-source readers and writes for the format (implemented by several Norwegian GIS vendors), in addition to several Open Source readers.

The format is slated for replacement by some GML-variation, but we are still waiting. There is also GDAL/OGR support for the format, courtesy of Statens Kartverk. However, this requires a lot of hoop-jumping and make-magic on Linux. In addition, the current implementation does not work with utf-8, which is a major drawback as most .SOS-files these days are in fact utf-8.

So, there we are. The official Norwegian format for exchange of geographic information in 2018 is a nearly 40 year old plain text format. And the crazy thing is that this Norwegian oddity is actually something other countries are envious about, as we actually have a common, open (!), standard for this purpose, not some de-facto reverse-engineered binary format.

And, why indeed, why should the age of a format be an issue, as long as it works?

Fastmail not taking security seriously?

About three years ago I figured I’d had enough Google-control of my online communication and was looking for an alternative email-provider. A friend of mine recommended Fastmail, which seemed like a good solution: Great web-interface, Android app, and the possibility of using an address from my own domain.

I signed up and have been using Fastmail since (with a redirect from my Gmail-address). The service has had some small issues (mainly the Android app being anything but “fast”), but overall I’ve been a happy customer.

Yesterday I figured out that I wanted to test 1password, moving away from LastPass after the recent security issues. In this process I decided to use the “generate password” functionality in 1password to set a new, strong password for my Fastmail account. Before I did that I made sure to set the “Account Recovery” email and phone number, so that if I made en error I would still be able to access my email.

And I was right. Indeed I made an error. I copied the generated password from 1password and pasted it into the change password dialog on fastmail. This logged me out, and then I managed to copy something else, removing the password from my clipboard. Then I managed to do something stupid in the 1password app, and my generated, 30-character, completely random, password was lost. I had managed to lock myself out of my email-account! Stupid! But hey, I have a recovery-email, right?

So I headed to the “Lost password screen” and typed in my gmail.address (to which I 10 minutes before had recieved a confirmation mail from fastmail).

Then I got the message:

The existing email address you entered was not for an existing user, or was for an account that has been disabled. Please try again

What?! Ok, after re-trying 5-6 times i had to open a ticket and provide a lot of information to regain-access by a manual process. In the ticket I wrote:

Thanks for the verification details.
I have now set your backup email address to:
*****@gmail.com

And I’m back in. Hooray! But I’m still wondering why the recovery email I entered did not work, so I’m asking:

Wasn’t my backup email set, or was there some problems regarding this feature? I am quite sure that I set my backup email yesterday.

The reply to this confused me:

Looks like the backup email address was not set. We then set it from our end and it worked for you. Please let me know if you need any further assistance.

After some back and forth I find out why:

Did you set this address from the Password & Security screen? If that is the case, you had set the “Recovery email address”. This is currently different from the backup email. Backup email can be set from the backend only.

And the password reset can be done using the backup email address only. The recovery process through recovery email address is not yet released into production. So I am afraid it will not work as of now.

What the actual, flying, fuck? The “Password & Security screen” is a frontend for some code that does not work? It presents itself as a way of setting a recovery mail, while it actually does nothing? The situation seems to have been like this for about 8 months, as this page from july 2016 clearly states:

Add your mobile phone number(s) and backup email address to the recovery options on the Password & Security screen. If you get locked out, we can use this to help verify your identity and restore access to your account.

I did express these concerns, and the reply I got was:

I really understand your frustration. I am sorry about that. I will pass your feedback to our supervisors.

We hope to implement the recovery procedure very soon.

But who knows? If they’ve been delaying this for 8 months now, I’m not confident that this will be fixed anytime soon, and that the “Password & Security screen” will continue to be a non-functioning, misleading page that does nothing but confuse the users. If the information isn’t used, don’t give the user the impression that it will. I can understand that not everything can be implemented at once, but have the balls to admit it, don’t lie to me. And about security issues? This is talentless!

So, to recap: The “Password & Security screen” of Fastmail is a sham. The information used there is not used. In order to regain access to your account if if loose your password you have to have a “backup email”. This backup email is not the same as the “recovery email”. The backup email has to be set by Fastmail staff.

Geospatial anarchy

It’s not that long ago since I started my PhD, but it feels like more time than a mere 2.5 months since.

But, what have I been doing? Well, one thing is that I’m taking classes, so some time has been spent attending lectures and examns (had my first examn in 8 years today, strange feeling). I’ve also started my literature review, so I’ve done a lot of reading.

But, to not derail too much: the title of this blog post is “Geospatial Anarchy”, which was the title of a talk I gave at the danish mapping conference “Kortdage” a week ago (see abstract here). The talk was in Norwegian (but understandable by danes, I hope). There is not much of a point in sharing my slides, as they are kinda devoid of meaning without me talking.

But, even better, the conference also asked if I could write an article covering the topic of the talk. Given the rather short deadline I opted out of the peer-review-process, but submitted a non-reviewed article.

I’ll post the abstract here, and if you want to read the whole article it’s available here.

OpenStreetMap (OSM) is the largest and best-known example of geospatial data creation using Volunteered Geographic Information (VGI). A large group of non-specialists joins their efforts online to create an open, worldwide map of the world. The project differs from traditional management of geospatial data on several accounts: both the underlying technology (Open Source components) and the mindset (schema-less structures using tags and changesets). We review how traditional organizations are currently using the OSM technology to meet their needs and how the mindset of OSM could be employed to traditional management of spatial datasets as well.

Testing Geospatial claims using Qgis, CartoDB and Cesium.js

andersnatten
This summer I hiked to Andersnatten, a rather small mountain in the southeast of Norway. On the start of the trail there was a sign, that among other things, said that you can see 7 parishes from the top. When we reached the top we had a great view, but I couldn’t see any parishes.. That is, I have no idea where the parishes are, so I couldn’t refute or confirm the claim.

Beeing a geospatial geek I thought that this should be possible to remedy. I just needed the Parishes as polygons, and then I could do some analysis. Well, turns out that I couldn’t find any georeferenced parishes. The closes I got was these scanned paper maps. I couldn’t let this stop me, so I opened Qgis and set to work. Luckily the parish borders resembles modern day municipality borders rather close, so with the georeferenced paper maps, and some other Qgis magic (perhaps more on this in a later blog post) I managed to digitize all the 315 parishes from 1801.

I then loaded these digitized parish polygons into cartodb, and colored the ones around Sigdal, the parish where Andersnatten is. The map turned out like this:

With this in place it was rather certain that 7 was a reasonable amount, there are 6 parishes sharing a border with Sigdal, these should be see-able. According to this article “Dust, water vapour and pollution in the air will rarely let you see more than 20 kilometres (12 miles), even on a clear day.”

Ok, 20 km, lets see how close the 7 nearest parishes are to Andersnatten, using a PostGIS query in Cartodb:

SELECT 
	name,
	ST_Distance(
      the_geom::geography,
      ST_SetSRID(
        ST_MakePoint(9.41677103, 60.11744509),
        4326
      )::geography) / 1000 as dist
FROM prestegjeld
ORDER BY dist
LIMIT 7

This gives

Sigdal      0
Rolloug     5.818130882558
Flesberg    13.510679336426
Modum       19.686924594812
Næss        20.103683955646
Nordrehoug  22.199498534301
Tind        24.804343832136

Ok, so the farthest parish of the seven are 24 kms away, give us some leeway since we are on a 733 m high peak. Interestingly, the closes ones doesn’t overlap with neighboring parishes.

Ok, but, line of sight? What if there are other mountains blocking the view? Since I’m already working on a Cesium.js project I decided to add the CartoDB map to a 3d Model and do some visual tests.

View North
View North

View Southeast
View Southeast

View South
View South

View West
View West

Oh, that’s a suprise. Ok, Cesium does not add “dust, water vapour or air pollution”, and the height model might be a bit off, but nonetheless: 13 (possibly 14) parishes can be seen in this model! That is double the number stated on the sign! Guess they have backing for their claim after all!

Oh, by the way: the digitized parishes are available at GitHub

GIS Programming: languages breakdown

Yesterday i found this post on geoawesomeness, with the intriguing headline “Learning GIS programming: An overview”. After reading it I felt a bit disappointed though. It was basically a breakdown of different programming languages and their usages in the GIS field. While this in itself is a good thing, I think it left a great deal out and confused some. Then I thought “well, instead of complaining, do it better yourself!”.

So here we go, my breakdown of some selected programming languages, their usages in the GIS field, along with notable examples and libraries. I’m ordering the languages roughly by age. Bear in mind that I was born in the 80ies, so your favourite language from before 2000 might not make it to the list if it’s not around anymore.

Fortran
This may be the exception to the previous statement, but Fortran is still around, I’ve even programmed in Fortran.

Fortran is an imperative language, the first compiled language, dated back to 1957. It’s still used today in numerical computations, but in the GIS field it’s largely legacy code that still is in Fortran. The only example I can think of is a set of geodesic functions we used at the university: Holsen’s småprogrammer.

Unless you know Fortran by heart and like working with legacy code you can safely ignore this language.

C/C++
C and C++ is actually two different languages, or: C++ is a superset of C with object oriented capabilities, while C is an imperative language. They date from 1970/1980, and since I don’t really know these languages I’ll treat them like the same. My impression is that they are rather “down to the metal”, you have pointers, memory management and stuff like that.

Unlike Fortran, C/C++ is still in widespread use, in the GIS-field it’s being used for several desktop applications of some age, as well as in what I’ll call the “first wave” of open source libraries and utilities. Notable mentions are PostGIS, OGR/GDAL, PROJ.4 and Mapserver.

While you may now know C/C++ and never really write a line of code in it, you will be using some tools written in it, either as a database, through the command line or through language bindings.

Java
Java is an object-oriented, multi-purpose language from 1995. It was originally developed by SUN. It’s become known for it’s rather “enterprisy” libraries, with several layers of abstractions and other strange things. Despite this, the language has gained widespread use, although it’s prime time may be past, although Java is the programming language for Android apps.

Java libraries was the “second wave” of open source GIS, and brought us libraries and tools like GeoServer, GeoTools, JTS and GeoWebCache.

Just because of GeoServer I think you should know some Java to get along as a GIS-developer. GeoServer has support for plugins, written in Java. This means that mastering Java you will be able to extend GeoServer to your needs. Java is not all that difficult in itself, but Java-code tends to be bogged down in layers of abstraction as mentioned earlier.

C#
C# is, in a way, Microsofts version of Java. It’s object oriented at it’s core, and a multi purpose language. Released in 2000 it’s gained a large following in “Microsoft shops”, and it’s way better than anything Microsoft has previously made. The language itself is rather nice, but suffers from some of the same enterprisyness as Java, and the tooling is completely tied to Microsoft (Visual Studio and the like) if you stick with the .NET-platform (as most do).

This may be the reason why the open source community hasn’t embraced C#, but there are some ok-ish libraries, mainly NetTopologySuite and some ports of Proj.4. At least in Norway you’ll have to be a good navigator to avoid C# and .NET, it seems to be the preferred language and platform for several consultancies, software houses and governmental bodies.

Python
Python is a multi-paradigm, dynamically typed language focused on readability. It’s not the fastest language around, but can use C/C++ bindings to speed up things.

Python has been adopted by ESRI as the scripting language of choice for their ArcGIS-platform, as well as by QGIS, where you have access to a python REPL and can write plugins using Python. There are also other GIS-libraries for Python, mainly Shapely, Fiona and Rasterio, as well as several other tools. On the applications side there is the tile server MapProxy and several other utilities.

Python is a really great programming language in itself, easy to grasp, enforces clean, readable code, and with the usage both in ESRI and QGIS it’s a language that you most definably should know it you work with GIS.

JavaScript
JavaScript was once known as the programming language for web browsers, and was regarded as clumsy, difficult and a toy language. That’s changed a bit the last years, with better tooling and some improvements to the language itself, but it still is a dynamic language with both parts object orientation and functional programming sides to it. The rise of Node.js also made JavaScript a general-purpose language, and this constitutes the “third wave” of open source GIS-libraries if you ask me.

From the advent of Google Maps and OpenLayers, JavaScript found it’s place in the GIS-domain as the language to write web map clients in (that is, after people realized that Flash and Sliverlight where blind alleys). Now there is a large ecosystem of browser-libraries, such as OpenLayers 2 & 3, Leaflet, mapbox-gl-js, proj4js and several more.

As for Node.js, this has been adopted by the “geohipster”-company Mapbox, which uses JavaScript for several parts of their server-stack, resulting in open source libraries such as Turf.js.

Again, JavaScript is really a language to focus on if you plan on doing any web-related GIS-work at all. Just don’t think you know JavaScript because the syntax is close to that of Java/C#, and do take your time to dig in to the functional sides of the language. And stay clear of Angular.js, unless you really like enterprisy code! JavaScript still has it’s quirks, and there are released several new frameworks, tools and libraries each day, so you may find this language a bit confusing.

These are mainly the languages that are used today as far as I know, but there are some other languages that might be worth looking into, namely:

Swift/Objective-C: Used for app development on the Apple platform, I really don’t know anything about this, but there gotta be some libraries, as there are maps both on iPhones and iPads.

Go Is a relatively new language from Google, perhaps described as C for the new century? I’ve never used it, but I wan’t to use it as I know several people who really seem to like it. As for GIS-libraries I’m not sure, but I believe there are wrappers for OGR/GDAL and Proj.4 available.

Clojure Is a Lisp-implementation using the JVM. It’s really functional programming, a style which I’ve been attracted to the last year or so, although I haven’t used Clojure at all, and I do not know if there are any GIS-libraries available, but hopefully?

There are probably a dozen more languages that could be included in this list, like Scala, Groovy, Ruby or PHP, but I really don’t know them at any depth, and I’m not sure about how they stack up when it comes to GIS. If you do know, drop me a comment!