Touch the firehose of ds106, the most recent flow of content from all of the blogs syndicated into ds106. As of right now, there have been 92792 posts brought in here going back to December 2010. If you want to be part of the flow, first learn more about ds106. Then, if you are truly ready and up to the task of creating web art, sign up and start doing it.

BEHOLD, the DATA!

Posted by
|

I procrastinated on starting this assignment quite a bit. My course load for Software Engineering has overshadowed most of my other classes, and I’m doing my best.

Anyways, this weekend I sought out to make some good progress on this assignment.  Previously, I found a data source, and I wanted to put it to use. The next day, I found an appropriate source for another set of data that would go well with the first.

The first set of data I found was an excel file containing the estimated percentage of internet users per country by year for 2001-2009 from the International Telecommunications Union. Overall, this data set contained information from about 220 countries or so.

The second set of data I found comes from the U.S. Census Bureau’s International Data Base, and I was able to obtain the population data from countries in the same year range.

To actually use this data, I had to do a fair amount of scripting.

ITU’s data was simple enough, I copied the data (by selecting the rows I needed) into Python, and after some string replacement, I had some usable lists of data.

The data from the IDB was a bit harder to grab. First of all, the data was on 10 different pages (one for each year), and it was not so friendly. My steps for getting the data went something like:

  1. Run a regex with Javascript to remove commas from population numbers.
  2. Select all data and paste into python.
  3. Run the string of data through a parsing function that I wrote.

Now that the data was in a usable form, I had a bit more mashing to do to get the data to make sense with each other.  I’m not going to go into too much detail, but after much reorganization, I ended up with two data sets. The first set the population data from 188 countries over the course of ten years (2000-2009). The second set is the estimated population of the same 188 countries over the same ten years.

In the end, yeah, I lost about 40 countries or so. Honestly, both sets of data probably contained the same countries (or at least were probably closer than they ended up), but due to differences in names and naming conventions, some countries were dropped from the set.

Now, to the visualization.

I will admit that what I have working right now has some flaws. In searching for a cool way to display the data, I ran into some nice tools, but I just do not have the time to work with some of them. I settled on a library called Raphaël, a JavaScript library that works with vector graphics via SVG.  Unfortunately, there are a few minor things that I’m not so happy with about my end result. It is what it is though, and even with the inconsistencies, I think the data is displayed in a nice fashion.

That’s about all I have to say before I throw up the link to the final version. Oh, users beware, I can only vouch for this working in Chrome on Windows 7. I’m pretty sure this doesn’t do much of anything in Firefox (it doesn’t in 3.6, can’t speak on 4). IE? Good luck.

Oh, and if you are on Chrome, then be sure to click around. You can change the year that is currently shown  and you can also change which countries you wish to see visualized.

..also.. its really best to have a large screen and view it full-screen if you can.

Anyways, here it is: Population vs Estimated Internet Users 2000-2009.

If I get a chance I may be updating this, but that chance is looking slim at the moment.

Add a comment

ds106 in[SPIRE]