Results tagged “data”

Hey, I need your input on a "Crunching Public Data" course at Code Lesson

Crunching Public Data on Code Lesson.

I’ve often heard the saying “you know a subject well when you can teach it confidently.” So I’ve decided to put that to the test by stepping up to teach “Crunching Public Data” at Code Lesson this spring.

This is the first time that Code Lesson will be running this four-week online course, so the focus of the learning experience is really in my hands (and yours), and I’m excited to be able to package up some of “working with data” tricks I’ve learned over the last fifteen years of working with advocacy organizations and publishers, as well as some of the inspiration and ideas that have come from being involved with groups like Civic Access and the Electoral Data Consortium.

Over the next few weeks, I will be working on the course outline — starting to add more detail for each section — and drafting the first week of course material. This is where I really need your input: to help ensure that this course is valuable to people like you — people who might consider taking the course — I want to make sure that it incorporates ideas and tools that would interest you.

There are two things that I could use feedback on right away:

  • The target audience for the course: Who is it?
  • The title of the course: Does it speak to the audience?

Who are the people that might need this course to improve their storytelling?

My first thought is that the course will be prepared for folks who don’t have a lot of practical experience using software and programming to work with data: mostly journalists, researchers, and civic enthusiasts — that is my best guess. The course will focus on:

  • A) finding publicly available data relevant to your line of investigation,
  • B) exploring that data,
  • and C) publishing meaningful representations of that data.

I’ve been asked to focus in a generic sense on Data.gov and similar providers, but I’m hoping that participants will work with datasets that are local to them, e.g., data that is available in their city, town, or region, or a specific area of thematic interest.

Depending on the level of technical experience that participants come to the course with, I would like to spend a fair bit of time introducing a small set of freely available tools for the exploration and publishing of data, and will ask participants to work on a project that will demonstrate their understanding of one or more of the techniques or tools introduced throughout the course.

A basic familiarity of Python is currently listed as a prerequisite on the Code Lesson site, but my sense is that the course will keep the programming-related tasks very, very light; for example, Python might be introduced in the context of using CSVkit to quickly investigate a large dataset, or using ScraperWiki to obtain some data that isn’t readily available.

So my question for you is: Who are the people that might need this kind of a course, and what specifically would they be hoping to learn? Are there aspects of “telling stories with data,” or “finding and understanding data,” that would be critical to include — even at a conceptual level — in this course?

I’m keen to hear from people who’ve been thinking about this a lot (I know there are lots of you out there!), and from those people who might actually take a course like this if it delivered practical skills that could be used every day.

Crunching public data? What does it mean?

The venerable quasi-sage guru of personal branding, Kris Krug, offered: “The title might need some help. Google has some good language around empowering new journalistic practices through programming and data. I’d read up on their scholarship and grants and morph the language a bit.” The scholarships that Kris refers to are the “The AP-Google Journalism & Technology Scholarship

I agree that the title needs some work, and I had initially proposed “Crunching Public Data: Finding, exploring, and visualizing data to tell better stories.” However, perhaps that title doesn’t quite hit the mark either in terms of being accessible to people that might be looking for a course like this.

The question here is: In the context of finding, exploring, and presenting the stories that can be found in “data,” does the term “crunching” add anything of value? Straight up: Do you have a suggestion for a title that would have more resonance with your friends and colleagues that might be interested in a course like this?

Open Journalism, Open Web, Open Learning

Over the past year and a half, I’ve had the incredible opportunity to think about, develop, and deliver online curriculum at the nexus of journalism, software, and the open Web.

The first pilot course was the result of a mini-grant from the Knight Foundation to Hacks/Hackers and Mozilla that brought together forty participants — twenty working journalists, and twenty professional software developers — for a six week online, peer-to-peer, learning experience.

This past summer, as part of the Knight-Mozilla News Technology Partnership, I ran what came to be known as the MozNewsLab, which took sixty participants through an intense four-week lecture-based program that aimed to introduce new thinking from luminaries of the journalism and software worlds.

I’m keen to keep working on the challenge of delivering practical skills to those individuals that are working to keep our communities knowable and our governments transparent and accountable, and I’m excited to have the opportunity to translate the learnings from running relatively large online courses into a learning experience that and is more intimate and hands-on.

In addition to the four-week version of this course at Code Lesson, I’ll also be delivering a two-day workshop version of the material in Vancouver in May or June (details to follow).

So, shoot me a note via Twitter, Linkedin, e-mail, or via the comments here if you have any thoughts on the audience for this course, and — given the audience — an appropriate name.

Many thanks in advance! :)

Comments

2 Comments

Great project! Sounds awesome.

Did you check out the "Data Journalism Handbook" project that came out of the Mozilla Festival?
http://owni.eu/2011/11/15/hacks-and-hackers-gather-to-write-the-first-data-journalism-handbook/

Version 0.2 here:
http://mzl.la/A91XK5

I think you're right to zero in on the question of your target audience. And maybe include them right in the course title.

"Crunching public data for journalism, activism and storytelling," or something like that?

Leave a comment

TrackBack URL: http://www.phillipadsmith.com/trackback/2772

The ultimate data backup triple-play for under $500

For the last couple years I've struggled to find the perfect backup solution. The perfect backup solution I was after had to meet certain criteria:

  • It had to be continuous and require almost no thought;
  • It had to be both onsite (for fast access) and offsite (in case of theft);
  • It had to to be encrypted so that my client's data was protected.

Recently, it all came into focus... so I thought I'd share my "ultimate data backup triple-play for under $500" in case you're in a similar situation.

The first thing I did was ditch my Buffalo Linkstation Mini 1TB Network Attached Storage (NAS) device (great conceptually, terrible in practice) and bought a Western Digital My Book Studio 2TB drive with firewire 800 and USB 2.0 interfaces. Unlike the NAS device, the firewire 800 connection means that my local, onsite backups are blazingly fast and the device only cost $210 CAD at Canada Computers.

Next I signed-up for Backblaze -- an online (thus offsite) backup service -- after reading this (very convincing) article about their hardware and HTTP-based backup software. The Backblaze service costs $50/year for one computer with unlimited data (wich is the key, as I have a lot of data to backup).

Finally, I found a way to make the process of backing up to my 160GB "classic" iPod painless and functional by ditching my hand-crafted rsync scripts and replacing them with the easy-as-pie iPodBackup software. The current cost for a 160GB iPod is roughly $259.00 and you can probably find one a lot cheaper on eBay or Craigslist.

Those pieces in place, here's how it all works:

  • I have a full backup of my computer on the 2TB hard drive that runs continuously via Apple's Time Machine software (not as terrible a piece of software as I thought it would be, to be honest). The hard drive mentioned above is one of the few at that price that comes with built-in hardware-based encryption -- so the drive is locked and encrypted when I dismount it.
  • A continuous encrypted backup of my essential client files (~40GB) happens via Backblaze so that I never need to think about it and can access the data in a pinch from the road. Backblaze lets you provide your own private encryption key, so that data is also encrypted both on-route to Backblaze and at their facility.
  • Finally, I perform a semi-regular encrypted backup of my essential client files (~40GB) on to my iPod, which I bring along with me on trips so that I have a copy of all my client data in my pocket. The iPodBackup software handles the creation of an encrypted "sparse image" before it moves the backup to the iPod, so I never have to worry that much about losing the iPod or having it stolen, as the data is encrypted.

All this for under $500. That's a low price to pay for complete piece of mind. :)

And, because I saw a tweet from my friend Rolf about it this morning, I should mention quickly what I do on the server that hosts my e-mail and Web sites. Basically, after much futzing around, I ended up with a simple solution using rsync and expect (to handle authentication prompts) that backs-up all of my Web site data, e-mail, and anything else lying around my account. This is all backed up to the free 100GB Strongspace account that I received as part of my lifetime hosting account with Textdrive (now Joyent). That backup runs every day by itself -- never have to think about it! -- and makes those files available via sFTP and a nifty Web interface.

Note to those of you that use a Linux desktop operating system: obviously, a lot of the above is Mac-centric. If you have some suggestions on how to achieve roughly the same set-up on open hardware and free software, it would be great if you could pop it into the comments. :-)

Comments

2 Comments

Dropbox

Hi Phillip,

Great post! I'll probably use some combo of what you've described, but my file size and backup needs are nowhere as extensive as yours seem to be.

I'm only regularly backing up my 'active' files - those files that I'm regularly changing. The rest just sit dormant.

I chose to go with Dropbox. It sits on any platform and is operating system agnostic. It's cost is about $50 US per year and it's true cloud computing (of course, you have to trust 'the cloud').

It automatically makes a backup on your desktop while you're logged in. It also automatically updates any other platforms that carry the same Dropbox account, so you're always updated. Finally, if you are using a third-party platform (ie. a library at a computer), you can access your files via a Dropbox login.

It's simplified my file-life in a quantum way :)

Thanks for sharing your

Thanks for sharing your thoughts and ideas. I too was trying to find a perfect storage for the endless pile of data that I have related to my business. There were times where I had nightmares of losing all that data. I have grown even more anxious for searching the perfect backup storage. Praise God Almighty I have finally found it! Thanks
remote online backup

Leave a comment

TrackBack URL: http://www.phillipadsmith.com/trackback/1831

Apathy is Boring says "Parliament is sitting" for youth.

Exciting news that I missed at the end of last week (thanks Drumbeat!)... Ilona Dougherty from the awesome Apathy is Boring organization writes about (yet another) open data initiative -- this time, focused at Canadian youth:

We've been wondering for quite some time how we can make Parliament and the legislative process easier to understand for Canadian youth. With the help of the Department of Canadian Heritage, Micheal Lenczner, and Daniel Haran, we recently started developing website that aggregates Parliamentary data and (more importantly) makes this information meaningful to young Canadians.

An early version of CitizenFactory.com went live at the end of March. We're planning a publicized launch in the coming months, once additional features have been added.

Since our soft launch, we've been excited to see Michael Mulley (who was kind enough to help out during a Citizen Factory hack day) launch OpenParliament. Datadotgc.ca also launched, and - thanks to the work of David Eaves and others - Parliament has agreed to provide more data in XML.

Our goal with Citizen Factory is not only to provide this information to youth, but also to help them decode it and take action. Apathy is Boring has access to tens of thousands of youth across Canada with whom we'll share this tool.

We thought this was as good a time as any to tell you about Citizen Factory. We would appreciate your comments and feedback

Open data in Canada is clearly on a roll. Happy days ahead.  :-)

(Via Civic Access.)
 
Reblog this post [with Zemanta]

Leave a comment

TrackBack URL: http://www.phillipadsmith.com/trackback/1947


1