Projects — What are they?
Scope
The "project" is intended to be an undertaking in your area of interest that is worthy of the investment of a semester's worth of time by up to three people. Students with thesis-related projects may have some advantage in topic selection, but not necessarily.
You are very strongly encouraged to choose collaborators of mixed skill levels. More senior members of a team will hone their ability to communicate the substance and technical implementation of their own work. More junior members of a team will gain familiarity with a wide range of data and data-visualization problems that they never knew they had.
Streams
- Development on an open source project (features and/or bugs).
- Development of your own work, such as theis projects, where data and model visualization components get the development help of a team, for critique, suggestion, bugfix, sharing of exactly the kind of basic code snippets / tricks that form the invaluable accumulation of graduate work.
- Really 2.5; is a variant of (2) using data that will be made available from a partner with lots of data and some visualization targets specified.
(1) is a service to the world and landing a pull request on an open source project is arguably the best way to become known among would-be peers in the field. Work on (2), essentially pairing on thesis projects, gives the benefit of project management (scope definition, sprint organization, self assessment), as well as specialization — never underestimate the benefit of “another pair of eyes” on a problem you are working on. The data partner for (3) offers high visibility for the result of your combined effort.
Some suggestions of varying scope:
- todo lists/wikis/issues of open source projects in data visualization
- ggvis
- ggplot2
- python ggplot
- crossfilter
- write a tool for exploratory/graphical methods where they are lacking. (e.g. coefplot ) )
- replication of published research for more visualization (e.g. tables to graphs); diet study; college instructor studies;
- improve or add insights to data presentations such as environmental performance index.
- modeling and presenting data for your other work (making more and better statistical graphics than you would otherwise).
Assessment
You are expected to write down achievable-but-difficult (or difficult-but-achievable) goals, and objectives with observable results.
Iteration Plans
Work should be broken up into two-week sprints with specific objectives. At the end of each sprint, you will evaluate your own progress, and schedule items/objectives for the next two weeks.
Ideally, you'll complete more than you thought you would in each sprint. By the third iteration plan add items/objectives that you would have been wary of at the outset. See an example of a first sprint plan below…
Sprint 1: Identify objectives for overall project:
Stream 1:
- Fork the project and make a branch or branches for each feature or bug you will target.
- make notes about your plan for attacking the features/bugs.
- Identify strengths of group members (writing doc, coming up with examples / demonstrations of feature, tests for bug)
Stream 2:
- Identify data sources you would like to use and brainstorm (sketch/scan?) graphs you each would ideally like to make.
- Identify strengths of group members — who is better at munging data, making graphs, making fine detail adjustments?
- Decide target: will it be static or interactive? Shiny (sortof interactive R) or standalone D3? A couple of you might have a full-stack option, serving on heroku or similar (talk to me about this).
- Set up your own common repo to work on (not the class one), and add each other as collaborators.
Stream 3:
- Identify datasets / target graphs of interest
- Identify strengths — data manipulation, graphs, brainstorm / sketch ideas like we did in class 1. Divide work!