You may replace up to 3 assignments with a semester long project (depending on the scope of your project).

Important Dates and What to Hand In

Project Teams 1/29 -
Prospectus 2/8 25%
Participate in E6998.003 Poster Session (optional) 5/3  
Report 5/5 75%

Project Teams

Teams should consist of 1-4 people (strongly recommend 2 or 3). In addition, if you have a project in mind, please indicate briefly (1–2 sentences) what you are thinking. We have included a list of possible projects at the end of this document although you are not required to choose from these.



Your research prospectus will contain an overview of the research problem, your hypothesis1, first pass at related work, a description of how you plan to complete the project, and metrics to decide if it worked. A good prospectus is basically the skeleton of the full report.

Your prospectus should follow the example:

It is highly recommended that you come to office hours to discuss project ideas before writing the prospectus.

1This is science after all.


  1. Rename the filename of your prospectus to the following format, last names should be in alphabetical order. prospectus_<lastname1>_<lastname2>.._<lastnameN>.pdf
  2. Upload the file by 2/8 11:59PM EST

Poster Session (optional)

You can optionally participate in E6998.003’s poster session. Simply make a poster and present the project during the session. This gives you an opportunity to present a short demo of your work and show what you have accomplished in the class!


You will prepare a conference-style report on your project with maximum length of 15 pages (10 pt font or larger, one or two columns, 1 inch margins, single or double spaced – more is not better.) Your report should expand upon your prospectus and introduce and motivate the problem your project addresses, describe related work in the area, discuss the elements of your solution, and present results that measure the behavior, performance, or functionality of your system (with comparisons to other related systems as appropriate.)

Because this report is the primary deliverable upon which you will be graded, do not treat it as an afterthought. Plan to leave at least a week to do the writing, and make sure your proofread and edit carefully!


  1. Rename the filename of your report to the following format, last names should be in alphabetical order. report_<lastname1>_<lastname2>.._<lastnameN>.pdf
  2. Upload the file by 11:59PM EST

What is Expected

Good projects can vary dramatically in complexity, scope, and topic. The only requirement is that they be related to something we have studied in this class and that they contain some element of novelty – e.g., that you do more than simply engineer a piece of software that someone else has described or architected. For instance, a new analysis, a new hard to collect dataset, a new optimization, a new way of interacting with data, are all fair game. To help you determine if your idea is of reasonable scope, we will arrange to meet with each group several times throughout the semester.

Project Suggestions

The following are examples of possible projects – they are by no means a complete list and you are free to select your own projects. In general, projects can be of three varieties:

  1. Research project: model an unsolved problem, propose algorithmic solution, evaluate and report findings.
  2. Win: pick an existing useful application and a well-recognized metric (latency, prediction, etc) and win against the state of the art.
  3. Break and fix: implement a state of the art algorithm on real data, show that it doesn’t actually work (results are poor, it’s slow, etc), make it work.
  4. End-to-end system: identify a new modality or new exploration approach for a particular domain and build a prototype that a normal person could use.

More Well-defined Projects

The steps to complete the following projects are relatively well-defined. All of them have the potential to be publishable work. Your report does not need to be published to do well in the class.

Does chat-based querying actually work?

Be Relatable:

PDFs + tables

Implement Innochat

Query The Web

Which Optimization Makes Sense?

Run some perceptual studies:

Data file formats

Harder (more ambiguous) Projects


Explanation and Cleaning

Core Data Processing for Viz

Recommendations and Predictions