Research Project

You will work in groups of three or four (some exceptions to group size will be permitted) on a custom research project of your choosing. The project will span 12 weeks of the semester and will result in:

a high-quality, conference-style paper (5-8 pages for groups of 3; 6-8 pages for groups of 4. This page limit does not include your required references and the Impact Statement)
a poster presentation of your work; and
the associated code to reproduce all work

MOTIVATION

We wish to provide students with not only a solid foundation of advanced concepts, but also the opportunity to conduct research within the field of NLP. That is, students in this course will be expected to demonstrate:

a strong understanding of the course content, via homework assignments
the ability to craft real-world solutions to problems, via homework assignments that involve implementing models on actual data
knowledge of a particular NLP research problem; the ability to pose a related, interesting, unsolved research question; principled approaches that aim to answer the research question

Computer Science, especially Machine Learning and NLP, is a field that progresses at an incredible rate. State-of-the-art models from five years ago are often barely sufficient in serving as baseline models for any particular task. Yet, we want everyone to gain skills that will last them for many years to come. Gaining research experience greatly helps toward this, as it requires critically reading and evaluating the latest, great approaches to a given problem. Then, one must poke holes in the current community-wide state of accepted knowledge (i.e., published research papers at top venues) and formulate specific research questions that the community at large doesn’t yet know the answers to. Working toward an answer is immediately fruitful to the problem at hand. More importantly, this process of conducting research, in general, is a healthy mechanism that allows one to continuously learn forever and not rely on knowledge that will inevitably become outdated.

RESEARCH PAPERS

In the world of research, the currency of knowledge is in the form of research papers. Specifically, in Computer Science, the gold standard is conference papers, not journals, and there are two formats:

long papers (~8 pages)
short papers (~4 pages)

Obviously, teaching how to conduct research is beyond the scope of any particular course, as it’s the entire purpose of a PhD. However, very briefly, a research paper should:

identify a concrete problem/issue (e.g., a new problem or an issue discovered with existing work)
exhaustively and meticulously review and summarize related work (very important)
offer novel insights or a solution to the problem
irrefutably defend the proposed solution (e.g., offer new state-of-the-art results with thorough experiments). The reader should be left with no doubt that your work is a viable and useful solution to your important problem. Demonstrating this is the crux of your narrative.
be objectively written (i.e., doesn’t editorialize or use first-person pronouns)
be specific (e.g., no weasel words or fluff)
not over-generalize or make bold claims beyond what its work demonstrates
summarize your work/findings to varying degrees and styles (via the Abstract, Introduction, and Conclusion)

It is incredibly fun to brainstorm and try out/implement clever ideas you may have to a problem. As a consequence, it is natural to skimp on the literature review and to get excited about your own ideas. Don’t fall into this trap, as you’ll inevitably suffer the consequences in the long run. For example, it is a horrible experience to spend many weeks or months on a solution, just to later realize that it’s already been tried and published. Make sure you thoroughly read related works, as it can save you from unnecessary work and trouble.

GOLD STANDARD

The top NLP research conferences are:

ACL
NAACL (the North American-located chapter of ACL)
EACL (the European-located chapter of ACL)
EMNLP
COLING
AAAI/ICLR/ICML (these concern AI/ML in general, but some NLP papers are accepted)

All published works from these conferences are made available to the public for free. Simply search for any year’s “accepted papers” (e.g., ACL 2023 accepted papers). Again, the field moves incredibly fast, so papers’ results/models from > 5 years tend to be significantly less relevant. However, it is important to be aware of the full history of the problem, including past approaches from > 5 years ago.

GUIDANCE

In this course, we aim to provide as much structure and guidance as possible so as to help each student gain high-quality, directed (not aimless) experience with NLP research. Toward this, there are several milestones/assessments throughout the project (percentages listed below are out of the total course grade):

Phase 1: Proposal (10%)
Phase 2: Related Work + Introduction (ungraded)
[OPTIONAL] Self-/peer- check-in (ungraded)
Phase 3: Paper Progress Report (5%)
Phase 4: FINAL DELIVERABLES:
- research paper (5-8 pages for groups of 3 students; 6-8 pages for groups of 4 students) (20%)
- poster session (10%)
- code (ungraded)
- [OPTIONAL] Self-/peer- check-in

Identifying a well-scoped, important research problem is often difficult. To help with this:

We will ask each student to come up with a project idea (Phase 0)
We will collect all of the aforementioned Project Ideas and share them to the class, so that all students benefit from each others’ ideas
The Project Ideas will facilitate finding project partners.
We, the teaching staff, will provide feedback throughout your project.

Each project will be appointed a TA. We have 10 TAs, and 500 students initially registered in the course. Thus, each TA will oversee roughly 10-15 projects. It is impossible for the TAs to be a priori familiar with everyone’s custom problem/project (e.g., the background literature). Instead, you can expect your TA to:

help you decide if your identified problem is too grand and impossible, easy, inappropriate for research (e.g., too applied and is more of a software development project)
provide feedback on selecting a reasonable baseline model
offer suggestions and feedback during designated Office Hours and address any egregious issues mentioned in your self-/peer- evaluation forms.

Further, we want to provide you all with sufficient time to work on your research project, so the last homework assignment is due roughly a month before the end of the semester – to minimize having competing workloads.

READING RESEARCH PAPERS

Effectively and efficiently reading research papers is a learned skill. In short though, a common and useful approach is to consider reading each paper in three stages:

Stage 1: read the Abstract (and optionally Introduction) (2-10 minutes)
Stage 2: read the Conclusion, then skim most of the paper (10-30 minutes)
Stage 3: thoroughly read the entire paper, with the goal of being able to understand the intricate details (30 minutes - 3+ hours)

Only proceed to a given stage after you’ve deemed it necessary to do so, given what you learned from the previous stage (e.g., you may have learned the work isn’t pertinent to your interests). Notice that each subsequent stage requires roughly an order of magnitude more time (2 minutes, 20 minutes, 200 minutes). It’s not worth your time to thoroughly read most papers, simply because their work isn’t closely related to your work. In other words, it is completely unnecessary to fully read every paper that looks interesting on the surface (e.g., title). This is especially helpful as you are just starting out on a new project, when you are still learning the scope of the field. Moreover, essentially no paper is perfectly, fully understandable; every paper has a page limit, and there’s only so much explaining one can detail. Thus, it’s often the case that some equations will have some poorly-defined variables, or for there to be tiny details about the data and models that are left out. So, please don’t waste your time by treating every paper as an oracle that is important for your work. Only the most related works are worth spending many hours diving into.

We highly recommend using ConnectedPapers to assist in finding related works.

You may find on arXiv.org a few highly-useful papers. Beware, though, arXiv papers are unvetted, pre-prints that are not peer-reviewed. It is common practice for authors to first place their paper on arXiv before submitting it to top-tier conferences (a la flag-planting). So, while some of arXiv papers will go on to become incredibly impactful published papers, the majority will not. If you find any arXiv paper that seems useful to your project, search for it on Semantic Scholar or Google Scholar to see if it’s been subsequently published in a top-tier research conference. This will help you assess how much stock to place in the given paper.

MORALE

We encourage everyone to please not feel intimidated by the breadth and depth of NLP research papers. You are in a safe and structured class environment, and we want to see everyone learn and grow. We hope everyone feels inspired to read tons of papers (even if they seem confusing at first), come up with several research questions and ideas, and execute your envisioned solutions as quickly as possible. In research, most attempted solutions do not work – this is the case for everyone. That’s part of the process and the necessary path for successfully contributing new nuggets of knowledge to a field. That’s the beauty and fun of science. The key is to “fail fast”, learn from each experiment’s results, and to document your insightful conclusions and successes.

EXPECTATIONS

LITERATURE SEARCH/REVIEW

It is easy to detect when one has conducted an insufficient, non-comprehensive review of past works. For example, the cited papers are old, only loosely related to the problem/project, are too generally related, and only a few are listed. There is no magical number, but it is reasonable to expect that the literature review of a well-researched project will require each student to skim ~50+ abstracts, skim 25+ papers, and to thoroughly read a dozen papers. Each project varies.

QUALITY OF RESEARCH

This is likely everyone’s first experience conducting NLP research, and that is completely fine. That is truly the point of this course – to provide a structured yet challenging introduction to the latest, advanced techniques in NLP. Since research papers are the de facto medium for detailing and disseminating new knowledge, we set this as your goal. Specifically, using the templates found here, you should write a 5-8 page paper (for groups of 3) or 6-8 page paper (for groups of 4).

We highly recommend that your team uses Overleaf to collaboratively edit your paper.

To be clear, your goal is to produce a paper that is worthy of submitting to ACL/NAACL/EMNLP. Of course, we only have one semester together, and research results are always impossible to predict. Thus, you will not be evaluated on if your research results are profound and actually worth submitting to a conference; rather, you will be evaluated on all aspects of how you conducted research (e.g., successfully identifying an important problem, thoroughly reading and summarizing the related work, coming up with reasonable solutions, methodically executing those ideas, effectively writing and orally-presenting your work).

You are also expected to include an Impact Statement and instructions on how to reproduce/run your code. Further details are in the DELIVERABLES section below.

SUCCESSFUL EXAMPLES

Chris’ Harvard version of this course in 2021 had a few papers get accepted into ACL:
- What GPT Knows About Who is Who. This was a great project because it took the recent, powerful technique of Probing and applied it to a difficult core NLP problem, which hadn’t yet been done before.
- Automatic Fake News Detection: Are current models “fact-checking” or “gut-checking”?. This was a great project because it cleverly and critically inspected an aspect of a challenging NLP task that we tended to overlook and assume was correct.
ACL 2023 Short Papers
NAACL 2022 long and short papers
EMNLP 2022 long and short papers
Jeff Huang maintains a list of best papers from many conferences. For the purposes of this class, the list of best papers is only useful for inspiration and to get a glimpse into how varying the styles can be amongst great research.

DELIVERABLES

NOTE: Research projects naturally evolve over time. So, if your project shifts gears as it progresses, that is completely fine. You can answer a slightly different research problem than what you initially set out to answer in your team’s Proposal, as long as it’s in the same ballpark. With that said:

after the Phase 2 deadline, you are expected to not change your project much. We can help with this process.
all Phase deadlines must be met on time
if you change focus from your team’s Proposal, you must:
- discuss this change with us ahead of time
- “catch up” by re-submitting the previous phases
projects cannot change team members after Phase 1

FORMING TEAMS (UNGRADED):

If you are enrolled in 6.8611 and thus taking the CIM portion of the course, you must select team members that attend your own CIM session. We will provide a spreadsheet to assist with this. For your project, your team must select (and optionally enhance) one of your team member’s ideas that were selected from PHASE 0.
If you are not enrolled in a CIM portion of the course (i.e., all 6.8610 students), then you are permitted to select non-CIM teammates. We will provide a spreadsheet to assist with this. For your project, your team must select (and optionally enhance) one of your team member’s ideas that were selected from PHASE 0.
If you wish for us to help you select teammates, we will be happy to do so. Please rank as many of the projects as possible (via a Google Form that we will provide), and we will help find the optimal student-to-project assignment so as to reach the maximum, global happiness.

PHASE 1: PROPOSALS (10%)

With your newly-formed teams, you will refine your research project idea and write a 1-2 page proposal. Specifically, you must:

describes the problem in terms of why it’s an important one (every published paper’s abstract/intro accomplishes this. so, we’re looking for something similar, not ground-breaking claims like how it will impact society)
lists a few (3-5) related papers and any commentary you may wish to include
details the measurement for success (e.g., BLEU score, F1 score, a novel comparison, etc)
list the pertinent dataset(s)
describe a few ideas (1-2) you have for approaching/solving the problem

We’ll provide feedback on your proposal.

To help keep everyone on track, we will ask you to submit your paper, which should include, at a minimum, an Introduction and Related Works.

Since you are still becoming familiar with your project, the Related Works section doesn’t have to be perfect or exhaustive yet. The expectation is that you’ve clearly identified the most similar works and the broader, tangential scope of related works. Likewise, the Introduction should clearly introduce and describe your problem. We expect this section to improve significantly over time, but for now it should be well-organized, such that any outsider could easily follow and digest your problem and the scope of it.

[OPTIONAL] SELF-/PEER- CHECK-IN (UNGRADED):

We hope that each team member fairly contributes. People rarely deliberately choose to be a bad team member; it’s usually by accident due to poorly communicated expectations. To help with this communication and accountability, we encourage each student to privately reflect on how their team is doing. What is working well, what could be improved, and if all team members are fairly contributing. If something isn’t going well, please let us know via the Google Form, so that we can offer support and address the situation. If you don’t submit this “check-in” Google Form, we will assume everything is going well.

NOTE: Although we designate two check-in points for optionally completing this Google Form (now and during the final project submission), you are always allowed and encouraged to inform us of any issues that arise! The reason we have two specific, designated times is to encourage you to deliberately take the time to internally reflect.

You will start new sections describing your Models and Experiments/Results. Specifically, you are expected to explain your system in any way you see fit, which is usually in a section titled Models or Methodology. Your experiments section should describe your exact setup, along with your baseline model’s results (usually in a sub-section titled Results). Again, it is okay if these results aren’t good yet, as nobody can perfectly predict the outcome. This is the nature of science and research. However, we do expect to see a reasonable approach for a baseline model, one that is not expected to perform very well but is a simple yet sensible initial approach to the problem. As a reminder, baseline results are critical to your work as they will help inform you of future directions to take, and they will serve as a reference point for your later results. That is, if you later develop a complicated, technical solution, how should one interpret its results? How will we know if its particular accuracy score is actually good? The baseline model provides that contrast and puts all future experiments into perspective.

We will grade your progress on Introduction, Related Works, and Models/Results.

PHASE 4: FINAL DELIVERABLES (30% IN TOTAL)

1. PAPER + IMPACT STATEMENT (20%)

Using the ACL templates, write a paper that is 5-8 pages (for groups of 3 students) or 6-8 pages (for groups of 4). This page limit excludes your required references and impact statement). To be clear, your references do not count toward the page limit.
Append your paper+references with an Impact Statement that details the possible ethical and societal ramifications of your project. The length should be at least 2 paragraphs. The full expectations/details can be found here.

You will be evaluated on:

all technical aspects of your research, ensuring that you:
- identified a sound problem
- conducted a thorough literature review
- crafted a reasonable approach(es)
- ran thorough experiments
- provided a detailed analysis of results
the clarity of your writing (e.g., ease of reading and understanding, grammatical errors)
the technical appropriateness of your writing (e.g., no subjectiveness, weasel words, or fluff)
clear figures with meaningful captions
your Impact Statement (e.g., is it thoughtful and reasonable?)

Submission: Upload your entire paper (.PDF) with references and the impact statement to Canvas.

2. POSTER PRESENTATION (10%)

Your team is required to make a poster presentation

You will be evaluated solely on the clarity and narrative of your poster

That is, your poster grade will not include an evaluation of the technical aspects of your project. That would be redundant, as it’s already part of your paper’s grade.

Submission: Upload your Poster to Canvas.

3. CODE (0%)

Make a .zip file that includes:

your code should be readable and runnable.
a short README file that details how to run your code
your data, if including your resulting .zip file remains less than 100mb in size. If including your data yields a filesize > 100mb, do not include your data in the .zip file. In this case, post it online (e.g., Google Drive) and describe in your README documentation where the data can be downloaded

You will be evaluated on your documentation and the ease in running/reproducing your results.

Submission: Upload your .zip file to Canvas.

4. [OPTIONAL] SELF-/PEER- CHECK-IN (UNGRADED):

Same process as the previous self-/peer- check-in from earlier; this is optional, but we are happy to read your how your team is doing, what worked well, what didn’t work well, and if you were limited by any team members who didn’t contribute appropriately.