Video has emerged as a dominant medium for online education, as witnessed by millions of students learning from educational videos on Massive Open Online Courses (MOOCs), Khan Academy, and YouTube. The large-scale data collected from students' interactions with video provide a unique opportunity to analyze and improve the video learning experience. We combine click-level interaction data, such as pausing, resuming, or navigating between points in the video, and video content analysis, such as visual, text, and speech, to analyze peaks in viewership and student activity. Such analysis can reveal points of interest or confusion in the video, and suggest production and editing improvements. Furthermore, we envision novel video interfaces and learning platforms that automatically adapt to learners' collective watching behaviors.