Home Page

Biomedical Informatics 214:
Representations and Algorithms for Computational Molecular Biology
(also listed as Bioengineering 214, Computer Science 274 and Genetics 214)

This year's course content will be hosted on Canvas. Please proceed there for current information.

Description | Staff | Announcements / Discussion | Units | Grading | Code Policy | Exams | Late Policy | Partner Policy | Honor Code | Auditors | Prerequisites | Computer Resources | Accessibility | Textbook | Note on Comp Bio Courses

Description: (top)

This course will introduce the basic computational methods used in molecular biology, combining core lectures, programming assignments, with midterm and final. The course will introduce and use biological data sources available on the internet. Topics will include basic algorithms for analysis of sequence, structure and function, particularly including alignment of biological sequences and structures, as well as more advanced representational and algorithmic issues. These include, for example, dynamic programming algorithms for alignment, structural superposition algorithms, computing with distance information, 1D and 3D motif definition and computation, hidden Markov models, phylogenetic trees, statistical feature detection, RNA sequence and structure, chemoinformatics and network analysis.


Russ B. Altman
Professor of Bioengineering, Genetics, & Medicine (and Computer Science by courtesy)
russ.altman at


Lectures: Tuesdays and Thursdays, 4:15pm-5:30pm, Gates B3

Sections: A few Fridays, 3:15pm-4:05pm, Gates B3
Week 1 (September 26): Python Tutorial 1 (of 2) - Basic syntax and usage
Week 2 (October 3): Python Tutorial 2 (of 2) - Slightly more advanced topics

Course Coordinator:

Tiffany Murray (tiffany.murray at
Department of Bioengineering
Shriram Center, 443 Via Ortega Room 213

Teaching Assistants:

Kun-Hsing Yu
Office Hours: Mondays, 2:30-3:30 PM, Medical School Office Building X237
Office hour before midterm: Thursday Oct. 30 4:15-5:30 PM, Gates B3
No office hour on Nov. 3.

Tomer Altman
Office Hours: (starting October 28th)
Tuesdays, 1:30-2:30 PM, Medical School Office Building, X237
Simulcasted using Google Hangouts

Winn Haynes
Office Hours: Wednesdays, 1-2 PM, Location:
- Oct 1 and Oct 8: Medical School Office Building X138
- Oct 22 and after: Medical School Office Building X237

Emily Tsang
Office Hours: Fridays, 3-4 PM, Medical School Office Building X237
Friday Dec. 5: 1:30-2:30 PM, MSOB X237

David Poznik
Office Hours: Mondays, 4:15-5:15pm, Littlefield 301 (southeast corner)
Note: Littlefield is located near the oval. If 301 is occupied, there will be a note on the door directing you to another room on the third floor.

Collin Melton
Office Hours: Tuesdays, 9:50am-10:50am, MSOB X271

Kun-Hsing Yu
Office Hours: Wednesdays, 10:30am-11:30am, Huang 16

Sameer Arya
Office Hours: Thursdays, 2:15pm-3:15pm, MSOB X275

Linda Szabo
Office Hours: Fridays, 9am-10am, MSOB X275
Contacting the TAs: The staff mailing list is biomedin214-aut1415-staff at This list is for personal matters only. The staff will not respond to non-private questions on the mailing list. Please post them to Piazza instead (see below).

Announcements / Discussion:(top)

All announcements will be posted as Instructor Notes on the Piazza forum:
Sign up now.

Please post questions about assignments to the Piazza forum so that all students can benefit from the answers. Students are encouraged to answer each others' questions, and participation points will be awarded accordingly. The TAs will peruse these pages to answer clarification questions and to endorse student answers.


Biomedin 214

  • This course is normally taken for 4 units.
  • It can be taken for 3 units by arrangement with instructor ONLY.

Biomedin 216 [by arrangement with instructor only]
Students must attend all lectures; absences must be approved by the instructor.

  • 1 unit: lectures only
  • 2 units: lectures, assignments, midterm, final

Grading: (top)

The course will be graded by performance on 3 short homework assignments (15%), 4 programming projects (60%), a midterm (10%), a final (10%), and participation (5%).

Participation is based on:

  • Attending class. To receive credit for attendance, you must physically sign the sheet. Please do not email later to request a retroactive sign-in.
  • Asking questions in class. Please identify yourself to Dr. Altman.
  • Contributing to the Piazza newsgroup by asking and/or answering good questions.

Code / Language Policy: (top)

Familiarize yourself with the Code/Language Policy before starting the first programming assignment.

Exams: (top)

Both exams are open-notes. No internet.
Midterm: 6:00-7:30 pm, Monday, November 3 (Week 7) LKSC 120 and 130.
Final: 12:15-3:15 pm, Monday, December 8 @ LK101/102
Last name A - F: Alway M208
Last name G - Z: Alway M114

Late Policy: (top)

Each student is granted 168 "free" late hours that can be used as extensions for any project or assignment. This is a total of 168 hours for the entire quarter, not per assignment. Late time will be measured with no distinction for weekends or holidays, and will be rounded UP to the nearest integer (thus, 10 minutes late = 1 hour late). After you use up all your free hours, your grade on late projects/assignments will be reduced by 0.5 percentage points for each late hour. So if your project is graded as 85 (of 100) but was turned in 24 hours late (beyond the free 168), we record a 73.
The clock runs the same for everyone, even those who join the course late.

Partner Policy: (top)

For assignments:
Students may discuss and work on problems in groups but must write up their own solutions. When writing up the solutions, students must write the names of people with whom they discussed the assignment.

For programming projects:
Students may discuss ideas with others. However, programs are to be completed independently and should be original work. Posting or sharing code, such as on Piazza or in a public repository, will be considered a violation of the Honor Code. If students choose to use a version control system such as GitHub, it is their responsibility to make sure that permission is set so that their code is not accessible by anyone else. Names of students with whom programming ideas were discussed should be included with assignments and explicitly indicated in the header comments of all source code files.

If you do not list a particular student, this will be interpreted as an attestation that you did not speak with this student about the assignment or project.

Honor Code: (top)

Students must abide by the terms of the Stanford Honor Code.

Auditors: (top)

Auditors for the course should take it for one unit as BMI 216. This course requires attendance (and sign-in) at each lecture, but does not require completion of homeworks or exams. Auditors who want to sit-in on the course but not be officially signed up for 1 unit of credit should get approval from Dr. Altman, and will also be asked to attend all lectures and sign-in.

Prerequisites: (top)

  1. Programming skills are required at the level of CS106B or CS106X. This course has a significant programming component, so students should enter it with the ability to create moderately complex data structures and implement algorithms using them. CS161 and CS108 would also be great, but you should be OK without them. Students who have attempted to take BMI214 with just CS106A under their belts have struggled, so caveat emptor. In particular, we highly recommend a good understanding of recursion for the first programming project, and the better a feel you have for classes, functions, and standard data structures (lists, dictionaries/maps, sets, etc), the easier a time you'll have. Acceptable languages are outlined in the code policy. If you're comfortable with the material of CS106B, but have never used Python, you'll be fine. However, we strongly advise that you learn it as soon as possible so that when you're working on the projects, you can focus on the algorithms rather than the syntax. Treat learning Python as your homework for the first week of class! Project 1 can be tricky to debug, and the more comfortable you are with Python, the fewer bugs you'll have to find :)
  2. Biology 40 or equivalent is recommended, since we will quickly move through many biology topics. We recommend that all students page through the Biology Tutorial on the sidebar of this site. For many students, this will be a quick review, but it might take an hour or two for those who do not think about biology very much.

Computer Resources: (top)

  • You will need to use your SUNet ID. If you don't have a SUNet ID, see
  • Make sure that you are registered for the course on Axess so that you will receive email announcements sent to the course list
  • You will need to have access to email, the course website, and the Stanford FarmShare computing resource. All of these resources are available at Stanford computer labs (such as at Sweet Hall) as well as through remote access (SSH or VPN).
The Stanford FarmShare corn cluster is available for running compute-intensive jobs for the course. To log in to the corn cluster machines, use a secure shell (ssh) client.
On Windows: You will have to download a terminal emulation that allows ssh. Stanford offers a few free ones here; a popular one is PuTTY. Directions for using Putty to connect to corn:
1. Under "Host Name", enter
2. Under "Protocol", choose SSH
3. Press the "open" button.
A terminal window should appear, connected to corn. Putty will tell you if there was an error.
On Mac OS X, Unix, Linux:
1. Open a terminal window
2. Type ssh -X
NOTE: If you are unfamiliar with the Unix command-line, this is a helpful list of basic commands:
For more information on the various campus computers you can access:
You will also need to be able to transfer files to the corn machines using SCP. If you are on a computer with command line scp you can transfer a file, with:
$ scp
Or there are GUI sftp clients files available:
For Mac OS X:
There are also other options available from the stanford IT web sites:
Some course material will be placed on the course website in *.pdf (Adobe Acrobat) format, which allows the documents to be read on multiple platforms. Readers are available for free for Windows, Macintosh and many Unix platforms at the Adobe website.

Students with Documented Disabilities: (top)

Students who may need an academic accommodation based on the impact of a disability must initiate the request with the Office of Accessible Education (OAE). Professional staff will evaluate the request with required documentation, recommend reasonable accommodations, and prepare an Accommodation Letter for faculty dated in the current quarter in which the request is being made. Students should contact the OAE as soon as possible since timely notice is needed to coordinate accommodations. The OAE is located at 563 Salvatierra Walk; phone: 723-1066; URL:

Optional Course Textbook: (top)

Other Recommended books:

Note on courses in computational biology: (top)

  • BMI 214 (also listed as CS 274, BioE 214, Genetics 214) is this course. It has been taught since 1996 and is an introduction to representations and algorithms for analysis of sequence, structure and function. It requires programming skills and aims to give an understanding of the molecular biological problems that arise, and how algorithms are developed to address them. It does not train students to be expert users of tools, but gives them an in-depth knowledge of some tools and a broad introduction to the technical issues in analysis of biological data.

There are several others courses at Stanford that may cover some overlapping material, but every instructor explains things differently and with different emphasis (all opinions are mine, and they may be wrong--I haven't taken any of these classes, just read about them and talked a bit to instructors, so talk to someone who has!)

  • BMI 217 (Atul Butte, Translational Bioinformatics): covers some similar topics in RNA expression analysis (clustering/classification) and genome analysis, but links these more to clinical concepts, and also has an independent project at end. Students routinely take this class and BMI 214.
  • BMI 215 (Nigam Shah, Data Driven Medicine): covers some Natural Language Processing, but not at the molecular level, also covers ontologies but this is not a focus of our course. Students routinely take this class and BMI 214.
  • BMI 258 (Biochemistry 158, Doug Brutlag, Genomics, Bioinformatics, and Medicine) is an introduction to genomics, bioinformatics and applications to medicine, and goes into more detail on the clinical implications of genomics.
  • BMI 231 (Biochemistry 218, Doug Brutlag, Computational Molecular Biology): (no longer offered, but course videos and lecture slides are available) this is more for tool users, but does cover some similar issues in genomics, sequence and structure analysis, in a complementary way to this class. No programming required. Students who have taken this and BMI 214 report some overlap, but very different perspectives provided.
  • BMI 262 (CS 262, Serafim Batzoglou, Computational Genomics): covers very similar concepts more deeply in sequence analysis from a computer science perspective (dynamic programming alignment, HMM, Gibbs sampling, context-free grammars, phylogenetics), but does not overlap very much in 3D structure or function analysis. Students who have taken this and BMI 214 say that this is more rigorous in terms of algorithms, complexity, etc...and so somewhat complementary.
  • BMI 273A (CS 173, CS 273A, Serafim Batzoglou & Gill Bejerano, Computational Tour of Human Genome): both cover details of genome sequencing, genome organization and informatics analysis. Programming encouraged but not required. I do not have data about overlap of this class and BMI 214.
  • Genetics 211 (Mike Cherry & Gavin Sherlock): This is an intro to genomics through Python programming and many genetics students have told me that it is a great warmup for BMI 214 if you are not as strong at programming.