Instructors: Professor Beth Plale and Chathura Herath
Plale: LH301C, 812-855-4373, e-mail, office hours Tue 3:00 - 5:00pm
Herath: e-mail. Chathura is available at chathurah on skype, yahoo and gmail.
Associate Instructors:
Yuan Luo : LH301H, e-mail, Office hours: Th 2:30 - 3:50 p.m. or by appointment.
Pairoj Rattadilok : LH310, e-mail, Office Hours: Wed 2:00 - 3:30 p.m., or by appointment.
Topics and Agenda
Goals of distributed systems | 13 Jan |
LEAD Portal for scientific discovery | 15 Jan |
Architectures: web services, cloud computing, overlay networks, ... | 19-28 Jan |
Virtualization and Communication: threads, VMs, RPC, ... | 2-18 Feb |
Performance and benchmarks | 23-25 Feb |
Midterm exam | 2 Mar |
Naming: global naming, name spaces | 4 Mar |
Synchronization: clocks, elections, mutual exclusion | 9-11 Mar |
Spring Break | 16, 18 Mar |
Consistency: data centric, eventual | 23-25 Mar |
Fault tolerance: resilience, reliable group communication, recovery | 30 Mar - 01 Apr |
Distributed file systems and distributed storage systems: Google File System, Big Table, NFS ... | 6-13 Apr |
Workflows systems and geoscience informatics: research topics | 15-20 Apr |
Student presentations of synthesis paper (grad only) | 22-29 Apr |
Final Exam | Mon 3 May 7:15-9:15 p.m. |
Textbook and materials: The course textbook is by Andrew S. Taenbaum and Maarten Van Steen called Distributed Systems: Principles and Paradigms, 2nd Ed., Prentice Hall, 2007. You are strongly advised to get the book. Other readings will come from conference and journal papers that can be downloaded from sources such as IEEE Digital Library, ACM Digital Library, or Citeseer. We will be using the Oncourse site for this course.
Abstracts
You will write abstracts for assigned readings from papers; there will be about a dozen papers in all. The abstract serves the purpose of organizing your thoughts for the class discussion. The abstract should be about 500 words in length and i.) Identify the problem being solved, ii.) identify the solution the author proposed and how the author validates the solution, and iii.) provides an assessment of the importance of the work. Abstracts will be submitted via Oncourse and will be due at the beginning of the class in which the paper is discussed.
B534 enrollees will be responsible for submitting abstracts for all 12 papers; B490 enrollees will be responsible for submitting 9 out of 12 of the abstracts.
Required Readings
Goals of Distributed Systems (12-14 Jan) | Chapter 1, Sections 1.1 - 1.3 |
Architectures (19-28 Jan) |
Chapter 2, Sections 2.1, 2.2
Curbera, F., et al. Unraveling the Web Services Web: an Introduction to SOAP, WSDL, and UDDI, IEEE Internet Computing, 6, 2, Mar/Apr 2002 [Link] Curbera, F. et al. The Next Step in Web Services: How three specifications support creating robust service compositions, CACM 46, 10, Oct 2003 [Link] Armbrust, M. et al. Above the Clouds: A Berkeley View of Cloud Computing, U Calif Berkeley Tech Report UCB/EECS-2009-28, Feb 2009 [Link] |
Virtualization and Communication (2-18 Feb) |
Chapter 3, Sections 3.1 - 3.4
Chapter 4, Sections 4.1 - 4.3 Barham, P. et al., Xen and the art of Virtualization, ACM Symposium on Operating Systems Principles, 2003. [Link] |
Performance Evaluation (23-25 Feb) | Vivek S. Pai, Peter Druschel, and Willy Zwaenepoel, Flash: An Efficient and Portable Web Server, Proceedings of the USENIX 1999 Annual Technical Conference Monterey, CA, June 1999 [Link] |
Naming (2-4 Mar) | Chapter 5, Sec 5.1 – 5.4 |
Synchronization (9-11 Mar) |
Chapter 6, Sec 6.1, 6.3
Lamport, L., Time, Clocks, and the Ordering of Events in a Distributed System, Communications of ACM, 21, 7, Jul 1978 [Link] |
Consistency (23-25 Mar) |
W. Vogels, Eventually Consistent, Communications of ACM, 52, 1, Jan 2009
[Link]
Terry, D.B., et al. Session Guarantees for Weakly Consistent Replicated Data, Proceedings of the ACM Third International Conference on Parallel and Distributed Information Systems, 1994 [Link] |
Fault Tolerance (30 Mar – 01 Apr) | Chapter 8, Sections 8.1, 8.2, 8.3, 8.6 |
Distributed File and Storage Systems (6 – 13 Apr) |
Chapter 11, 11.1 – 11.9
Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber, Bigtable: A Distributed Storage System for Structured Data, OSDI 2006. [Link] Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung, The Google File System 19th ACM Symposium on Operating Systems Principles, Lake George, NY, October, 2003. [Link] Santry, D., et al. Deciding when to forget in the Elephant file system, ACM Symposium on Operating Systems Principles, 1999 Link] |
Workflow Systems and Geoscience Informatics (15-22 Apr) | Paper TBA |
Projects
The course includes three projects, two that are programming projects and one that is a synthesis paper. Those students enrolled in B490 will do the 2 programming projects but not the synthesis paper.
The programming projects will require experience with programming, and will grow your skills at systems programming. Distributed systems today are too large for any one person to write, so the systems programmer must be comfortable working with APIs, libraries, and code from other programmers and other organizations. You will likely work in Java and on a linux platform, though you may choose other languages/platforms. The programming projects are group projects. The project grade will be based on a demo, the quality of the code, and a written report.
For the synthesis paper (B534 students), you will research an area of distributed systems by selecting and reading three related conference papers from selective conference venues. From these readings you will develop a taxonomy and use the taxonomy to guide you in structuring your paper. The synthesis paper can be a breakthrough experience in independent scholarship for a student. When coupled with one of the projects or outside work, the synthesis paper can provide a path for independent scholarship beyond the spring semester.
The course prerequisite is CSCI P536 Advanced Operating Systems, CSCI P436 Operating Systems or consent of instructor. The prereq is there because the programming projects are intended as a systems programming experience that builds off core competency in synchronization, concurrency, file systems, and single-image programming. If you think you've got the requisite skills but haven't taken P536, talk to the instructor.
Grading
The course grade is determined by the student's performance over several areas: projects (50%), readings and discussion (25%), and exams (25%).
Academic Misconduct Your academic conduct while taking this course is bound by the IU Code of Student Rights, Responsibilities, and Conduct. In particular, Part II discusses your responsibility to uphold and maintain academic and professional honesty and integrity http://www.iu.edu/~code/code/responsibilities/academic/index.shtml.