CSI 5v93: Machine Learning, Spring 2005

Announcements

Thursday, 3/24/2005
The description of the paper presentation is posted. Please read it and get started on the paper presentation.
Tuesday, 3/1/2005
The fourth assignment is posted. It will be due on March 22, and has a considerable amount of work; please start on it immediately.
Tuesday, 2/15/2005
Here are my accuracy results for the k-nearest neighbor classifier on the ZIP data, training on the training data. You can compare these with your own results. Note the fact that the test accuracy goes up slightly with k = 5 over k = 1.
accuracy k = 1 k = 5 k = 25 k = 125
training set 1.000 0.979 0.947 0.890
test set 0.943 0.944 0.918 0.855
Friday, 2/11/2005
The third assignment is posted.
Friday, 1/28/2005
Please see the second assignment for an update (see the announcement at the top of the page).
Thursday, 1/27/2005
The second assignment is posted, and the due date has been changed to February 8th.
Tuesday, 1/11/2005
Welcome to the course! The first assignment is posted.

Objectives

This is a course in machine learning, a subfield of artificial intelligence. Machine learning is a very big, interesting, and fast-growing field. The central problem we address in this class is how to use the computer to make models which can learn, or make inferences, from data. Further, we would like to use the learned models to make predictions about unknowns.

This course covers:

This list of topics is optimistic. Be prepared to invest the time necessary to understand the concepts, and to do the programming projects. My best advice is to attend the lectures, read the book, ask questions, and start projects early.

Practical information

Lectures are from 8:00 AM to 9:00 AM in Rogers 312 on Tuesdays and Thursdays.

My office is in the Rogers Engineering and Computer Science building. My office hours are T-F, 10-11 AM, and by appointment. I am often in my office and am glad to talk to students.

Schedule

Here is an aggressive schedule of the material we will cover:

Week Dates New topics Chapters Tuesday Thursday
1 Jan 11, 13 Introduction 1, 2 Notes, Homework 1 assigned Notes
2 Jan 18, 20 Notes Notes, Homework 1 due
3 Jan 25, 27 Linear regression methods 3 Notes, Homework 2 assigned Notes
4 Feb 1, 3 Notes Notes
5 Feb 8, 10 Linear classification methods 4 Notes, Homework 2 due Notes, Homework 3 assigned
6 Feb 15, 17 Notes Case study
7 Feb 22, 24 Bayesian learning Mitchell 6 Case study, Homework 3 due Notes
8 Mar 1, 3 Notes, Homework 4 assigned Notes
9 Mar 8, 10 Notes Notes
10 Mar 15, 17 spring break spring break
11 Mar 22, 24 SVMs 12 Notes Notes, Homework 4 due, Paper presentations assigned
12 Mar 29, 31 Notes Notes
13 Apr 5, 7 Paper presentations Notes Paper presentations
14 Apr 12, 14 Paper presentations, Homework 5 assigned diadeloso
15 Apr 19, 21 Unsupervised learning 14 Notes Notes
16 Apr 26, 28 Notes Notes, Homework 5 due

The final exam is on Saturday, May 7th, between 2-4 PM.

Textbooks & resources

Required text: we will be using The Elements of Statistical Learning: Data Mining, Inference, and Prediction by Trevor Hastie, Robert Tibshirani, and Jerome Friedman. You can purchase this book from the Baylor bookstore or amazon, among other places.

Optional text: Machine Learning by Tom Mitchell.

Further online resources:

Grading

Grades will be assigned based on this breakdown:

Here is a tentative grading scale:
A: 90-100, B+: 88-89, B: 80-87, C+: 78-79, C: 70-77, D: 60-69, F: 0-59

Some projects may be worth more than others. Exams are closed-book. The final will be comprehensive.

Policies

Academic honesty

I take academic honesty very seriously.

Many studies, including one by Sheilah Maramark and Mindi Barth Maline have suggested that "some students cheat because of ignorance, uncertainty, or confusion regarding what behaviors constitute dishonesty" (Maramark and Maline, Issues in Education: Academic Dishonesty Among College Students, U.S. Department of Education, Office of Research, August 1993, page 5). In an effort to reduce misunderstandings in this course, a minimal list of activities that will be considered cheating have been listed below.


Copyright © 2004 Greg Hamerly, with some content taken from a syllabus by Jeff Donahoo.
Computer Science Department
Baylor University

valid html and css