User warning: The following module is missing from the file system: readonlymode. For information about how to fix this, see the documentation page. in _drupal_trigger_error_with_delayed_logging() (line 1156 of /var/www/html/starling.sbs.arizona.edu/mig/ischool/includes/bootstrap.inc).

Degree Requirements – Machine Learning Sub-Plan

The Machine Learning Sub-Plan includes an additional core course in Machine Learning and a variety of focused electives. Machine learning works to help manage and interpret large amounts of data by automating the processes by which models of data are built. This emphasis will prepare graduates to be innovative scientific leaders across sectors, graduates who understand the complexities of machine-learning as a particular kind of data science.

Plan of Study

You should work with your faculty to develop a Master’s Plan of Study during your first few months in the program. The Plan of Study should be submitted to the Graduate College no later than your second semester in the program.

The Master’s Plan of Study identifies 1) courses you intend to transfer from other institutions; 2) courses already completed at the University of Arizona which you intend to apply toward the graduate degree, and 3) additional coursework to be completed to fulfill degree requirements. The Plan of Study must have the approval of the Director of Graduate Studies before it can be submitted to the Graduate College.

Core Courses

  • 9 units total

This course introduces fundamental ideas of the Information Age, focusing on the value, organization, use, and processing of information. The course is organized as a survey of these ideas, with readings from the research literature. Specific topics (e.g., visualization, retrieval) will be covered by guest faculty who research in each of these areas.

Machine learning describes the development of algorithms, which can modify their internal parameters (i.e., "learn") to recognize patterns and make decisions based on example data. These examples can be provided by a human, or they can be gathered automatically as part of the learning algorithm itself. This course will introduce the fundamentals of machine learning, will describe how to implement several practical methods for pattern recognition, feature selection, clustering, and decision making for reward maximization, and will provide a foundation for the development of new machine learning algorithms.  

Experiential Courses

Complete 3 units total. More information on experiential courses is available on our internships and individual studies pages.

Internship is intended to provide an opportunity for students to build on what they have mastered in the program and practice the knowledge and skills in the real world. The Internship should be relevant to student's degree competencies and contribute to the development and enforcement of the student's knowledge and skill sets in the field of Information Science. The student should propose an internship plan and then identify an internship site supervisor, who typically is external. The site supervisor and the graduate advisor of the school need to approve the plan prior to course registration. The plan should include goals for the internship, degree competencies addressed by the internship, expected tasks to be completed, work schedule, and the assessment plan. The amount of the work should be appropriate for the units registered (3 units = 135 hours). The internship may be paid or unpaid. Student may take an internship in the same organization where student is employed, but work planed for the internship need to have a clear separation from the work expected by the employment. At the conclusion of the internship, the site supervisor is expected to submit a written assessment of student's work.

Capstone Project is intended to provide an opportunity for students to show off what they have mastered in the program. The project should be relevant to MS degree competencies and contribute to the development and enforcement of the student's knowledge and skill sets in the field of Information Science. The student should propose a project plan and the faculty advisor should approve it before registration. The project plan should include goals for the project, MS competencies addressed by the project, system design, an implementation schedule, and the assessment plan. The project plan should also include reasonable milestones and check points. The amount of the work should be appropriate for a 3-unit course. The primary faculty advisor must be an SI faculty, but faculty members from other units may participate in advising the student.

Capstone Project

For either course:

  • Identify your internship supervisor (INFO 693) or iSchool faculty supervisor (INFO 698)
  • Request and experience via Handshake as described on our internships and individuals studies page
  • The internship or capstone project must exercise all competencies required for the M.S. degree
  • The internship or capstone project must have a software development component. Capstones must deposit code in GitHub or other source code repository
  • Upon completing the internship or capstone project, submit a report (5000-6000 words in length) in the form of an academic paper, documenting what has been accomplished and explaining how the competencies have been demonstrated
  • Your supervisor(s) will complete a competencies evaluation form, evaluate the project, and assign a pass/fail grade

You must submit your application in Handshake. More information can be found on the individual studies page

Machine Learning Elective Courses

  • Minimum 9 units

Choose three courses (Minimum 9 units) from the following:

Bayesian modeling and inference is a powerful modern approach to representing the statistics of the world, reasoning about the world in the face of uncertainty, and learning about it from data. It cleanly separates the notions of representation, reasoning, and learning. It provides a principled framework for combining multiple source of information such as prior knowledge about the world with evidence about a particular case in observed data. This course will provide a solid introduction to the methodology and associated techniques, and show how they are applied in diverse domains ranging from computer vision to molecular biology to astronomy.  Graduate-level requirements include different exams requiring greater depth of understanding of topics, and will be assigned questions based on graduate-student specific assignments topics.

This course will introduce students to the concepts and techniques of data mining for knowledge discovery. It includes methods developed in the fields of statistics, large-scale data analytics, machine learning, pattern recognition, database technology and artificial intelligence for automatic or semi-automatic analysis of large quantities of data to extract previously unknown interesting patterns. Topics include understanding varieties of data, data preprocessing, classification, association and correlation rule analysis, cluster analysis, outlier detection, and data mining trends and research frontiers. We will use software packages for data mining, explaining the underlying algorithms and their use and limitations. The course include laboratory exercises, with data mining case studies using data from many different resources such as social networks, linguistics, geo-spatial applications, marketing and/or psychology.

This course introduces the key concepts underlying statistical natural language processing. Students will learn a variety of techniques for the computational modeling of natural language, including: n-gram models, smoothing, Hidden Markov models, Bayesian Inference, Expectation Maximization, Viterbi, Inside-Outside Algorithm for Probabilistic Context-Free Grammars, and higher-order language models.  Graduate-level requirements include assignments of greater scope than undergraduate assignments. In addition to being more in-depth, graduate assignments are typically longer and additional readings are required.

This course provides a broad technical introduction to the tools, techniques and concepts of artificial intelligence. The course will focus on methods for automating decision making under a variety of conditions, including full and partial information, and dealing with uncertainty. Students will gain practical experience writing programs that use these techniques to solve a variety of problems.

Topics include problem solving (search spaces, uninformed and informed search, games, and constraint satisfaction), principles of knowledge representation and reasoning (propositional and first-­‐order logic, logical inference, planning), and representing and reasoning with uncertainty (decision theory, reinforcement learning, Bayesian networks, probabilistic inference, basic discrete-­‐time probabilistic models).

Most of web data today consists of unstructured text. This course will cover the fundamental knowledge necessary to organize such texts, search them a meaningful way, and extract relevant information from them. This course will teach natural language processing through the design and development of end-to-end natural language understanding applications, including sentiment analysis (e.g., is this review positive or negative?), information extraction (e.g., extracting named entities and their relations from text), and question answering (retrieving exact answers to natural language questions such as "What is the capital of France" from large document collections). We will use several natural language processing toolkits, such as NLTK and Stanford's CoreNLP. The main programming language used in the course will be Python, but code written in Java or Scala will be accepted as well.  Graduate-level requirements include implementing more complex, state-of-the-art algorithms for the three proposed projects. This will require additional reading of conference papers and journal articles.

Most of the web data today consists of unstructured text. Of course, the fact that this data exists is irrelevant, unless it is made available such that users can quickly find information that is relevant for their needs. This course will cover the fundamental knowledge necessary to build such systems, such as web crawling, index construction and compression, boolean, vector-based, and probabilistic retrieval models, text classification and clustering, link analysis algorithms such as PageRank, and computational advertising. The students will also complete one programming project, in which they will construct one complex application that combines multiple algorithms into a system that solves real-world problems.  Graduate level requirements include implementing more complex, state-of-the-art algorithms for the programming project, which might require additional reading of research articles. Written assignments will have additional questions for graduate students.

Neural networks are a branch of machine learning that combines a large number of simple computational units to allow computers to learn from and generalize over complex patterns in data. Students in this course will learn how to train and optimize feed forward, convolutional, and recurrent neural networks for tasks such as text classification, image recognition, and game playing.

General Elective Courses

  • Minimum 9 units
  • Choose three elective courses with the INFO prefix
  • Up to two elective courses may be substituted from other academic units with advisor approval