Masters in Data Science

Introduction | Outcomes | Contact | Admission Requirements | Curriculum | Sample Course of Study | Course Descriptions | FAQ

 


Introduction

The School of Computing and Information Sciences (SCIS) offers an interdisciplinary Master of Science degree in Data Science (MS – Data Science) to impart broad and deep technical training in data science, drawing on faculty expertise across the FIU campus, and allowing for specialization in several key application areas of importance to the industry. This MS program in Data Science will prepare the students for the global marketplace where major decisions in every discipline are becoming increasingly “data-driven”.

The MS degree program in Data Science will have separate tracks to prepare students to become data scientists with specializations in areas such as Computational Data Analytics, Business Data Analytics, Hospitality Data Analytics, and Biostatistics Data Analytics. The program will be a technical degree program that will admit students with sound analytical skills in one of many disciplines and will impart rigorous data analysis skills appropriate for their discipline. Thus, the program will encompass courses allowing the study of data science from a variety of professional and academic perspectives.


Outcomes

The following student outcomes are expected for this program:

  1. Content Knowledge: Proficiency in one or more of the toolkits for comprehensive data analysis and visualization.
  2. Critical Thinking:  Proficiency in modern data analytics – Students will demonstrate proficiency in design and implementation of a comprehensive analysis of an industry-relevant data set.
  3. Communication: Oral Communication Skills in Data Analytics – Students will demonstrate effective oral presentation skills in the field of data analytics.
  4. Communication: Written Communication Skills in Data Analytics – Students will demonstrate effective written communication skills in the field of data analytics.


Contact

  • Contact Carlos Cabrera (grad-info@cis.fiu.edu) for admission information, application, documents, and status.
  • For curricular information contact:
    • Prof. Giri Narasimhan (giri@cis.fiu.edu) [Computational Data Analysis track]
    • Dr Richard Klein [Business Analytics track]
    • Prof. Miguel Alonso [Hospitality Analytics track]
    • Prof. Changwon Yoo [Biostatistics Data Analytics track]


Admission Requirements

The following admission requirements for the MS – Data Science program are in addition to the University’s graduate admission requirements.

  • The program will admit students with a Bachelor’s degree in a discipline that is appropriate for the specialization sought. For example, a student seeking to specialize in Computational Data Analytics would be required to have a Bachelor’s degree in Computer Science, Computer Engineering, Information Technology, Mathematics, Statistics, or a related discipline. Students seeking to specialize in other tracks would be required to have an appropriate Bachelor’s degree.
  • ‘B’ average or better in all course work attempted while registered as an upper-division student in the Bachelor’s program, and a GRE general test score with a minimum quantitative score of 148.
  • Three letters of recommendation from persons in a position to judge the applicant’s potential success in graduate study.
  • International graduate student applicants whose native language is not English are required to submit a score for the Test of English as a Foreign Language (TOEFL) or for the International English Language Testing System (IELTS). A total score of 80 on the iBT TOEFL or 6.5 overall on the IELTS is required.
  • Approval of the corresponding Graduate Committee.


Curriculum

The proposed MS – Data Science program will be course based, requiring 30 credits from 10 courses. Because of its interdisciplinary nature, several specialization tracks are planned across multiple disciplines (computer science, business, public health, hospitality). However, a common required core of 4 courses plus a capstone is proposed for all specialization tracks, together with 5 additional courses to be chosen from an approved set of courses from a specialization track. The core courses are carefully designed to include three complementary perspectives: (1) common core principles governing Data Science across all application areas, (2) a practical, hands-on study of basic data analysis tools; and (3) a study of the broader context of Data Science as applied to multiple application areas. The foundations will be rigorous and will be meant for students with proven quantitative skills. The goal is to understand both the theory and the practice of Data Science. Perspectives of the field from the computational side as well as from the Statistical side will be imparted.

Core Courses and Capstone (15 credits):

Core courses impart principles fundamental to Data Science, two of which are offered by the School of Computing and Information Sciences (SCIS) and two by the Department of Statistics. One of the four core courses is new and its precise course prefix and number will be determined later once the course is approved through the curriculum process.

CAP 5768                                   Introduction to Data Science (new course) 3

CAP 5771 (or COP 5577)         Principles of Data Mining 3

STA 6244                                 Data Analysis I (or equivalent course) 3

STA 6247                                 Data Analysis II (or equivalent course) 3

Or to replace STA 6244 and 6247 for Biostatistics Students

PHC 6052         Biostatistics 1 (equivalent course to STA 6244 Data Analysis I)

PHC 6091         Biostatistics 2 (equivalent course to STA 6247 Data Analysis II)

 

Required Capstone Course: Each track will include one capstone course, designed specifically for that purpose. It will involve a large data analysis project that synthesizes the student’s learning process from the MS degree program. It will be offered at variable credit (1-3) so that a student can complete over two semesters.

IDC 6940           Capstone Course in Data Science 3

ISM 6930         Special Topics in Management Information Systems (Required for Business)

 

SPECIALIZATION TRACKS (15 CREDITS EACH)

Several specialization tracks have been developed to cater to enrolled students with different backgrounds, needs, and program specializations.  Five elective courses are to be selected from a set of elective graduate courses per chosen track.  With the permission of the academic advisor, students may be allowed to combine courses from one or more elective sequences if it enables better thematic specialization.

Computational Data Analytics: Within this track, students with computing majors can readily design course sequences that help them specialize in Bioinformatics, Medical Informatics, Financial computing, Network Traffic Analysis, Computing Forensics, Big Data algorithms, and much more.

CAP 5510C       Introduction to Bioinformatics

CAP 5610         Introduction to Machine Learning

CAP 5738         Data Visualization

CAP 6776         Advanced Topics in Information Retrieval

CAP 6778         Advanced Topics in Data Mining

CEN 5082        Grid Enablement of Scientific Applications

CIS 5372          Fundamentals of Computer Security

CIS 5374          Information Security and Privacy

CIS 6931          Special Topics: Advanced Topics in Information Processing

COP 6727        Advanced Database Systems

COT 6405        Analysis of Algorithms

COT 6936        Topics in Algorithms

TCN 6420        Modeling and Performance Evaluation of Telecommunications Networks

EEL 6803         Advanced Digital Forensics Engineering (taught as special topics course)

STA 6636         High Dimension Data Analysis

 


Business Analytics: The Business Analytics Track seeks applicants with a highly quantitative undergraduate business degree, including Accounting, Finance or Information Systems. The program also encourages applicants with degrees in Computer Science, Industrial Engineering, Mathematics, and Statistics. Applicants should have achieved undergraduate grades of B (or better) at a minimum in all undergraduate mathematics, statistics, and quantitative methods coursework. Applicants with several years work experience in a quantitative role would also be competitive absent a relevant undergraduate degree and coursework completed. The program looks for GRE scores in the range of 148 and up, particularly on the quantitative reasoning component of the exam.

ISM 6136         Business Analytics Applications

ISM 6128         Business Process Design

ISM 6208         Data Warehousing and Data Visualization for Business

ISM 6251         Emerging Information Technologies (Unstructured Data and Web Analytics)

CAP 6778         Advanced Topics in Data Mining

STA 6636         High Dimension Data Analysis

 

Hospitality Analytics:

HMG 6xxx        Data Science in Hospitality

HMG 6xxx        Customer Experience Design & Behavior Analysis

HMG 6xxx        Revenue Optimization Science

HMG 6xxx        Travel and Tourism Data Analysis

CAP 5610         Introduction to Machine Learning

CAP 5738         Data Visualization

CIS 5372          Fundamentals of Computer Security

 

Biostatistics Data Analytics:

PHC 6064         Models for Binary Public Health Outcomes

PHC 6067         Probabilistic Graphical Models

PHC 6056         Longitudinal Health Data Analysis

PHC 6060         Principles and Approaches to Biostatistical Consulting

PHC 6059         Cohort Studies and Lifetime Events in Public Health

PHC 6093         Biostatistical Data Management Concepts and Procedures



Sample Course of Study

Fall

Course NumberTitleCredits
 CAP 5768 Introduction to Data Science3
 STA 6244 Data Analysis I3
CAP 5771 Principles of Data Mining3

 

Spring

Course NumberTitleCredits
STA 6247 Data Analysis II3
 Specialization Track course3
 Specialization Track course3

 

Summer

Course NumberTitleCredits
IDC 6xxx Capstone Course in Data Science2

 

Fall

Course NumberTitleCredits
IDC 6xxx Capstone Course in Data Science1
 Specialization Track course3
 Specialization Track course3
Specialization Track course3


Course Descriptions

Core Courses

Introduction to basic principles governing managing and analysis of data.
Introduction to data mining concepts, knowledge representation, inferring rules, statistical modeling, decision trees, association rules, classification rules, clustering, predictive models, and instance-based learning.
Introduction; Review of Probability; Collecting Data; Exploring and Summarizing Data; Sampling Distributions of Statistics; Basic Concept of Inference; Inference for Single Population; Inference for Two Samples; Inference for Proportions and Count Data; Simple Linear Regression
Correlation Analysis; Multiple Linear Regression; Analysis of Single-Factor Experiments; Two-Factor Experiments with Fixed Crossed Factors; Nonparametric Statistical Method; Time Series Analysis
Project course to synthesize concepts from databases, analytics, visualization and management of data. The project will use Python, SQL, R, and/or other specialized data analysis toolkits to solve data science problems in specific application areas.

 

Computational Data Analytics

Introduction to bioinformatics; algorithmic, analytical and predictive tools and techniques; programming and visualization tools; machine learning; pattern discovery; analysis of sequence alignments, phylogeny data, gene expression data, and protein structure.
Decision trees, Bayesian learning reinforcement learning as well as theoretical concepts such as inductive bias, the PAC learning, minimum description length principle.
Advanced class on data visualization principles and techniques. Students propose, implement, and present a project with strong collaborative and visual components.
Information Retrieval (IR) principles including indexing and searching document collections, as well as advanced IR topics such as Web search and IR-style search in databases.
Web, stream data, and relational data mining, graph mining, spatiotemporal data mining, privacy-preserving data mining, high-dimensional data clustering, social network, and linkage analysis.
Fundamental principles and applications of high-performance computing and parallel programming using OpenMP, MPI, Globus Toolkit, Web Services, and Grid Services.
Information assurance algorithms and techniques. Security vulnerabilities. Symmetric and public key encryption. Authentication and Kerberos. Key infrastructure and certificate. Mathematical foundations.
Information Security Planning, Planning for Contingencies, Policy, Security Program, Security Management Models, Database Security, Privacy, Information Security Analysis, Protection Mechanism.
Select special topics in information processing.
Design, architecture and implementation aspects of DBMS, distributed databases, and advanced aspects of databases selected by the instructor.
Design of advanced data structures and algorithms; advanced analysis techniques; lower bound proofs; advanced algorithms for graph, string, geometric, and numerical problems; approximation algorithms; randomized and online algorithms.
Advanced data structures, pattern matching algorithms, file compression, cryptography, computational geometry, numerical algorithms, combinational optimization algorithms and additional topics.
Covers methods and research issues in the models and performance evaluation of high-speed and cellular networks. Focuses on the tools from Markov queues, queuing networks theory and applications.
This course provides students with the advanced skills to track and counter a wide range of sophisticated threats including espionage, hacktivism, financial crime syndication, and APT groups.
Statistical techniques used to analyze high dimensional data sets. Topics include machine learning, high-dimensional data, discriminant analysis and clustering.

 

Business Analytics

The course will give students the skills needed to manage and deliver BI. It covers data warehouse concepts, dimensional modeling, OLAP cubes, advanced reporting and visualization and data mining.
The course covers fundamentals concepts, principles, and techniques that can be used to improve business performance through the analysis, modeling, and design of the as-is and the to-be business processes.
Data Warehousing and Online Analytical Processing tools will be utilized to organize and analyze large volumes of data in order to explain the past, monitor the present, and anticipate the future.
This course covers emerging information and communication technologies that are changing the way the business is being operated in the global economy.

 

Biostatistical Data Analytics

An introduction to basic biostatistical techniques for MPH students majoring in Biostatistics, but also open to those seeking a thorough understanding of and ability to use the essential biostatistical procedures. Prerequisites: Familiarity with basic algebra and basic calculus is important.
Continuation of Biostatistics I. Covers advanced methods for ANOVA, different regression and correlation techniques and survival analyses. Prerequisite: PHC 6052.
This course will offer students a focused introduction to statistical models for the analysis of binary medical and public health data. The course will provide an introduction to the application of statistical models for PH outcomes in epidemiology, dietetics, and nursing. Prerequisites: PHC 6052 or permission of the instructor.
Concepts and implementation of Probabilistic Graphical Models, comparative study the models, and their suitability for various datasets. Prerequisites: PHC 6052, PHC 6091, or permission of the instructor.

Applied longitudinal health data analysis; methods to compare different health treatments and behavioral interventions. The focus will be on models for single and multiple correlated public health outcomes. Prerequisites: PHC 6052, PHC 6091, or permission of the instructor.

The course specifically addresses the process of providing biostatistical consulting support for public health, medical and clinical research. Prerequisites: PHC 6052, PHC 6091, PHC 6093.
Concepts of lifetime events and survival data in Public Health; modern methods used to analyze time-to-event data; non-parametric and parametric models.
Covers procedures and tools for data management, including data collection, transfer, handling, quality and security issues for research projects for public health, medicine, and related fields.

Hospitality Analytics

Includes applying data science techniques, such as data wrangling, data management, exploratory data analysis, predictive modeling, regression, & classification to problems in the hospitality and tourism industry.
Involves crafting the customer experience & measure it to produce robust data sets, model, analyze/assess the customer experience, & iteratively design & influence customer experience & behavior using data-driven decision making.
Includes applying data & mathematical techniques and algorithms to understand factors that affect business revenue & further optimize revenue via data-driven decision-making and predictive models.
Applies data science & analysis techniques to analyze tourist behavior, forecast tourism demand, design travel packages, develop data-driven destination marketing strategies & identify factors that influence customer travel & tourism.


FAQ

What computing resources will be at the disposal of MS-DS students?

We are in the process of seeking funding for a specialized laboratory to cater to the special high-performance computing needs of the MS-DS program.

Will new tracks be added in the future?

New tracks will be considered as and when faculty expertise and interest is identified.

How will the program stress on skills like presentation and communication?

This is an important component of the MS-DS program. Both oral and written communication skills will be emphasized for the capstone project. Other graduate courses will often have projects and presentations to further add to this training.

Will every student have an advisor?

There is a track director for each track. Additionally, each student will work with an appropriate faculty advisor for his/her capstone project.

Will the GRE requirement be eliminated in the future?

Discussions are underway within FIU to figure out how best to use GRE scores in the futures. The results of this discussion are likely to impact the GRE requirements of MS-DS as well.

Who is advising on how the program shapes up in the future?

We will be forming an Industrial Advisory Board as the program ramps up, so that we not only seek advice on the curriculum but also find potential collaborators for the student capstone projects.

What hurdles do you foresee as this new degree program is rolled out? How responsive will you be to modify the program as you encounter problems in the future?

We are very aware that this is a novel and experimental program. Challenges include teaching core courses with very diverse backgrounds and skill levels. These courses will be carefully monitored to find ways to improve. The 4 directors from the specialization tracks are in consultation on this matter.

MDC has an Associate degree in Business Analytics and a newer BS degree program in a related field. Is there any coordination with that program?

We are in the process of reaching out to them. We hope to attract the students who have finished MDC’s Associate degree to the MS-DS program.

What are the expected outcomes for this program?
What is the purpose of the Capstone course? Can I use my own data or my own project for the course?

The Capstone course is intended to synthesize all that you have learnt in your MS-DS program and to apply it to real-life data. The project will be decided by the student in consultation with a faculty advisor. Data for the project would also be chosen in consultation with the faculty advisor.

What happens if I have a BS degree in a STEM discipline not listed on your website? What happens if I don’t have specific prerequisites for one of the core courses?

Talk to an advisor in the specialization track that you think is best suited to your needs. Contact information is provided below.

What is the cost of the program?

The in-state tuition cost is estimated at $456 X 30 = $13,680. This does not include other fees, books, supplies, etc.

What languages will be used in the program?

The MS-DS program will emphasize the use of R, Python, and SQL. Other languages may be relevant for individual classes.

Will there be evening classes? Will there be Saturday classes? What about online classes?

Almost all of our graduate classes are evening classes to cater to working students. Some are even scheduled on Saturdays. If you need to take a class that is scheduled during the day, it is possible to petition the department to move it to an evening class. FIU is slowly converting all courses to have an online or a hybrid version.