MGT560m Big Data and Cloud Computing

The growth in available data is a challenge to many companies. This presents an opportunity for companies to conquer the vast and various data available to them. The growth in data includes traditional structured data, as well as unstructured data created by both people and machines. It is essential for analysts to be comfortable in the new technologies and tools that are being developed to store, retrieve, analyze, and report, using the vast data resources available. This course introduces students to the technologies currently deployed to overcome the challenges of Big Data.

Today’s data environment benefits greatly from reduced costs of collecting, storing, and processing data. This enables companies to maintain ever-increasing data repositories. Initially, structured data was stored focusing on transactions and interactions with customers, suppliers, financial services, transportation services, and extended supply chain partners. More recently, social media data was added to the available data repository. This data is often inherently unstructured and multi-media, including Facebook or Twitter posts, user reviews, blogs, pictures, and videos. Add information from Internet of Things sensors on vehicles, machines, fixed-points, inventory, and medical devices, or pictures and videos from internal cameras, and we find massive volume and variety in data available to today’s analysts.

Hadoop, and related technologies supported by the Apache Foundation, is the current standard in facilitating storage of vast amounts of heterogeneous data across commodity servers. This course introduces students to current projects supported by the Apache Foundations, including Hadoop, YARN, MapReduce, Sqoop, Hive, Pig, and Spark. Each of these plays a unique role in the development of clusters of commodity servers, managing vast amounts of structured and unstructured data, parallel processing, organizing data for analysis, and developing queries for reports. Through hands-on examples using relevant data, students develop competencies in these technologies, realizing the challenges and opportunities of Big Data.

Prerequisites: MGT560G

Fall 2018