

They want to automate this workflow, and monitor and manage it on a daily schedule.

To extract insights, it hopes to process the joined data by using a Spark cluster in the cloud (Azure HDInsight), and publish the transformed data into a cloud data warehouse such as Azure Synapse Analytics to easily build a report on top of it. The company wants to utilize this data from the on-premises data store, combining it with additional log data that it has in a cloud data store. To analyze these logs, the company needs to use reference data such as customer information, game information, and marketing campaign information that is in an on-premises data store. It also wants to identify up-sell and cross-sell opportunities, develop compelling new features, drive business growth, and provide a better experience to its customers. The company wants to analyze these logs to gain insights into customer preferences, demographics, and usage behavior. Usage scenariosįor example, imagine a gaming company that collects petabytes of game logs that are produced by games in the cloud. Azure Data Factory is a managed cloud service that's built for these complex hybrid extract-transform-load (ETL), extract-load-transform (ELT), and data integration projects. However, on its own, raw data doesn't have the proper context or meaning to provide meaningful insights to analysts, data scientists, or business decision makers.īig data requires a service that can orchestrate and operationalize processes to refine these enormous stores of raw data into actionable business insights. In the world of big data, raw, unorganized data is often stored in relational, non-relational, and other storage systems.
