Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rename from ESA to DMA #108

Merged
merged 10 commits into from
Dec 13, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/lint.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ name: Lint Code Base

on:
push:
branches-ignore: [master]
branches: [main]
pull_request:
branches: [main]

Expand Down
34 changes: 17 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,35 +1,35 @@
# Enterprise-Scale Analytics - Data Product Analytics
# Data Management & Analytics Scenario - Data Product Analytics

## Objective

The [Enterprise-Scale Analytics](https://aka.ms/adopt/datamanagement) architecture provides a prescriptive data platform design coupled with Azure best practices and design principles. These principles serve as a compass for subsequent design decisions across critical technical domains. The architecture will continue to evolve alongside the Azure platform and is ultimately driven by the various design decisions that organizations must make to define their Azure data journey.
The [Data Management & Analytics Scenario](https://aka.ms/adopt/datamanagement) provides a prescriptive data platform design coupled with Azure best practices and design principles. These principles serve as a compass for subsequent design decisions across critical technical domains. The architecture will continue to evolve alongside the Azure platform and is ultimately driven by the various design decisions that organizations must make to define their Azure data journey.

The Enterprise-Scale Analytics architecture consists of two core building blocks:
The Data Management & Analytics architecture consists of two core building blocks:

1. *Data Management Zone* which provides all data management and data governance capabilities for the data platform of an organization.
1. *Data Landing Zone* which is a logical construct and a unit of scale in the Enterprise-Scale Analytics architecture that enables data retention and execution of data workloads for generating insights and value with data.
1. *Data Landing Zone* which is a logical construct and a unit of scale in the Data Management & Analytics architecture that enables data retention and execution of data workloads for generating insights and value with data.

The architecture is modular by design and allows organizations to start small with a single Data Management Zone and Data Landing Zone, but also allows to scale to a multi-subscription data platform environment by adding more Data Landing Zones to the architecture. Thereby, the reference design allows to implement different modern data platform patterns like data-mesh, data-fabric as well as traditional datalake architectures. Enterprise-Scale Analytics has been very well aligned with the data-mesh approach, and is ideally suited to help organizations build data products and share these across business units of an organization. If core recommendations are followed, the resulting target architecture will put the customer on a path to sustainable scale.
The architecture is modular by design and allows organizations to start small with a single Data Management Zone and Data Landing Zone, but also allows to scale to a multi-subscription data platform environment by adding more Data Landing Zones to the architecture. Thereby, the reference design allows to implement different modern data platform patterns like data-mesh, data-fabric as well as traditional datalake architectures. Data Management & Analytics Scenario has been very well aligned with the data-mesh approach, and is ideally suited to help organizations build data products and share these across business units of an organization. If core recommendations are followed, the resulting target architecture will put the customer on a path to sustainable scale.

![Enterprise-Scale Analytics](/docs/images/EnterpriseScaleAnalytics.gif)
![Data Management & Analytics](/docs/images/DataManagementAnalytics.gif)

---

_The Enterprise-Scale Analytics architecture represents the strategic design path and target technical state for your Azure data platform._
_The Data Management & Analytics Scenario represents the strategic design path and target technical state for your Azure data platform._

---

This respository describes a Data Product template for Data Analytics and Data Science. Data Products are another unit of scale inside a Data Landing Zone through the means of Resource Groups. Resource Groups inside the Data Landing Zone subscription are created and handed over to cross-functional teams to provide them an environment in which they can work on their own data use-cases. The ownership of this resource group and operation of services within is handed over to the Data Product teams. In order to enable self-service, the owning teams are free to deploy their own services within the guardrails set by Azure Policy. Repository templates can be used for these teams to more quickly scale within an organization and rollout common data analysis patterns not just once but multiple times across various use-cases. The ownership of templates is also handed over, which ultimately gives these teams a starting point while allowing them to enhance the template based on their specific requirements. This Data Product template deploys a set of services, which can be used for data analytics and data science. The template includes services such as Azure Machine Learning, Cognitive Services and Azure Search. The Data Product teams can then leverage these tools to generate insights and value with data.
This repository describes a Data Product template for Data Analytics and Data Science. Data Products are another unit of scale inside a Data Landing Zone through the means of Resource Groups. Resource Groups inside the Data Landing Zone subscription are created and handed over to cross-functional teams to provide them an environment in which they can work on their own data use-cases. The ownership of this resource group and operation of services within is handed over to the Data Product teams. In order to enable self-service, the owning teams are free to deploy their own services within the guardrails set by Azure Policy. Repository templates can be used for these teams to more quickly scale within an organization and rollout common data analysis patterns not just once but multiple times across various use-cases. The ownership of templates is also handed over, which ultimately gives these teams a starting point while allowing them to enhance the template based on their specific requirements. This Data Product template deploys a set of services, which can be used for data analytics and data science. The template includes services such as Azure Machine Learning, Cognitive Services and Azure Search. The Data Product teams can then leverage these tools to generate insights and value with data.

> **Note:** Before getting started with the deployment, please make sure you are familiar with the [complementary documentation in the Cloud Adoption Framework](https://aka.ms/adopt/datamanagement). Also, before deploying your first Data Product, please make sure that you have deployed a [Data Management Zone](https://github.com/Azure/data-management-zone) and at least one [Data Landing Zone](https://github.com/Azure/data-landing-zone). The minimal recommended setup consists of a single [Data Management Zone](https://github.com/Azure/data-management-zone) and a single [Data Landing Zone](https://github.com/Azure/data-landing-zone).

## Deploy Enterprise-Scale Analytics
## Deploy Data Management & Analytics Scenario

The Enterprise-Scale Analytics architecture is modular by design and allows customers to start with a small footprint and grow over time. In order to not end up in a migration project, customers should decide upfront how they want to organize data domains across Data Landing Zones. All Enterprise-Scale Analytics architecture building blocks can be deployed through the Azure Portal as well as through GitHub Actions workflows and Azure DevOps Pipelines. The template repositories contain sample YAML pipelines to more quickly get started with the setup of the environments.
The Data Management & Analytics architecture is modular by design and allows customers to start with a small footprint and grow over time. In order to not end up in a migration project, customers should decide upfront how they want to organize data domains across Data Landing Zones. All Data Management & Analytics architecture building blocks can be deployed through the Azure Portal as well as through GitHub Actions workflows and Azure DevOps Pipelines. The template repositories contain sample YAML pipelines to more quickly get started with the setup of the environments.

| Reference implementation | Description | Deploy to Azure | Link |
|:---------------------------|:------------|:----------------|------|
| Enterprise-Scale Analytics | Deploys a [Data Management Zone](https://github.com/Azure/data-management-zone) and one or multiple Data Landing Zones all at once. Provides less options than the the individual Data Management Zone and Data Landing Zone deployment options. Helps you to quickly get started and make yourself familiar with the reference design. For more advanced scenarios, please deploy the artifacts individually. |[![Deploy To Azure](https://aka.ms/deploytoazurebutton)](https://portal.azure.com/#blade/Microsoft_Azure_CreateUIDef/CustomDeploymentBlade/uri/https%3A%2F%2Fraw.git.ttaallkk.top%2FAzure%2Fdata-management-zone%2Fmain%2Fdocs%2Freference%2FenterpriseScaleAnalytics.json/uiFormDefinitionUri/https%3A%2F%2Fraw.git.ttaallkk.top%2FAzure%2Fdata-management-zone%2Fmain%2Fdocs%2Freference%2Fportal.enterpriseScaleAnalytics.json) | |
| Data Management & Analytics Scenario | Deploys a [Data Management Zone](https://github.com/Azure/data-management-zone) and one or multiple Data Landing Zones all at once. Provides less options than the the individual Data Management Zone and Data Landing Zone deployment options. Helps you to quickly get started and make yourself familiar with the reference design. For more advanced scenarios, please deploy the artifacts individually. |[![Deploy To Azure](https://aka.ms/deploytoazurebutton)](https://portal.azure.com/#blade/Microsoft_Azure_CreateUIDef/CustomDeploymentBlade/uri/https%3A%2F%2Fraw.git.ttaallkk.top%2FAzure%2Fdata-management-zone%2Fmain%2Fdocs%2Freference%2FdataManagementAnalytics.json/uiFormDefinitionUri/https%3A%2F%2Fraw.git.ttaallkk.top%2FAzure%2Fdata-management-zone%2Fmain%2Fdocs%2Freference%2Fportal.dataManagementAnalytics.json) | |
| Data Management Zone | Deploys a single Data Management Zone to a subscription. |[![Deploy To Azure](https://aka.ms/deploytoazurebutton)](https://portal.azure.com/#blade/Microsoft_Azure_CreateUIDef/CustomDeploymentBlade/uri/https%3A%2F%2Fraw.git.ttaallkk.top%2FAzure%2Fdata-management-zone%2Fmain%2Finfra%2Fmain.json/uiFormDefinitionUri/https%3A%2F%2Fraw.git.ttaallkk.top%2FAzure%2Fdata-management-zone%2Fmain%2Fdocs%2Freference%2Fportal.dataManagementZone.json) | [Repository](https://github.com/Azure/data-management-zone) |
| Data Landing Zone | Deploys a single Data Landing Zone to a subscription. Please deploy a [Data Management Zone](https://github.com/Azure/data-management-zone) first. |[![Deploy To Azure](https://aka.ms/deploytoazurebutton)](https://portal.azure.com/#blade/Microsoft_Azure_CreateUIDef/CustomDeploymentBlade/uri/https%3A%2F%2Fraw.git.ttaallkk.top%2FAzure%2Fdata-landing-zone%2Fmain%2Finfra%2Fmain.json/uiFormDefinitionUri/https%3A%2F%2Fraw.git.ttaallkk.top%2FAzure%2Fdata-landing-zone%2Fmain%2Fdocs%2Freference%2Fportal.dataLandingZone.json) | [Repository](https://github.com/Azure/data-landing-zone) |
| Data Product Batch | Deploys a Data Workload template for Data Batch Analysis to a resource group inside a Data Landing Zone. Please deploy a [Data Management Zone](https://github.com/Azure/data-management-zone) and [Data Landing Zone](https://github.com/Azure/data-landing-zone) first. |[![Deploy To Azure](https://aka.ms/deploytoazurebutton)](https://portal.azure.com/#blade/Microsoft_Azure_CreateUIDef/CustomDeploymentBlade/uri/https%3A%2F%2Fraw.git.ttaallkk.top%2FAzure%2Fdata-product-batch%2Fmain%2Finfra%2Fmain.json/uiFormDefinitionUri/https%3A%2F%2Fraw.git.ttaallkk.top%2FAzure%2Fdata-product-batch%2Fmain%2Fdocs%2Freference%2Fportal.dataProduct.json) | [Repository](https://github.com/Azure/data-product-batch) |
Expand All @@ -40,13 +40,13 @@ The Enterprise-Scale Analytics architecture is modular by design and allows cust

To deploy the Data Product into your Data Landing Zone, please follow the step-by-step instructions:

1. [Prerequisites](/docs/EnterpriseScaleAnalytics-Prerequisites.md)
2. [Create repository](/docs/EnterpriseScaleAnalytics-CreateRepository.md)
3. [Setting up Service Principal](/docs/EnterpriseScaleAnalytics-ServicePrincipal.md)
1. [Prerequisites](/docs/DataManagementAnalytics-Prerequisites.md)
2. [Create repository](/docs/DataManagementAnalytics-CreateRepository.md)
3. [Setting up Service Principal](/docs/DataManagementAnalytics-ServicePrincipal.md)
4. Template Deployment
1. [GitHub Action Deployment](/docs/EnterpriseScaleAnalytics-GitHubActionsDeployment.md)
2. [Azure DevOps Deployment](/docs/EnterpriseScaleAnalytics-AzureDevOpsDeployment.md)
5. [Known Issues](/docs/EnterpriseScaleAnalytics-KnownIssues.md)
1. [GitHub Action Deployment](/docs/DataManagementAnalytics-GitHubActionsDeployment.md)
2. [Azure DevOps Deployment](/docs/DataManagementAnalytics-AzureDevOpsDeployment.md)
5. [Known Issues](/docs/DataManagementAnalytics-KnownIssues.md)

## Contributing

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ First, you need to create an Azure Resource Manager service connection. To do so
1. On the next page select **Service principal (manual)**.
1. Select the appropriate environment to which you would like to deploy the templates. Only the default option **Azure Cloud** is currently supported.
1. For the **Scope Level**, select **Subscription** and enter your `subscription Id` and `name`.
1. Enter the details of the service principal that we have generated in step 3. (**Service Principal Id** = **clientId**, **Service Principal Key** = **clientSecret**, **Tenant ID** = **tenantId**) and click on **Verify** to make sure that the connection works.
1. Enter the details of the service principal that we have generated in step 3. (**Service Principal ID** = **clientId**, **Service Principal Key** = **clientSecret**, **Tenant ID** = **tenantId**) and click on **Verify** to make sure that the connection works.
1. Enter a user-friendly **Connection name** to use when referring to this service connection. Take note of the name because this will be required in the parameter update process.
1. Optionally, enter a **Description**.
1. Click on **Verify and save**.
Expand All @@ -42,7 +42,7 @@ In order to deploy the Infrastructure as Code (IaC) templates to the desired Azu
- `.ado/workflows/dataProductDeployment.yml` and
- `infra/params.dev.json`.

Update these files in a seperate branch and then merge via Pull Request to trigger the initial deployment.
Update these files in a separate branch and then merge via Pull Request to trigger the initial deployment.

### Configure `dataProductDeployment.yml`

Expand All @@ -61,7 +61,7 @@ The following table explains each of the parameters:
| Parameter | Description | Sample value |
|:--------------------------------------------|:------------|:-------------|
| **AZURE_SUBSCRIPTION_ID** | Specifies the subscription ID of the Data Management Zone where all the resources will be deployed | <div style="width: 36ch">`xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx`</div> |
| **AZURE_LOCATION** | Specifies the region where you want the resources to be deployed. Please check [Supported Regions](/docs/EnterpriseScaleAnalytics-Prerequisites.md#supported-regions) | `northeurope` |
| **AZURE_LOCATION** | Specifies the region where you want the resources to be deployed. Please check [Supported Regions](/docs/DataManagementAnalytics-Prerequisites.md#supported-regions) | `northeurope` |
| **AZURE_RESOURCE_GROUP_NAME** | Specifies the name of an existing resource group in your data landing zone, where the resources will be deployed. | `my-rg-name` |
| **AZURE_RESOURCE_MANAGER _CONNECTION_NAME** | Specifies the resource manager connection name in Azure DevOps. You can leave the default value if you want to use GitHub Actions for your deployment. More details on how to create the resource manager connection in Azure DevOps can be found further above or [here](https://docs.microsoft.com/azure/devops/pipelines/library/connect-to-azure?view=azure-devops#create-an-azure-resource-manager-service-connection-with-an-existing-service-principal). | `my-connection-name` |

Expand Down Expand Up @@ -144,15 +144,15 @@ As a last step, you need to create an Azure DevOps pipeline in your project base

1. Click on **Continue** and then on **Run**.

## Merge these changes back to the `main` branch of your repo
## Merge these changes back to the `main` branch of your repository

After following the instructions and updating the parameters and variables in your repository in a separate branch and opening the pull request, you can merge the pull request back into the `main` branch of your repository by clicking on **Merge pull request**. Finally, you can click on **Delete branch** to clean up your repository. By doing this, you trigger the deployment workflow.

## Follow the workflow deployment

**Congratulations!** You have successfully executed all steps to deploy the template into your environment through Azure DevOps.

Now, you can navigate to the pipeline that you have created as part of step 5 and monitor it as each service is deployed. If you run into any issues, please check the [Known Issues](/docs/EnterpriseScaleAnalytics-KnownIssues.md) first and open an [issue](https://github.com/Azure/data-product-analytics/issues) if you come accross a potential bug in the repository.
Now, you can navigate to the pipeline that you have created as part of step 5 and monitor it as each service is deployed. If you run into any issues, please check the [Known Issues](/docs/DataManagementAnalytics-KnownIssues.md) first and open an [issue](https://github.com/Azure/data-product-analytics/issues) if you come across a potential bug in the repository.

>[Previous](/docs/EnterpriseScaleAnalytics-ServicePrincipal.md)
>[Next](/docs/EnterpriseScaleAnalytics-KnownIssues.md)
>[Previous](/docs/DataManagementAnalytics-ServicePrincipal.md)
>[Next](/docs/DataManagementAnalytics-KnownIssues.md)
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Data Product Analytics - Create repository from the template

First, you must generate your own respository based off this template respository. To do so, please follow the steps below:
First, you must generate your own repository based off this template repository. To do so, please follow the steps below:

1. On GitHub, navigate to the [main page of this repository](https://github.com/Azure/data-management-zone).
1. Above the file list, click **Use this template**
Expand All @@ -16,5 +16,5 @@ First, you must generate your own respository based off this template respositor
1. Optionally, to include the directory structure and files from all branches in the template and not just the default branch, select **Include all branches**.
1. Click **Create repository from template**.

>[Previous](/docs/EnterpriseScaleAnalytics-Prerequisites.md)
>[Next](/docs/EnterpriseScaleAnalytics-ServicePrincipal.md)
>[Previous](/docs/DataManagementAnalytics-Prerequisites.md)
>[Next](/docs/DataManagementAnalytics-ServicePrincipal.md)
Loading