This topic uses a MySQL database deployed on an Elastic Compute Service (ECS) instance as an example to describe how to establish a network connection between a resource group and a data source that is hosted on an ECS instance.
Use scenarios
If your data source meets the following condition, we recommend that you use this solution:
The data source is hosted on an Elastic Compute Service (ECS) instance.
Solution description
Same Alibaba Cloud account and same region
If the ECS instance on which the data source is hosted and the resource group belong to the same Alibaba Cloud account and reside in the same region, we recommend that you deploy the resource group and the ECS instance in the same VPC. This way, the resource group can access the data source over a VPC.
Same Alibaba Cloud account and different regions or different Alibaba Cloud accounts and different regions
If the ECS instance on which the data source is hosted and the resource group belong to different Alibaba Cloud accounts and reside in different regions or belong to the same Alibaba Cloud account but reside in different regions, we recommend that you use CEN or a VPC peering connection to establish a network connection between the resource group and the VPC in which the ECS instance is deployed. This way, the resource group can access the data source over a VPC.
Diagrams for network connectivity
Prerequisites
The data source that you want to use is deployed in an ECS instance. For more information about data source types that are supported by DataWorks, see Supported data source types and synchronization operations.
A workspace is created. For more information, see Create a workspace.
A serverless resource group is created and associated with your workspace. For more information, see Step 1: Create a serverless resource group and Step 2: Associate the resource group with a workspace.
Billing
Billing varies based on the network connectivity tool that you use. For more information, see Billing of CEN or Billing of VPC peering connections.
When you use VPC peering connection, if the ECS instance on which the data source is hosted and the resource group belong to different Alibaba Cloud accounts and reside in the same region, no fees are generated.
Configure network connectivity
This section describes the general procedure of configuring network connectivity between a data source and a DataWorks resource group. You can read this section to quickly understand the key logic of configuring network connectivity. For information about more configuration details, see the Configuration example section of this topic.
Step 1: Obtain basic information
Same Alibaba Cloud account and same region
Data source side
Information about the VPC and vSwitch of the ECS instance
Log on to the ECS console. In the top navigation bar, select the region in which the ECS instance resides.
In the left-side navigation pane, choose
. On the Instance page, find the ECS instance on which the MySQL database is deployed and click the name of the instance to go to the Instance Details tab.In the Configuration Information section, obtain the VPC and vSwitch information of the ECS instance. In this example,
VPC 1
is used.
DataWorks side
Information about the VPC and vSwitch with which the serverless resource group is associated
Go to the Resource Groups page in the DataWorks console, find the desired resource group, and then click Network Settings in the Actions column.
On the page that appears, view the VPC and vSwitch with which the serverless resource group is associated based on the purpose of the serverless resource group.
For example, if you want to connect a MySQL data source hosted on an ECS instance to DataWorks for data synchronization, view the VPC and vSwitch with which the serverless resource group is associated in the Data Scheduling & Data Integration section. In this example, the VPC is
VPC 2
.
Same Alibaba Cloud account and different regions
Data source side
Region information: An ECS instance that resides in the China (Hangzhou) region is used.
Information about the VPC of the ECS instance
Log on to the ECS console. In the top navigation bar, select the region in which the ECS instance resides.
In the left-side navigation pane, choose
. On the Instance page, find the ECS instance on which the MySQL database is deployed and click the name of the instance to go to the Instance Details tab.In the Configuration Information section, obtain the VPC and vSwitch information of the ECS instance.
DataWorks side
Region information: In this example, a DataWorks workspace and a serverless resource group that reside in the China (Shanghai) region are used.
Information about the VPC and vSwitch with which the serverless resource group is associated
Go to the Resource Groups page in the DataWorks console, find the desired resource group, and then click Network Settings in the Actions column.
On the page that appears, view the VPC and vSwitch with which the serverless resource group is associated based on the purpose of the serverless resource group.
For example, if you want to connect an ApsaraDB RDS for MySQL data source hosted on an ECS instance to DataWorks for data synchronization, view the VPC and vSwitch with which the serverless resource group is associated in the Data Scheduling & Data Integration section.
Different Alibaba Cloud accounts and different regions
Data source side
Account information: In this example, Alibaba Cloud Account A is used.
Region information: In this example, an ECS instance that resides in the China (Hangzhou) region is used.
Information about the VPC and vSwitch of the ECS instance:
Log on to the ECS console. In the top navigation bar, select the region in which the ECS instance resides.
In the left-side navigation pane, choose
. On the Instance page, find the ECS instance on which the MySQL database is deployed and click the name of the instance to go to the Instance Details tab.In the Configuration Information section, obtain the VPC and vSwitch information of the ECS instance.
DataWorks side
Account information: In this example, Alibaba Cloud Account B is used.
Region information: In this example, a DataWorks workspace and a serverless resource group that reside in the China (Shanghai) region are used.
Information about the VPC and vSwitch with which the serverless resource group is associated:
Go to the Resource Groups page in the DataWorks console, find the desired resource group, and then click Network Settings in the Actions column.
On the page that appears, view the VPC and vSwitch with which the serverless resource group is associated based on the purpose of the serverless resource group.
For example, if you want to connect an ApsaraDB RDS for MySQL data source hosted on an ECS instance to DataWorks for data synchronization, view the VPC and vSwitch with which the serverless resource group is associated in the Data Scheduling & Data Integration section.
Step 2: Establish a network connection
Same Alibaba Cloud account and same region
If
VPC 1
andVPC 2
are the same, the ECS instance and the serverless resource group are deployed in the same VPC, and a network connection is established between them by default.If
VPC 1
andVPC 2
are different, you need to click Add VPC Association on the network settings page of the serverless resource group to associate the serverless resource group withVPC 1
and deploy the serverless resource group and the ECS instance in the same VPC.
Same Alibaba Cloud account and different regions
Cloud Enterprise Network (CEN): This tool is suitable for establishing network connections among multiple VPCs in complex network environments. For more information, see Connect VPCs in different regions.
VPC peering connection: This tool is suitable for establishing network connections between two VPCs. For information about detailed configurations, see Use VPC peering connection for private network communication.
If errors occur when you configure network connectivity, submit a ticket to contact technical support of the related Alibaba Cloud service.
Different Alibaba Cloud accounts and different regions
CEN: This tool is suitable for establishing network connections among multiple VPCs in complex network environments. For more information, see Connect VPCs in different regions.
VPC peering connection: This tool is suitable for establishing network connections between two VPCs. For information about detailed configurations, see Use VPC peering connection for private network communication.
If errors occur when you configure network connectivity, submit a ticket to contact technical support of the related Alibaba Cloud service.
Step 3: Add a route for the serverless resource group
In scenarios where the the ECS instance on which the data source is hosted and the serverless resource group belong to the same Alibaba Cloud account but different regions or belong to different Alibaba Cloud accounts, you must add a route for the serverless resource group. The route must point to the CIDR block of the vSwitch in which the ECS instance resides.
Go to the Resource Groups page in the DataWorks console, find the desired resource group, and then click Network Settings in the Actions column.
On the page that appears, find the VPC with which the serverless resource group is associated based on the purpose of the serverless resource group, and click Custom Route in the Actions column.
In the Custom Route panel, click Add Route. In the Add Route dialog box, select CIDR Block for Connection Method and set Destination CIDR Block to the CIDR block of the vSwitch in which the ECS instance resides.
Step 4: (Optional) Enable remote access for the database
You need to enable remote access for specific databases in their configuration files to allow specified users to access the databases from an external storage based on the IP addresses and port numbers. Configuration methods of different databases vary. You can refer to their respective official documentation.
For more information about how to enable remote access for a MySQL database, see 4. Enable remote access for the MySQL database.
Step 5: Configure a security group rule for the ECS instance
Alibaba Cloud ECS implements firewall capabilities based on security group rules. You need to enable the database port in the ECS security group rule to the VPC with which the DataWorks resource group is associated. In this case, the DataWorks resource group can access the services deployed in the ECS instance.
Log on to the ECS console. In the top navigation bar, select the region in which the ECS instance resides.
In the left-side navigation pane, choose
. On the Instance page, find the ECS instance on which the MySQL database is deployed and click the name of the instance to go to the Instance Details tab.Click the Security Groups tab. On the Security Groups tab, find the security group and click the name of the security group to go to the Security Group Details tab.
In the Access Rule section, click Quick Add. In the Quick Add dialog box, configure the following key parameters. You can retain default values for other parameters.
Authorization Object: Enter the CIDR block of the vSwitch with which the DataWorks resource group is associated.
Port Range: Select the corresponding port number of the database deployed in the ECS instance. For example, you can use port
3306
to connect to the MySQL database.
Test network connectivity
Log on to the DataWorks console. In the top navigation bar, select the desired region. In the left-side navigation pane, choose . On the page that appears, select the desired workspace from the drop-down list and click Go to Data Integration.
In the left-side navigation pane, click Data source. On the Data Sources page, click Add Data Source. In the Add Data Source dialog box, select the desired data source type and configure the related parameters to add a data source of this type.
In the Connection Configuration section of the Add MySQL Data Source dialog box, select the serverless resource group that has been connected to the data source and click Test Network Connectivity in the Connection Status column.
NoteIf Connection failed is displayed in the Connection Status column, you can click Self-service Troubleshoot in the column to resolve the issue. If the network connection between the resource group and the data source still cannot be established, submit a ticket.
Configuration example
This section provides an example on how to establish a network connection between a resource group and a data source that is hosted on an ECS instance. In this example, a MySQL database deployed in an ECS instance in the China (Hangzhou) region within Alibaba Cloud Account A and a DataWorks workspace created in the China (Shanghai) region within Alibaba Cloud Account B are used.
1. Basic information
Parameter | Data source | DataWorks resource group |
Account | Alibaba Cloud Account A | Alibaba Cloud Account B |
Region | China (Hangzhou) | China (Shanghai) |
VPC |
Basic information page of the ECS instance: |
Network settings page of the resource group: |
2. Establish a network connection
This solution allows you to use VPC peering connection to establish a network connection between the ECS instance and the resource group.
If errors occur when you configure network connectivity, submit a ticket to contact technical support of the related Alibaba Cloud service.
Go to the VPC Peering Connection page of the VPC console by using Alibaba Cloud Account A. In the top navigation bar, select China (Hangzhou). On the VPC Peering Connection page, click Create Peering Connection. On the Create VPC Peering Connection page, configure the relevant parameters.
The following table describes the key parameters that you must configure in this example. You can retain default values for other parameters.
Parameter
Configuration description and example
Peering Connection Name
The name of the peering connection. In this example,
Account_A to Account_B
is used.Requester VPC Instance
The VPC with which the ECS instance within Alibaba Cloud Account A is associated. In this example, select
Account_A_hangzhou_VPC
.Accepter Account Type
In this example, select
Cross-account
.UID of the receiver
Enter the ID of Alibaba Cloud Account B.
Accepter Region Type
In this example, select
Cross-region
.Accepter Region
The region in which the DataWorks workspace and resource group reside within Alibaba Cloud Account B. In this example, select
China (Shanghai)
.Accepter VPC
Enter the ID of the VPC with which the DataWorks resource group is associated within Alibaba Cloud Account B. In this example, the VPC is
Account_B_shanghai_VPC
.Click OK to complete the peering connection configuration. The Basic Information tab of the peering connection appears. The status of the peering connection is Peering Accepting.
Go to the VPC Peering Connection page of the VPC console by using Alibaba Cloud Account B. In the top navigation bar, select China (Shanghai). A peering connection that is the same as the peering connection within Alibaba Cloud Account A appears. Click Accept in the Actions column. After the acceptance, the status of the peering connection changes to Activated.
On the VPC Peering Connection page, find the created peering connection and click Configure Route in the Accepter VPC column. In the Configure Route dialog box, configure the Name parameter and set the Destination CIDR Block parameter to the CIDR block of the VPC with which the ECS instance is associated. In this example,
192.168.0.0/16
is used.Go to the VPC Peering Connection page of the VPC console by using Alibaba Cloud Account A. In the top navigation bar, select China (Hangzhou). On the VPC Peering Connection page, find the created peering connection.
Click Configure Route in the Requester VPC column. In the Configure Route dialog box, configure the Name parameter and set the Destination CIDR Block parameter to the CIDR block of the VPC with which the resource group is associated. In this example,
172.16.0.0/12
is used.
3. Add a route for the serverless resource group
Go to the Resource Groups page of the DataWorks console by using Alibaba Cloud Account B, find the DataWorks resource group, and then click Network Settings in the Actions column.
On the page that appears, find the VPC with which the DataWorks resource group is associated based on the purpose of the resource group, and click Custom Route in the Actions column.
In the Custom Route panel, click Add Route. In the Add Route dialog box, select CIDR Block for Connection Method and set Destination CIDR Block to the CIDR block of the vSwitch in which the ECS instance resides. In this example, the Destination CIDR Block parameter is set to
192.168.6.0/24
.
4. Enable remote access for the MySQL database
Connect to the ECS instance in which the MySQL database is deployed to enable remote access for the database.
The following commands are only applicable to the MySQL database of MySQL 8.0 that runs in Linux. You can use the operating systems and MySQL versions based on your business requirements.
Find the directory of the
my.cnf
configuration file. In most cases, the configuration file is in the/etc/my.cnf
directory.find / -name my.cnf
Run the
vim /etc/my.cnf
command to edit the configuration file. Replace the directory of themy.cnf
configuration file with the actual directory.Press
i
at the end of the configuration file to add the following configuration to[mysqld]
:bind-address=0.0.0.0
Press
Esc
and then enter:wq!
to save and exit.Run the
systemctl restart mysqld
command to restart the service.Create a user to remotely connect to the MySQL database when you add a data source to the DataWorks workspace.
Run the
mysql -u root -p
command to log on to the database as an administrator.Create a user and configure a password.
-- "dataworks_user" indicates the username. You can specify a username. -- "%" indicates that the user can access the database based on any IP address. You can also configure a specific IP address for fine-grained management. -- "StrongPassword123!" indicates the password. You can specify a password. CREATE USER 'dataworks_user'@'%' IDENTIFIED BY 'StrongPassword123!';
Grant the permissions on the database to the user.
-- Run one of the following commands: -- Grant all permissions on the database to the user. Preceed with caution. GRANT ALL PRIVILEGES ON *.* TO 'dataworks_user'@'%' WITH GRANT OPTION; -- Grant permissions on a specific database such as mydatabase to the user. GRANT ALL PRIVILEGES ON mydatabase.* TO 'dataworks_user'@'%' WITH GRANT OPTION;
Run the
FLUSH PRIVILEGES;
command to refresh permissions and then run theexit
command to exit the database.Test remote connection.
mysql -u dataworks_user -h <Primary private IP address of the ECS instance> -p
5. Configure a security group rule for the ECS instance
Log on to the ECS console by using Alibaba Cloud Account A. In the top navigation bar, select China (Hangzhou).
In the left-side navigation pane, choose
. On the Instance page, find the ECS instance on which the MySQL database is deployed and click the name of the instance to go to the Instance Details tab.Click the Security Groups tab. On the Security Groups tab, find the security group and click the name of the security group to go to the Security Group Details tab.
In the Access Rule section, click Quick Add. In the Quick Add dialog box, configure the following key parameters. You can retain default values for other parameters.
Authorization Object: Enter the CIDR block of the vSwitch with which the DataWorks resource group is associated. In this example,
172.16.66.0/24
is used.Port Range: Select the corresponding port number of the database deployed in the ECS instance. In this example,
3306
is used.
6. Test network connectivity
Log on to the DataWorks console by using Alibaba Cloud Account B.
Log on to the DataWorks console. In the top navigation bar, select the desired region. In the left-side navigation pane, choose . On the page that appears, select the desired workspace from the drop-down list and click Go to Data Integration.
In the left-side navigation pane, click Data source. On the Data Sources page, click Add Data Source.
In the Add Data Source dialog box, select MySQL and configure the related parameters to add a MySQL data source.
Configuration Mode: Select Connection String Mode.
Host IP Address: Enter the private IP address of the ECS instance. In this example,
192.168.6.172
is used.Port Number: Enter
3306
.Database Name: Enter the name of an existing database.
Username and Password: Enter the username and password of the
dataworks_user
account created in the 4. Enable remote access for the MySQL database step.
In the Connection Configuration section of the Add MySQL Data Source dialog box, select the resource group that is associated with the workspace and click Test Network Connectivity in the Connection Status column check whether the network connectivity test is passed.
NoteIf the test fails, you can click Self-service Troubleshoot to resolve the issue. If the test still fails after troubleshooting, submit a ticket.
References
For more information about frequently asked questions of network connectivity, see Network connectivity and operations on resource groups.