Hello! My name is Josh Frantz, and I am a Security Consultant at Rapid7. I’m here today to talk about leveraging our vulnerability management solution InsightVM’s Data Warehousing functionality to export scan data to a managed Cloud SQL instance in the Google Cloud Platform (GCP).
In this example, we’ll use a simple public address in GCP for our database (you can easily use a private IP address or set up a VPN/Proxy) and an instance of InsightVM from which to export our data. As always, make sure you have proper approval and change orders set up to perform the steps listed out below.
The following are the hardware requirements for the Data Warehouse in Google Cloud:
- 2 GHz+ processor (Quad-core processor recommended)
- 32 GB RAM (minimum), 72 GB+ RAM (recommended)
- 1 TB HDD (minimum), 2 TB+ HDD (recommended)
- 100 Mbps network interface (minimum), 1 Gbps (recommended)
Let’s get started!
What is data warehousing?
Data warehousing is a mechanism that lets you export data from InsightVM and use it for external reporting or business intelligence (such as Kenna, Power BI, or Splunk). The export performs an extract, transform, and load (ETL) process into the target warehouse using a dimensional model (more about that can be found here).
Why Cloud SQL from Google Cloud?
Cloud SQL is a fully managed database service that makes it extremely easy to set up, maintain, and manage your databases in the cloud. It provides scalability, high performance, and convenience for your applications running anywhere. This means:
- Updates, patching, replication, and backups are all done automatically.
- Instances can auto-scale, meaning no more worrying about running out of disk space or computing resources.
- Cloud SQL is SSAE 16-, ISO 27001-, PCI DSS v3-, and HIPAA-compliant.
Setting up the managed SQL instance in GCP
Before you set up your warehousing job in InsightVM, you first need to configure a database instance within Google Cloud. To do this, go to your Google Cloud console and find the SQL option on the left-hand side.
Click the Create Instance button at the top, then select Postgres 9.6 as your database engine. After clicking Next and waiting a few minutes for the Computer Engine API to initialize, you can fill out the below data. As a reminder, we’re using a public IP address for this, so I’ll configure it so that only my console IP address can access this instance.
After selecting the specifications (as noted in the beginning of this post) and other optional configuration options, click Create. After a few minutes, your instance will be created. For the next step, you will download the required certificates and import them into InsightVM so you can connect to your Cloud SQL instance.
After the instance is created, you can see the instance details. Copy the IP address of the instance and the database name listed in the details. Before you set up the actual warehousing, you need that certificate. Click on Connections, then scroll down and hit Allow only SSL connections, since you don’t want people connecting to this instance without them being secured. The instance will be updated, and you can then create a certificate and download it.
Once you’ve downloaded your SSL certificate, you're ready to log in to InsightVM.
Configuring InsightVM’s Data Warehouse export
For this next step, you’ll log in to InsightVM and configure your Data Warehouse export on a schedule to send data to your SQL instance in GCP. Before you configure the connection, you need to import the certificate downloaded in the previous step. You can do that by going to the Administration menu and navigating to Manage certificates under Scan Options.
You can then open the certificate you downloaded in a text editor and copy the contents into the box that pops up after you select Import Certificates toward the top of the page.
After pasting in your certificate, press Import and go back to the Administration menu.
For this step, go toward the bottom of the page and click Manage next to Data Warehousing.
Make sure that Enable Export is checked, and fill in your required information to connect to your instance in GCP. The default port for Postgres is 5432, which is what Google sets as well. Make sure you use the IP address of the Cloud instance you created earlier, along with the database name, user, and password.
You can test the connection and verify that it’s good, then go to the Schedule tab on the left to select a start time and frequency. In this case, we set up a daily pull starting at 8 a.m.
Note that the larger the dataset, the more network usage and storage your Cloud instance will require. Our example is a rather small dataset (about 2GB), so I’m okay with running it once a day. Optionally, go to Data Retention and select a data retention option.
After configuring, click Save in the top right, and it will start running on the schedule you selected previously. Once the schedule criteria has been met, you can see your Cloud SQL instance now has data in the Instance Details screen.
You have now successfully connected your InsightVM dimensional warehouse to Google Cloud and can use this data for reporting, business intelligence, or log aggregation systems.
For more information on how to connect to this instance and import data into your applications, check out this how-to guide. As always, thank you for reading, and expect more blogs about warehousing your data on other cloud platforms soon.