Extract and Save Data from FinSpace with Managed kdb Insights

2 minute read
Content level: Intermediate
0

Demonstrate how to use PyKX to locally save contents of Managed kdb tables and queries.

You will need your environment’s ID (ENV_ID in code example), you can find it on the kdb environment page on the AWS console: Enter image description here
Enter image description here

Using the PyKX libraries and AWS boto3 client, connect to the running FinSpace cluster ‘welcome_db’

import os
import boto3
import datetime

import pykx as kx

clusterName="cluster_welcomedb"
userName="sagemaker"

# create the boto client for finspace
session = boto3.Session()
client = session.client(service_name='finspace')

# get the userArn from finspace
resp=client.get_kx_user(environmentId=ENV_ID, userName=userName)
userArn = resp.get("userArn")

# using the userArn, get a connection string to the running FinSpace kdb cluster
resp=client.get_kx_connection_string(environmentId=ENV_ID, userArn=userArn, clusterName=clusterName)

conn_str = resp.get("signedConnectionString", None)

# parse the connection string into its components
conn_parts = conn_str.split(":")

host=conn_parts[2].strip("/")
port = int(conn_parts[3])
username=conn_parts[4]
password=conn_parts[5]

# now create a connection to the cluster with PyKX
hdb = kx.SyncQConnection(host=host, port=port, username=username, password=password)

# ensure the database is loaded
hdb('.Q.l `$.aws.akdbp,"/",.aws.akdb')

print(f"Database: {str(hdb('.aws.akdb'))} loaded")
print(f"Tables: {str(hdb('tables[]'))}")

Database: welcomedb loaded Tables: ,`example

With the client connected, query for each table’s contents and save it as a csv file.

# Select all, save each table locally
#

# name of database being saved
DB_NAME=str(hdb(".aws.akdb"))

# local directory to save database to
THIS_DATABASE_DIR = f"{DATABASES_DIR}/{DB_NAME}"

# list of tables
tables = hdb("tables[]").py()
print(tables)

# for each table
for t in tables:
    c=hdb(f"count {t}").py()
    print(f"Saving Table: {t} with {c:,} rows")
    
    # select all of table
    tb = hdb.sql(f"select * from {t}")

    # save tables as csv file to local DB path
    kx.q.write.csv(f"{THIS_DATABASE_DIR}/{t}.csv", tb,",")

print("Done Saving")

['example'] Saving Table: example with 10,000,000 rows Done Saving

References

FinSpace User Guide FinSpace Workshop Connecting to Amazon FinSpace Managed kdb Insights Clusters

profile pictureAWS
EXPERT
published 2 months ago59 views