Connecting to Amazon FinSpace Managed kdb Insights Clusters

6 minute read
Content level: Intermediate
0

Connecting to FinSpace managed kdb clusters uses the same IPC communication like other q process. The managed service will ensure only those entitled to access the cluster can get connection information using the service APIs. In this rePost we will show how to get a signed URL to a manged kdb cluster (the connection string) and then how to use that connection string in various q clients to connect to the FinSpace managed kdb cluster.

Entitlements And The Connection String

To connect clients to a running FinSpace Managed kdb cluster, one needs to first get a connection string (as a signed URL) to the managed kdb instance. The service API GetKxConnectionString (CLI, REST, boto) will ensure that the user calling has access to the referenced environment and cluster, and if they do, will return a signed URL to the cluster. For more information on setting up a user in FinSpace, see the FinSpace User Guide: Interacting with a kdb cluster. You can also read how to setup a user in the Amazon FinSpace With Managed kdb Insights Foundations Workshop.

The response to the service API GetKxConnectionString returns one element, signedConnectionString; it is that element we will use in the clients below.

AWS CLI Example uses jq -r to extract the signedConnectionString element from the service API return:

sh-4.2$ aws finspace get-kx-connection-string —environment-id $ENV_ID —user-arn $USER_ARN 
—cluster-name cluster_welcomedb | jq -r '.signedConnectionString'

:tcps://vpce-0e26363340a92f198-ogxvd2j2.vpce-svc-04b62f50186b0d296.us-east-1.vpce.amazonaws.com:443:
sagemaker:Host=vpce-0e26363340a92f198-ogxvd2j2.vpce-svc-04b62f50186b0d296.us-east-1.
vpce.amazonaws.com&Port=443&User=sagemaker&Action=finspace%3AConnectKxCluster&X-Amz-Security-
Token=IQoJb3JpZ2luX2VjEHgaCXVzLWVhc3QtMSJHMEUCIQDK0%2BnZFlgQMIRQ6JUFusGFEzMlh%2FRxvQSyZhBJKC
%2BjlQIgL4ofS05aHZxsgsxnJPCHUJY7HeEvQt3QZ2JlAYk%2BcAEq9wIIYRAAGgw4Mjk4NDU5OTg4ODkiDDUdkotv%2
FhMpoCN03CrUAnGcAC1FtLcgK2nMOu9hGt7dPEmDaLRJ7sO3XRJv3jUaM4eLgHbNzC8Ym3XVW9XcqzPoySV1O3hkvZ
2wOH2fxAa9DSPphMrSolt4H%2F3QG%2BIbfuT%2BURIueTgIh2vvfxLjHbNwJKm5pGmOURWoVTUI5HVifC3ISYhM6AI
T4fs4w0dbst0VRpNgZ41wq1S%2BCOI7zXQFQfmUsUffsXh5QQcH9lH4VWdPHlyjDxx%2FnV6ah3Px7mTevWQMC2qrU
%2FIVrYehjGOSreLmS2hpOb6RJoHmVv9rEPq6IN%2BWAoqIqIS9BueHp62gQiO5Y%2FJbAVbrWTtSTavPSsu6TQEHKEJ
aSeLEfe%2Bjho9wKLZUzD28wj7%2FpVO4PYhglW8HHK%2B2t0OgcOcGEZDqihZGgQH3f8xK86WDpnzvtj4FvmkpyAU
%2FkNiUNzoFRogvHa48auNsKYdsk86q0Zijhzgw45%2F1uwY6vwGTOiKt7AOlA%2BL8XlCBMbMEP16WONg1lskQ3AQB
2bVbVO4kIJXTTCu5FZq3957E2p2yjKWUuwM63FBhLK7E1g3hGsdcemIwmR8nTbUX0KOaMCfTzArLdormCTBAGR88SLSv
UCLWU7P46X4OlTVe0KWlpQBQAn%2BnpzYHTHnSSx8o3kmeN8U1LuOIZgl8tj4N%2BGTVvRek5h73UuUYa0AXDnzTC
%2F5%2FP%2B0NiXrrVoknyED65qccB%2FwgGXDKVGSNSlITGQ%3D%3D&X-Amz-Algorithm=AWS4-HMAC-SHA256
&X-Amz-Date=20250107T160139Z&X-Amz-SignedHeaders=host&X-Amz-Expires=900&X-Amz-Credential=
ASIA4CNVNBUUWXGGDD3V%2F20250107%2Fus-east-1%2Ffinspace-apricot%2Faws4_request&
X-Amz-Signature=fa7a4edf0f228c8a739462b0876d5dee1f3638419474e87e64ba2649c4b0d9ee

The signedConnectionString element can be used “as is” when connecting from another q process, it also has all the components needed for kdb IPC communications (host, port, username, and password) for use from other q clients such as PyKX or R.

Connecting from another q process

Use system from the q process to capture the connection string and use the string to connect to the managed cluster in FinSpace using hopen.

q)\c 20 2000 
q)conn_str: system "aws finspace get-kx-connection-string --environment-id $ENV_ID --user-arn $USER_ARN --cluster-name cluster_welcomedb | jq -r '.signedConnectionString'"
q)con: hopen `$conn_str[0]
q)con "tables[]"
,`example
q)

Notes:

  • Assumes AWS CLI is available from the q process
  • Ensure the full string is captured by increasing the console size with “\c 20 2000”
  • System returns a list of strings, use conn_str[0] to get the first element of the list

Connecting with qcon

The q console (qcon) is a simple q client for connecting to a remote q process and is available from kdb on github. The q console takes one argument, the remote q process as a string (host:port:username:password). The connection string as returned by the service API get-kx-connection-string starts with “:tcps://, those first 8 characters will need to be removed for use by qcon.

sh-4.2$export CONN_STRING=$(aws finspace get-kx-connection-string --environment-id $ENV_ID --user-arn $USER_ARN --cluster-name cluster_welcomedb | jq -r '.signedConnectionString')
sh-4.2$qcon ${CONN_STRING:8}
q)tables[]
,`example
q)

Connecting with PyKX

PyKX is Python library for kdb/q provided by KX, documentation is available from KX, and is installed from pypi.

With a few utility functions you can use the AWS Python library (boto3) with PyKX to connect to a FinSpace Managed kdb cluster.

import boto3
import pykx as kx

ENV_ID="YOUR ENVIRONMENT ID HERE"

#-------------------------------------------------------------

def get_kx_connection_string(client, clusterName:str, userName:str, environmentId:str):
    resp=client.get_kx_user(environmentId=environmentId, userName=userName)

    userArn = resp.get("userArn")
    resp=client.get_kx_connection_string(environmentId=environmentId, userArn=userArn, clusterName=clusterName)

    return resp.get("signedConnectionString", None)


def parse_connection_string(conn_str: str):
    conn_parts = conn_str.split(":")

    host=conn_parts[2].strip("/")
    port = int(conn_parts[3])
    username=conn_parts[4]
    password=conn_parts[5]

    return host, port, username, password


def get_pykx_connection(client, clusterName: str, userName: str, environmentId:str):
    conn_str = get_kx_connection_string(client, environmentId=environmentId, clusterName=clusterName, userName=userName)

    host, port, username, password = parse_connection_string(conn_str)

    return kx.SyncQConnection(host=host, port=port, username=username, password=password)

#-------------------------------------------------------------

# create finspace client using default credentials
session = boto3.Session()
client = session.client(service_name='finspace')

# Connect to cluster with PyKX
hdb = get_pykx_connection(client=client, clusterName="cluster_welcomedb", userName="sagemaker", environmentId=ENV_ID )

# list tables on the cluster
hdb("tables[]").py()

Notes

  • Be sure to provide your FinSpace environment ID in the variable ENV_ID

Connecting with R

You can connect to managed clusters from R using the client provide by the package rkdb on github.

# Define utility functions
# ------------------------------------------------------------

#' Test if package is already installed
#‘
#' @param mypkg package name to check
#' @returns TRUE: if package is installed, else FALSE
#‘
is.installed <- function(mypkg) {
    is.element(mypkg, installed.packages()[,1])
}

#' Install package if not installed
#‘
#' @param x package name to confirm installation
#' @returns NULL
#‘
pkgTest <- function(x) {
    if ( !is.installed(x) ) {
        install.packages(x, dep=TRUE)
    }
    return(NULL)
}

#' Get a connection to a FinSpace managed kdb cluster
#‘
#' @param environment_id FinSpace environment ID
#' @param cluster_name FinSpace cluster name to connect to
#' @param user_name FinSpace user connecting to cluster
#' @returns A connection to the requested cluster
#‘
get_connection <- function(environment_id, cluster_name, user_name) {
    # create finspace service client (uses default credentials)
    finspace <- paws::finspace()

    # Call service to get information about the FinSpace user
    user_resp <- finspace$get_kx_user(
        environmentId = environment_id,
        userName = user_name
    )

    # get the connection string to the cluster for the finspace user, passing the userARN
    conn_str <- finspace$get_kx_connection_string(
        environmentId = environment_id,
        userArn = user_resp$userArn,
        clusterName = cluster_name
    )

    # Break into parts for connection function call
    split_df <- str_split(conn_str$signedConnectionString, ":")

    # split the connection string
    conn_parts <- split_df[[1]]

    # get out the bits
    host <- gsub('/', '', conn_parts[3])
    port <- conn_parts[4]
    user <- conn_parts[5]
    pass <- onn_parts[6]

    # username password are packed together for connection call
    userpas <- paste(user, pass, sep=":")

    return(open_connection(host, port, userpass))
}

# ------------------------------------------------------------

# ensure needed packages are installed
for (p in c("paws", "nanotime", "stringr", "devtools") ) {
    pkgTest(p)
}

# rkdb is not on CRAN, installing is different (uses devtools)
if ( !is.installed("rkdb") ) {
    devtools::install_github('KxSystems/rkdb')
} 

library(rkdb)

# FinSpace environemnt ID
ENV_ID <- "YOUR ENVIRONMENT ID HERE"

# Cluster to connect to, created by welcome notebook: welcome.ipynb
CLUSTER_NAME <- "cluster_welcomedb"

# connect to cluster
conn <- get_connection(ENV_ID, CLUSTER_NAME, KDB_USERNAME)

# Available Tables in FinSpace Cluster
execute(conn, 'tables[]')

Notes:

  • Be sure to provide your FinSpace environment ID in the variable ENV_ID
profile pictureAWS
EXPERT
published 9 days ago34 views