How to Filter by Query on Nested Fields in AWS AppSync

0

Problem and Expected Results

I'm using a proof of concept schema and DynamoDB Table setup to filter on nested field values. I've followed the ideas very generally here (https://medium.com/open-graphql/implementing-search-in-graphql-11d5f71f179) as well as the documentation for $utils.transform.toDynamoDBFilterExpression (https://docs.aws.amazon.com/appsync/latest/devguide/resolver-util-reference.html#transformation-helpers-in-utils-transform).

The basic idea is this: using the same sort of principles, I'd like to filter by any arbitrarily deep nested field (short of the 32 document path length limit in DynamoDB). The relevant setup looks like this:

AppSync schema (apologies for the naming conventions; was supposed to be a quick and dirty PoC):

query {
    listActiveListingsBySubAndFilter(
        filter: TableTestMasterDataTable_ImportV1FilterInput!,
        limit: Int,
        nextToken: String
    ): TestMasterDataTable_ImportV1Connection
}

input TableBooleanFilterInput {
	ne: Boolean
	eq: Boolean
}

input TableDataObjectFilterInput {
	beds: TableFloatFilterInput
	baths: TableFloatFilterInput
}

input TableFloatFilterInput {
	ne: Float
	eq: Float
	le: Float
	lt: Float
	ge: Float
	gt: Float
	contains: Float
	notContains: Float
	between: [Float]
}

input TableIDFilterInput {
	ne: ID
	eq: ID
	le: ID
	lt: ID
	ge: ID
	gt: ID
	contains: ID
	notContains: ID
	between: [ID]
	beginsWith: ID
}

input TableIntFilterInput {
	ne: Int
	eq: Int
	le: Int
	lt: Int
	ge: Int
	gt: Int
	contains: Int
	notContains: Int
	between: [Int]
}

input TableStringFilterInput {
	ne: String
	eq: String
	le: String
	lt: String
	ge: String
	gt: String
	contains: String
	notContains: String
	between: [String]
	beginsWith: String
}

input TableTestMasterDataTable_ImportV1FilterInput {
	id: TableStringFilterInput
	status: TableStringFilterInput
	sub: TableStringFilterInput
	data: TableDataObjectFilterInput
}

type TestMasterDataTable_ImportV1 {
	id: String!
	status: String!
	sub: String!
	data: AWSJSON
}

type TestMasterDataTable_ImportV1Connection {
	items: [TestMasterDataTable_ImportV1]
	nextToken: String
}

input UpdateTestMasterDataTable_ImportV1Input {
	id: String!
	status: String
	sub: String!
	data: AWSJSON
}

VTL request and response resolvers:

## Request resolver

#set( $filter = $ctx.args.filter )
#set( $path = $filter.data )

{
    "version" : "2017-02-28",
    "operation" : "Query",
    "index" : "listings-index",  ## GSI on table with HASH: status, RANGE: sub
    "query" : {
        "expression": "#status = :status and #sub = :sub",
        "expressionNames" : {
        	"#status" : "status",
            "#sub" : "sub"
    	},
        "expressionValues" : {
            ":status" : $util.dynamodb.toDynamoDBJson("Active"),
            ":sub" : $util.dynamodb.toDynamoDBJson($filter.sub.eq)
        }
    },
    "filter" : $util.transform.toDynamoDBFilterExpression($path),
    "limit": $util.defaultIfNull($ctx.args.limit, 20),
    "nextToken": $util.toJson($util.defaultIfNullOrEmpty($ctx.args.nextToken, null))
}


## Response resolver

{
    "items": $util.toJson($ctx.result.items),
    "nextToken": $util.toJson($util.defaultIfNullOrBlank($context.result.nextToken, null))
}

Example DynamoDB Table element:

{
  "_meta": {
    "exposure": 0.08,
    "lastActive": 1557800000,
    "lastUpdated": 1557878400,
    "lastView": 1557878500,
    "numViews": 63,
    "posted": 1557878400
  },
  "buildingID": "325-5th-Ave,-New-York,-NY-10016,-USA",
  "data": {
    "agent": [
      {
        "agentID": "daeo@gmail.com"
      },
      {
        "agentID": "ben@gmail.com"
      }
    ],
    "amenities": [
      "hot tub",
      "time machine"
    ],
    "baths": 2,
    "beds": 2
  },
  "id": "325-5th-Ave,-New-York,-NY-10016,-USA#37C:1557878400",
  "status": "Active",
  "sub": "new-york/manhattan/listings",
  "unitNum": "37C",
  "unitRefID": "325-5th-Ave,-New-York,-NY-10016,-USA#37C"
}

Based on all of this, if I run the following query:

listActiveListingsBySubAndFilter(filter: {
    "sub" : {
      "eq" : "new-york/manhattan/listings"
    },
    "data": {
      "beds": {
      	"eq": 2.0
      }
    }) {
    items {
      id
      status
    }
    nextToken
}

I would expect to get something like this in return:

{
  "data": {
    "listActiveListingsBySubAndFilter": {
      "items": [
          {
              "id": "325-5th-Ave,-New-York,-NY-10016,-USA#37C:1557878400",
              "status": "Active"
          }
      ],
      "nextToken": null
    }
  }
}

Note: this is the only expected return since there's only one item matching these requirements in the database at this time.

Actual Results

All of that said, the results I'm getting (or lack thereof) aren't making much sense. No matter the query (data.beds, data.baths), if the field is nested in data the return is the same:

{
  "data": {
    "listActiveListingsBySubAndFilter": {
      "items": [],
      "nextToken": null
    }
  }
}

I've verified the query is working as expected and the filter expression is formatted appropriately (it works on other non-nested fields like id). What's perplexing is that the filter just doesn't seem to get applied (or maybe is being applied in some non-intuitive way?). For reference, here's a snippet of a typical CloudWatch log for the above:

{
    "context": {
        "arguments": {
            "filter": {
                "sub": {
                    "eq": "new-york/manhattan/listings"
                },
                "data": {
                    "beds": {
                        "eq": 2
                    }
                }
            },
            "limit": 200
        },
        "stash": {},
        "outErrors": []
    },
    "fieldInError": false,
    "errors": [],
    "parentType": "Query",
    "graphQLAPIId": "q7ueubhsorehbjpr5e6ymj7uua",
    "transformedTemplate": "\n\n{\n    \"version\" : \"2017-02-28\",\n    \"operation\" : \"Query\",\n    \"index\" : \"listings-index\",\n    \"query\" : {\n        \"expression\": \"#status = :status and #sub = :sub\",\n        \"expressionNames\" : {\n        \t\"#status\" : \"status\",\n            \"#sub\" : \"sub\"\n    \t},\n        \"expressionValues\" : {\n            \":status\" : {\"S\":\"Active\"},\n            \":sub\" : {\"S\":\"new-york/manhattan/listings\"}\n        }\n    },\n    \"filter\" : {\"expression\":\"(#beds = :beds_eq)\",\"expressionNames\":{\"#beds\":\"beds\"},\"expressionValues\":{\":beds_eq\":{\"N\":2.0}}},\n    \"limit\": 200,\n    \"nextToken\": null\n}"
}

Notice the filter expressionValues value in transformedTemplate: { "N" : 2.0 } (sans $util.toDynamoDBJson formatting) and compare it to the value in the object in DynamoDB on that field.

I've tried everything, including changing the fields themselves to strings and doing various filter operations like eq and contains to see if this was some odd type inconsistency, but no luck.

As of now, I have two backup solutions that involve either "pulling up" all the relevant fields I might want to filter on (cluttering my records with attributes I'd rather keep nested) or creating a new nested type containing only high-level fields for filtering on -- i.e., effectively split the records into a record reference and a record filter reference. In this scenario, we'd get some "Listing" record that has as its data field value something like ListingFilterData -- e.g.:

type Listing {
    id: String!
    sub: String!
    status: String!
    data: ListingFilterData!
}

type ListingFilterData {
    beds: Float!
    baths: Float!
}

Both are doable, but I'd rather try to solve the current issue instead of adding a bunch of extra data to my table.

Any thoughts?

Edited by: april-labs on Sep 16, 2019 4:31 PM

asked 5 years ago2552 views
4 Answers
0

The following statements in the VTL are relevant:
#set( $filter = $ctx.args.filter )
#set( $path = $filter.data )
...
"filter" : $util.transform.toDynamoDBFilterExpression($path)

Now, the filter passed in the query is:
{
"sub" : {
"eq" : "new-york/manhattan/listings"
},
"data": {
"beds": {
"eq": 2.0
}
}
}

Here, $path is the value of "data", ie {"beds":{"eq": 2.0}}
But, the object in the table is :
"data": {
"agent": [
{
"agentID": "daeo@gmail.com"
},
{
"agentID": "ben@gmail.com"
}
],
"amenities": [
"hot tub",
"time machine"
],
"baths": 2,
"beds": 2
}

The parent property "data" is missing in the query created by the VTL. Therefore, it does not match anything returned by the query and the result is null. You can try to add the parent node "data".

answered 5 years ago
0

After some more fiddling, I happened upon the solution implied here: https://stackoverflow.com/questions/55132782/appsync-graphql-how-to-filter-a-list-by-nested-value. Following from what I could glean about solution, I successfully implemented a hardcoded nested query filter using the following VTL request resolver (and changing the filter expression keyname to avoid a reserved word conflict on data`):

#set( $filter = $ctx.args.filter )
#set( $path = $filter.filterData ) ## currently, unused

{
    "version" : "2017-02-28",
    "operation" : "Query",
    "index" : "listings-index",
    "query" : {
        "expression": "#status = :status and #sub = :sub",
        "expressionNames" : {
        	"#status" : "status",
            "#sub" : "sub"
    	},
        "expressionValues" : {
            ":status" : $util.dynamodb.toDynamoDBJson("Active"),
            ":sub" : $util.dynamodb.toDynamoDBJson($filter.sub.eq)
        }
    },
    "filter" : {
		"expression" : "#filterData.beds = :beds",
        "expressionValues" : {
        	":beds" : $util.dynamodb.toDynamoDBJson(2.0)
        }
    },
    "limit": $util.defaultIfNull($ctx.args.limit, 20),
    "nextToken": $util.toJson($util.defaultIfNullOrEmpty($ctx.args.nextToken, null))
}

This returns my expected result:

{
  "data": {
    "listActiveListingsBySubAndFilter": {
      "items": [
        {
          "id": "325-5th-Ave,-New-York,-NY-10016,-USA#37C:1557878400",
          "status": "Active"
        }
      ],
      "nextToken": null
    }
  }
}

Update #2 from 9/17/19

After further playing around with the request resolver, I think I've found a quick and dirty way to dynamically grab the path and target vars for creating a filter expression for my nested attributes. Note: The whole thing still returns an empty result set and it assumes there's only one filter key (for now), but the reserved keyword bit seems to have been solved. Still wondering why the results aren't showing up as expected though.

#set( $filter = $ctx.args.filter )
#foreach( $parent in $filter.keySet() )
	#set( $path = $parent )
#end
#set( $target = $filter[$path] )
#foreach( $ff in $target.keySet() ) ## should only contain one Map key-value pair
	#set( $fp = $ff )
#end
#set( $fv = $target[$fp] )

{
    "version" : "2017-02-28",
    "operation" : "Query",
    "index" : "listings-index",
    "query" : {
        "expression": "#status = :status and #sub = :sub",
        "expressionNames" : {
        	"#status" : "status",
            "#sub" : "sub"
    	},
        "expressionValues" : {
            ":status" : $util.dynamodb.toDynamoDBJson("Active"),
            ":sub" : $util.dynamodb.toDynamoDBJson($filter.sub.eq)
        }
    },
    "filter" : {
		"expression" : "#ffp = :$fp",  ## filter path parent.target = :target
        "expressionNames" : {
        	"#ffp" : "${path}.${fp}"
        },
        "expressionValues" : {
        	":$fp" : $util.dynamodb.toDynamoDBJson(${fv.eq}), ## :target : value to filter for
        }
    },
    "limit": $util.defaultIfNull($ctx.args.limit, 200),
    "nextToken": $util.toJson($util.defaultIfNullOrEmpty($ctx.args.nextToken, null))
}

Inspecting the CloudWatch log transformedTemplate shows the expression names and values are being substituted appropriately:

"filter" : {
    "expression\" : "#ffp = :beds",
    "expressionNames" : {
        "#ffp" : "data.beds"
    },
    "expressionValues" : {
        ":beds" : { "N": 2.0 }
    }
}

"Parent" category of beds (data) is present in the VTL query as far as I can tell.

answered 5 years ago
0

Update from 09/18/19

I may have finally discovered the root of the problem: it seems that the way in which expressionNames are evaluated does not allow for a key to be a docpath. If I run either of the following filters (notice the use of a non-reserved DynamoDB keyword to illustrate the problem is with expression name substitution), I'll get the result I'm looking for:

"filter" : {
	"expression" : "filterData.beds = :beds",  ## filter path parent.target = :target
        "expressionValues" : {
        	":beds" : $util.dynamodb.toDynamoDBJson(${fv.eq}) ## :target : value to filter for
        }
    }

or

"filter" : {
	"expression" : "filterData.beds = :${fp}",  ## filter path parent.target = :target
        "expressionValues" : {
        	":{fp}" : $util.dynamodb.toDynamoDBJson(${fv.eq}) ## :target : value to filter for
        }
    }

Now, if I make a minor change, only attempting to substitute with an expression name value

"filter" : {
	"expression" : "#filterData.beds = :${fp}",  ## filter path parent.target = :target
        "expressionNames": {
        	"#filterData.beds" : "filterData.beds"
        },
        "expressionValues" : {
        	":{fp}" : $util.dynamodb.toDynamoDBJson(${fv.eq}) ## :target : value to filter for
        }
    }

I get the following error message:

"ExpressionAttributeNames contains invalid key: Syntax error; key: \"#filterData.beds\" (Service: AmazonDynamoDBv2; Status Code: 400; Error Code: ValidationException"

Even with a hardcoded path substitution, VTL seems to read the path as a single key name. Same issue when swapping the values of the expressions dynamically so there're no hardcoded strings.

answered 5 years ago
0

SOLVED

I happened upon this https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Expressions.ExpressionAttributeNames.html#Expressions.ExpressionAttributeNames.AttributeNamesContainingDots by accident, and it gave me the little bit extra I needed to find a workable solution with dynamic key names!

Here's what the filter expression looks like now:

"filter" : {
		"expression" : "#path.#filter = :${fp}",  ## filter path parent.target = :target
        "expressionNames": {
        	"#path" : "${path}",
        	"#filter" : "${fp}"
        },
        "expressionValues" : {
        	":${fp}" : $util.dynamodb.toDynamoDBJson(${fv.eq}) ## :target : value to filter for
        }
    }

The hold up here was that while expression attribute names are generally interpreted as document paths, with the introduction of the substituted name, the interpreter treats the key name as a scalar attribute and NOT as a document path. You need to individually identify the path elements and substitute for each.

answered 5 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions