By using AWS re:Post, you agree to the Terms of Use

Questions tagged with AWS Step Functions

Sort by most recent

Browse through the questions and answers listed below or filter and sort to narrow down your results.

Step Functions Choice Type coerces non-numeric values to 0 for Numeric*Path choices (e.g NumericGreaterThanEqualsPath)

Here is a simple state machine to reproduce the bug: ``` { "StartAt": "First", "States": { "First": { "Type": "Choice", "Choices": [ { "Variable": "$.key", "NumericGreaterThanEqualsPath": "$.find", "Next": "ChoiceMatched" } ], "Default": "DefaultChoice" }, "ChoiceMatched": { "Type": "Pass", "Result": "ChoiceMatched", "End": true }, "DefaultChoice": { "Type": "Pass", "Result": "DefaultChoice", "End": true } } } ``` Here are the test cases: // Results in "DefaultChoice" for non-numeric value of key - Expected behavior ``` { "key": "ABC", "find": 0 } ``` // Results in "ChoiceMatched" when "key" is 0 and "find" is non-numeric - Incorrect behavior "ABC" was treated as numeric ``` { "key": 0, "find": "ABC" } ``` // Results in "ChoiceMatched" when "key" is 1 and "find" is non-numeric - Incorrect behavior was treated as numeric ``` { "key": 1, "find": "ABC" } ``` // Results in "DefaultChoice" when "key" is -1 and "find" is non-numeric - Incorrect behavior was treated as numeric ``` { "key": -1, "find": "ABC" } ``` The last three cases show that the "find" attribute's value ("ABC") was converted to numeric and precisely to a value of 0 when comparing against the "key" value of 0. This happens only when non-numeric is a Path variable and not the variable being compared (test case 1). Also, this coersion is only happening for numeric comparisons. Comparisons for other data types such as string etc. work as expected (there is another odd behavior with timestamps but I will ask a different question for that). The documentation for Choice step (https://docs.aws.amazon.com/step-functions/latest/dg/amazon-states-language-choice-state.html) clearly says "For each of these operators, the corresponding value must be of the appropriate type: string, number, Boolean, or timestamp. Step Functions doesn't attempt to match a numeric field to a string value." but this is clearly not the behavior. If this can't be fixed, at least the documentation must be updated to specify any conversion rules. Thanks.
0
answers
0
votes
13
views
asked 11 days ago

Querying Latest Available Partition

I am building an ETL pipeline using primarily state machines, Athena, and the Glue catalog. In general things work in the following way: 1. A table, partitioned by "version", exists in the Glue Catalog. The table represents the output destination of some ETL process. 2. A step function (managed by some other process) executes "INSERT INTO" athena queries. The step function supplies a "version" that is used as part of the "INSERT INTO" query so that new data can be appended into the table defined in (1). The table contains all "versions" - it's a historical table that grows over time. My question is: What is a good way of exposing a view/table that allows someone (or something) to query only the latest "version" partition for a given historically partitioned table? I've looked into other table types AWS offers, including Governed tables and Iceberg tables. Each seems to have some incompatibility with our existing or planned future architecture: 1. Governed tables do not support writes via athena insert queries. Only Glue ETL/Spark seems to be supported at the moment. 2. Iceberg tables do not support Lake Formation data filters (which we'd like to use in the future to control data access) 3. Iceberg tables also seem to have poor performance. Anecdotally, it can take several seconds to insert a very small handful of rows to a given iceberg table. I'd worry about future performance when we want to insert a million rows. Any guidance would be appreciated!
1
answers
0
votes
51
views
asked a month ago