Skip to content

AWS Transform for mainframe: Understanding Your S3 Output Structure

10 minute read
Content level: Intermediate
0

AWS Transform for mainframe automates the analysis, transformation, and modernization of mainframe applications. Each job execution generates a structured set of artifacts in Amazon S3 that are essential for downstream modernization activities — including code conversion, data migration, testing, and validation. As teams scale their modernization efforts across multiple applications and job runs, navigating these outputs efficiently becomes critical.

After running an AWS Transform for mainframe job, understanding where to find your outputs and what each folder contains is important for successful modernization. This document serves as a technical reference for the S3 bucket structure and artifacts generated during AWS Transform for mainframe workflows. Use this guide to locate specific S3 outputs, understand artifact contents, and integrate transformation results into your modernization workflow. Each transformation job produces a standardized set of outputs based on the job objectives, that are organized into designated folders within the configured S3 bucket, for example s3://[aws-transform-bucket-name/jobid]/

FolderPurposeArtifacts
1/business-documentation/Business rules and logic extractionHTML, JSON files
1/data_analysis/Data lineage and dictionaryData relationships, COBOL copybooks, Db2 schemas
1/decomposition/Domain decompositiondependency_graph.json
1/documentation/Compressed documentation archiveZipped PDF and XML files
codetransform/Refactored Java applicationgenerated.zip
doc/Environment setup guideIaC templates (CloudFormation, CDK, Terraform)
documentation/Technical documentationPDF and XML files per program
inputs/Source code and configurationCOBOL, JCL, copybooks, VSAM files
logs/Transform processing logsProcessing logs per activity type
reforge/Restructured Java codereforge.zip (maven_project, logs)
results/Code analysis outputsanalyze_code_results.zip, metrics
runtime/AWS Blu Age runtimegapwalk zip file
smf_analysis/Mainframe telemetry analysisCSV files (batch jobs, CICS transactions)
test_automation_script_generation/Automated test scriptsTest scripts, comparison tools
test_data_collection/Test data extraction scriptsJCL scripts (before/after)
test_plan/Test strategyJSON test scenarios

Transformation output structure details

1/ - Transformation Run Directory

Contains outputs from the initial transformation run. Subsequent runs are numbered sequentially, (e.g., 2/, 3/)

Business Documentation (1/business-documentation/)

Contains business logic extraction results in both HTML and JSON formats, capturing business rules, process flows, and functional specifications at both application and individual program levels. Use these artifacts to validate extracted business rules against current requirements, identify gaps or inconsistencies in legacy implementations, understand business context during code modifications, and generate user stories or functional specifications for the modernized application.

Data Analysis (1/data_analysis/)

Provides data lineage and data dictionary outputs that map data flows and structures across the application. Data lineage includes dataset impact analysis with CRUD operations, Db2 table relationships, program-to-data mappings, and JCL-to-data dependencies. The data dictionary contains COBOL copybook field definitions with business descriptions and Db2 table schemas with primary/foreign keys. Use these outputs to design target database schemas, plan data migration strategies and transformations, identify critical datasets for test data preparation, and trace data lineage for regulatory compliance requirements.

Dependency Graph (1/decomposition/)

Contains domain decomposition artifacts showing a proposed segmentation of the application into logical business domains. The dependency_graph.json file provides a representation of component relationships and domain boundaries, along with domain definitions, seed files, component assignments, and cross-domain dependency mappings. Use these artifacts to evaluate and define microservices boundaries and API contracts, sequence modernization waves based on domain dependencies, understand component ownership and integration points, and plan deployment pipelines and environment configurations.

Documentation (1/documentation/)

Contains a compressed archive of the technical documentation generated for each COBOL program and JCL file. This zipped version mirrors the contents of the root-level documentation/ folder for convenient bulk download. Documentation is available in two detail levels: Summary (high-level overview with one-line summaries per file) and Detailed Functional Specification (details include logic and flow, dependencies, input and output processing, and transaction details). Use this archive for offline access or bulk distribution of documentation artifacts.

Transformed Code (codetransform/)

The generated.zip file contains the refactored Java application code packaged alongside the AWS Blu Age Runtime required for compilation. The refactored code is designed to maintain business-critical logic while transforming COBOL programs into Java applications suitable for cloud deployment. A key file included in the package is the transformation report located at ‘/report/generate-status-report.xls’, which provides detailed insight into the results of the transformation process. This workbook contains two important to review: the weather and the generate tab. The weather tab provides a high-level overview of the overall transformation health, and the generate tab tracks the transformation status of each individual artifact. Validation of the transformed code against source application behavior is recommended prior to production deployment.

Setup Guide (doc/)

Contains step-by-step instructions for environment configuration to set up the AWS refactored development environment. Includes Infrastructure as Code (IaC) templates in multiple formats (CloudFormation, AWS CDK, and Terraform) that provide a baseline configuration for essential components such as compute resources, databases, storage, and security controls for deploying the modernized mainframe application. Customization to align with your organization's environment and security requirements may be necessary.

Documentation (documentation/)

Contains technical documentation generated for each COBOL program and JCL file. Documentation is available in two detail levels: Summary (high-level overview with one-line summaries per file) and Detailed Functional Specification (details include logic and flow, dependencies, input and output processing, and transaction details). Outputs include PDF files for review and XML files for automated processing. Optional customization is supported through glossary.csv (for terminology definitions) and pdf_config.json (for branded headers, footers, and logos). Use this documentation to understand legacy program logic, support knowledge transfer and onboarding, and provide reference material for troubleshooting and maintenance activities.

Inputs (inputs/)

Contains the application source code and configuration files provided as input to the transformation job. This includes COBOL programs, JCL scripts, copybooks, VSAM files, data files, and mainframe environment-specific configuration files. For a complete list of supported file types, see Supported file types for transformation of mainframe applications.

Transformation Logs (logs/)

Contains logs of processing activities performed by AWS Transform. Use these logs to troubleshoot transformation issues and diagnose warnings and errors.

Reforge Package (reforge/)

Contains reforged Java code restructured to align with modern Java idioms and conventions. The output is packaged as reforge.zip containing: maven_project/ (reforged source code), reforge.log (diagnostic logs), and tokenizer_map.json (mapping of token IDs to data for privacy protection). Files with unsuccessful compilation are saved as .java.incomplete, while successfully refactored files are backed up as .java.original. Use these artifacts to evaluate and replace refactored code with restructured Java implementations where applicable, and selectively apply improvements from incomplete reforges. Review of all reforged outputs is recommended to confirm expected behavior and code quality.

Code Analysis (results/)

Contains code analysis outputs including: analyze_code_results.zip (classification file, assets list, dependencies JSON, and list of missing files), classification_*.json (file type classifications), decomposition.zip (decomposition results and domain list), and generic-analysis-*.json (dependency analysis and complexity metrics). Analysis provides file-level details such as lines of code, cyclomatic complexity, comment lines, and effective lines of code. Use these outputs to assess modernization scope, and identify high-complexity components requiring additional testing.

Runtime Environment (runtime/)

Contains the AWS Blu Age runtime environment packaged as a zip file ‘gapwalk zip file’, which provides the necessary libraries and components to execute the refactored COBOL-to-Java application on AWS. The runtime environment supports functional equivalence between the legacy mainframe application and the modernized cloud application by providing COBOL-compatible data handling, transaction processing, and system services in the Java environment. Use these artifacts to include in deployment packages. .

Telemetry Analysis (smf_analysis/)

Contains System Management Facility (SMF) analysis results in CSV format, providing insights into batch jobs (type 30) and CICS transactions (type 110). Analysis includes tabular views with summary information, batch job/CICS transaction analysis with aggregated metrics, and code analysis comparison (when code analysis was performed prior to SMF analysis). The output displays the time range of analyzed records and provides discovery summaries to identify key jobs and transactions. Use these outputs to identify unused code for retirement, understand job performance and execution patterns, and support decision-making on target architecture based on usage patterns.

Test Automation (test_automation_script_generation/)

Contains automated test scripts designed to execute test cases on the modernized application. Scripts are organized by test case in individual S3 folders and include capabilities for environment setup, data preparation (utilizing test data from the test data collection step), test case execution, and result comparison. Also includes comparison tools (38 MB) for output validation, data migration utilities (50 MB) for test data setup, and technical/user documentation. Use these scripts to execute automated regression testing and integrate with CI/CD pipelines for continuous validation.

Test Data Collection (test_data_collection/)

Contains generated JCL scripts for extracting test data from the mainframe environment. Scripts are organized by test case with separate folders for "before" and "after" data collection, and are categorized by data type (sequential datasets, database tables, and transfers). Scripts support Db2 table unloading, VSAM file processing (REPRO utility), and sequential dataset processing, with customization based on provided templates and configuration files. Execute these scripts on the mainframe to collect test data and transfer the collected data to AWS test environments.

Test Plan (test_plan/)

Contains a test strategy in JSON format with test scenarios designed to cover the identified application functions based on code analysis and scheduler configurations. Includes test case definitions with preferred execution order based on dependencies, job group assignments from scheduler, complexity scores (aggregate scores for test cases), business domain associations, cyclomatic complexity metrics, dataset and table dependencies with direction (input/output), lines of code metrics, and business function mappings. Test cases contain 1 to many JCLs in schedule execution order and are created from supported schedulers (CA7 and Control -M). Use this test plan to plan testing phases and resource allocation and execute structured test scenarios.

Common Use Cases

  • I want to identify unused or rarely-used mainframe code: → Review smf_analysis/ CSV outputs for batch job and CICS transaction usage patterns
  • I need to trace which programs access specific datasets: → Use program-to-data mappings in 1/data_analysis/data lineage
  • I need to understand the format of a VSAM file: → Review VSAM in 1/data_analysis/ data dictionary
  • I need to understand the business logic before making changes: → Review 1/business-documentation/ for business rules and process flows
  • I need to review the refactored code: → Navigate to codetransform/generated.zip
  • I need to set up my AWS refactor environment: → Use Infrastructure as Code templates in doc/
  • I need to prepare test data from the mainframe: → Execute JCL scripts from test_data_collection/ on your mainframe environment

Additional Resources

AWS Transform: User Guide
re:Post selection: Generative AI for Mainframe Modernization

For technical support, contact your AWS account team or open a support case through the AWS Management Console.