I am trying to read xml file from S3 location and using diffusedxml library. Code executes fine, but Bandit throws medium Severity in code analysis and throws following message-
blacklist: Using lxml.etree.parse to parse untrusted XML data is known to be vulnerable to XML attacks. Replace lxml.etree.parse with its defusedxml equivalent function.
Test ID: B320
Severity: MEDIUM
Confidence: HIGH
CWE: CWE-20
File: ./lambda_code/ccda_step2_validation/defusedxml/defusedxml/lxml.py
Line number: 135
More info: https://bandit.readthedocs.io/en/1.7.4/blacklists/blacklist_calls.html#b313-b320-xml-bad-etree
134 parser = getDefaultParser()
135 elementtree = _etree.parse(source, parser, base_url=base_url)
136 check_docinfo(elementtree, forbid_dtd, forbid_entities)
blacklist: Using lxml.etree.fromstring to parse untrusted XML data is known to be vulnerable to XML attacks. Replace lxml.etree.fromstring with its defusedxml equivalent function.
Test ID: B320
Severity: MEDIUM
Confidence: HIGH
CWE: CWE-20
File: ./lambda_code/ccda_step2_validation/defusedxml/defusedxml/lxml.py
Line number: 143
More info: https://bandit.readthedocs.io/en/1.7.4/blacklists/blacklist_calls.html#b313-b320-xml-bad-etree
142 parser = getDefaultParser()
143 rootelement = _etree.fromstring(text, parser, base_url=base_url)
144 elementtree = rootelement.getroottree()
Here is my Pseudo code
from defusedxml.defusedxml.ElementTree import fromstring
is_valid_file = False
S3_CLIENT = boto3.client("s3")
s3_file = S3_CLIENT.get_object(Bucket=bucketname, Key=filename_with_key)
#Read a text file entire content
s3_filedata = s3_file["Body"].read()
try:
#tree = ET.ElementTree(ET.fromstring(s3_filedata))
tree = fromstring(s3_filedata)
#search_document_header = tree.getroot()
search_document_header = tree.findall(".")
search_patient_section = tree.findall(".//{urn:hl7-org:v3}patientRole")
if str(search_document_header).find("ClinicalDocument") !=-1 and str(search_patient_section).find("patientRole") !=-1:
is_valid_file = True
except Exception as e:
is_valid_file = False
LOGGER.error("in parse error")