Skip to content

ExtractXml

Extracts values from XML content and writes them to metadata or annotations.

Parameters

NameDescriptionAllowed ValuesRequiredDefault
allKeysDelimiterDelimiter to use if handleMultipleKeys is ALL or DISTINCTstring,
contentIndexesList of content indexes to include or excludeinteger (list)
contentTagsList of content tags to include or exclude, matching anystring (list)
errorOnKeyNotFoundError if a key is not found.booleanfalse
excludeContentIndexesExclude specified content indexesbooleanfalse
excludeContentTagsExclude specified content tagsbooleanfalse
excludeFilePatternsExclude specified file patternsbooleanfalse
excludeMediaTypesExclude specified media typesbooleanfalse
extractTargetExtract to metadata or annotations.ANNOTATIONS
METADATA
METADATA
filePatternsList of file patterns to include or exclude, supporting wildcards (*)string (list)
handleMultipleKeysConcatenate all or distinct values extracted from multiple content, or just use the first or last content.ALL
DISTINCT
FIRST
LAST
ALL
mediaTypesList of media types to consider, supporting wildcards (*)string (list)
retainExistingContentRetain the existing contentbooleanfalse
xpathToKeysMapMap of XPath expressions to keys. Values will be extracted using XPath and added to the corresponding metadata or annotation keys.string (map)

Input

Content

Input content to act on may be selected (or inversely selected using the exclude parameters) with contentIndexes, mediaTypes, and/or filePatterns. If any of these are set and the content is not matched, the content is passed through unchanged.

Output

Content is passed through unchanged.

Metadata

If extractTarget is METADATA, XPath expressions from xpathToKeysMap keys are used to extract values from the input content and write them to metadata keys which are the corresponding xpathToKeysMap values.

Values extracted from multiple contents are handled according to handleMultipleKeys.

Annotations

If extractTarget is ANNOTATIONS, XPath expressions from xpathToKeysMap keys are used to extract values from the input content and write them to annotation keys which are the corresponding xpathToKeysMap values.

Values extracted from multiple contents are handled according to handleMultipleKeys.

Errors

  • On failure to parse any content from XML
  • On failure to evaluate any XPath expression in xpathToKeysMap keys
  • On errorOnKeyNotFound set to true and no values can be extracted from input content for any xpathToKeysMap key

Contact US