How to provide Hyphenation Patterns (XML) to OpenSearch?

0

I would like to use the "hyphenation decompounder" token filter of OpenSearch, to split German compound words into separate tokens (https://www.elastic.co/guide/en/elasticsearch/reference/7.10/analysis-hyp-decomp-tokenfilter.html)

To use it you need to provide a XML file with "hyphenation_patterns_path" prop.

Supposedly packages are the go to tool to provide the file with OpenSearch service. However package upload is rejected with "Copy failed" error, detailing in "Validation failure: package contains unsupported content." ... since obviously the XML file is not a plain word list.

But how else can the XML file be provided to OpenSearch? Or is it just not possible at all?

This is the XML file in question: https://github.com/uschindler/german-decompounder/blob/master/de_DR.xml

Kind regards Stefan

stesie
gefragt vor einem Jahr268 Aufrufe
1 Antwort
0

What worked for us was removing the line

<!DOCTYPE hyphenation-info SYSTEM "hyphenation.dtd">
beantwortet vor einem Jahr

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen