Changes from M28 to M29
Applications
-
Tikal
- The scoping report option now outputs character counts in addition to word counts by default.
Filters
-
IDML Filter
- Fixed a concurrency issue that could cause crashes when multiple instances of the filter were used simultaneously.
-
OpenXML Filter
- The way formatting information is converted to codes has changed. The filter will now attempt to streamline code generation by considering whether the formatting applied to a text run can be considered a “nested” format within the existing formatting. For example, a bold, italic run would be considered “nested” within a bold run. This allows for a more natural code mapping that should be more intuitive for translators, and is also more closely aligned with other tools.
- Style inheritance is now considered when calculating the formatting in effect for a run of text.
- Right-to-left (RTL) support has been added for paragraphs, table
content in DOCX files and some
DrawingML
constructs. - Fixed issue #486. Simple and complex fields are now represented as a single code for the entire field.
- Fixed issue #487. Runs that differ only in script specified for non-overlapping codepoint ranges can now be merged. This reduces the number of inline codes produced in some cases.
- Fixed issue #502. Cells that are in rows and columns that are hidden will no longer be exposed for translation by default. This brings the behavior of the Excel filter into alignment with the behavior of the other OpenXML filters. A new option, “Translate Hidden Rows and Columns”, has been added to the configuration for the Excel portion of the OpenXML filter.
- The “Clean Tags Aggressively” option will now strip <w:bCs> and <w:szCs> tags from Word documents.
- Fixed a crash that could occur when parsing files with enormous attribute values.
- The non-breaking hyphen is now converted to a character, rather than treated as a tag.
-
ITS Filter
- Added type for text units coming from attributes (value:
x-<attribute-name>
).
- Added type for text units coming from attributes (value:
-
Table Filter
- Fixed issue #511: now empty targets with delimiters are merged properly.
-
TXML Filter
- Fixed issue #501, where segment elements commented out were deleted from the output file.
-
XLIFF Filter
- Fixed issue #500, where
alt-trans
proposals with amatch-quality
score in decimal form (“100.00”) were treated as having a score of 0. - Added support to change sdlxliff original attribute values based
on
okf_xliff-sdl
filter configuration.conf
andlocked
attributes are also supported.
- Fixed issue #500, where
Libraries
-
XLIFFWriter
- Added support for
state-qualifier
output in main<target>
.
- Added support for
Connectors
-
Pensieve
- **IMPORTANT:
Code.codesToString()
changes. ** The pensieve TM format has changed and is not backwards compatible. You will need to export your TM's and re-import them with M29.
- **IMPORTANT:
Steps
-
Added character count Steps
- The Character Count step calculates character counts per the GMX-V 2.0 standard and stores them in a Metrics annotation (like the Word Count step). There are also steps for counting all GMX non-translatable categories (ProtectedCharacterCount, etc.) and Okapi categories (Condordance, FuzzyMatch, MT, etc.).
-
GMX “-Only” word count Steps
- The
AlphanumericOnly
,NumericOnly
, andMeasurementOnly
word count steps now follow the GMX standard in that they only give non-zero counts for TUs that consist solely of tokens of the relevant type. (Previously they merely counted relevant tokens.)
- The
-
Translation Comparison Step
- Added an option to use the target of the alt-trans element for a given origin value when processing an XLIFF file as second file. This allows to compare an MT candidate placed as alt-trans entry with the actual translation in the main target element.
-
Scoping Report Step
- The Scoping Report step now can report character counts when the relevant annotations are present. Use both the Word Count and Character Count steps to get full detail. The default template has been updated to include character counts for the included categories.
-
Post-segmentation Inline Codes Removal Step
- Added step that attempts to simplify (trim and merge) as many inline codes as possible by looking at each linguistically distinct segment in a TextUnit.
Connectors
-
KantanMT Support
- Added a new connector to support KantanMT.
-
Microsoft Translation Hub
- Fixed an issue when working with trained engines with certain target languages.