Special characters and umlauts
Posted: Tue Oct 10, 2023 7:29 pm
Hi everybody,
we have a LANSA RDMLX function running on an IBMi that reads data from an IBMi data base file and creates a xml file that will be stored in IFS.
We save the lines for the XML file in a working list and at the end of the program the file is created using TRANSFORM_LIST like this:
Use Builtin(TRANSFORM_LIST) With_Args(#L_STRING #W_PATHFILE O I Y '.') To_Get(#RETCOD)
At last we change character set so that the file is readable on PC via QSHELL:
#W_COMMAND := 'STRQSH CMD(' + '''' + 'cd ' + #PFADNAME.trim.RightTrim + '; iconv -f 273 -t 819 ' + #DATEI250.trim.RightTrim + ' > ' + #PFADNAME2.trim.RightTrim + '/' + #DATEI250.trim.RightTrim + '''' + ')'
Use Builtin(SYSTEM_COMMAND) With_Args('X' #W_COMMAND) To_Get(#STD_NUM)
This generally works fine since many years but the customer uses more and more special characters and umlauts and then we get problems with the signs that appear instead of these in the xml file. To solve this we have special routines to replace characters like this:
#TEXT6500 := #TEXT6500.ReplaceAll( "ä" "ae" )
#TEXT6500 := #TEXT6500.ReplaceAll( "Ä" "AE" )
#TEXT6500 := #TEXT6500.ReplaceAll( "ö" "oe" )
#TEXT6500 := #TEXT6500.ReplaceAll( "Ö" "OE" )
#TEXT6500 := #TEXT6500.ReplaceAll( "ü" "ue" )
#TEXT6500 := #TEXT6500.ReplaceAll( "Ü" "UE" )
#TEXT6500 := #TEXT6500.ReplaceAll( "ß" "ss" )
#TEXT6500 := #TEXT6500.ReplaceAll( "°" " " )
#TEXT6500 := #TEXT6500.ReplaceAll( "[" " " )
#TEXT6500 := #TEXT6500.ReplaceAll( "]" " " )
#TEXT6500 := #TEXT6500.ReplaceAll( "{" " " )
#TEXT6500 := #TEXT6500.ReplaceAll( "}" " " )
But every new special character they use lead to new problems that make the xml file corrupt and unreadable. Latest example is the Euro-Sign "€":
<eb:Description>SERVICE BUNDHOSE MIT STRETCH GRAU SCHWARZ ab 300 Stueck 75 € ** ab 400 Stueck 66 € **</eb:Description>
leads to this in XML:
<eb:Description>SERVICE BUNDHOSE MIT STRETCH GRAU SCHWARZ ab 300 Stueck 75 ¤ ** ab 400 Stueck 66 ¤ **</eb:Description>
We stopped this by a new replacing statement #TEXT6500 := #TEXT6500.ReplaceAll( "¤" "EUR" ) - but Isn't there an easier way than replacing all the single characters / umlauts and transfer the texts exactly as they are stored in the IBM file?
The database files are old files (DDS and not defined in LANSA) so the fields are Alphanumeric (which cannot be changed) and Unicode cannot be used. TEXT6500 in the replace statements is defined as String (Length 6500) with SUNI in input and output attributes.
I also tried using a NVarChar-textfield instead of TEXT6500 but the results have been the same.
Many thanks in advance and best regards,
Joerg
we have a LANSA RDMLX function running on an IBMi that reads data from an IBMi data base file and creates a xml file that will be stored in IFS.
We save the lines for the XML file in a working list and at the end of the program the file is created using TRANSFORM_LIST like this:
Use Builtin(TRANSFORM_LIST) With_Args(#L_STRING #W_PATHFILE O I Y '.') To_Get(#RETCOD)
At last we change character set so that the file is readable on PC via QSHELL:
#W_COMMAND := 'STRQSH CMD(' + '''' + 'cd ' + #PFADNAME.trim.RightTrim + '; iconv -f 273 -t 819 ' + #DATEI250.trim.RightTrim + ' > ' + #PFADNAME2.trim.RightTrim + '/' + #DATEI250.trim.RightTrim + '''' + ')'
Use Builtin(SYSTEM_COMMAND) With_Args('X' #W_COMMAND) To_Get(#STD_NUM)
This generally works fine since many years but the customer uses more and more special characters and umlauts and then we get problems with the signs that appear instead of these in the xml file. To solve this we have special routines to replace characters like this:
#TEXT6500 := #TEXT6500.ReplaceAll( "ä" "ae" )
#TEXT6500 := #TEXT6500.ReplaceAll( "Ä" "AE" )
#TEXT6500 := #TEXT6500.ReplaceAll( "ö" "oe" )
#TEXT6500 := #TEXT6500.ReplaceAll( "Ö" "OE" )
#TEXT6500 := #TEXT6500.ReplaceAll( "ü" "ue" )
#TEXT6500 := #TEXT6500.ReplaceAll( "Ü" "UE" )
#TEXT6500 := #TEXT6500.ReplaceAll( "ß" "ss" )
#TEXT6500 := #TEXT6500.ReplaceAll( "°" " " )
#TEXT6500 := #TEXT6500.ReplaceAll( "[" " " )
#TEXT6500 := #TEXT6500.ReplaceAll( "]" " " )
#TEXT6500 := #TEXT6500.ReplaceAll( "{" " " )
#TEXT6500 := #TEXT6500.ReplaceAll( "}" " " )
But every new special character they use lead to new problems that make the xml file corrupt and unreadable. Latest example is the Euro-Sign "€":
<eb:Description>SERVICE BUNDHOSE MIT STRETCH GRAU SCHWARZ ab 300 Stueck 75 € ** ab 400 Stueck 66 € **</eb:Description>
leads to this in XML:
<eb:Description>SERVICE BUNDHOSE MIT STRETCH GRAU SCHWARZ ab 300 Stueck 75 ¤ ** ab 400 Stueck 66 ¤ **</eb:Description>
We stopped this by a new replacing statement #TEXT6500 := #TEXT6500.ReplaceAll( "¤" "EUR" ) - but Isn't there an easier way than replacing all the single characters / umlauts and transfer the texts exactly as they are stored in the IBM file?
The database files are old files (DDS and not defined in LANSA) so the fields are Alphanumeric (which cannot be changed) and Unicode cannot be used. TEXT6500 in the replace statements is defined as String (Length 6500) with SUNI in input and output attributes.
I also tried using a NVarChar-textfield instead of TEXT6500 but the results have been the same.
Many thanks in advance and best regards,
Joerg