Closed
Description
This is a spin-off of #1785.
Currently, fo-dicom correctly handles the decoding of strings with multi-valued character sets, but for the encoding only the first of the encodings is taken into account. This is only a problem for newly added strings, as strings loaded from existing data will already have the encoded byte representation saved.
Here is an example test that should pass:
[Fact]
public void Test_SavingPatientNameWithMultiEncoding()
{
var patientName = "Yamada^Tarou=山田^太郎=やまだ^たろう";
var dataset = new DicomDataset
{
{ DicomTag.SOPClassUID, DicomUID.SecondaryCaptureImageStorage },
{ DicomTag.SOPInstanceUID, DicomUIDGenerator.GenerateDerivedFromUUID() },
{ DicomTag.SpecificCharacterSet, "\\ISO 2022 IR 87" },
{ DicomTag.PatientName, patientName}
};
var dicomFile = new DicomFile(dataset);
var stream = new MemoryStream();
dicomFile.Save(stream);
stream.Seek(0, SeekOrigin.Begin);
var inFile = DicomFile.Open(stream);
Assert.Equal(patientName, inFile.Dataset.GetString(DicomTag.PatientName));
}
A few considerations taken from the other issue:
- to find errors in encodings, the used encoding shall be constructed with
EncoderFallback.ExceptionFallback
(as is already done for decoding) - tags with VR PN shall be handled by encoding each component separately (if encoding with the first value fails)
- any encoded string part that does not use the first character set shall be surrounded by the proper escape sequences (as already used in decoding)
- we have to decide if we want to support strings or PN components which need more than one encoding to encode, as these are difficult to implement and practically non-existent
Note: There is already a TODO comment in the problematic code.