Collection of bussines objects in XML should be deserialized in collection of DTO objects using .NET XmlSerializer.
Although it is normal in such scenario to have XSD schema that explains XML will assume that you can't obtain one but have to manually write DTO's that reflect data in XML.
Here is collection of bus.objects in plain XML:
<Products>
<Product>
<Id>1</Id>
<Name>Tommato</Name>
</Product>
<Product>
<Id>2</Id>
<Name>Apple</Name>
</Product>
</Products>
Firstly we have to tweak above since it is not acceptable for XmlSerializer.
When working with collections we must envelope collection node with an extra node serving as container like this:
<ProductCollection>
<Products>
<Product>
<Id>1</Id>
<Name>Tommato</Name>
</Product>
<Product>
<Id>2</Id>
<Name>Apple</Name>
</Product>
</Products>
</ProductCollection>
Writing DTO
Here is DTO:
[Serializable()]
public class Product
{
public int Id { get; set; }
public string Name { get; set; }
}
[Serializable()]
public class ProductCollection
{
public Product[] Products;
}
It is not common to write your own DTO's. You should have XSD schema provided and then consider using XSD.EXE to generate your objects.
Observe how objects are clear from attributes.
DO NOT needlesly decorate your DTO with XML attributes unless neccessary.
Bellow code will perform absolutly the same. Added atributes serve only to developer as comments not XmlSerializer. He is perfectly capable to reflect simple DTO's.
[Serializable()]
public class Product
{
[XmlElement("Id")]
public int Id { get; set; }
[XmlElement("Name")]
public string Name { get; set; }
}
[Serializable()]
[XmlRoot("ProductCollection")]
public class ProductCollection
{
[XmlArray("Products")]
[XmlArrayItem("Product")]
public Product[] Products;
}
Here is one of proper ways of using attributes. Let's assume that my DTO objects and XML input have different naming of fields or even parent node. In this sample XML field "Name" is mapped to DTO's "ProductName" and XML parent node "ProductCollection" is mapped to our DTO ProductsContainer.
[Serializable()]
public class Product
{
public int Id { get; set; }
[XmlElement("Name")]
public string ProductName { get; set; }
}
[Serializable()]
[XmlRoot("ProductCollection")]
public class ProductsContainer
{
public Product[] Products;
}
Writing deserialization
Let's write simplest deserialization for above sample. Note that this code matches last version of DTO's with name mappings.
private static ProductsContainer GetProductsFrom(string xml)
{
using (TextReader textReader = new StringReader(xml))
{
var serializer = new XmlSerializer(typeof(ProductsContainer));
ProductsContainer results = serializer.Deserialize(textReader) as ProductsContainer;
return results;
}
}
Serializer expect two things: prepared stream with XML content (either from file or plain string like in my case) and secondly type of object we try to deserialize.
Handling nullable or empty values in XML
In real life we can receive empty values for fields content or even complete field can be missing.
For example:
<ProductCollection>
<Products>
<Product>
<Id></Id>
<Name>Tommato</Name>
</Product>
<Product>
<Id>2</Id>
<Name></Name>
</Product>
</Products>
</ProductCollection>
In this sample Id from first and Name from second element is empty.
Of course empty string is completely valid value for String element. Although one could ask how to get NULL value not "" as a result.
This empty Id is problem since conversion from NULL -> Int is not possible.
Let's tweak Product DTO allowing nullable integer:
...
public class Product
{
public int? Id { get; set; }
...
That's our side but it is not problem solver. Serializer will not treat empty content or in that manner any other content which is not convertible to Int as NULL. Why would he? Change your mindset - yes he sees NOT empty but non-convertible value and he will complain about it. If you had instead :
...
<Product>
<Id>foo</Id>
...
; it would still raise error.
Your provider of XML has to handle this too. Again commonly this is stated in XSD schema accepted by both sides as contract.
If no XSD then your provider must explicitly write like this:
...
<Product>
<Id xsi:nil="true"></Id>
...
Additionally on header of root element provide reference to XML schema language like this:
...
<ProductCollection xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
...
Otherwise XML Serializer won't know what xsi attribute stands for.
Handling missing value
So what about completly missing fields like this:
<ProductCollection>
<Products>
<Product>
<Id>1</Id>
<Name>Tommato</Name>
</Product>
<Product>
<Name></Name>
</Product>
</Products>
</ProductCollection>
Allowing for nullable types in your DTO will do it. We did it already above with this:
...
public class Product
{
public int? Id { get; set; }
...
When XML Serializer bumps on missing field he checks your DTO and sees that nullable is allowed. Ok by him.
Deserializing 1:MANY entity
How about a litle bit more complicated example. Let's assume Order contains collection of products. Here is XML input:
<?xml version="1.0" encoding="utf-8" ?>
<Order>
<Id>1</Id>
<Name>Order one</Name>
<Products>
<Product>
<Id>1</Id>
<Name>Apple</Name>
</Product>
</Products>
</Order>
And here is DTO's. Very clean and logic:
[Serializable()]
[XmlRoot("Order")]
public class Order
{
public int Id { get; set; }
public string Name { get; set; }
public Product[] Products;
}
[Serializable()]
public class Product
{
public int? Id { get; set; }
public string Name { get; set; }
}
Handling XML node attributes
In real life we can have price for our product described in XML like this:
<Product>
<Id>1</Id>
<Name>Apple</Name>
<Price Currency="€" Taxdeductable="true">2.23</Price>
</Product>
Value of currency and taxdeduction are described as attributes. This is how we describe it in DTO's:
[Serializable()]
[XmlRoot("Order")]
public class Order
{
public int Id { get; set; }
public string Name { get; set; }
public Product[] Products;
}
[Serializable()]
public class Product
{
public int? Id { get; set; }
public string Name { get; set; }
public PriceElement Price { get; set; }
}
[Serializable()]
public class PriceElement
{
[System.Xml.Serialization.XmlTextAttribute()]
public double Value { get; set; }
[XmlAttribute("Currency")]
public string Currency { get; set; }
[XmlAttribute("Taxdeductable")]
public bool Taxdeductable { get; set; }
}
Handling XML namespaces
In our samples we work with orders. It is quite realistic that we could get orders from more than one company.
Also it is quite possible that both companies provide us with entites named Order. In such case in XML we use XML namespaces to differ two Order entities
since XML syntax does not allow for two Order elements with different structure.
Here is XML document:
<?xml version="1.0" encoding="utf-8" ?>
<Catalog>
<myorders:Order xmlns:myorders="http://www.myorders.com/ordercatalog">
<myorders:Id>1</myorders:Id>
<myorders:Name>Order one</myorders:Name>
<myorders:Products>
<myorders:Product>
<myorders:Id>1</myorders:Id>
<myorders:Name>Apple</myorders:Name>
<myorders:Price Taxdeductable="true">2.23</myorders:Price>
</myorders:Product>
</myorders:Products>
</myorders:Order>
<croatiaorders:Order xmlns:croatiaorders="http://wwww.croatiaorders.com">
<Id>23</Id>
<Name>Pear</Name>
</croatiaorders:Order>
</Catalog>
Observe that myorders prefix is applied to all elements of myorders:Order. That way all its elements are consistently assigned to myorders xml namespace.
Second order is not all prefixed. Only parent element croatiaorders:Order is prefix. It's items Id & Name are not prefix. This is perfectly legal by XML syntax but not expected by parsers as we shall see.
Our existing entity Order belongs now to xml namespace defined in http://www.myorders.com/ordercatalog.
Second order came from company whose XML namespace is defined in http://wwww.croatiaorders.com
New Catalog entity is added as simple container for both orders.
Let's look at changes in DTO's:
[Serializable()]
[XmlRoot("Catalog")]
public class Catalog
{
[XmlElement(Namespace = "http://www.myorders.com/ordercatalog")]
public Order Order { get; set; }
[XmlElement("Order", Namespace = "http://wwww.croatiaorders.com")]
public CroatiaOrdersOrder CroatiaOrdersOrder { get; set; }
}
[Serializable()]
public class Order
{
public int Id { get; set; }
public string Name { get; set; }
public Product[] Products;
}
[Serializable()]
public class Product
{
public int? Id { get; set; }
public string Name { get; set; }
public PriceElement Price { get; set; }
}
[Serializable()]
public class PriceElement
{
[System.Xml.Serialization.XmlTextAttribute()]
public double Value { get; set; }
[XmlAttribute("Currency")]
public string Currency { get; set; }
[XmlAttribute("Taxdeductable")]
public bool Taxdeductable { get; set; }
}
[Serializable()]
public class CroatiaOrdersOrder
{
[XmlElement(Namespace="")]
public int Id { get; set; }
[XmlElement(Namespace = "")]
public string Name { get; set; }
}
Our original Order (myorders.com) is decorated with :
[XmlElement(Namespace = "http://www.myorders.com/ordercatalog")]
Note that child elements of myorders.com by default inherit namespace. No further decorating with attributes is neccessary.
So if in XML you consistently use your namespace in all child elements, like we did for myorders:Order, we're done.
On the other hand if in XML you ommit xml prefix for child elements of parent that has declared xml namespace, like we did for CroatiaOrders, then you must explicitly decorate each child element with :
[XmlElement(Namespace="")]
; in order to work with XMLSerializer. Otherwise you wont have errors but NULL values.
And if you ran on problems some good reading:
http://www.codeproject.com/Articles/14064/Using-the-XmlSerializer-Attributes
Troubleshooting Common Problems with the XmlSerializer