Serialization Studio for .NET

Serialization Studio (SS) is a program that generates the serializers for .Net framework in the form of a source (C#) code.
Why do we need it?
Standard .Net binary serialization works quite nicely for the vast majority of the applications. However there are a number of cases where standard serialization does not work or does not work well:
Serialization speed is not sufficient
Serialization of large DataSet abd DataTable objects fails or too slow
Cross platform (or cross device) communication
Binary serialization is not implemented for Compact Framework
Serialization of objects derived from base classes
Size of the resulting object is too large
Features
1000%-50000% faster than standard binary formatter
The resulting size is smaller by times
Platform and device independent – runs on PC, Mobile devices, Mono
Ability to serialize the classes without [serializable] attribute
Can serialize the classes with inheritance (like the one below)
public class FromHashtable : Hashtable // standard formatter crashs
{
// MyVars
}
|
Binary formatter stores everything in the stream. Do we need everything in that stream? What is the benefit of storing everything? Well apparently if we put all the information in the stream, we can extract this information easier. That is true. Let’s imaging the situation: we are deserializing the object. Hence everything was stored in the stream (including the types and all the object attributes) , we can deserialize the object without any external means. The object ,in fact, deserializes itself. Is it good? The question is obviously rhetorical . Yes, it is good but at price. The object became much bigger and the time spent on the processing increased. Ok the object is eventually deserialiazed. What can we do with this object? The answer is nothing. Without the knowledge about the internal structures of the object is useless. The computer program is built in the way that in order to do something like executing the method, the method description (the code or the signature) has to be provided. This method signature is kept in the metadata. The conclusion of the above is that in order to do something on the object, all methods (method signatures) should be known before the program is built. Sending such information along with the object is pure waste of time and a bandwidth of the channel. Without having the classes’ description on both sides of the channel, the normal execution is impossible.
Since we know what information is to be received (in terms of classes structure), we index this information, instead of sending it as is and then send the index and the actual data.
Building the serializer
The assembly DLL has to be provided to the Serialization Studio.The Studio traverses the object and builds the object graph. Then the studio generates C# code of the serializer.
Building of the serializer can be precisely controlled on the node level:
Only public - ignore not public members
Each node (or globally) can be set using the tick boxes
Only serializable – serialize objects marked as serializable
The Studio breaks the object into primitives and then tries to match the surrogate for each of the primitives. For instance for an integer or for a char or for an ArrayList.
All complex classes consist of primitives. Does it mean that any class can be serialized?
Unfortunately that is not the case. If the class contains the interface instead of the class, there is no way this object can be traversed. It can be only serialized at run time.
Typically the serialization (even the standard binary one) does not go that far. If the object is marked as ISerializable, that means that the surrogate may be used for serialization and there is no point to traverse the object further and the object itself provides the methods for serialization . Serialization Studio uses this approach as well.
The TypeHolder
The TypeHolder is the internal Studio object that manages the particular class and interaction with other
(parent and child) holders. The surrogate generator (if any) is also attached to the holder.
When studio builds the serializer, it creates the collection of the typeholders.
Generated serializer structure
The serializer consists of several sections :
Enumerations – basically the indexes of the classes to serialize (expressed in mnemonics)
Standard body – common methods
Methods generated – serialization and deserialization of a particular class
SerializeUnknown – selector of which serializer to apply
DeserializeUnknown – selector of which deserializer to apply
Building your own surrogates
The surrogate is a piece of code that serializes a particular class as whole without breaking it into primitives. For the most of the objects there is the set of surrogates. However if the surrogate creation is desirable, you can derive your surrogate from BaseComplexSurrogateTypeHolder class.
Build the assembly from it and include this assembly as an add-in component in the studio.The sample external surrogate for TimeSpan is provided with the sample code.
Below is the sample of the surrogate . (it is an actual DateTime surogate)
The lines marked in red is your code that does the job. This code will be injected in the serializer body where it is required. Of course just the writing a few lines does not make the serializer body but the rest is done by the FSStudio framework.
namespace DotNetRemoting
{
///
/// It is a demo surrogate for Serialization Studio
/// To add it to Serialization Studio
/// Open Tools/Configeru External Holders/..
/// and select the assembly (dll) built with this project
///
[Holder(typeof(System.TimeSpan))]
public class TimeSpanHolder : BaseComplexSurrogateTypeHolder
{
public override string Description
{
get{return "Surrogate for System.TimeSpan";}
}
protected override string MethodBodySer
{
get { return @" void Serialize(TimeSpan ts)
{
if (ts != null) // not realy applicable to a value type object .. just a pattern ..
{
BW.Write((byte)ObjType.System_Object);
}
else
{
BW.Write((byte)ObjType.NULL);
return; }
BW.Write((Int32)ts.Ticks);
}";
}
}
protected override string MethodBodyDeSer
{
get { return @"
void Deserialize(ref TimeSpan ts)
{
byte b = BR.ReadByte();
if (b == (byte)ObjType.NULL)
{
return; }
Int32 Ticks = BR.ReadInt32();
ts = new TimeSpan(Ticks);
}";
}
}
public TimeSpanHolder(Type
t, ExternDataContainer dc, bool Public, string FieldName)
: base(t, dc, Public, FieldName)
{
}
}
}
|
Advanced options
If you are developing a very sophisticated surrogate, you may need an extra information for the types that participate in the process.
ThisType - the type the surrogate is being built for
TypeString – the full name of this type
Name – internal name used for enumerators
FieldName – the name of the field (could be null if it is a root class)
GetFieldName() – method to retrieve field name (almost same as above)
These vars are accessible from your typeholder since it was derived from the BaseTypeHolder which contains the definitions of them.
Type[] GenericArguments - if the type is generic, generic arguments are kept in this collection
BW and BR – binary writer and binary reader. (they are not standard BinaryReader and BinaryWriter though they have many similarities)
Adding the TypeHolder to the Studio as add-in
Compile the TypeHolder into DLL. Start the Studio. Open Tools/Configer External Holders and browse for the DLL.
Add the DLL to the list.
Using generated serializer
The serializer generated by Studio is an integral part of DotNetRemoting Framework.
However this serializer can be used as a surrogate for the standard formatter (Binary for example).
You just need to create FSSurogate object and call the method that creates the surrogate selector compatible with the formatter FSSurrogate takes generated serializer a constructor parameter.
FSSurrogate surr_ = new FSSurrogate(new FSDefault());
BinaryFormatter bf = new BinaryFormatter();
bf.SurrogateSelector = surr_.GetSurrogateSelector();
MemoryStream ms = new MemoryStream();
Employee _emp = new Employee();
bf.Serialize(ms, _emp);
|
Perfomance
The speed may vary from 1000% to 50000% against standard binary serializer.
Recommendations.
Avoid using private members. The access to the private varibles can be done only through reflection and that slows down the serializer considerably.
Supported data types
All primitive types
Arrays (of any type serialized)
ArrayList
Hashtable
List<type t>
Dictionary<type type1, type type2>
DataSet
DataTable
DateTime
Limitations
This version of the Studio can not serialize objects with :
Multidimensional arrays
Circular references
Platforms
Window, Windows Mobile, Linux, Unix
If the serializer is used directly (not as a surrogate) the field GenericSerializer should to be set to the instance of IFS_Formatter. IFS_Formatter is a wrapper for the generic serializer For PC generic serializer is usually a binary formatter.
Please see the samples provided.For the compact framework Compact Formatter by Angelo Scotto is a good choice.