• Example 1: Sorting an array of key-value pairs and creating its read-only wrapper
• Example 2: Cloning an array of referenced integers shallowly and deeply
• Example 3: Creating two sets of integers and comparing them by reference and by content
• Example 4: Serializing and deserializing LATINO objects
• Example 5: Performing basic operations on sets
• Example 6: Performing basic operations on sparse vectors
• Example 7: Performing basic operations on sparse matrices
• Example 8: Accessing sparse vector elements directly
• Example 9: Using the logging tool
• Download LATINO Core
Introduction
LATINO Core implements several basic data structures and utilities that are being used throughout the library. The reason for providing our own basic data structures is mostly historical. The Framework did not provide the desired functionality back in 2004 when the development of LATINO has started. At that time, the Framework was still missing important data structures and support for basic design patterns. For example,
HashSet<T>
(a set data structure based on the hash table) came out with .NET Framework 3.5 which was released in November 2007. Also, generics and read-only wrappers for collections were introduced only in 2005 while .NET has been around since 2002. Note that even today, the core .NET data structures are missing several nice properties of the LATINO Core data structures such as, for example, read-only wrappers, explicit distinction between shallow and deep cloning, and between content-wise and reference comparison. But most importantly, LATINO Core implements sparse vectors and sparse matrices which are the fundamental data structures for text and data mining applications.The following list briefly describes the most important LATINO Core data structures and interfaces:
•
Set<T>
(class): A set data structure based on the hash table.•
MultiSet<T>
(class): A set data structure that can contain duplicate elements.•
Ref<T>
(class): A wrapper that enables referencing value types such as int
and double
.•
Pair<FirstT,SecondT>
(struct): A pair data structure.•
KeyDat<KeyT,DatT>
(struct): A dictionary item data structure. A key-value pair comparable by the key.•
IdxDat<T>
(struct): An indexed item data structure. An index-value pair comparable by the index.•
BinaryVector
(class): A sparse binary vector data structure. Contains indices of non-zero elements.•
ArrayList<T>
(class): A dynamic array data structure. Extends .NET's List<T>
with LATINO Core interfaces.•
SparseVector<T>
(class): A sparse vector data structure. Contains a dense array of index-value pairs.•
SparseMatrix<T>
(class): A sparse matrix data structure. Contains a dense vector of sparse vectors representing rows in the sparse matrix.•
BinarySerializer
(class): A utility class providing functionality for binary serialization and deserialization of objects.•
Logger
(class): A utility class providing logging functionality similar to that of log4net (but more lightweight). •
Utils
(static class): A utility class providing various useful general-purpose functions.•
IReadOnlyAdapter<T>
(interface): Provides a read-only wrapper of a LATINO object.•
IDeeplyCloneable<T>
(interface): Provides a deep clone of a LATINO object.•
IContentEquatable<T>
(interface): Enables content-wise comparison of two LATINO objects.•
ISerializable
(interface): Enables straightforward binary serialization and deserialization of LATINO objects.In the following, we give several examples of using the LATINO Core data structures and interfaces.
Back to top
Example 1: Sorting an array of key-value pairs and creating its read-only wrapper
In this example, we create an
ArrayList
of KeyDat<double,int>
, populate it, sort it descendingly, output it to the console, and create its read-only wrapper. A read-only wrapper prevents you from modifying the underlying data structure. However, it acts as a "protector" rather than a "restrictor". You can always access the underlying data structure through the wrapper's Inner
property.using System; using Latino; namespace Latino.Tutorials { class Program { static void Main(string[] args) { // create an ArrayList of KeyDat<double, int> ArrayList<KeyDat<double, int>> list = new ArrayList<KeyDat<double, int>>(); // populate it list.AddRange(new KeyDat<double, int>[] { new KeyDat<double, int>(0.1, 1), new KeyDat<double, int>(0.3, 3), new KeyDat<double, int>(0.2, 2)}); // sort it descendingly list.Sort(DescSort<KeyDat<double, int>>.Instance); // output it to the console Console.WriteLine(list); // says: ( ( 0.3 3 ) ( 0.2 2 ) ( 0.1 1 ) ) // create a read-only wrapper ArrayList<KeyDat<double, int>>.ReadOnly listReadOnly = new ArrayList<KeyDat<double, int>>.ReadOnly(list); //listReadOnly.Add(new KeyDat<double, int>(0.4, 4)); // this is not possible // output the read-only list to the console Console.WriteLine(listReadOnly); // says: ( ( 0.3 3 ) ( 0.2 2 ) ( 0.1 1 ) ) // create a writable copy ArrayList<KeyDat<double, int>> listCopy = listReadOnly.GetWritableCopy(); // modify the copy listCopy.Add(new KeyDat<double, int>(0.4, 4)); // output the original list to the console to show that it was not changed Console.WriteLine(list); // says: ( ( 0.3 3 ) ( 0.2 2 ) ( 0.1 1 ) ) // use Inner to modify the enwrapped data structure listReadOnly.Inner.Add(new KeyDat<double, int>(0.4, 4)); // output the original list to the console to show that it was changed Console.WriteLine(list); // says: ( ( 0.3 3 ) ( 0.2 2 ) ( 0.1 1 ) ( 0.4 4 ) ) } } }Back to top
Example 2: Cloning an array of referenced integers shallowly and deeply
In this example, we create an
ArrayList
of Ref<int>
, populate it, clone it shallowly and deeply, and output the clones to the console.using System; using Latino; namespace Latino.Tutorials { class Program { static void Main(string[] args) { // create and populate an ArrayList of Ref<int> ArrayList<Ref<int>> array = new ArrayList<Ref<int>>(new Ref<int>[] { new Ref<int>(1), new Ref<int>(2), new Ref<int>(3)}); // create a shallow clone ArrayList<Ref<int>> arrayClone = array.Clone(); // change the original array array[0].Val = 4; // output the two arrays to the console Console.WriteLine(array); // says: ( 4 2 3 ) Console.WriteLine(arrayClone); // says: ( 4 2 3 ) // restore the original array array[0].Val = 1; // create a deep clone ArrayList<Ref<int>> deepClone = array.DeepClone(); // change the original array array[0].Val = 4; // output the two arrays to the console Console.WriteLine(array); // says: ( 4 2 3 ) Console.WriteLine(deepClone); // says: ( 1 2 3 ) } } }Back to top
Example 3: Creating two sets of integers and comparing them by reference and by content
In this example, we create two instances of
Set<int>
, populate them, output them to the console, and compare them by reference and by content.using System; using Latino; namespace Latino.Tutorials { class Program { static void Main(string[] args) { // create two instances of Set<int> and populate them Set<int> set = new Set<int>(new int[] { 1, 2, 3 }); Set<int> otherSet = new Set<int>(new int[] { 1, 2, 3 }); // output them to the console Console.WriteLine(set); // says: { 1 2 3 } Console.WriteLine(otherSet); // says: { 1 2 3 } // compare them by reference Console.WriteLine(set == otherSet); // says: False // compare them by content Console.WriteLine(set.ContentEquals(otherSet)); // says: True } } }Back to top
Example 4: Serializing and deserializing LATINO objects
In this example, we create an
ArrayList<BinaryVector>
, populate it, write it to a file, read it from the file, output it to the console, and compare it to the original array.using System; using System.IO; using Latino; namespace Latino.Tutorials { class Program { static void Main(string[] args) { // create an ArrayList<BinaryVector> ArrayList<BinaryVector> array = new ArrayList<BinaryVector>(); // populate it array.AddRange(new BinaryVector[] { new BinaryVector(new int[] { 1, 3, 5 }), new BinaryVector(new int[] { 2, 4, 6 })}); // output it to the console Console.WriteLine(array); // says: ( ( 1 3 5 ) ( 2 4 6 ) ) // serialize it to a file BinarySerializer fileWriter = new BinarySerializer("array.bin", FileMode.Create); array.Save(fileWriter); fileWriter.Close(); // read it from the file BinarySerializer fileReader = new BinarySerializer("array.bin", FileMode.Open); ArrayList<BinaryVector> otherArray = new ArrayList<BinaryVector>(fileReader); fileReader.Close(); // output it to the console Console.WriteLine(otherArray); // says: ( ( 1 3 5 ) ( 2 4 6 ) ) // compare it to the original array Console.WriteLine(array == otherArray); // says: False Console.WriteLine(array.ContentEquals(otherArray)); // says: True } } }Back to top
Example 5: Performing basic operations on sets
In this example, we create two instances of
Set<int>
and compute their union, intersection, difference, and Jaccard similarity.using System; using Latino; namespace Latino.Tutorials { class Program { static void Main(string[] args) { // create a Set<int> Set<int> set = new Set<int>(new int[] { 1, 2, 3 }); // create another Set<int> Set<int> otherSet = new Set<int>(new int[] { 2, 3, 4 }); // compute the union and output it to the console Console.WriteLine(Set<int>.Union(set, otherSet)); // says: { 1 2 3 4 } // compute the intersection and output it to the console Console.WriteLine(Set<int>.Intersection(set, otherSet)); // says: { 2 3 } // compute the difference and output it to the console Console.WriteLine(Set<int>.Difference(set, otherSet)); // says: { 1 } // compute the Jaccard similarity Console.WriteLine(Set<int>.JaccardSimilarity(set, otherSet)); // says: 0.5 } } }Back to top
Example 6: Performing basic operations on sparse vectors
In this example, we create a
SparseVector<double>
, populate it, traverse it, manipulate it, get the number of non-empty elements and the number of dimensions, and output it to the console.using System; using Latino; namespace Latino.Tutorials { class Program { static void Main(string[] args) { // create a SparseVector<double> SparseVector<double> vec = new SparseVector<double>(new IdxDat<double>[] { new IdxDat<double>(0, 0.0), new IdxDat<double>(2, 0.2), new IdxDat<double>(4, 0.4), new IdxDat<double>(6, 0.6)}); // output it to the console Console.WriteLine(vec.ToString()); // says: ( ( 0 0 ) ( 2 0.2 ) ( 4 0.4 ) ( 6 0.6 ) ) // traverse it (says: (0,0) (2,0.2) (4,0.4) (6,0.6)) foreach (IdxDat<double> item in vec) { Console.Write("({0},{1}) ", item.Idx, item.Dat); } Console.WriteLine(); // add an element vec[1] = 0.1; // change a value vec[0] = 42; // remove an element vec.RemoveAt(4); // output the vector to the console Console.WriteLine(vec.ToString()); // says: ( ( 0 42 ) ( 1 0.1 ) ( 2 0.2 ) ( 6 0.6 ) ) // get the number of non-empty elements Console.WriteLine(vec.Count); // says: 4 // get the number of dimensions Console.WriteLine(vec.LastNonEmptyIndex + 1); // says: 7 } } }Back to top
Example 7: Performing basic operations on sparse matrices
In this example, we create a
SparseMatrix<double>
, populate it, traverse it, change a value, remove an element, get the number of dimensions, transpose it, and output it to the console.using System; using Latino; namespace Latino.Tutorials { class Program { static void Main(string[] args) { // create a SparseMatrix<double> SparseMatrix<double> mat = new SparseMatrix<double>(); mat[0] = new SparseVector<double>(new IdxDat<double>[] { new IdxDat<double>(0, 1.1), new IdxDat<double>(1, 1.2)}); mat[1] = new SparseVector<double>(new IdxDat<double>[] { new IdxDat<double>(1, 2.2)}); mat[2] = new SparseVector<double>(new IdxDat<double>[] { new IdxDat<double>(1, 3.2), new IdxDat<double>(2, 3.3)}); // output it to the console Console.WriteLine(mat.ToString("E")); Console.WriteLine(); // traverse the matrix foreach (IdxDat<SparseVector<double>> row in mat) { foreach (IdxDat<double> item in row.Dat) { Console.Write("({0},{1},{2}) ", row.Idx, item.Idx, item.Dat); } Console.WriteLine(); } Console.WriteLine(); // change a value mat[1, 1] = -2.2; Console.WriteLine(mat.ToString("E")); Console.WriteLine(); // remove an element mat.RemoveAt(1, 1); Console.WriteLine(mat.ToString("E")); Console.WriteLine(); // get the number of dimensions Console.WriteLine("rows: {0}, cols: {1}", mat.GetLastNonEmptyRowIdx() + 1, mat.GetLastNonEmptyColIdx() + 1); Console.WriteLine(); // transpose the matrix SparseMatrix<double> matTr = mat.GetTransposedCopy(); Console.WriteLine(matTr.ToString("E")); } } }The above example outputs the following to the console:
1.1 1.2 - - 2.2 - - 3.2 3.3 (0,0,1.1) (0,1,1.2) (1,1,2.2) (2,1,3.2) (2,2,3.3) 1.1 1.2 - - -2.2 - - 3.2 3.3 1.1 1.2 - - - - - 3.2 3.3 rows: 3, cols: 3 1.1 - - 1.2 - 3.2 - - 3.3Back to top
Example 8: Accessing sparse vector elements directly
In this example, we use the direct-access functions to traverse and manipulate a sparse vector. Using
InnerIdx
, InnerDat
, SetDirect
, and RemoveDirect
is faster than using the indexer and RemoveAt
(O(1) vs. O(logn)). While it is safe to use SetDirect
and RemoveDirect
, you need to be very careful with InnerIdx
and InnerDat
as they allow you to put the vector into an invalid state.using System; using Latino; namespace Latino.Tutorials { class Program { static void Main(string[] args) { // create a SparseVector<double> SparseVector<double> vec = new SparseVector<double>(new IdxDat<double>[] { new IdxDat<double>(0, 0.0), new IdxDat<double>(2, 0.2), new IdxDat<double>(4, 0.4), new IdxDat<double>(6, 0.6)}); // output it to the console Console.WriteLine(vec.ToString()); // says: ( ( 0 0 ) ( 2 0.2 ) ( 4 0.4 ) ( 6 0.6 ) ) // traverse it (says: (0,0) (2,0.2) (4,0.4) (6,0.6)) for (int i = 0; i < vec.Count; i++) { Console.Write("({0},{1}) ", vec.GetIdxDirect(i), vec.GetDatDirect(i)); } Console.WriteLine(); // add an element vec.InnerIdx.Insert(1, 1); // !!! be careful !!! vec.InnerDat.Insert(1, 0.1); // !!! be careful !!! // output the vector to the console Console.WriteLine(vec.ToString()); // says: ( ( 0 0 ) ( 1 0.1 ) ( 2 0.2 ) ( 4 0.4 ) ( 6 0.6 ) ) // change a value vec.SetDirect(0, 42); // output the vector to the console Console.WriteLine(vec.ToString()); // says: ( ( 0 42 ) ( 1 0.1 ) ( 2 0.2 ) ( 4 0.4 ) ( 6 0.6 ) ) // remove an element vec.RemoveDirect(2); // output the vector to the console Console.WriteLine(vec.ToString()); // says: ( ( 0 42 ) ( 1 0.1 ) ( 4 0.4 ) ( 6 0.6 ) ) } } }Back to top
Example 9: Using the logging tool
LATINO implements a lightweight log4net-like logging tool. It offers a named logger hierarchy with a limited functionality for defining appenders (outputs) and layouts (output formats).
using System; using Latino; namespace Latino.Tutorials { class Program { static void Main(string[] args) { // create two loggers Logger logger1 = Logger.GetLogger("Latino.Tutorials.Example9.Logger1"); Logger logger2 = Logger.GetLogger("Latino.Tutorials.Example9.Logger2"); // output a message and a warning logger1.Info("Main", "This message is brought to you by Logger 1."); logger2.Warn("Main", "This warning is brought to you by Logger 2."); // change the output format of Logger 2 logger2.CustomOutput = new Logger.CustomOutputDelegate(delegate(string loggerName, Logger.Level level, string funcName, Exception exception, string message, object[] msgArgs) { Console.WriteLine("{0} says: \"{1}\"", loggerName, string.Format(message, msgArgs)); }); logger2.LocalOutputType = Logger.OutputType.Custom; // output the message and warning again logger1.Info("Main", "This message is brought to you by Logger 1."); logger2.Warn("Main", "This warning is brought to you by Logger 2."); // set both loggers to output only warnings, errors, and fatal errors Logger.GetRootLogger().LocalLevel = Logger.Level.Warn; logger1.Trace("Main", "This trace message is brought to you by Logger 1."); // this will not be displayed logger1.Debug("Main", "This debug message is brought to you by Logger 1."); // this will not be displayed logger2.Info("Main", "This message is brought to you by Logger 2."); // this will not be displayed logger2.Warn("Main", "This warning is brought to you by Logger 2."); } } }The above example outputs the following to the console:
2012-04-08 20:56:29 Latino.Tutorials.Example9.Logger1 Main INFO: This message is brought to you by Logger 1. 2012-04-08 20:56:29 Latino.Tutorials.Example9.Logger2 Main WARN: This warning is brought to you by Logger 2. 2012-04-08 20:56:29 Latino.Tutorials.Example9.Logger1 Main INFO: This message is brought to you by Logger 1. Latino.Tutorials.Example9.Logger2 says: "This warning is brought to you by Logger 2." Latino.Tutorials.Example9.Logger2 says: "This warning is brought to you by Logger 2."Back to top
Download LATINO Core
• Latino Core May-2012 (latest)
Back to top
No comments:
Post a Comment