Provides fast random and sequential access to vast sorted lists. Creates and maintains them with unmatched ease, speed and flexibility. Simultaneously provides the benefits of a binary tree and a linked list.
You can find an entry with a keyed, binary-style search, then move sequentially in either direction (or both directions) from that point. You can do adds and removes at any point.
Finds entries rapidly by key. Can find the highest value less than a key or the lowest value greater than a key.
You can use multiple keys if they produce the same sort order. You can find a row by index number and then find a row by name with no need to resort the list or do other real work.
Given any entry, you can always get the next and previous entries. Removes and updates can be done freely with no need for an iterator or a keyed find.
Great for use with the rows in JTable's and JList's.
A SortedList entry can be part of another structure. It can be located through that structure and either removed directly from the SortedList or used to view the SortedList sequentially in either direction (or both directions).
Sequential searches are fast in both directions. Given attendance numbers at a theater, sorted by date, you can quickly determine the start and end of an attendance dip. (Find a date in the dip and scan in both directions for the first entries with normal attendance.)
Excellent for sparse array storage. If theater attendance numbers (above) are stored by 100's and attendance is steady, each entry can store a whole string of days with the same number (between 200 and 300, say). One sequential step can move ahead 50 days instead of just one. (This feature is apt to be more important for engineering and scientific applications.)
Works fine with small lists too. A million SortedLists with 10 entries each use no more memory than 10 SortedLists with a million entries each. So there are no awkward moments when a 10-entry list gets an unexpected 999,990 new rows, and no wasted memory when a million-entry list loses 999,990 rows.
All source code is thoroughly Javadoc'ed.
Huge Display Table Performance
A simple use for SortedList is holding data for display tables. It can do quick finds on indexes and do rapid sequential updates to reset the index numbers when adds and deletes are done. The following indicates the typical performance:
Times were obtained on an ancient machine. Your results will be better. java.util.TreeMap is used for comparison. One million rows of random data, sorted on the way in, were added to a list, then numbered. The result was a displayable list. Next 10 rows were added in the middle and all the index numbers in the second half of the list upped by 10. The list was again displayable, but with 10 rows out of sort order. Then everything was resorted and the list completely renumbered. It was again displayable. Finally, 100,000 random finds were done against the list. (Sample source code for an actual JTable application is included below.)
Times in Seconds
SortedList TreeMap
Fill the table, sort, and number: 34.840 38.505
Add 10 rows and adjust 500,000 indexes: 0.140 0.421
Resort mostly ordered table: 0.922 18.246
Renumber table: 0.270 2.344
Find 100000 random indexes: 1.052 0.620
This shows that SortedList can both add a row to a large list and resort the list fast enough that a user will not notice a delay--with a million rows on a P4, 1.0 GHz, 512MB Dell. Note the time to renumber the table. The ability of a threaded binary tree to blast through a list sequentially lets it do other time critical work as well..
Notes on java.util.TreeMap: Two TreeMap instances were needed, one to handle the row sort and one to handle the index sort; it was rather more complicated to use than Sorted List. SortedList, aside from being easier, sorts ordered data much faster. It also is much faster with sequential updates. (But TreeMap does do faster finds.)
Notes on java.util.ArrayList:: In my test ArrayList was by far and away the winner in everything but the sorts. (It was not worth showing the numbers.) However, large arrays can cause memory fragmentation if you repeatedly allocate and deallocate space for them. And ArrayList will allocate space on its own initiative, and most sorts will too. If you use arrays and are continually creating and disposing of display tables (with lots of rows) you are likely to have your system hang, use a lot more memory than necessary, and even produce out-of-memory exceptions. Worse, this behavior will be annoyingly random.
Usage case: Histograms
The toughest job for SortedList was histograms. These were not pretty displays; they were used for tracking work loads. The minimum time span was a minute, and the time span of a histogram could be years. There were a lot of these histograms, and they added to be added, subtracted, multiplied, divided, merged, aligned, summarized, synchronized, pattern-matched, etc., etc., etc.
SortedList allowed me to merge many minutes into one entry, which as few levels lasted only one minute, most lasted hours or days, and some lasted years saved immense amounts of space while still allowing detail where it was needed. I could jump to the relevant date in a list and then work sequentially, rearranging the levels as I went. (This rearranging meant removing some entries, changing the time span on others, adding to entries, and changing the levels on almost all of them.)
Usage case: Tracking List Selections
Using SortedList on this problem was overkill, but I have had similar big problems where SortedList was the only thing that could do the job. The critical feature in all cases was the ability to have two sort keys (which both resulted in the same sort) and to be able to do finds based on either with no cost for the switch.
I tracked row selections in a JTable by storing each continuous set of selected rows in a SortedList entry which comprised a starting row and the number of rows selected--plus the starting row number counting only selected rows. If rows 3, 6, 7, 8, 9, 12, 13, and 17 were selected, the list would look like this:
row # sel # count
3 0 1
6 1 4
12 5 2
17 7 1
Note that while the two start row numbers could be wildly different, they would always sort the entries in the same order.
A find on a normal row number could quickly gives the selected row number. Likewise, a find on a selected row number quickly gives the normal row number. A <= find on row 10 gives the second entry, which indicates (6 + 4 - 1 < 10) that row 10 is not selected. A <= find on selected row 6 gives the third entry, which indicates (12 + 6 - 5 = 13) it is the 13th row in the list.
TTreeMap
SortedList uses internal linkage, which helps performance and makes everything simpler. However, it also means that entries in a SortedList must extend SortedListNode. Objects that cannot cannot be used in a SortedList. Also, objects already in one such list cannot be used in another. The TTreeMap class solves this problem. It uses SortedList to create a java.util.Map implementation that works almost as well as a SortedList. The disadvantage of TTreeMap is that it does require an iterator to traverse the list. The advantage is that this iterator is a full-feature java.util.ListIterator, allowing movement in either direction. Also, TTreeMap allows adds and deletes while the ListIterator is in use. (Some care must be used not to delete the iterator's current entry.)
Performance
SortedList uses carefully worked out algorithms to keep its binary tree balanced and efficient. Adds (for data in random order) and deletes are marginally slower than with an ordinary binary tree, such as Java.util.TreeMap, simply due to the additional pointers for for the threads--which provide dramatically better sequential access. Adds where the data is already sorted, almost sorted, or sorted in reverse order are very fast. The table in the "Huge Display Table" section shows this.
Two sample programs are included as an example of a java.util.TableModel implementation, one using SortedList and the other TTreeMap. Three test/timing programs are also included. SortedList is not thread-safe, but can be used in a read-only manner across multiple threads.
/** Displays 1000000 rows in a JTable using SortedList
* to track the data. */
public class SortedListJTableExample extends JFrame {
/** A row in the list. */
private static class Row extends SortedListNode {
public int rowIndex;
public String rowName;
public Row( String name ) { rowName = name; }
public Row getKey() { return this; }
}
/** Creates a SortedListJTableExample window and displays it. */
public static void main(String[] args) {
new SortedListJTableExample().setVisible( true );
}
// *** Instance Fields ***
/** The table model connecting the JTable and SortedList. */
private AbstractTableModel tableModel =
new AbstractTableModel() {
public int getRowCount() {
Row row = (Row) sortedList.last();
if (row == null) return 0;
return (row.rowIndex + 1);
}
public Object getValueAt( int row, int column ) {
Row r = (Row)
sortedList.findByKey( new Integer( row ) );
if (r == null) return null;
if (column == 0) return new Integer( r.rowIndex );
return r.rowName;
}
};
/** A comparator for sorting and searching by name. */
private Comparator nameComp = new Comparator() {
public int compare( Object o1, Object o2 ) { ... }
};
/** A comparator for sorting and searching by index. */
private Comparator indexComp = new Comparator() {
public int compare( Object o1, Object o2 ) { ... }
};
/** The display table. */
private JTable listTable;
/** The display data. */
private SortedList sortedList;
// *** Instance Method ***
/** Creates a new SortedListJTableExample instance. */
public SortedListJTableExample() {
// Create and Populate a Sorted List of 40 character entries.
// Set name comparator for alphabetic sort.
sortedList = new SortedList( nameComp );
char[] letters = new char[] { ... };
Random dieroll = new Random();
for (int i = 0; i < 1000000; i++) {
StringBuilder builder = new StringBuilder();
for (int j = 1; j < 40; j++)
builder.append( letters[dieroll.nextInt(
letters.length )] );
Row nrow = new Row( builder.toString() );
sortedList.add( nrow );
}
// Number the List.
int counter = 0;
for (Iterator i = sortedList.iterator(); i.hasNext();)
((Row) i.next()).rowIndex = counter++;
// Set index comparator (at no cost) so tableModel will work.
// Can be switched back to nameComp (at no cost) at any time
// to find a row by text value.
sortedList.setComparator( indexComp );
// Set Up a JTable. Use "tableModel" from above.
listTable = new JTable( tableModel );
JScrollPane jsp = new JScrollPane( listTable );
getContentPane().add( jsp, BorderLayout.CENTER );
// Add an "Add Row" button.
JButton addButton = new JButton( "Add Row" );
addButton.addActionListener( new ActionListener() {
public void actionPerformed( ActionEvent evt ) {
// Get index and text for add.
int index = listTable.getSelectedRow();
String text = nameField.getText();
// Increment row numbers after add.
Row current = (Row) sortedList.findByKey(
new Integer( index ) );
int counter = index;
while (current != null) {
current.rowIndex = ++counter;
current = (Row) current.next();
}
// Add new row.
Row row = new Row( text );
row.rowIndex = index;
sortedList.addNode( row );
tableModel.fireTableRowsInserted( index, index );
}
} );
// The JTable is now available to the user with 1000000
// entries.
}
}
rrc76.utilities.utils.tree Class SortedList
java.lang.Object rrc76.utilities.utils.tree.SortedList
All Implemented Interfaces: 'java.lang.Iterable'
public class SortedList extends java.lang.Object implements java.lang.Iterable
A doubly threaded binary tree. This class will efficiently do the work of a binary tree and a doubly linked list. All entries must extend SortedListNode. (To gain this class's capabilities using entries that do not, use TTreeMap, a Map implementation.)
SortedList is written for performance. It uses internal linkage, which provides a major boost to speed. Adds and removes are efficient. Keys and comparators can be changed without resorting, provided they result in the same sort order. Given an entry in a SortedList, even (especially) when obtained independently of that SortedList, the next and previous entries can be obtained immediately (at no cost) and it can be removed from the list in one simple operation.
SortedList is not thread safe. Use the balance method to render an instance immutable for multi-threaded reads and queries.
Uses:
Use this class to find members of a list by key value (it can do <, <=, >=, >, first, and last as well as equals) and move sequentially either way in the list and add or remove elements during each step. An example: my heaviest use of this class involved blocks of time. Adding a new block meant finding (by key) the first block that overlapped the new block and removing that plus all subsequent overlapped blocks (stepping through the list until clear of the new block's span, removing each entry along the way) and finally adding in the new block (which would be modified to include the non-overlapped portions of the removed end blocks).
Sometimes time would be measured on 2 or even 3 different scales. Because they all sorted the same way, each time I had to find a block, I could, at no cost, swap in the comparator for the scale for that find.
SortedList also works well for large indexed lists such as those providing data for a JTable or other display list. Adding a row means incrementing the key (the index number) for all later rows, then adding in the new row. The linked list nature of SortTable makes moving through the list almost as fast as moving through an array. No sorting needs to be done. And no large memory reallocations--such as big arrays would require--need to be done. Removing rows is similarly fast and effortless.
Generally, SortedList is excellent as a sparse array where non-null sets of entries need to be accessed sequentially in both directions and values are changed (added, deleted, and modified) frequently.
Capabilities:
SortedList can do array-style operations efficiently by making the index the key. The values of keys can be changed with a single pass through the list, provided the sort order does not change. So a list can be sorted by one--possibly rather complex--key. Then a pass through the list, giving each element key the value of a counter incremented for each new element (that is: 1, 2, 3, 4, ...), will provide fast access by simple numeric index, without any need for resorting. Some care should be used when doing this.
Multiple comparators can be used to interpret the sort key from the SortedListNode instance. The key itself can contain several values. (Indeed, to give the comparators all possible options, the key might just be the SortedListNode instance itself.) As long as all the comparators sort the keys in the same order, a SortedList instance's comparator can be changed at will at virtually no cost. This allows searching by different keys on a search by search basis. For instance, rows might be sorted by name, and then numbered in that order. If the key contained the name and the number one search could be done for the 1293rd entry, then another search done for "Minneapolis" with no real cost for the key switch. All this done just by switching the comparator.
The SortedListNode class provides methods to get the next and previous nodes. These provide the most efficient way to move through a tree one node at a time. An entry obtained by any means, possibly from an independent list or reference, can thus provide immediate, linked-list access to other entries in the SortedList without need to do any (relatively) costly binary searches.
Constructor Detail
SortedList()
public SortedList()
Creates a new SortedList with no Comparator. Any keys must implement Comparable!
SortedList()
public SortedList(java.util.Comparator c)
Creates new SortedList.
Parameters:
c - the comparator for sorting. It should sort the keys returned by SortedListNode's getKey method along with the keys passed to the find-by-key methods.Method Detail
setComparator()
public void setComparator(java.util.Comparator c)
Replaces the Comparator. This method is perfectly safe to use if the SortedList is empty. Otherwise, the new Comparator must sort the existing list members exactly as the old Comparator did or there will be problems. This should be done only when efficiency is important. (The simplest safe way to resort is to create a new SortedList and fill it with data from the old one. Alternatively, the clear method might be called to empty the SortedList, setComparator called, and the data re-added. See the clear method for an easy way to do this.)
Parameters:
c - the new comparator. It should sort the keys returned by SortedListNode's getKey method. May be null if the keys implement Comparable and their natural sort obeys the rules a Comparator must obey.getComparator()
public java.util.Comparator getComparator()
Gets the Comparator, if any. This is a Comparator that works on the keys returned by SortedListNode's getKey method.
addNode()
public SortedListNode addNode(SortedListNode n)
Adds a node to the tree. Does nothing if a node with the same key is already there. This is the preferred method for adds.
Parameters:
n - the node to be added. Returns:
normally null. If a node with the same key already exists, that pre-existing node will be returned (and the list left untouched). The caller can thus skip a preliminary findByKey to check for trouble by simply doing the add. If there is trouble (a non-null return), the caller has all the necessary information to handle it.
findByKey()
public SortedListNode findByKey(java.lang.Object key)
Finds the node with the given key.
Parameters:
key - the key. Must not be null unless there is a comparator that can handle null keys.Returns: the sought node. Will be null if no node matched the key.
findByKey()
public SortedListNode findByKey(java.lang.Object key,
int mode)
Finds a node by key.
Parameters:
key - the key. Not used if mode is 3 or -3 and may be null in this case. Otherwise, must not be null unless there is a comparator that can handle null keys.mode - how to use the key:
-3 get first node in the SortedList.
-2 get node with highest key value less than specified key.
-1 get node with highest key value less than or equal to specified key.
0 get node with key value equal to specified key.
1 get node with lowest key value equal to or greater than specified key.
2 get node with lowest key value greater than specified key.
3 get last node in SortedList.Returns: the sought node. Will be null if no node met the given conditions.
position()
public SortedListNode position(SortedListNode n,
long move)
Positions within the list. Note that for simple positioning (where move == 1) it would be more efficient to use the SortedListNode methods.
Parameters:
n - the node to start with. If null, method moves from the start of the list if move is positive and from the end if negative. position( null, 4 ) returns the fourth element in the list (with an index of 3, if SortedList did indexes).move - the number of nodes to move from the starting node. A 1 will get the next node, a -1 the previous one. Returns: the desired node. May be null if an attempt was made to move beyond the start or finish of the list.
removeNode()
public void removeNode(SortedListNode n)
Removes the given node from the tree. This method is very efficient, but dangerous. If the node is actually not in the tree, this tree may be trashed. If the node is in another tree, that tree may be trashed also (and the node will be removed from it). Use the remove( SortedListNode ) method from the Collection-inspired methods instead to be safe, if less efficient.
Parameters:
n - the node to remove. May be null, in which case nothing will be done. If not null, this node must be in the tree!balance()
public void balance()
Balances the tree. Does not normally need to be called. However, if a tree is not going to be modified, but only queried, calling this method will make the queries as efficient as possible and prevent queries from modifying the tree. Always call this method before accessing this SortedList (read only) from more than one thread. SortedList only rebalances the tree when it sees that it is out of balance. This often happens when querying or reading the tree. Calling this method stops that--until the tree is next modified--by making sure the tree is balanced.
first()
public SortedListNode first()
Gets the first node instance in this SortedList.
last()
public SortedListNode last()
Gets the last node instance in this SortedList.
size()
public int size()
Returns the number of entries in the SortedList. Works by simply counting all the entries. An inefficient method inspired by java.util.Collection.
isEmpty()
public boolean isEmpty()
Returns true if this SortedList is empty and false if not. This method is efficient and inspired by java.util.Collection.
contains()
public boolean contains(java.lang.Object o)
Returns true if this SortedList contains the specified object, which it will not unless the object is an instance of SortedListNode. This method is efficient and inspired by java.util.Collection.
iterator()
public java.util.Iterator iterator()
Returns an Iterator containing all the nodes in this SortedList in their sorted order. The remove method works. This method is efficient and inspired by java.util.Collection. However, using the first method and SortedListNode's next method will work somewhat faster.
This iterator allows concurrent access, but avoid removing the next node. Also, do not use SortedList's remove or removeNode methods on the last returned node, then call the iterator's remove method.
Specified by:
iterator in interface java.lang.Iterable
toArray()
public java.lang.Object[] toArray()
Returns the contents of this SortedList in an array. As efficient as possible. Inspired by java.util.Collection.
toArray()
public java.lang.Object[] toArray(java.lang.Object[] a)
Returns the contents of this SortedList in an array of the specified type. As efficient as possible. Inspired by java.util.Collection.
add()
public boolean add(SortedListNode node)
Adds the specified node to this SortedList if it is not there already. Inspired by java.util.Collection. Use addNode instead unless you prefer a boolean return value.
Parameters:
* node - the object to be added.
Returns:
true if the SortedList changed as a result of this call, false if it did not. A false return indicates that the element--or one with an identical key--is already in this SortList.
remove()
public boolean remove(SortedListNode node)
Removes the specified object from this SortedList. This is preferred to removeNode because it is safer, if slower. (Unlike removeNode, it checks to be sure the specified node is in this SortedList.) Inspired by java.util.Collection.
Parameters:
node - the object to remove from this SortedList.
Returns:
true if the object was in the list (and hence removed), false if it was not. Either way, the caller may be sure the specified node is not now in this SortedList.
clear()
public void clear()
Clears out this SortedList. Extremely efficient. Does not touch the SortedListNode instances! This means that the SortedListNode methods will still work. A node will remain part of the old linked list until added to this or another SortedList instance. So long as a given node and its next and previous nodes have not been so added, the SortedListNode methods will work. Further, the next method will work if the previous node has been added elsewhere, and the previous method will work if the next node has been added elsewhere.
This means a SortedList can be resorted simply and easily. Obtain the first element of the list (first) and save it as the current node. Call clear. Change the Comparator (setComparator). Get the next element of the list by calling SortedListNode's next on the current node and make it the new current node. Add the old current node to the tree. Repeat till next returns null. Done.
equals()
public boolean equals(java.lang.Object o)
Compares an object to this sorted list for equality. A SortedList can only be equal to something if it implements java.util.Iterable and returns the same objects (as determined by equals) in the same order as they exist in this SortedList.
Parameters:
o - the object to be compared.hashCode()
public int hashCode()
Returns a hash code.
rrc76.utilities.utils.tree Class SortedListNode
java.lang.Object rrc76.utilities.utils.tree.SortedListNode
public abstract class SortedListNodeextends java.lang.Object
A node in a SortedList object. Extend this class to make useful entries for a SortedList threaded binary tree. The only method that can or needs needs to be overridden here is getKey, which will determine this node's position within the list.
The other methods provide the next and previous elements in the list. They are the most efficient ways to do this. They will continue to work in a set of nodes from a SortedList that has been cleared. (They will not work in node that have been individually removed from a tree.)
Method Detail
next()
public final SortedListNode next()
Gets the next node.
next()
public final SortedListNode next(boolean right)
Gets the next node.
Parameters:
right - if true get the next node. If false, gets the previous node.previous()
public final SortedListNode previous()
Gets the previous node.
getKey()
public abstract java.lang.Object getKey()
Gets this node's sort key. Never returns null. The value returned will determine this node's position within its SortedList instance.
This value must not normally change lest the SortedList instance malfunction. When it does change, this node should be removed from the SortedList and re-added. However, a key or keys may be changed, without affecting ordering or structure, if the sort order of the new keys is the same as that of the old keys. This can quickly make a whole new set of information available to people querying the list. (Similar results can be had, even more efficiently, by changing comparators in the SortedList instance itself.)
Questions & Comments