Node.rb
An object library to parse
outlines and handle hierarchies
Copyright (C) 2005 by Steve Litt
NO WARRANTY!
There is no warranty for anything contained in the
Node.rb distribution or documentation or its web pages, to the extent
permitted by applicable law. Except when otherwise stated in
writing
the copyright holders and/or other parties provide the program,
documentation
and web pages "as is" without warranty of any kind, either expressed or
implied, including, but not limited to, the implied warranties of
merchantability
and fitness for a particular purpose. The entire risk as to the
quality
and performance of the program is with you. Should the program,
documentation
or web pages prove defective, you assume the cost of all necessary
servicing,
repair or correction.
Node.rb is a Ruby object library
designed
to quickly and easily parse outline files into node hierarchies, and
to manipulate those node hierarchies. It is similar to DOM but lighter
weight, and it can handle multiple top level items in parsed files.
It is a functional equivalent of the Node.pm
library, although version 0.01 is not as complete and has not been
extensively checked for bugs.
Copyright and License
This software is copyright (C) 2005 by Steve Litt, all rights reserved.
I have licensed it under the Litt Scripting Language Development Tool
License
(LSLDTL). The LSLDTL
is the GNU
GPL with an exception and an exception to
that exception. The intent of the LPDTL is to provide you with a tool
that is copylefted free software, but does not require programs you
write by including that tool to be either copylefted, free software, or
GPL compatible. I have tried to craft this license so that badguys
cannot proprietarize modifications of the tool itself.
The LSLDTL contains provisions for the user to issue modifications of
the code as pure GPL, in case you're uncomfortable with the LSLDTL.
Keep
in mind, however, that if you do that, nobody (including yourself) will
be able to use that modified tool to create software that is either
proprietary or GPL incompatible.
Project Charter
The purpose of Node.pm is the quick and easy parsing of outline files,
and quick and easy manipulation of hierarchies. This might not sound
especially impressive, but consider the uses:
- configuration file parsing
- Menu creation and service
- Conversion between markup languages
- XML creation
- HTML structuring
- Creation of book structure
- outline processing
- Hierarchy drilldown
- Hierarchy flattening
Node.rb is created as an object oriented tool. At the lowest level are
Node objects, which have a name, type, value, and a list of named
attributes, as well as containing pointers to parent, previous sibling,
next sibling, first child and last child. Hierarchies are built from
trees of these Node objects.
The design priorities of Node.rb are:
- Quick, simple and easy development of hierarchy/outline
manipulation algorithms
- Easy learning curve relative to the tool's power
- Ability to implement very complex hierarchical requirements in an
easy to maintain manner
To simplify working with such hierarchies, the Walker object walks
the
tree recursively, handling all a Node's decendents before going on to
its younger siblings. In the preceding sentence the word "recursively"
is used loosely, because although the order of Node visitation mimics
what you would see in a recursive algorithm, in fact the algorithm is a
simple loop, conserving on memory.
Having the Walker "walk the Node tree" is a nice intellectual exercise,
but it accomplishes nothing unless it takes action. The action taken
is determined by two callback routines passed
GOT HERE the EMDL language is to facilitate fast and easy manipulation
of large menu systems using a standard text editor capable of automatic
indentation (the ability to indent to the same level as the preceding
line,
plus ways to cut, paste, indent and exdent multiple lines). The Vim
editor
is one such editor.
Maintainer's Guide
All current and future maintainers of Node.pm should
be very cognizant of the project's priorities. Node.pm must be easy for
the
human maintaining or manipulating hierarchies.
Project Specifications
Node.pm implements three types of objects:
- Node: Data for a single entity in a hierarchy
- Parser: An object to convert a tab indented outline into a
hierarchy of Node objects
- Walker: An object that "walks" a hierarchy of Node objects,
taking action on each Node as specified by its callback functions.
It should be noted that Node.pm was inspired by the Apache Software
Foundation's DOM (Document Object Model) software. It differs from DOM
in that it is less inclusive, and:
- Much easier to add to Ruby than DOM
- Much smaller footprint than DOM
- Includes a Walker object to simplify tree manipulation
- Ability to parse input with multiple top level entities
- Parses tab indented outlines instead of XML
- No facilities for DTD's or Schemas
Node Object
The Node object represents one entity in a hierarchy or tree. It
contains both data, navigational pointers and methods:
- Data
- Name
- Type
- Value
- Zero, one or more attributes in key/value pairs
- Navigational Pointers
- Parent
- Previous sibling
- Next sibling
- First child
- Last child
- Methods
- Writeable attributes
- Read only attributes (protected: writeable within the class
itself)
- Navigation
- .parent
- .prevsibling
- .nextsibling
- .firstchild
- .lastchild
- Methods
- Attribute methods:
- hasAttribute()
- getAttribute()
- setAttribute()
- removeAttribute() ### TODO
- hasAttributes()
- getAttributes()
- setAttributes()
- Creation
- Insertion/Deletion
- insertSiblingBeforeYou(nodeToInsert)
- insertSiblingAfterYou(nodeToInsert)
- insertFirstChild(nodeToInsert)
- insertLastChild(nodeToInsert)
- deleteSelf()
OutlineParser Object
The OutlineParser object is an object whose task is to parse a tab
indented outline and place its information into a tree of Node objects.
Because outlines frequently have multiple top level entries, but a true
tree can have only one, the OutlineParser object creates a new Node
object that becomes the top level, with the outline's top level entries
becoming children of that Node object created by the OutlineParser.
Each line of the outline is converted to a Node object with the
following data:
- Name is undefined
- Type is "Node"
- Value is the text of the line itself
- Attribute _lineno is the line number of the source outline file
- This is vital for error messages
The outline file's hierarchy is mirrored in the Node tree. If line P
has a child C in the outline, then node P has a child node C in the
Node hierarchy.
OutlineParser Object Methods
|
Properties of the Parse
|
.commentchar
|
Single character signifying a
line is a comment. This character must be the first nonblank character
on the line in order to render the line a comment. Comment lines are
not converted to nodes, nor are they checked for correct indentation or
other syntax. Default is #. Read/write.
|
.skipblanks
|
Sets the OutlineParser to either
ignore (true) or process (false)
blank lines, meaning lines with no characters and also lines with only
whitespace. Default is true.
|
Action Methods |
new()
|
Instantiates a new OutlineParser
object, and passes it back as a return.
|
parse()
|
Perform the parse. If parsing a
file, pass the filename in via an argument. All other parse properties
have been previously defined. The top level node (the one created by
OutlineParser) is passed back as the function return. From that top
level node, the application programmer can navigate or walk anywhere in
the Node tree.
|
Walker Object
The Walker object "walks" an entire Node hierarchy defined by the new()
arg1 Node argument and its descendents. This walk always goes deep
before going broad. In other words, all children are processed before
going to the next sibling.
The Walker object visits every Node in the hierarchy, which in itself
does nothing. The Walker performs actions on these Node objects via two
callback routines, the entry callback and the exit callback. The
entry callback is called immediately upon the Walker's first visiting
the Node. The exit callback is called upon exiting the Node for the
last time. Unlike Node.pm, Node.rb runs the exit process on all nodes,
whether they have children or not.
The entry and return callbacks are arg2 and arg3, respectively, of the
new() function. Callbacks MUST be methods of a Ruby object. They CANNOT
be free standing subroutines. The reason for this design is so the
callbacks can keep persistent information, and so they can trade
information with the "outside world" without resorting to global
variables.
The following is a simple example of the use of a Walker object,
assuming that object Callbacks
contains method printNode(),
whose three arguments are the object itself ($self), the Node object
that called the callback, and the level of that node:
walker = Walker.new(node, cb.method(:entry), cb.method(:exit))
walker.walk()
You'll notice the preceding instantiates the Walker object with only
two arguments. To pass only one of the two callback routines, pass the
other one as nil. There
would never be a reason to run a Walker object without any callbacks --
doing so would perform no work.
Downloads
- Version 0.03 pre-alpha, 12/17/2005
- Checks for parent before doing certain operations, preventing
crashes
- Node.rb (The Tool itself)
- test.otl (test data for the test
program)
- testnode_parse.rb (A test
program)
- Please note that all three files should be placed in the same
file.
- Run the program like this:
- ./testnode_parse <
test.otl | less
- Version 0.02 pre-alpha, 12/7/2005
- Implements WalkerReverse for reverse node walking
- Node.rb (The Tool itself)
- test.otl (test data for the test
program)
- testnode_parse.rb (A test
program)
- Please note that all three files should be placed in the same
file.
- Run the program like this:
- ./testnode_parse <
test.otl | less
- Version 0.01 pre-alpha, 12/7/2005
- Node.rb (The tool itself)
- test.otl
(test data for the test program)
- testnode_parse.rb
(A test program)
- Please note that all three files should be placed in the same
file.
- Run the program like this:
- ./testnode_parse <
test.otl | less
Maintainers List
Needed Programming and Documentation Tasks
Several minor functionalities from Node.pm must be included. A
WalkerReverse class must be created. Also, this tool must be tested
extensively before it can be confidently used.
How to Participate
Email Steve Litt if you'd
like to participate. I'll work with you as much or as little as you
want.
Mailing List
There's no mailing list yet. For now, communicate directly with Steve
Litt. Once there are several participants, I'll make a mailing
list.
FAQ (Frequently Asked Questions) list
None exists. The project is too new to really know what to put in it.
HTMLized versions of the project
documentation
Besides this page, there's EXTENSIVE documentation of Node.pm, the Perl
version of this tool, at http://www.troubleshooters.com/lpm/200407/200407.htm
and http://www.troubleshooters.com/projects/Node/index.htm.
and .
Links to related projects.
Dedication: We Stand On Their Shoulders
- Richard Stallman and the Free
Software
Foundation:
Without them I shudder to think what the software world would be like
today.
- Linus Torvalds and the various Linux projects: Without Linux, I
wouldn't
need VimOutliner -- I'd just need a lot more money to purchase
proprietary
software and a lot more patience to deal with Blue Screens of Death and
resulting data loss.
- Greater Orlando Linux User Group
(GOLUG): The peer to peer brain network of which
I'm a small part.
- The VimOutliner project,
whose members gave me ideas and encouragement in developing Node.pm.
- The Apache Software Foundation,
whose DOM spec inspired me to create Node.pm.
Progress
On 12/7/2005 I released this software, after testing it in its current
form for some six months.
Top of Page