News
2007-02-13: version 1.0
Download it here.
The library is no longer maintained. I have too many things to do unfortunately, and I cannot
keep working on the library unless it's really needed.
2005-01-13: version 1.0beta is out.
Download it here.
Improvements are too many, including support for Fortran 90, full DOM2 API and
xpath. F77/F90xml is the future reference library for integration and communication of Quantum Chemistry packages
under the COST D23 project
"A meta-laboratory for code integration in ab-initio methods".
See
the AbiGrid website for other informations.
Documentation for the library is provided in the package. I really need a hand to
cleanup. Tasks are simple: cleaning up documentation and the... well, website,
providing examples. Some skill in packaging and autotools is useful. I'm
not really smart with autotools, so I'm not able to assure a perfect compliant package. If you
can do better you are very welcome.
2004-06-07: After an informatical odissey, version 1.0alpha is out.
check it out.
Yes, I need help. Are you a talented F77 programmer not scared of using my library and
writing a program no longer than 60 lines of code? Then this announce is for you.
I need a testsuite for the library, now that the API is stable. The test directory already
holds various tests for different calls. If you have time to spend and want to contribute,
please consider this. The library code is trivial, but the testsuite is the most time consuming
task.
Subscribe to the
mailing list
or email me directly. you are welcome!
2004-05-03: As reported on the mailing list, the project has been forced to a stop
due to the fact that my laptop was stolen. I have a backup of the last pre1.0 release, but it's
currently on my digital camera. Since the cable (which was also stolen) seems to be proprietary,
I'm forced to wait for the cable from AIPTEK (which I wish to thank. They'll send me another
cable at no cost) and for another laptop (I have problems accessing the camera using older
versions of linux).
So please pardon me for the delay, but I'm doing all the possible for releasing the
1.0 version as soon as possible.
As I remember you, take a look at the
draft for the 1.0 release.
2004-03-29: a mailing list for the project has been opened.
You can subscribe using
this web form
Introduction
In these days, xml plays a central role in all internet infrastructure.
Grid computing is a reality.
At the same time, lot of calculations in scientific environments such
as physics, chemistry, astronomy and so on are still bound to Fortran
(90 in best cases, 77 or even 65 in the worst ones, and i'm not joking,
believe me) for reasons that range from compatibility, historical,
library availability, efficiency and, last but most important, human
factor.
Lots of scientific developers need to cope with code from the past.
Given they are not (usually) computer scientists, they learn only the
one language they need in their daily job, thus Fortran.
Given these premises, the lack of a Fortran library to handle xml is a
problem (indeed, the input handling in Fortran is conceptually a pain,
and only with F90 the namelist became a standard and language driven
way for reading plain text parameters). Moreover, even in the need of
input/output wrappers to convert from xml to namelists (or viceversa)
the human factor reason stick to Fortran instead of using more advanced
languages or resources, like python, perl or <place your favourite
language here>.
All the currently available solutions for reading xml from Fortran are
targeted to SAX parsing, using F90 as a (complete) programming
language. This produces libraries that are
- difficult to use
- unable to perform validation
- read only
- very rigid
- afflicted by the Fortran bane, namely the string handling
- bound to F90 compilers (unfortunately GNU doesn't provide a g90
compiler)
The first five points are indeed critical. You'll have no hope for
advanced features. Fortran has limits in strings, file handling and
object management. Period.
The latter is more controversial. Most people doesn't care, since there
are lots of F90 professional compilers. Some of them are free (as in
beer), but you'll never know for how long they remain so. Scientific
computing is moving to Linux and free software for economical and
technical reasons, but the lack of a free (as in speech) Fortran 90
compiler is a problem. Also, most of the codes that exists today are
still in F77, and the groups that still develop them usually choose
either to stick to pure F77 or to mix F77 and F90. They never port
everything to F90.
So we need a library that provide DOM parsing, sticks to F77, is
extensible and compiles with GNU g77. In pure F77, this is impossible.
In C it become more feasible.
Description
NOTE: These notes are outdated. they refer to the 0.x versions. A new
architecture is currently planned and discussed for the 1.0 version. If you
want to partecipate, read the draft and join the
mailing list
F77xml is a C library designed to provide DOM parsing functionality to
Fortran 77.
It acts as a wrapper to gdome2 library. At the moment, the API is very
unstable, the code is probably full of memory leaks, there are some
conceptual problem to be faced, but you already can read and add
elements, text or attributes to an xml file.
The main problem I need to face with the development of this library is
to stick to a maximum of 6 characters for each function. This is a
major problem when you need to map, for example Element::firstChild to
something that is still comprehensible in 6 characters. Also, a lot of
namespace pollution problems arise. How to solve?
It's not black magic. I introduced two concepts in the usage of the
library
- signatures
- multiplexers
signatures is a well known term for whom is accustomed to C++, for
example. Indeed is very near. Suppose these functions from gdome2
GdomeNode* gdome_el_firstChild (GdomeElement *self, GdomeException *exc);
GdomeNode* gdome_el_appendChild (GdomeElement *self, GdomeNode *newChild, GdomeException *exc);
Here you can see some analogy. firstChild accepts an element, returns a
node and an error. appendChild accepts an element, a node and an error,
and it returns a node (indeed the same as newChild, so it's redundant
and could be neglected). In conclusion, if we consider a node having a
"code" (in C speech, its pointer, in F77 speech we'll se later), both
functions need to handle 2 codes and one error. They are of signature
cce (code,code,error). It's not so important if in the first case they
are (input,output,output) and in the second case (input, input,output).
They are simply parameters, passed by reference and thus can act both
as input and as output.
If you take, for example
void gdome_el_setAttribute (GdomeElement *self, GdomeDOMString *name, GdomeDOMString *value, GdomeException *exc);
the signature here is csse (code, string, string, error), which is
different from the previuos cases.
The first cases are named "p3" (which means: 3 parameters). The second
case "p4" (four parameters). Please note that there isn't always a
straightforward match between the gdome choice of parameters and the
F77xml signature. In other words, given a gdome function, you cannot a
priori obtain the F77xml signature, and viceversa.
Also, please note that the need to distinguish signature "cce" from
"sce" still holds. They have both 3 parameters, but they are of
different type. For this reason, the first ones are "p3t1" signature
functions, and the second ones "p3t2". we prepend an "x" letter and we
obtain the name of the associated multiplexer.
What is multiplexer? we still need to solve the problem of calling so
different functions with the limitation of low pollution in the fortran
namespace. So why not pass the function name as a parameter? this is
F77xml solution. To call, for example, firstChild you need to write
func="Element::firstChild"
call xp3t1(func, ...)
and to call setAttribute, simply
func="Element::setAttribute"
call xp4t1(func, ...)
(obviously, func is a character*n)
Ok, not so userfriendly, because you need to lookup the multiplexer
function (xp3t1, xp4t1 subroutines) every time, While you still need to know
the number and the order of parameters, you need to lookup the type, which is
annoying. Also, declaring the func every time as the new string is very
error prone (if you write the wrong name for the multiplexed function, the multiplexer
complains at execution time, so the program will apparently compile correctly)
and requires unneded typing. For this reason, another tool comes in hand: the
f77xml preprocessor.
Currently still in development, the f77xml
preprocessor read your sources and do the lookup work for you, then
substituting the more user friendly syntax
call f77xml::Element::firstChild(...)
with the more compiler friendly syntax
func="Element::firstChild"
call xp3t1(func, ...)
At the time of writing, the f77xml preprocessor is partially written, so
you need to be compiler friendly, but the whole architecture works.
Download
The current (1.0 alpha) is
here.
The (0.4.0) is
here.
The (0.3.1) is
here.
You also need:
- gdome2-0.8.0
- glib-1.2.10
- libxml2-2.5.11
- python 2.3
- g77+gcc
please note that reported version numbers are only indicative.
Probably F77xml compiles against more older or recent version of these libraries, but I
have not tested.
You need python because:
- Lots of API related stuff with multiplexers are very standard and
can be generated automatically given an API description (so you need python to
compile f77xml)
- the preprocessor is written in python (so you need python
to develop using f77xml). This will change, if there's the need of a C preprocessor
ChangeLog
version 0.4.0:
* added Element::parentNode
* Cache now do an assert on the introduction of the same pointer
with addPointer. This is a quick fix (not particularly elegant,
i know) for giving back always the same object code when the
same object is referenced. The Cache clients _must_ check
if the pointer is already present before feeding it to the cache
and eventually free the allocated resource. A future release will
include reference counting, thus freeing the clients from this check
and enforcing a transparent usage.
* Simplified PointerType. There's no need for child of Type_GdomeNode,
since we can use the virtualized destructor of Node for each child
* f77xml_Cache_query marked as deprecated in favour of queryPointer
* added f77xml_Cache_queryCode
* defined NullCode
* added test5
* added autotest script
* Added autoconf/make/libtool stuff
* moved src directory to libf77xml
* fixed offset in xp4t1
* moved signature for xp4t1 from cose to the correct csoe
* implemented el_getAttribute under p5t1
* added test4 to get and set attributes
* added Document::createComment
* first implementation of preprocessor parser in directory fpp
License
The license is LGPL
FAQ
Q: where are the previous versions?
A: in my CVS. I named the first public release 0.3.1 just for coherence
between my tags and the version available on the net. If you need previous
versions drop me a line, but don't take the changelog as a guide, for these
versions. It isn't.
Q: Is the API stable?
A: Absolutely not. I won't manage to have a stable API until version 1.0 rolls
out.
Q: If I insert new nodes, my file is not modified. why?
A: The file is not modified for the lack of an API to write files. Currently,
the modified file is written in a popular italian root password "pippo" (the italian
name for Goofy). Check this file to see your changes.
Contact
Contact me at m u n e h i r o @ f e r r a r a . l i n u x . i t (removing
spaces, just a tiny prevention against spam). Collaboration,
suggestions, usage reports very appreciated.