Post by Eve M. BehrThis is going to be a bit frustrating for you, but unfortunately I
don't have the programming background to understand quite what you're
[...]
Post by Eve M. BehrPost by Peter Flynn1. a copy of the TEI Lite DTD and/or Schema that the documents reference
I might have the DTD but I have no idea what to do with it.
You shouldn't need to do anything with it except to ensure that it is in
the location specified at the top of the XML files. If your XML files say
<!DOCTYPE TEI.2 SYSTEM "teilitex.dtd">
then software will expect the DTD (that file) to be in the same
directory as the XML file. If the XML file says
<!DOCTYPE TEI.2 SYSTEM "file:///C:/mydtds/teilitex.dtd">
then the DTD must be in C:\mydtds
Post by Eve M. BehrI pasted
the text into a notepad file and named it Teilite.dtd.
OK so long as that's what the XML files refer to it as (including the
capital T...URIs are case-sensitive).
Post by Eve M. BehrNo idea how it
fits into the picture.
Software reads that first line, looks to find the DTD where it says, and
then reads it. That's all you need.
Post by Eve M. BehrI have no idea what a Schema is.
Forget it for the moment. It's an alternative way of expressing the same
information.
Post by Eve M. BehrPost by Peter Flynn2. a knowledge of XML
I'm able to create tables that apparently pass validation (so the
project leader says--
Excellent.
Post by Eve M. Behrthough he is the one who is telling the team to trust him and forget
about being able to see our work.
Silly man. Technically he's right: if the documents you create are
correctly constructed (valid) *and* you've put the right data in the
right places, then all will indeed be well when it gets formatted.
But most people like to see how it is progressing. One (crude) way is to
write a CSS stylesheet to explain to your browser what gets formatted
and how. This requires a knowledge of CSS. To get a more complex format
you would need to write an XSLT stylesheet...more learning, more work.
Post by Eve M. BehrHe says it's impossible to render XML human-readable.
There is a technical term for this: bullshit. I render XML
human-readable all day, every day. Look at the XML FAQ for an example.
OK, a vast table of impenetrably complex data intended for the internal
management of, say, a numerically-controlled turret lathe, is not going
to be meaningful if you rendered it human-readable. But normal text
documents are easily done.
Post by Eve M. BehrWhen I open it up with XML Marker it shows
nicely behaved collapsible trees with no error indications.
So far so good.
Post by Eve M. BehrWhen the tags are converted to HTML tags the tables show as nicely
formatted tables.
So far so better.
Post by Eve M. BehrPost by Peter Flynn3. an understanding of the TEI Lite markup
TEI Lite looks like the sort of style sheets I'm familiar with from
working with Brian Reed's Scribe and with working with LaTeX. I like
it.
Yep, it's fine. It has its limitations: it's intended for the
transcription of literary and historical documents, so it's not a good
choice for the documentation of a nuclear power station. But a large
number of other common DTDs work in a very similar manner, just with
different names for the element types.
Post by Eve M. BehrPost by Peter Flynn4. an XML editor, preferably one with an XSLT IDE
I'm not sure I'd know what an XSLT IDE is.
An Interactive Development Environment for the Extensible Stylesheet
Language (Transformations). Basically a glorified editor that lets you
write stylesheets and see them applied to a document, all from the one
screen. Just a convenience.
Post by Eve M. BehrI did just download Emacs
as per http://www.tei-c.org/Software/tei-emacs/ but so far it just
looks like a notepad type editor (like a programmer IDE interface that
hilights variables and stuff -- I've done VBA coding). Aside from
that I have XML Marker and XML Notepad.
But if they've included the right add-ons in that distribution (which I
*think* they did, knowing the people who did it), it ought to understand
the markup. If you open your XML file, does it colorise the markup and
switch to XML-mode (check the status line at the bottom of the window)?
Does it have a "Markup" menu? It should. Does it complain that it can't
find TEI.2 (that means your DTD is not where the XML file says it ought
to be). If it doesn't automatically spot the fact that it's XML, then
either the distribution package doesn't have the right bells and
whistles, or it has not installed itself properly. Last time I used it,
however, it all seemed to work fine. I use Emacs all the time for all my
XML editing.
Post by Eve M. BehrPost by Peter Flynn5. a knowledge of XSLT
6. an XSLT processor (eg Saxon)
I tried reading the Saxon webpage and unfortunately none of it makes
any sense at all.
You don't need to understand any of it, just install the software. It's
written in Java, so you'll need an up-to-date Java (download it from
java.sun.com).
Post by Eve M. BehrI understand it's a command line language but
that's about all I could fathom from the page.
Not a language. Saxon is just a processor. It takes an XML file and an
XSLT stylesheet and spits out whatever format the stylesheet says to.
Post by Eve M. BehrI have lots of time, but lots of uses for it also (like preparing and
reading stuff that is human readable).
I have a shortage of time, and like to spend what I have reading and
drinking wine :-)
Post by Eve M. BehrPost by Peter FlynnXSLT is an XML processing language expressed in XML itself, for
transforming XML into other formats, including HTML, other XML formats,
and plaintext (eg LaTeX, CSV, etc).
So basically, you have to convert the XML into something else?
Not necessarily, but usually.
Post by Eve M. BehrWhy
not simply write in that something else, like LaTeX which already
imposes structure and can swap stylesheets.
Because XML is much more robust than LaTeX. XML is a *provable* format:
any validator can test an XML file against the DTD and say if it's right
or not. The only program that will really check a LaTeX file is TeX
itself, and all it can do is typeset it (which it does really well).
LaTeX's structuring is non-rigorous (limp and floppy). In LaTeX, I can write
\section{stuff}
blah blah
\subsubsection{more stuff}
blah blah
and get
1 stuff
blah blah
1.0.1 more stuff
blah blah
which is obviously wrong, but LaTeX won't stop you. In any properly
constructed DTD you simply can't do that. And with XML you do all kinds
of other stuff, like retrieve the third para of the fourteenth
subsection of section 3 of chapter 4, which is virtually impossible in
LaTeX.
Post by Eve M. BehrAnd has a compiler to
produce human readable text. Alternatively, how is this all that much
different from my writing replace macros in Word to convert from TEI
(e.g., <cell></cell> to HTML <td></td>).
Because -- for example -- perhaps some kinds of table aren't really
meant to be rendered as tables? See the last paragraph of
http://xml.silmaril.ie/appendix/glossary/#tables
<table>
<tr>
<th>Chocolate<th>
<td>A major food group</td>
</tr>
<tr>
<th>XML</th>
<td>A non-edible language</td>
</tr>
</table>
should probably best be rendered as
\begin{description}
\item[Chocolate] A major food group
\item[XML] A non-edible language
\end{description}
Don't fall into the trap of assuming that things only ever have one
possible rendering. You also might not want to print your document, but
speak it through an audio generator. Or format it for some other
purpose. LaTeX can be considered an extreme case of premature binding.
There's also the problem that different markup elements mean different
things in different places. <head> inside <table> is actually a caption,
whereas <head> inside a <div3> is a subsubsection title. Trying to do
that with Word macros leads to insanity.
Post by Eve M. BehrI love LaTeX, and I'm beginning to think I might want to invest my
time using it in its native form.
I use it all the time...for formatting my XML. But my master copies are
always tucked away safely in XML, not LaTeX.
Post by Eve M. BehrI have MikTeX and feel really
spoiled because when I did LaTeX officially there were no screen
previews (you had to kill trees before you could see what your stuff
looked like).
That must have been nearly as long ago as when I started using it. My
first previewer was a Tektronix graphics storage tube. Even the very
first DOS versions of TeX had DVI viewers for the CGA and Hercules
graphics cards.
Post by Eve M. BehrBut some of the projects I want to work on for
Gutenberg are insisting on this TEI standard so ...
Preservability and reusability. Trust me on this.
[xml example]
Post by Eve M. BehrThis I get. This is like the text I actually type in LaTeX. Looks
different but essentially the same sort of thing.
OK.
[xslt example]
Post by Eve M. BehrThis I recognize also, though I only dipped into the Scribe/LaTeX
databases because I was unhappy with what the designers thought
consistuted a proper manuscript. Once I was happy, I spent most of my
time cranking out manuscript.
OK too.
Post by Eve M. BehrPost by Peter Flynn$ java -jar /usr/local/saxon/b8.5/saxon8.jar -o test.html test.xml test.xsl
Here you lose me completely. Looks like a conversion utility though.
That's exactly what it is. With an IDE (remember IDEs) you don't have to
type this stuff, just click on the button.
[xslt latex example]
Post by Eve M. BehrHere's what I need explained. Obviously these are two files (the
style sheet and the document). Where do I put them and what are their
extensions?
yourfile.xml is the XML file
whatever.xsl is an XSLT stylesheet (maybe called tei2html.xsl or
tei2latex.xsl)
Post by Eve M. BehrAnd how do I open up the document in a way that it sees
the stylesheet (Explorer won't do it and neither will Notepad).
<?xml-stylesheet href="tei2latex.xsl" type="text/xsl"?>
inserted in your XML document (at the top, between the DOCTYPE line and
the <TEI.2> start-tag.
Explorer, Firefox, and most other modern browsers will work with this,
but their implementations are incomplete and clumsy. Only a few XML
editors can render using XSLT in real time (because XML processing
normally needs the whole document to be complete -- for example to
resolve ID/IDREF links [similar to \label and \ref] -- and in an
*editor*, designed for writing a document, a file might simply not yet
be complete...and thus partially unrenderable).
Post by Eve M. BehrPost by Peter Flynna. use a browser that reads XML (Firefox, MSIE, DocZilla, etc)
b. open the file in an editor
c. have the file served to you by an XML server (eg AxKit, Cocoon,
PropelX, etc)
But as you have already deduced, you need a stylesheet to express how
you want it to look, because XML typically does not carry styling
information, only content-descriptive markup.
OK, again, assume I have the stylesheet, how do I point the document
at it in a way that the browser will understand it.
Use the <?xml-stylesheet...?> Processing Instruction as above, or use an
external processor like Saxon, as above or in an IDE. Or write some
chunk of code in XML.NET which opens the document, binds the stylesheet,
and runs whatever it is that .NET runs to do XSLT (forgive me, I'm not a
Windows user).
Post by Eve M. BehrThe browser when
it reads the XML file either gives me the collapsible tree or the text
all muddled together. Nothing that looks like anything.
<?xml-stylesheet...?> is your friend...but be warned that browser XSLT
is flaky. For an example, see http://xml.silmaril.ie/hotels.xml
Post by Eve M. BehrPost by Peter FlynnThis can be done with CSS, provided you don't want to change the order
of the document, or do anything cute like generating a table of
contents. For that you have to use XSLT, because it processes the file
and can therefore reach into it and cherry-pick the bits you want where
you want them, which CSS can't do. You can of course also apply CSS post
hoc to the generated HTML.
Right now I'd be happy to see a section of text with something bolded
and perhaps an inset quote.
Try that hotels.xml link above.
Post by Eve M. BehrPost by Peter FlynnPost by Eve M. BehrI have downloaded a stylesheet that seems to have the correct tags, but
nothing I do with my browser will pull up a TEI book so that I can see
the tables and the poetry and the formatted block quotes as they
should be.
Is it a CSS stylesheet or an XSLT stylesheet?
It's a CSS stylesheet. I haven't encountered XSLT stylesheets. But
the CSS stylesheet seemed to have all the tags I had in my document.
Post by Peter FlynnAdd ONE of the following to the XML document if it's not already there,
<?xml-stylesheet href="foo.xsl" type="text/xsl"?>
<?xml-stylesheet href="foo.css" type="text/css"?>
Tried that. Again, both Notepad, and Explorer just mashed the text
together or showed raw code.
Forget Notepad utterly. It's completely dumb and won't do anything
except write shopping lists. Explorer should do it, but it may need the
documents to be served from a server, not opened locally. Install
Firefox and try that instead.
Post by Eve M. BehrOK, skip "right" one. Howabout any stylesheet that will produce
something recognizably formatted.
If you have a CSS stylesheet that someone claims will format your
document, ask them how to do it. If you want to try it yourself, create
myfile.xml:
<?xml version="1.0"?>
<?xml-stylesheet href="test.css" type="text/css"?>
<doc>
<title>Hello world!</title>
<text>A test</text>
</doc>
and test.css
title { display:block; font-weight:bold; font-size:24pt }
text { display:block; margin-top:12pt; font-size:12pt; }
and open myfile.xml. But as I said, you may have to do this through a
web server if your browser doesn't handle XML when opened from disk.
Post by Eve M. BehrLike with LaTeX, when I started I
was just happy if I could produce stuff with recognizable headings and
equations that looked like equations with the proper structure. It
was only later that I worked out stylesheets that didn't reflect
Lamport's silly prejudices and instead reflected *my* silly
prejudices.
We all do this...
Post by Eve M. BehrPost by Peter FlynnA browser will honor the <?xml-stylesheet...?> PI and apply the styling
to the document as much as browsers are capable of doing so (they are
notoriously flaky handling XML, which is why all major TEI projects use
XML servers which do the transformation server-side).
Post by Eve M. BehrEven the person who wrote "A Gentle Guide to XML" admitted that
the Guide had to be written in HTML because he had no idea how to
produce something readable in anything else.
Is this an Urban Myth? I can't see Michael or Lou authoring in HTML.
Could have been a different "Gentle Guide".
From www.tei-c.org it should be one of the chapters in the TEI
Guidelines. I'm not aware of any other one.
Just something I stumbled
Post by Eve M. Behralong in my "wee hours I'm not going to bed till I figure something
out" sessions.
Post by Peter FlynnPost by Eve M. BehrSurely the goal of the Text Encoding Initiative and SGML isn't that
Certainly not. The XML/SGML is the master storage format. You can
download it and write your own style for display, or use whatever a
particular project provides online (see for example http://celt.ucc.ie)
OK, I tried this site and downloaded their Doczilla in response to
their statement "Your designated program for SGML files will open the
file." combined with the fact that Doczilla seemed to be what they
were talking about. Doczilla did open up the SGML file I downloaded
for testing. But, again, it only showed me lots and lots of code.
I must check that. It worked last time I tried it with the Panorama
plugin for Netscape 3 (which has the same fundamentals as DocZilla) but
it's possible that things have deteriorated: those web pages are in
transition.
Post by Eve M. BehrI assume that the site wants you to use their converted HTML files and
leave the SGML files for computer experts.
No, for experts in Irish literature and history. But the SGML files are
in the process of being converted to XML, and a new server is being
installed which will do much more, using XSLT, than was possible in SGML
using Omnimark.
Post by Eve M. BehrPost by Peter FlynnPost by Eve M. BehrInstead of the poem within.
For an example of TEI encoding a Shakespearian sonnet in the original
Klingon, with markup in Elvish, see http://research.silmaril.ie/xml/
Post by Eve M. BehrHow do I get access to the poem as it was intended to be seen
Do we know how "it was intended to be seen"? By whom? The author (now
dead)? The publisher (which one)? The TEI project (I doubt they have a
view of how it "ought" to be seen).
By someone who isn't a computer programmer? With text being bolded
instead of tags indicated it should be bolded
That's what stylesheets do. Exactly the same principle as in LaTeX.
Post by Eve M. Behrand just "use your
imagination".
I don't think I've ever said that. I assume this is someone else.
Post by Eve M. BehrI hate to start sounding snippy again, but I am
beginning to hate the sight of XML tags.
I think you have been neglected here. Let's try to repair it.
Post by Eve M. BehrI'm sure people have produced stylesheets, but I clearly am extracting
the information wrong. (I can't find style sheet files, just pages of
code and I try to guess where the file starts and ends).
I'm not sure where you're looking, but try
http://www.tei-c.org/Stylesheets/teic/
But you will need to install Saxon or a similar processor to do this.
I've never heard of anyone using CSS to format TEI (like using a
teaspoon to move a mountain) but it's possible, just infinitely fiddly
and tedious.
Post by Eve M. BehrWell Peter. Through the course of typing this response (the last
three hours) I have tried various things. I even thought I finally
had the solution. Unfortunately, all I can get is "This XML file does
not appear to have any style information associated with it. The
document tree is shown below."
<?xml-stylesheet...?>
Post by Eve M. BehrAnd then the hated tree structure below
it. The XML FAQ seemed to be the most helpful, however all their
examples are to documents on their website and not on a hard drive.
Right. That's what XML was designed for, although it works perfectly
well off disk. But your *browser* may not. My Firefox seems to be OK.
Post by Eve M. BehrThey also don't show their style sheet. I'll have to look at other
pages and hope I land on one that uses an intro line that corresponds
to my setup.
Try those two test files above.
Post by Eve M. BehrI think I'll take a break and do something easy (like transcribe some
maths into LaTeX). But I will figure it out somehow. And when I do I
will then be faced with the task of explaining it to people who would
like to help with the project, but have even less computer background
than I do.
More documentation is *always* needed. But XML requires absolute
precision: what you specify (names, directories, etc) must be 100%
correct or nothing at all will work.
Post by Eve M. BehrOnce I figure some more out I might be able to ask decent questions.
By all means...
///Peter, away on vacation for a few weeks now...byebye