Discussion:
Is SGML dead?
(too old to reply)
Tin Gherdanarra
2006-12-04 16:16:53 UTC
Permalink
I've just completed an SGML-project and have started to like
SGML.

To me it looks as if SGML has been superseded by XML. SMGL
is treated as legacy data, with just one application: Being
converted to XML. This observation stems from anecdotal
and not very trustworthy evidence.

I'd love to dig deeper into the matter in order to help
companies convert their SGML repositories into XML, but
I don't know if this is a widespread need that justifies
learning a lot about SGML. What's also interesting for me:
From what I've learned in my SGML-project, and what other
people have told me, is that real-world SGML repositories cannot
be parsed with standard SGML-processors. This could mean
that there is a market for home-grown SGML-tools that can
be modified to my heart's content in order to adapt it
to weirdly embellished repositories. Large, old data repositories
in general have a lot of cruft and hairs.
Others have dealt with such SGML-problems before, and I'd like
to do this, too, but I have the feeling it is too late
for this. (I'm a compiler/parser-nerd). So my question is:
Have I missed the SGML-bandwagon? Please check:

[ ] "Just what the world needs -- more SGML-nerds."
[ ] "Just what the world needs! More SGML-nerds!"

Google and various job-billboards don't show any job-opportunities
for people like me, or so it seems, but maybe this industry
works by word-of-mouth?

Kind regards
Tin
Luch
2006-12-04 17:36:50 UTC
Permalink
My company's software deals with the aircraft maintenance business. The
aircraft manufacturers (Airbus, Boeing, Bombardier, Embraer) still use
SGML to write (or distribute / deploy) their manauls.
And yes, our software as a first step converts it to XML so we can work
with it better.

My suggestion: Get a job with one of them to help them convert over to
using XML themselves.
Post by Tin Gherdanarra
I've just completed an SGML-project and have started to like
SGML.
To me it looks as if SGML has been superseded by XML. SMGL
is treated as legacy data, with just one application: Being
converted to XML. This observation stems from anecdotal
and not very trustworthy evidence.
I'd love to dig deeper into the matter in order to help
companies convert their SGML repositories into XML, but
I don't know if this is a widespread need that justifies
From what I've learned in my SGML-project, and what other
people have told me, is that real-world SGML repositories cannot
be parsed with standard SGML-processors. This could mean
that there is a market for home-grown SGML-tools that can
be modified to my heart's content in order to adapt it
to weirdly embellished repositories. Large, old data repositories
in general have a lot of cruft and hairs.
Others have dealt with such SGML-problems before, and I'd like
to do this, too, but I have the feeling it is too late
[ ] "Just what the world needs -- more SGML-nerds."
[ ] "Just what the world needs! More SGML-nerds!"
Google and various job-billboards don't show any job-opportunities
for people like me, or so it seems, but maybe this industry
works by word-of-mouth?
Kind regards
Tin
Tin Gherdanarra
2006-12-04 20:46:55 UTC
Permalink
Post by Luch
My company's software deals with the aircraft maintenance business. The
aircraft manufacturers (Airbus, Boeing, Bombardier, Embraer) still use
SGML to write (or distribute / deploy) their manauls.
This sounds interesting because it supports my hypothesis that
SGML is good for data entry clerks. XML isn't. Unfortunately, markup
fell into disgrace as soon as supposedly half-literate cubiclists
got PCs, Excel and mice.
Post by Luch
And yes, our software as a first step converts it to XML so we can work
with it better.
... because XML is more Java-readable.
Post by Luch
My suggestion: Get a job with one of them to help them convert over to
using XML themselves.
Thanks (:
Tad McClellan
2006-12-05 03:16:47 UTC
Permalink
Post by Tin Gherdanarra
Post by Luch
My company's software deals with the aircraft maintenance business. The
aircraft manufacturers (Airbus, Boeing, Bombardier, Embraer) still use
SGML to write (or distribute / deploy) their manauls.
This sounds interesting because it supports my hypothesis that
SGML is good for data entry clerks. XML isn't.
You are remarkably insightful.
Post by Tin Gherdanarra
Unfortunately, markup
fell into disgrace as soon as supposedly half-literate cubiclists
got PCs, Excel and mice.
You are astoundingly insightful!


(too bad you've missed the SGML boat, we could have used you. :-)
--
Tad McClellan SGML consulting
***@augustmail.com Perl programming
Fort Worth, Texas
Peter Flynn
2006-12-04 21:59:02 UTC
Permalink
Post by Tin Gherdanarra
I've just completed an SGML-project and have started to like
SGML.
Look for hairs on the palms of your hands :-)
http://www.flightlab.com/~joe/sgml/faq-not.txt
Post by Tin Gherdanarra
To me it looks as if SGML has been superseded by XML.
Not quite superseded, but XML is where the development and growth is.
There are still lots of applications in SGML, with the owners having
so immediate intention of changing to XML.
Post by Tin Gherdanarra
SMGL is treated as legacy data, with just one application: Being
converted to XML. This observation stems from anecdotal and not very
trustworthy evidence.
That's why it's slightly inaccurate.
Post by Tin Gherdanarra
I'd love to dig deeper into the matter in order to help companies
convert their SGML repositories into XML, but I don't know if this is
a widespread need that justifies learning a lot about SGML.
It needs quite a lot of SGML to do this successfully. There are rather a
lot of hidden wrinkles in SGML.
Post by Tin Gherdanarra
From what I've learned in my SGML-project, and what other
people have told me, is that real-world SGML repositories cannot
be parsed with standard SGML-processors.
Untrue. If it can't be parsed with (for example) nsgmls, then it's
probably not SGML, or it uses one or more of the very arcane features of
SGML which have never been implemented in a common parser.
Post by Tin Gherdanarra
This could mean
that there is a market for home-grown SGML-tools that can
be modified to my heart's content in order to adapt it
to weirdly embellished repositories. Large, old data repositories
in general have a lot of cruft and hairs.
Those tend to be the result of having to retro-fit SGML to
poorly-designed original data models, or the result of later tinkering,
rather than through any fault in SGML itself (of which it has plenty,
but most of them are addresses in XML).
Post by Tin Gherdanarra
Others have dealt with such SGML-problems before, and I'd like
to do this, too, but I have the feeling it is too late
Have I missed the SGML-bandwagon?
Not quite, but I suspect the effort wouldn't be worth it unless you
tapped some hitherto unsuspected rich vein of SGML.
Post by Tin Gherdanarra
Google and various job-billboards don't show any job-opportunities
for people like me, or so it seems, but maybe this industry
works by word-of-mouth?
That, and places like this newsgroup. Plus the major conferences where
SGML-users tend to congregate: the XML conference (this week in Boston);
the Extreme Markup conference (summer, Montreal); the TEI meetings; the
joint annual ACH/ALLC conferences, etc.

Welcome aboard.

///Peter
--
XML FAQ: http://xml.silmaril.ie/
William F Hammond
2006-12-05 22:21:00 UTC
Permalink
Post by Peter Flynn
Not quite superseded, but XML is where the development and growth is.
There are still lots of applications in SGML, with the owners having
so immediate intention of changing to XML.
I would say that SGML has only been superseded by XML for public
serving on the web. All other use is still sane and often quite
valuable.

Meanwhile a number of newbies have joined the XML bandwagon displaying
a kind of religious fervor that is dismissive of SGML.

Alas, where will they be when SDATA is finally brought into XML? :-)
(See discussions in and around the w3c-math list about named character
entities -- a growing chorus demanding that web browsers recognize
a barrel of CDATA character entities by name.)
Post by Peter Flynn
SGML is treated as legacy data, with just one application: Being
converted to XML. This observation stems from anecdotal and not very
trustworthy evidence.
That's why it's slightly inaccurate.
I'd love to dig deeper into the matter in order to help companies
convert their SGML repositories into XML, but I don't know if this is
a widespread need that justifies learning a lot about SGML.
It needs quite a lot of SGML to do this successfully. There are rather
a lot of hidden wrinkles in SGML.
Conversion from SGML to XML can be a useful stage at the front end of
a formatting or translating pipeline. But an SGML source has higher
content value than any XML derived from it.

A wise person will always save original source.
Post by Peter Flynn
From what I've learned in my SGML-project, and what other
people have told me, is that real-world SGML repositories cannot
be parsed with standard SGML-processors.
Untrue. If it can't be parsed with (for example) nsgmls, then it's
probably not SGML, or it uses one or more of the very arcane features
of SGML which have never been implemented in a common parser.
Agreed.

-- Bill
Peter Flynn
2006-12-06 22:56:01 UTC
Permalink
Post by William F Hammond
Post by Peter Flynn
Not quite superseded, but XML is where the development and growth is.
There are still lots of applications in SGML, with the owners having
so immediate intention of changing to XML.
I would say that SGML has only been superseded by XML for public
serving on the web. All other use is still sane and often quite
valuable.
Meanwhile a number of newbies have joined the XML bandwagon displaying
a kind of religious fervor that is dismissive of SGML.
Alas, where will they be when SDATA is finally brought into XML? :-)
That's not going to happen, I'm afraid. There was an extensive
discussion at Extreme Markup this summer about the bits of SGML that we
ought not to have left out of XML, and I was surprised (and a little
saddened) by the vituperative nature of the response from those in a
position to do something about it -- people whose opinion I respect, but
who have an insufficient background in normal document publishing to
understand the importance of some of the features required. Their view
was that the W3C is run by corporate commercial entities who see XML as
a data-vehicle, and are blind to the needs of the author, publisher, and
editor. A lot of BS about "making a business case" for such changes as
if XML was a sellable product, when the case was already made for them
20 years ago and more.
Post by William F Hammond
(See discussions in and around the w3c-math list about named character
entities -- a growing chorus demanding that web browsers recognize
a barrel of CDATA character entities by name.)
That won't happen either, until browsers start using DTDs or Hell
freezes over, whichever is the first. The browser authors see no case
for it, and don't understand XML sufficiently to make a judgment anyway:
no change from 1993 there, then
(http://www.oasis-open.org/cover/sgmlwww.html)

And the argument most often adduced (that it would take too many cycles
or too much bandwidth) is completely bogus: if Panorama was able to
download and tokenise the entire TEI DTD in real time on a PII under
Win3 in an acceptable timeframe (which it demonstrably was), then under
modern conditions with bandwidth and cycles coming out of our ears it
should be possible to perform a similarly realtime non-validating parse
of MathML with a few hundred character entities.

The answer is server-side generation of HTML using numeric references,
provided the browser makers can be persuaded to recognize them, and
operating-system makers be persuaded to ship suitable fonts as standard.
Post by William F Hammond
Conversion from SGML to XML can be a useful stage at the front end of
a formatting or translating pipeline. But an SGML source has higher
content value than any XML derived from it.
Not necessarily. In the documents I deal with, it's the element markup
which carries probably 95% of the value of the document, and that is
unchanged between SGML and XML (apart from NETs). But yes, some of the
document management features of SGML source will always out-perform XML.
Post by William F Hammond
A wise person will always save original source.
"But I've got the _original_ Word files...you know, the ones I got from
OCR which you converted to {SG|X}ML for me..."

*shudder* :-)

///Peter, *not* crossposting to c.t.x; no need to frighten the horses
William F Hammond
2006-12-07 21:24:22 UTC
Permalink
Post by William F Hammond
Alas, where will they be when SDATA is finally brought into XML? :-)
That's not going to happen, I'm afraid. . . .
The train has, I think, pulled somewhat out of the station though it
is not very far down the track. Evidence:

file://local.firefox.tree:/res/entityTables/

http://dig.csail.mit.edu/breadcrumbs/node/166
The answer is server-side generation of HTML using numeric references,
provided the browser makers can be persuaded to recognize them, and
operating-system makers be persuaded to ship suitable fonts as
standard.
But, yes, this is what I recommend for the end of the online branch
of an authoring pipeline.
Post by William F Hammond
Conversion from SGML to XML can be a useful stage at the front end of
a formatting or translating pipeline. But an SGML source has higher
content value than any XML derived from it.
Not necessarily. In the documents I deal with, it's the element markup
which carries probably 95% of the value of the document, and that is
unchanged between SGML and XML (apart from NETs). But yes, some of the
document management features of SGML source will always out-perform XML.
I didn't say 'necessarily'. The use of 'can be' allows for exceptions.
When I referred to xml derived from sgml, I didn't mean to indicate
human intervention to repair nonsense. And the word 'higher' should be
construed here in the sense of weak inequality, i.e., at least as good
as.

Human intervention changes the picture. And human intervention always
introduces the possibility of human error.
///Peter, *not* crossposting to c.t.x; no need to frighten the horses
(c.t.x = comp.text.tex)

It's still by far the largest community using markup.

Cheers.

-- Bill
Peter Flynn
2006-12-10 20:03:00 UTC
Permalink
Post by William F Hammond
Human intervention changes the picture. And human intervention always
introduces the possibility of human error.
Never was a truer word...
Post by William F Hammond
Post by Peter Flynn
///Peter, *not* crossposting to c.t.x; no need to frighten the horses
(c.t.x = comp.text.tex)
Noooo...comp.text.xml

///Peter

Loading...