Discussion:
Help, please - re DTD element declaration. How to define: content required
(too old to reply)
Ingrid
2004-08-18 22:04:15 UTC
Permalink
Can anybody let me know if - in a DTD - it can be specified that an
element must have content, ie that in the instance document data must
have been entered between the open and closing tag for the document to
validate???

In my experience if the element is declared to contain #PCDATA,
parsers will not throw up an error if nothing has been filled in.

I know that in xsd Schema I can write my own regular expression and
add it as a datatype to the element definition, demanding say
character content between the open and closing tag for validation.

Ingrid
Tad McClellan
2004-08-18 23:01:17 UTC
Permalink
Post by Ingrid
Can anybody let me know if - in a DTD - it can be specified that an
element must have content,
No, it cannot.
Post by Ingrid
In my experience if the element is declared to contain #PCDATA,
parsers will not throw up an error if nothing has been filled in.
#PCDATA allows zero characters, so no, you cannot specify that
the empty string is disallowed as "content".
--
Tad McClellan SGML consulting
***@augustmail.com Perl programming
Fort Worth, Texas
Jan Roland Eriksson
2004-08-19 00:52:19 UTC
Permalink
Post by Ingrid
Can anybody let me know if - in a DTD - it can be specified that an
element must have content
No; a 'Document Type Definition' can only specify the correct flow of
elements into other elements, i.e. a DTD specifies the correct syntax
for the markup itself. DTD's are not concerend with the flow of actual
data.

So in essence, the name 'Document Type Definition' is a long time
misnomere (for more than one reason); now lets move on to the Olympic
games, I hear that HyTime architectures has hit the show :-)

[...]
Post by Ingrid
In my experience if the element is declared to contain #PCDATA,
parsers will not throw up an error if nothing has been filled in.
That is correct parser behaviour. And think about it, data can be all or
nothing. Data that is nothing is sitting on the zero point on the line
of numbers. Why do you want to treat the 'zero' as different from e.g.
-1 or +1 ?
Post by Ingrid
I know that in xsd Schema I can write my own regular expression...
It was never an intention from the original creators of SGML to make a
programming language, SGML is a "meta" language, by definition, and any
and all applications of SGML needs an "SGML declaration" to be defined
at first if there is to be a markup language at all to be of any use in
practice. XML is just one of many possible applications of SGML.

I'm tired and I need to hit the sack, see you around...
--
Rex
Ingrid
2004-08-19 10:24:47 UTC
Permalink
Thanks for clarifying!!

So I guess in the debate Schema vs. DTD - Schema wins hands down when
it comes to data verification in instance documents.

I need to create a transcription template (kind of like a form) for
the electronic representation of a manuscript. Traditionally this has
always been done via TEI/EAD DTD... But I want to make use of maximum
parser validation of the data entered into the XML instance doc. by
transcibers - So Schema seems to be a good option.

Any thoughts?

Ingrid
Post by Jan Roland Eriksson
Post by Ingrid
Can anybody let me know if - in a DTD - it can be specified that an
element must have content
No; a 'Document Type Definition' can only specify the correct flow of
elements into other elements, i.e. a DTD specifies the correct syntax
for the markup itself. DTD's are not concerend with the flow of actual
data.
So in essence, the name 'Document Type Definition' is a long time
misnomere (for more than one reason); now lets move on to the Olympic
games, I hear that HyTime architectures has hit the show :-)
[...]
Post by Ingrid
In my experience if the element is declared to contain #PCDATA,
parsers will not throw up an error if nothing has been filled in.
That is correct parser behaviour. And think about it, data can be all or
nothing. Data that is nothing is sitting on the zero point on the line
of numbers. Why do you want to treat the 'zero' as different from e.g.
-1 or +1 ?
Post by Ingrid
I know that in xsd Schema I can write my own regular expression...
It was never an intention from the original creators of SGML to make a
programming language, SGML is a "meta" language, by definition, and any
and all applications of SGML needs an "SGML declaration" to be defined
at first if there is to be a markup language at all to be of any use in
practice. XML is just one of many possible applications of SGML.
I'm tired and I need to hit the sack, see you around...
Jan Roland Eriksson
2004-08-19 12:04:43 UTC
Permalink
Post by Ingrid
So I guess in the debate Schema vs. DTD - Schema wins hands down when
it comes to data verification in instance documents.
If you want to call it a "win"; well yes, but it comes at some
considerable cost in the form of added complexity.
Post by Ingrid
I need to create a transcription template...
...want to make use of maximum parser validation of the data...
[...]
Post by Ingrid
Any thoughts?
Maybe I'm missing something but why do you need a "one shot" solution?

To me it looks like a two step problem to solve. First parse and
validate the document instance to produce a parse tree, then let some
form of lexical analyzer have a go at the data in that tree.

Or am I wrong?
--
Rex
Ingrid
2004-08-20 09:47:45 UTC
Permalink
Agreed!

However, a lot of people seem to think that it will only be a matter
of time until Schema replaces DTD use... and since Schemas have
extended datatyping capabilities, I just want to see how far I can go
with it in terms of validating data input..

I do realise that Schemas are by no means sufficient to offer a 'one
shot' solution to my task - and wonder if it is indeed worth the
bother learning such a complex document type definition language, as
DTD use is so well established and comparatively straight forward to
create ...
Post by Jan Roland Eriksson
Post by Ingrid
So I guess in the debate Schema vs. DTD - Schema wins hands down when
it comes to data verification in instance documents.
If you want to call it a "win"; well yes, but it comes at some
considerable cost in the form of added complexity.
Post by Ingrid
I need to create a transcription template...
...want to make use of maximum parser validation of the data...
[...]
Post by Ingrid
Any thoughts?
Maybe I'm missing something but why do you need a "one shot" solution?
To me it looks like a two step problem to solve. First parse and
validate the document instance to produce a parse tree, then let some
form of lexical analyzer have a go at the data in that tree.
Or am I wrong?
Jan Roland Eriksson
2004-08-23 21:33:31 UTC
Permalink
Post by Ingrid
Agreed!
However, a lot of people seem to think that it will only be a matter
of time until Schema replaces DTD use...
That "attitude" always comes along with anything that is considered to
be the "new thing". Usually lasts in that "hyped" status until a
sufficient number of people have learned about the drawbacks, complexity
and extra work associated with that new thing.
Post by Ingrid
...and since Schemas have extended datatyping capabilities,
I just want to see how far I can go with it in terms of validating
data input..
Acquiring real knowledge on how to use a "new thing" is the only way to
find out if it's suitable to solve ones problem. I'm by no means an
"expert" on schema my self but AFAIK it will only allow you to do some
basic type tests on data, e.g. differentiating between pure numeric
entries and character string entries, check for character strings being
inside some predefined range of character values or string lengths etc.

If such capacities is enough to solve your problem, you may want to
start working with it.

If you OTOH need things like spell checking and/or grammar checking
against the specific use of word sequences in some human readable
language, you will still need a second processor even after a schema
based parsing stage.

I would move to say that a DTD based parsing may leave its output more
easy to access by such a second stage processor, but I leave it to more
experienced regulars here in the NG to correct me on that point.
Post by Ingrid
I do realise that Schemas are by no means sufficient to offer a 'one
shot' solution to my task - and wonder if it is indeed worth the
bother learning such a complex document type definition language, as
DTD use is so well established and comparatively straight forward to
create ...
If you are on a specific time schedule for your task? I would say "stay
with DTD's" and spend your time to find or write the one you need. Then
go on to find the real data processor you need to do the data checking
for you.

If you end up with spare time on your hand it may then be of interest to
learn a bit more about what can be done with DTD's :-)

http://www.isogen.com/papers/archintro.html

If you still have time left over, learning about XML schema may be the
next step.

My 2 cents...
--
Rex
Loading...