Discussion:
Topic Maps Generation
(too old to reply)
Aayush Puri
2004-10-25 10:39:51 UTC
Permalink
Hi,
I am trying to figure out if there is any way of automating the
process of generation of Topic Maps from a set of data ( and in case
we are given the set of topics). I am having difficulty in getting out
a way to set "associations" between the "topics". Can we apply some
kind of machone learning algos to make a system better a topic map?
Any pointers/suggestions???

-Aayush
Richard Light
2004-10-25 11:58:29 UTC
Permalink
Post by Aayush Puri
Hi,
I am trying to figure out if there is any way of automating the
process of generation of Topic Maps from a set of data ( and in case
we are given the set of topics). I am having difficulty in getting out
a way to set "associations" between the "topics". Can we apply some
kind of machone learning algos to make a system better a topic map?
Any pointers/suggestions???
Aayush,

It would help if you provided some examples of the data you want to
convert to a Topic Map. I have had no great difficulty in generating
XTM Topic Maps from XML sources using XSLT - the issue is understanding
what the relationships between the items of data actually are, in order
to render them as associations.

Richard Light
--
Richard Light
SGML/XML and Museum Information Consultancy
***@light.demon.co.uk
Lars Marius Garshol
2004-10-25 19:02:06 UTC
Permalink
Hi Aayush,

* Aayush Puri
|
| I am trying to figure out if there is any way of automating the
| process of generation of Topic Maps from a set of data ( and in case
| we are given the set of topics). I am having difficulty in getting
| out a way to set "associations" between the "topics". Can we apply
| some kind of machone learning algos to make a system better a topic
| map?

There are many ways to do this, but as Richard says it all depends on
the input data, and to some extent it also depends on your ontology
(topic and association types). It's possible to use NLP techniques and
existing NLP software to do this, but that doesn't remove the need for
an understanding of the data.

Can you tell us more about your project? That would make it easier to
help you.
--
Lars Marius Garshol, Ontopian <URL: http://www.ontopia.net >
GSM: +47 98 21 55 50 <URL: http://www.garshol.priv.no >
Aayush Puri
2004-10-26 06:31:49 UTC
Permalink
Thanks a lot for yr feedback Lars and Richard....
I guess I didn't explain my problem well. Let me give it another
shot.
What I am supposed to do is to design and implement a system that is
able to generate topic maps from a given set of data...lets suppose
that the data is stored in some database. Also I have a complete list
of the topics given to me. Now I need to "study" the given datasets
and derive relationship between the list of given topics. As I was
also expected and as Lars suggested that may be some NLP techniques
need to be used.
Richard I know that it may be somewhat simple to transform XML to XTM
but the problem after reading some data source how do I set the
relationships(associations) between the topics??
Suppose I read a text "Shakespeare authored Merchant of Venice"....and
in the list of given topics I have "Shakespeare" and "Merchant of
Venice". Now after reading the aformentioned sentence I must be able
to set associations of "authored_by" and "authored" so that the topic
map is able to say not only that "Shakespeare authored Merchant of
Venice" but also "Merchant of Venice was authored by Shakespeare"

I hope I was able to clarify things a bit. Looking forward for yr
replies.

Best Wishes,
-Aayush
Post by Lars Marius Garshol
Hi Aayush,
* Aayush Puri
|
| I am trying to figure out if there is any way of automating the
| process of generation of Topic Maps from a set of data ( and in case
| we are given the set of topics). I am having difficulty in getting
| out a way to set "associations" between the "topics". Can we apply
| some kind of machone learning algos to make a system better a topic
| map?
There are many ways to do this, but as Richard says it all depends on
the input data, and to some extent it also depends on your ontology
(topic and association types). It's possible to use NLP techniques and
existing NLP software to do this, but that doesn't remove the need for
an understanding of the data.
Can you tell us more about your project? That would make it easier to
help you.
Richard Light
2004-10-26 07:05:03 UTC
Permalink
Post by Aayush Puri
Thanks a lot for yr feedback Lars and Richard....
I guess I didn't explain my problem well. Let me give it another
shot.
What I am supposed to do is to design and implement a system that is
able to generate topic maps from a given set of data...lets suppose
that the data is stored in some database. Also I have a complete list
of the topics given to me. Now I need to "study" the given datasets
and derive relationship between the list of given topics. As I was
also expected and as Lars suggested that may be some NLP techniques
need to be used.
Richard I know that it may be somewhat simple to transform XML to XTM
but the problem after reading some data source how do I set the
relationships(associations) between the topics??
Suppose I read a text "Shakespeare authored Merchant of Venice"....and
in the list of given topics I have "Shakespeare" and "Merchant of
Venice". Now after reading the aformentioned sentence I must be able
to set associations of "authored_by" and "authored" so that the topic
map is able to say not only that "Shakespeare authored Merchant of
Venice" but also "Merchant of Venice was authored by Shakespeare"
Aayush,

I'm now even less clear about what form your data takes (!) Your
original question concerned a "database": to me that would suggest
relational tables, normalization of data, etc. That is a very different
thing from a text in which you read that "Shakespeare authored Merchant
of Venice".

If it really is just textual sources that you are working from, I would
concur with Lars' suggestion of an NLP approach (and good luck!).
However, if you do have a database structure it should be possible to
work at a higher, more effective level.

A database schema embodies a set of assertions about the data it holds.
If, for example, you had a two-column table with fields "Playwright" and
"Play" containing the row

"Shakespeare", "Merchant of Venice"

then from the database schema itself you can define the topics
"playwright" and "play" in your topic map. The row itself gives you two
more topics: "Shakespeare" and "Merchant of Venice", and the database
schema infers an association between the two. The two-way linking you
are asking about at the end of your reply is achieved within a single
association, by defining the role that each topic plays in the
association: Shakespeare is linked in the role "playwright", and
Merchant of Venice in the role "play".

The point here is that you can apply the same logic to every single row
in the table, and programmatically derive N associations (and
2N-minus-duplicates topics) from an N-row table. This is how deriving
topic maps from a database can potentially become quite efficient.

Richard
--
Richard Light
SGML/XML and Museum Information Consultancy
***@light.demon.co.uk
Aayush Puri
2004-10-27 04:46:09 UTC
Permalink
Richard,
Thanks for your reply. I am now starting to understand what are the
challenges in generating topic maps from databases and from textual
sources. Actually I must have stated clearly that in my case mostly
the database would consist of textual sources. So I guess I need to go
the NLP way.
BTW do you have some suggestions is this regard as NLP is quite a new
thing for me so I just want my job to be done rather than getting into
the details of NLP??

Richard, once again you for your suggestions.

Best Wishes,
-Aayush

-----
Post by Richard Light
I'm now even less clear about what form your data takes (!) Your
original question concerned a "database": to me that would suggest
relational tables, normalization of data, etc. That is a very different
thing from a text in which you read that "Shakespeare authored Merchant
of Venice".
If it really is just textual sources that you are working from, I would
concur with Lars' suggestion of an NLP approach (and good luck!).
However, if you do have a database structure it should be possible to
work at a higher, more effective level.
A database schema embodies a set of assertions about the data it holds.
If, for example, you had a two-column table with fields "Playwright" and
"Play" containing the row
"Shakespeare", "Merchant of Venice"
then from the database schema itself you can define the topics
"playwright" and "play" in your topic map. The row itself gives you two
more topics: "Shakespeare" and "Merchant of Venice", and the database
schema infers an association between the two. The two-way linking you
are asking about at the end of your reply is achieved within a single
association, by defining the role that each topic plays in the
association: Shakespeare is linked in the role "playwright", and
Merchant of Venice in the role "play".
The point here is that you can apply the same logic to every single row
in the table, and programmatically derive N associations (and
2N-minus-duplicates topics) from an N-row table. This is how deriving
topic maps from a database can potentially become quite efficient.
Richard
Richard Light
2004-10-27 07:37:00 UTC
Permalink
Post by Aayush Puri
Richard,
Thanks for your reply. I am now starting to understand what are the
challenges in generating topic maps from databases and from textual
sources. Actually I must have stated clearly that in my case mostly
the database would consist of textual sources. So I guess I need to go
the NLP way.
BTW do you have some suggestions is this regard as NLP is quite a new
thing for me so I just want my job to be done rather than getting into
the details of NLP??
Not my field either, but in general terms you are probably going to have
to parse (in the original sense) your textual sources at some level
(subject, verb, object, that sort of thing) in order to infer the
relationships between the concepts expressed therein.

An alternative technique is to go in and mark up the text explicitly -
"this is a person, that is a place", etc. - and use XML-based processing
to extract the topics and associations.

Either way it's going to be (a) non-trivial and (b) dependent at the
detailed level on the exact nature of your data.

Richard
--
Richard Light
SGML/XML and Museum Information Consultancy
***@light.demon.co.uk
Aayush Puri
2004-10-28 12:22:39 UTC
Permalink
Thanks a lot Richard..>you helped clarify a lot of stuff!

Best Wishes,
-Aayush
Post by Richard Light
Post by Aayush Puri
Richard,
Thanks for your reply. I am now starting to understand what are the
challenges in generating topic maps from databases and from textual
sources. Actually I must have stated clearly that in my case mostly
the database would consist of textual sources. So I guess I need to go
the NLP way.
BTW do you have some suggestions is this regard as NLP is quite a new
thing for me so I just want my job to be done rather than getting into
the details of NLP??
Not my field either, but in general terms you are probably going to have
to parse (in the original sense) your textual sources at some level
(subject, verb, object, that sort of thing) in order to infer the
relationships between the concepts expressed therein.
An alternative technique is to go in and mark up the text explicitly -
"this is a person, that is a place", etc. - and use XML-based processing
to extract the topics and associations.
Either way it's going to be (a) non-trivial and (b) dependent at the
detailed level on the exact nature of your data.
Richard
Loading...