Discussion:
Muenchian grouping on external XML files
Paul Spence
2004-01-19 18:31:53 UTC
Permalink
Is there any way I can use keys when calling on external XML files using
the document function? I am trying to use the Muenchian grouping
technique to list unique content values for an element, but the values
exist in a document that is external to the one I'm processing.

More detail below (apologies for long post):

I am developing an application that publishes XML files (marked up in
TEI) as static HTML files. I am processing one set of XML files which
create the general web pages themselves, e.g. snippet from 'genweb.xml'

-----------------XML FILE--------------
<TEI.2>
...
<text>
...
<p>This is an index of people:</p>
<divGen id="personIndex" />
<p>And some more text ...</p>
</text>
</TEI.2>
-------------------------------

My TEI publishing process uses the <divGen> element in this XML file to
call for an index to be created at a particular point: in this case, an
index of people.

I have another series of (primary source) XML files that contains
various references to people. Snippet2 from an example 'primary.xml':

-----------------XML FILE--------------
<TEI.2>
...
<text>
...
<p>Some text with mentions of <rs type="person">person 1</rs> and
<rs type="person">person 2</rs>.</p>
<p>And some more text ...</p>
<p>Another mention of <rs type="person">person 1</rs></p>
</text>
</TEI.2>
-------------------------------

I group these primary source files together as a collection in
'master_primary.xml'. I then want to create an index of people mentioned
throughout the collection, with links to the individual mention.
Something like this:

-----------------HTML output--------------
Index of people mentioned

[each asterisk takes you to an individual mention of the person in
question]

person 1: * * *
person 2: * * * * *
-------------------------------

I normally do this with Muenchian grouping, but can't get it working on
external XML files. See following XSLT snippet, which I think crashes
because of the match pattern in <xsl:key>:

-----------------XSLT FILE--------------
<xsl:key name="myKeyName" match="document('master_primary.xml')//rs"
use="normalize-space(.)" />

<xsl:template match="divGen">
<xsl:if test="@id='personIndex'">
<xsl:for-each
select="document('../../xml/02_master/master_object.xml')//rs[generate-i
d(.)=generate-id(key('myKeyName',
normalize-space(.))[1])][@type='person']">
NOW DO SOMETHING ELSE ...
</xsl:for-each>
</xsl:if>
</xsl:template>
---------------------------------------------

I can think of other methods, using intermediate XML output, but would
prefer to avoid them if possible.

I am using Saxon processor version 6.5.2.

Thanks in advance,
Paul

---------------------------------------
Paul Spence
Centre for Computing in the Humanities
King's College London


XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
G. Ken Holman
2004-01-19 20:30:40 UTC
Permalink
Post by Paul Spence
Is there any way I can use keys when calling on external XML files using
the document function?
Key tables only have document-wide scope.
Post by Paul Spence
I am trying to use the Muenchian grouping
technique to list unique content values for an element, but the values
exist in a document that is external to the one I'm processing.
The technique uses generate-id() which is unique for all nodes in the set
of source node trees, thus you'll never equate the node from one tree to
the node in another tree by using generate-id().
Not enough, though, to work with for a working example.
Post by Paul Spence
-----------------XSLT FILE--------------
<xsl:key name="myKeyName" match="document('master_primary.xml')//rs"
use="normalize-space(.)" />
<xsl:template match="divGen">
<xsl:for-each
select="document('../../xml/02_master/master_object.xml')//rs[generate-i
d(.)=generate-id(key('myKeyName',
The above test will *always* produce an empty set: the generated
identifiers for nodes in master_primary.xml (in your key table) will never
equal any of the generated identifiers in your master_object.xml files.
Post by Paul Spence
I can think of other methods, using intermediate XML output, but would
prefer to avoid them if possible.
You might be stuck doing so ... grouping techniques employ node identity,
not node equality, and your nodes are from different documents.

I hope this helps.

.......................... Ken

--
North America (Washington, DC): 3-day XSLT/2-day XSL-FO 2004-03-15
- (San Francisco, CA): 3-day XSLT/2-day XSL-FO 2004-03-22
Asia (Hong Kong, China): 3-day XSLT/2-day XSL-FO 2004-05-17
Europe (Bremen, Germany): 3-day XSLT/2-day XSL-FO 2004-05-24
Instructor-led on-site corporate, government & user group training
for XSLT and XSL-FO world-wide: please contact us for the details

G. Ken Holman mailto:***@CraneSoftwrights.com
Crane Softwrights Ltd. http://www.CraneSoftwrights.com/s/
Box 266, Kars, Ontario CANADA K0A-2E0 +1(613)489-0999 (F:-0995)
Male Breast Cancer Awareness http://www.CraneSoftwrights.com/s/bc


XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Thomas V. Nielsen
2004-01-19 21:56:59 UTC
Permalink
I'll think something like this must have been done a thousand times before,
but right now, my brain and my search capabilities are to tired.

I have a xml document e.g.

<content>
<item>bedrock</item>
<item>alphaville</item>
<item>Zebra<item>
<item>365 BC</item>
...
</content>


What I would like is to transform this document into a HTML table like with
four columns

-----------------------------
| # | E | J | Q |
|365 BC| | | R |
| A | F | K | S |
|alpha | | L | T |
| B | G | M | U |
|bedroc| H | N | V |
| C | I | O | X |
| D | | P | Z |
-----------------------------

Right know I can't either figure out how to sort the items in groups, or
find out how many groups to put into each column to make it look nice.

Any hints is appreciated.

<thomas/>


XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
G. Ken Holman
2004-01-19 22:35:22 UTC
Permalink
Post by Thomas V. Nielsen
I'll think something like this must have been done a thousand times before,
I've done this for my own stuff.
Post by Thomas V. Nielsen
I have a xml document e.g.
...
What I would like is to transform this document into a HTML table like with
four columns
-----------------------------
| # | E | J | Q |
|365 BC| | | R |
| A | F | K | S |
|alpha | | L | T |
| B | G | M | U |
|bedroc| H | N | V |
| C | I | O | X |
| D | | P | Z |
-----------------------------
Right know I can't either figure out how to sort the items in groups,
That part is below. Note that the entities are not required but I find it
easier when doing some grouping tasks. Each of these are only used once,
but should they be needed elsewhere then the entities are handy and ready
to use.
Post by Thomas V. Nielsen
or find out how many groups to put into each column to make it look nice.
*That* part will take two passes ... I think it would be a lot easier than
trying a recursive call.

I used text, you can change it to put out XML for the second pass.

I hope this helps.

............................... Ken

T:\ftemp>type thomas.xml
<content>
<item>bedrock</item>
<item>alphaville</item>
<item>Zebra</item>
<item>365 BC</item>
<item>alpha2</item>
<item>alphaz</item>
...
</content>

T:\ftemp>type thomas.xsl
<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE xsl:stylesheet [
<!ENTITY lower 'abcdefghijklmnopqrstuvwxyz0123456789'>
<!ENTITY upper 'ABCDEFGHIJKLMNOPQRSTUVWXYZ##########'>
<!ENTITY first 'translate(substring(.,1,1),"&lower;","&upper;")'>
]>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:i="internal"
version="1.0">

<xsl:key name="firsts" match="item" use="&first;"/>

<xsl:output method="text"/>

<i:index>
<l>#</l> <l>A</l> <l>B</l> <l>C</l> <l>D</l> <l>E</l>
<l>F</l> <l>G</l> <l>H</l> <l>I</l> <l>J</l> <l>K</l> <l>L</l>
<l>M</l> <l>N</l> <l>O</l> <l>P</l> <l>Q</l> <l>R</l> <l>S</l>
<l>T</l> <l>U</l> <l>V</l> <l>W</l> <l>X</l> <l>Y</l> <l>Z</l>
</i:index>

<xsl:template match="/">
<xsl:variable name="input" select="/"/>
<xsl:for-each select="document('')/*/i:index/l">
<xsl:value-of select="."/><xsl:text>
</xsl:text>
<xsl:variable name="index" select="."/>
<xsl:for-each select="$input">
<xsl:for-each select="key('firsts',$index)">
<xsl:sort/>
<xsl:value-of select="."/><xsl:text>
</xsl:text>
</xsl:for-each>
</xsl:for-each>
</xsl:for-each>
</xsl:template>

</xsl:stylesheet>
T:\ftemp>saxon thomas.xml thomas.xsl
#
365 BC
A
alpha2
alphaville
alphaz
B
bedrock
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
S
T
U
V
W
X
Y
Z
Zebra

T:\ftemp>rem Done!


--
North America (Washington, DC): 3-day XSLT/2-day XSL-FO 2004-03-15
- (San Francisco, CA): 3-day XSLT/2-day XSL-FO 2004-03-22
Asia (Hong Kong, China): 3-day XSLT/2-day XSL-FO 2004-05-17
Europe (Bremen, Germany): 3-day XSLT/2-day XSL-FO 2004-05-24
Instructor-led on-site corporate, government & user group training
for XSLT and XSL-FO world-wide: please contact us for the details

G. Ken Holman mailto:***@CraneSoftwrights.com
Crane Softwrights Ltd. http://www.CraneSoftwrights.com/s/
Box 266, Kars, Ontario CANADA K0A-2E0 +1(613)489-0999 (F:-0995)
Male Breast Cancer Awareness http://www.CraneSoftwrights.com/s/bc


XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Loading...