Discussion:
Ignoring specific characters like > in the XML while doing XSLT
Dipesh Khakhkhar
2003-10-16 15:55:16 UTC
Permalink
Hi,

In my input xml there is some special characters like >
and when i do trasformation this is changing to ">" i.e. great than symbol and
somehow there is error in the ouput. I am getting text output.

I removed that from the file and run my xsl and it gave me correct output.

How do i ignore special characters like those ? I mean i don't want XSLT to
change it.

Any help would be appreciated.


Regards,
Dipesh


XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Wendell Piez
2003-10-16 19:58:48 UTC
Permalink
Dipesh,

The character sequence "& g t ;" (no spaces), by definition, is an entity
reference in XML, and by definition it represents the character ">".

(Hey, where's Dave C or Mike B? This thing is an entity reference that
happens to be built in, not a character reference, right?)

When you say "I am getting text output" are you trying to tell us you have

<xsl:output method="text"/>

? since if so, this method is specifically required not to escape
characters such as "<" and ">" and "&" into their well-formed XML
representations "&lt;" and "&gt;" and "&amp;" but to leave them as "<" and
">" and "&" -- since it's making plain text (not XML), and these are the
plain text characters those references refer to.

Try outputting XML (method="xml") instead of text, and you'll find the
serializer will escape the thing back again. (Of course you may not like
the output for another reason.)
Post by Dipesh Khakhkhar
How do i ignore special characters like those ? I mean i don't want XSLT to
change it.
The XSLT processor isn't changing it; it's the parser sitting in front that
is resolving it -- from this point of view, it isn't a change, it's only
making it into what it "really is" (what it is always supposed to represent).

Cheers,
Wendell


======================================================================
Wendell Piez mailto:***@mulberrytech.com
Mulberry Technologies, Inc. http://www.mulberrytech.com
17 West Jefferson Street Direct Phone: 301/315-9635
Suite 207 Phone: 301/315-9631
Rockville, MD 20850 Fax: 301/315-8285
----------------------------------------------------------------------
Mulberry Technologies: A Consultancy Specializing in SGML and XML
======================================================================


XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Dipesh Khakhkhar
2003-10-17 14:49:04 UTC
Permalink
Hi Wendell,

Thanks for replying and explaining about DOE.

My output method is text and i have defined it like this.

<xsl:output method="text" encoding="UTF-8"/>

And in the input xml file i am getting value like this at one place <myTag NAME="Identifier">HL-DT-ST CD-ROM GCR-8480Bema"&gt;</ATTRIBUTE>

and there is one end of line character after &gt; and then question mark is
there. I guess this file is generated by some tool and there must be some
goofing somewhere which is producing output like this.

So do you mean with ouput method as text, i won't be able to use DOE ?

I can't change my output method from text to xml. Are there any other ways to
escape those.

Thanks once again for replying.

Regards,
Dipesh


Date: Thu, 16 Oct 2003 15:58:48 -0400
From: Wendell Piez <***@mulberrytech.com>
Subject: Re: [xsl] Ignoring specific characters like &gt; in the XML while
doing XSLT

Dipesh,

The character sequence "& g t ;" (no spaces), by definition, is an entity
reference in XML, and by definition it represents the character ">".

(Hey, where's Dave C or Mike B? This thing is an entity reference that happens
to be built in, not a character reference, right?)

When you say "I am getting text output" are you trying to tell us you have

<xsl:output method="text"/>

? since if so, this method is specifically required not to escape characters
such as "<" and ">" and "&" into their well-formed XML representations "&lt;"
and "&gt;" and "&amp;" but to leave them as "<" and ">" and "&" -- since it's
making plain text (not XML), and these are the plain text characters those
references refer to.

Try outputting XML (method="xml") instead of text, and you'll find the
serializer will escape the thing back again. (Of course you may not like the
output for another reason.)
Post by Dipesh Khakhkhar
How do i ignore special characters like those ? I mean i don't want XSLT to
change it.
The XSLT processor isn't changing it; it's the parser sitting in front that is
resolving it -- from this point of view, it isn't a change, it's only making
it into what it "really is" (what it is always supposed to represent).

Cheers,
Wendell


XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Wendell Piez
2003-10-17 18:39:11 UTC
Permalink
Dipesh,

The thing is, using output method="text", it's as if DOE is always on, so
switching it on does nothing. [XSLT 16.4: "The text output method ignores
the disable-output-escaping attribute, since it does not perform any output
escaping.]
Post by Dipesh Khakhkhar
My output method is text and i have defined it like this.
<xsl:output method="text" encoding="UTF-8"/>
there. I guess this file is generated by some tool and there must be some
goofing somewhere which is producing output like this.
So do you mean with ouput method as text, i won't be able to use DOE ?
I can't change my output method from text to xml. Are there any other ways to
escape those.
If your output is otherwise plain text but you want ">" to appear as
"&gt;", you can use a recursive string-replacement routine in your
stylesheet (check exslt.org) to escape the problematic characters.

Cheers,
Wendell


======================================================================
Wendell Piez mailto:***@mulberrytech.com
Mulberry Technologies, Inc. http://www.mulberrytech.com
17 West Jefferson Street Direct Phone: 301/315-9635
Suite 207 Phone: 301/315-9631
Rockville, MD 20850 Fax: 301/315-8285
----------------------------------------------------------------------
Mulberry Technologies: A Consultancy Specializing in SGML and XML
======================================================================


XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Loading...