Close

XML Entities

[Last Updated: May 25, 2017]

An entity is a declaration that states a named reference to be used in the XML in place of content or markup.

There are various types of entities but first of all we will see an example on 'internal entity' to understand the concept.


Internal Entity Example


Here's the example code for those who want to do a quick copy-paste:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE myDoc [
<!ENTITY author "Joe">
]>

<myDoc>
	<date>7-5-2016</date>
	<otherInfo>Author: &author;</otherInfo>
</myDoc>

Save the above file say test.xml somewhere in your drive and open it with browser (I used chrome)

Output:



Types of entity

  1. Predefined Entities: The XML specification defines five "predefined entities" representing special characters.
    Entity Character
    &quote; "
    &amp; &
    &lt; <
    &gt; >
    &apos; '

  2. Named Entities: These are named character references used in HTML5.
  3. Numbered Entities: These are single character entities expressed in unicode.
  4. Internal Entities: An internal entity (as we saw in above example) is one that is defined locally. Basic purpose of an internal entity is to avoid duplications by using same entity reference multiple times.
  5. External Entities: The difference with Internal Entity is; the external entity is defined in an separate file. Please see an example below.
  6. Unparsed Entities: An unparsed entity is a resource whose contents may or may not be text e.g. images or audio contents, hence they are not parsed by a generic parser. These entities are identified by name. Each unparsed entity has an associated notation. Please see an example below.
  7. Parameter Entities: These entities are only used in DTD. For example:
    <!ENTITY % authorName "Joe">

    Dereferencing:
    <!ELEMENT author %authorName;>


External Entity Example:

The external entity is just like internal entity except that it is defined in an separate file.

d:/test/test.xml

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE myDoc [
<!ENTITY myExtEntity SYSTEM "test2.xml" >
]>

<myDoc>
	<nestedTag>&myExtEntity;</nestedTag>
</myDoc>

d:/test/test2.xml

<?xml version="1.0" encoding="UTF-8"?>
<tag1>
  <tag2>Some text</tag2>
</tag1>

Note that a parser should replace the external entity reference with the external file content. This doesn't work in modern browsers for local files because os security reasons. There's a famous security risk known as XML external entity attack


Unparsed Entity Example:

An unparsed entity is the one which is not parsed by the parser.

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE logo [
<!ELEMENT logo EMPTY>
<!ATTLIST logo src ENTITY #REQUIRED>
<!ENTITY myLogoImage SYSTEM "http://www.example.com/logo.png" NDATA png>
<!NOTATION png PUBLIC "png viewer">
]>

<logo src="myLogoImage" />

How unparsed entity are useful?

XML specification doesn't expect any particular behavior from an client application for unparsed entity. It's up to the application to load or not to load or make any sort of connection with the server where the actual image resides. At most, parsers should tell the application on whose behalf it's parsing that there is an unparsed entity at a particular URI with a particular notation and let the application decide what, if anything, it wants to do with that information.

See Also