<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>The discovery blog &#187; Steve Mallen</title>
	<atom:link href="http://blogs.semantico.com/discovery-blog/author/steve/feed/" rel="self" type="application/rss+xml" />
	<link>http://blogs.semantico.com/discovery-blog</link>
	<description>Semantico looks at online publishing</description>
	<lastBuildDate>Thu, 02 Sep 2010 10:22:31 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<image>
			<title>The discovery blog</title>
			<url>http://blogs.semantico.com/discovery-blog/wp-content/uploads/2008/11/logo64.png</url>
			<link>http://blogs.semantico.com/discovery-blog</link>
			<width>64</width>
			<height>64</height>
			<description>Semantico looks at online publishing</description>
		</image>		<item>
		<title>Seven tips for better XML data quality</title>
		<link>http://blogs.semantico.com/discovery-blog/2009/09/seven-tips-for-better-xml-data-quality/</link>
		<comments>http://blogs.semantico.com/discovery-blog/2009/09/seven-tips-for-better-xml-data-quality/#comments</comments>
		<pubDate>Thu, 24 Sep 2009 10:42:22 +0000</pubDate>
		<dc:creator>Steve Mallen</dc:creator>
				<category><![CDATA[Project Management]]></category>
		<category><![CDATA[Technology]]></category>

		<guid isPermaLink="false">http://blogs.semantico.com/discovery-blog/?p=638</guid>
		<description><![CDATA[Data is at the heart of every online publishing application. Ensuring that data is accurately and comprehensively marked up is the key to providing a good online experience for those searching and viewing your content.
The following is a small collection of tips for making sure your XML data is in tip-top condition.

1. Use Semantic Markup
Specify [...]]]></description>
			<content:encoded><![CDATA[<p><img class="size-full wp-image-644 alignleft" src="http://blogs.semantico.com/discovery-blog/wp-content/uploads/2009/09/iStock_000004306014XSmall.jpg" alt="XML data" width="247" height="185" />Data is at the heart of every online publishing application. Ensuring that data is accurately and comprehensively marked up is the key to providing a good online experience for those searching and viewing your content.</p>
<p>The following is a small collection of tips for making sure your XML data is in tip-top condition.</p>
<p><span id="more-638"></span></p>
<h3>1. Use Semantic Markup</h3>
<p>Specify &#8220;what&#8221; not &#8220;how&#8221; in your data.  If an item is bold in print, then consider using a tag which describes &#8220;why&#8221; the element is bold.  Perhaps it is a title in a reference.  Using &lt;title&gt; rather than &lt;i&gt; describes the intent rather than the output.  This way, data can be re-used in many different contexts.  What happens later when titles need to be shown in bold?  The names of all tags and attributes should reflect the information they contain rather than how they&#8217;ll be presented to an end user.</p>
<h3>2. Mark up each unit of information</h3>
<p>Insufficient markup is often one of the biggest problems with online content.  If there is no markup, there is no way to detect and process a piece of data.  You should tag each unit of information, and avoid implicit structure.</p>
<h3>3. Validate your XML</h3>
<p>Use a DTD or XML Schema to validate your markup.  This is vital for good QA.</p>
<h3>4. Use opaque IDs</h3>
<p>An ID should uniquely identify a piece of content, and be persistent.  Any other information gleaned from the ID might change in future, causing the ID to change.</p>
<h3>5. Use elements for data, attributes for metadata</h3>
<p>This isn&#8217;t a strict rule, but whenever in doubt, use an element.  Attributes should not contain structured data &#8211; structured information is easier to process as elements.  Some good candidates for attributes are: IDs, URLs, revision dates, types.</p>
<h3>6. Separate print and online content</h3>
<p>Things like page numbers are irrevelant for online use.  Consider using attributes to indicate whether content is meant for print or online use.</p>
<h3>7. Don&#8217;t reinvent the wheel</h3>
<p>Use TEI, NLM, Docbook, Dublin Core, etc. rather than making up your own standard.</p>
<p>Good, structured data is the key to a functionally rich user experience online.  If you want to find out more, I can heartily recommend these titles:</p>
<p><a href="http://books.google.co.uk/books?id=GBT61nOT058C&amp;pg=PR4&amp;lpg=PR4&amp;dq=%22Effective+XML:+50+Specific+Ways+to+Improve+Your+XML%22,+by+Elliotte+Rusty+Harold&amp;source=bl&amp;ots=UFn6NATh0A&amp;sig=bdw1P2-igD48WzgoqsIsR3IQJWw&amp;hl=en&amp;ei=yHi_SrePK5rLjAfN8YhA&amp;sa=X&amp;oi=book_result&amp;ct=result&amp;resnum=1#v=onepage&amp;q=&amp;f=false" target="_self">&#8220;Effective XML: 50 Specific Ways to Improve Your XML&#8221;, by Elliotte Rusty Harold</a><br />
<a href="http://books.google.co.uk/books?id=7LNhdOeQulQC&amp;pg=PP1&amp;dq=%22XML+Data+Management:+Native+XML+and+XML-Enabled+Database+Systems%22,+by+Akmal+B.+Chaudhri%3B+Awais+Rashid%3B+Roberto+Zicari&amp;ei=N3m_StiGIqfkyQTfxaTKDw#v=onepage&amp;q=&amp;f=false" target="_self">&#8220;XML Data Management: Native XML and XML-Enabled Database Systems&#8221;, by Akmal B. Chaudhri; Awais Rashid; Roberto Zicari</a></p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.semantico.com/discovery-blog/2009/09/seven-tips-for-better-xml-data-quality/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
