<?xml version="1.0" encoding="iso-8859-1"?>
<?xml-stylesheet type="text/xsl" href="/xslt/html.xsl"?>
<page
	title="software - xmltv_merge - merge XMLTV files into one | robert klep"
	description="xmltv_merge - merge multiple XMLTV-files into one"
	keywords="xmltv, mythtv, mythfilldatabase, merge, xml, lxml, python"
	>
	<content>

		<paragraph title="NAME">
		<link url="dl/xmltv_merge">xmltv_merge</link> &#8211; merge multiple XMLTV-files into one
		</paragraph>
		<paragraph title="SYNOPSIS">
			<code>xmltv_merge xmltvfile [xmltvfile ...]</code>
		</paragraph>
		<paragraph title="DESCRIPTION">
     xmltv_merge merges the contents of multiple XMLTV-files. This can be
     useful if you grab XMLTV-data from multiple sources and want to create
     one big file, perhaps for use with `mythfilldatabase'.
		</paragraph>
		<paragraph>
     It also tries to output a file which adheres to the XMLTV DTD. As
     such, it can also be used as a filter to create such files from
     non-adhering XMLTV files.
		</paragraph>
		<paragraph>
     xmltv_merge can - roughly - handle multiple XMLTV-files containing
     data for the same channel; it does so by merging all information for
     a particular program (with a simple algorithm: for each field in the
     program structure, keep the field with the longest string).
		</paragraph>
		<paragraph>
     However, it does so only if it can match the program across the
     different XMLTV-sources: if one source lists a program `A', and
     another source lists a program `B', both on the same start time on the
     same channel, xmltv_merge uses a couple of methods to determine if
     A is the same as B (see the similarity() function in the script). If they
     don't match, the program data won't be merged and the entry from the
     XMLTV-source which is listed first on the commandline `wins'.
		</paragraph>
		<paragraph>
     Also, xmltv_merge doesn't (yet) fix overlaps, gaps or differing
     start/end times for the same program. If one source lists program `A'
     as starting on 09:30, and another source lists the same program `A' as
     starting on 09:27, these both end up in the output as different
     programs.
		</paragraph>
		<paragraph title="PREREQUISITES">
     Apart from a decent Python installation (this script was developed
     using Python 2.5.1 on Mac OS X 10.5), you'll need the lxml module from
		 <link url="http://codespeak.net/lxml/">http://codespeak.net/lxml/</link>
		</paragraph>
		<paragraph title="TODO LIST">
			<ul>
				<li>handle overlaps/gaps</li>
				<li>make similarity thresholds configurable</li>
				<li>code cleanups</li>
			</ul>
		</paragraph>

	</content>
</page>

