<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>philihp.com &#187; Optimization</title>
	<atom:link href="http://www.philihp.com/blog/tag/optimization/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.philihp.com/blog</link>
	<description>I do things, and then I tell the internet about them.</description>
	<lastBuildDate>Mon, 06 Feb 2012 05:40:30 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>SQL Optimization: Union vs. Union All</title>
		<link>http://www.philihp.com/blog/2010/sql-optimization-union-vs-union-all/</link>
		<comments>http://www.philihp.com/blog/2010/sql-optimization-union-vs-union-all/#comments</comments>
		<pubDate>Thu, 27 May 2010 22:08:45 +0000</pubDate>
		<dc:creator>philihp</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[Optimization]]></category>
		<category><![CDATA[SQL]]></category>
		<category><![CDATA[Union]]></category>

		<guid isPermaLink="false">http://www.philihp.com/blog/?p=582</guid>
		<description><![CDATA[Everyone should learn the difference between Union and Union All. Knowing it will make you a better programmer, and it&#8217;s fairly trivial to understand. SELECT * FROM apples UNION SELECT * FROM oranges When you know for a fact that there will never be any common rows between the apples table and the oranges table, [...]]]></description>
			<content:encoded><![CDATA[<p>Everyone should learn the difference between Union and Union All. Knowing it will make you a better programmer, and it&#8217;s fairly trivial to understand.</p>
<pre>SELECT * FROM apples
UNION
SELECT * FROM oranges</pre>
<p>When you know for a fact that there will never be any common rows between the <code>apples</code> table and the <code>oranges</code> table, this query will be slightly faster with at low cardinality, and incredibly faster at high cardinality by using &#8220;UNION ALL&#8221;</p>
<pre>SELECT * FROM apples
UNION ALL
SELECT * FROM oranges</pre>
<p>The difference between the two queries is this: UNION ALL will simply concatenate the two queries together into the resultset. Just using UNION will concatenate, but then remove duplicates (do a distinct sort). Leaving out this second step can vastly reduce the time it takes for your query to run.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.philihp.com/blog/2010/sql-optimization-union-vs-union-all/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Dramatically Increasing SAS DI Studio performance of SCD Type-2 Loader Transforms</title>
		<link>http://www.philihp.com/blog/2010/dramatically-increasing-sas-di-studio-performance-of-scd-type-2-loader-transforms/</link>
		<comments>http://www.philihp.com/blog/2010/dramatically-increasing-sas-di-studio-performance-of-scd-type-2-loader-transforms/#comments</comments>
		<pubDate>Tue, 19 Jan 2010 21:55:35 +0000</pubDate>
		<dc:creator>philihp</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[3.4]]></category>
		<category><![CDATA[9.1.3]]></category>
		<category><![CDATA[Data Integration Studio]]></category>
		<category><![CDATA[DI Studio]]></category>
		<category><![CDATA[Index]]></category>
		<category><![CDATA[Optimization]]></category>
		<category><![CDATA[SAS]]></category>
		<category><![CDATA[SCD Type-2 Loader]]></category>
		<category><![CDATA[Slowly Changing Dimension]]></category>

		<guid isPermaLink="false">http://www.philihp.com/blog/?p=535</guid>
		<description><![CDATA[In SAS DI Studio 3.4 (and I imagine in future versions), the prepackaged code for the SCD Type-2 Loader works like this: Does the dataset exist? If not, create an empty dataset with structure and indexes as defined from metadata. Then detect differences between it and the source dataset and the target dataset, expire any [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.sas.com/technologies/dw/entdiserver/index.html"><img src="http://linode.philihp.com/blog/wp-content/uploads/2010/01/SCDLoader2.png" alt="SCD Loader" title="SCD Loader" width="316" height="94" class="alignright size-full wp-image-544" /></a>In SAS DI Studio 3.4 (and I imagine in future versions), the prepackaged code for the SCD Type-2 Loader works like this: Does the dataset exist? If not, create an empty dataset with structure and indexes as defined from metadata. Then detect differences between it and the source dataset and the target dataset, expire any observations that are modified or deleted by setting their valid-to-date to now, and append any modified or new observations with a valid-from-date. The expire bit is done in-place with a data step modify statement, and the append is done with PROC APPEND. I assume this is done to reduce the amount of locking necessary on the dimension dataset. Because new observations are appended, the dataset never actually gets sorted by the business key, so this could lead to exponential growth over time on the expire bit; every time the transform wants to change a single observation&#8217;s valid-to-date, it scans the entire table. And it doesn&#8217;t help that compression is off by default.</p>
<p>In the instance below, we had an uncompressed dataset with a modest 220,000 rows. The dataset had a variable &#8220;description&#8221; defined as a 4000-length string which was usually but not always null. Most steps of the SCD Loader run in a few seconds, but the following is usually the bottleneck. On a run one night to populate a new variable, this data step ran in a little over 9 hours:</p>
<pre>     data PRESTAGE.LOAD_SW_ENTITY_RELEASE_X_HOST;
        modify PRESTAGE.LOAD_SW_ENTITY_RELEASE_X_HOST
           work.etls_close
            (rename = (ETLS_KEY = SW_ENTITY_REL_X_HOST_ID
           ETLS_FROMDATE = VALID_FROM_DTTM))
            updatemode=nomissingcheck;
        by SW_ENTITY_REL_X_HOST_ID VALID_FROM_DTTM;
        VALID_TO_DTTM = ETLS_CLSDATE;
        if %sysrc(_SOK) eq _iorc_ then
           replace;
        _iorc_ = 0;
        _error_ = 0;
     run;</pre>
<pre>NOTE: There were 1 observations read from the data set PRESTAGE.LOAD_SW_ENTITY_RELEASE_X_HOST.
NOTE: The data set PRESTAGE.LOAD_SW_ENTITY_RELEASE_X_HOST has been updated.  There were 45924 observations rewritten, 0
     observations added and 0 observations deleted.
NOTE: There were 45924 observations read from the data set WORK.ETLS_CLOSE.
NOTE: DATA statement used (Total process time):
     real time           9:06:00.46
     cpu time            9:05:58.43</pre>
<p>Turning compression on reduced the size of this dataset from 1.1 gigs to 4.5 megs, mostly from compressing the 4000-char string that was usually empty; with compression off, an X-length string always takes up X bytes because it&#8217;s faster to seek to a certain observation if all observations are the same size. Additionally, in metadata an index was defined on the two variables used above in the BY statement. I ran this PROC DATASETS statement to create the index by hand (the Loader would create them if the table didn&#8217;t exist, but it assumes indexes exist if the table exists).</p>
<pre>proc datasets library=prestage nolist;
 modify load_sw_entity_release_x_host;
 index create ndx=(sw_entity_rel_x_host_id valid_from_dttm) / unique;
quit;</pre>
<p>With this metadata in place, the SCD Loader uses the index in the datastep and generates the following datastep instead:</p>
<pre>     data PRESTAGE.LOAD_SW_ENTITY_RELEASE_X_HOST;
        set work.etls_close(rename = (ETLS_KEY = SW_ENTITY_REL_X_HOST_ID
                                      ETLS_FROMDATE = VALID_FROM_DTTM));
        <b>modify PRESTAGE.LOAD_SW_ENTITY_RELEASE_X_HOST
            key=ndx / unique;</b>
        VALID_TO_DTTM = ETLS_CLSDATE;
        if %sysrc(_SOK) eq _iorc_ then
           replace;
        _iorc_ = 0;
        _error_ = 0;
     run;</pre>
<p>By using the index, the runtime was reduced to under two seconds.</p>
<pre>NOTE: There were 45924 observations read from the data set WORK.ETLS_CLOSE.
NOTE: The data set PRESTAGE.LOAD_SW_ENTITY_RELEASE_X_HOST has been updated.  There were 45924 observations rewritten, 0
     observations added and 0 observations deleted.
NOTE: DATA statement used (Total process time):
     real time           1.41 seconds
     cpu time            1.21 seconds</pre>
]]></content:encoded>
			<wfw:commentRss>http://www.philihp.com/blog/2010/dramatically-increasing-sas-di-studio-performance-of-scd-type-2-loader-transforms/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Reverse the Bits in a Byte in 3 operations</title>
		<link>http://www.philihp.com/blog/2009/reverse-the-bits-in-a-byte-in-3-operations/</link>
		<comments>http://www.philihp.com/blog/2009/reverse-the-bits-in-a-byte-in-3-operations/#comments</comments>
		<pubDate>Wed, 18 Mar 2009 15:39:44 +0000</pubDate>
		<dc:creator>philihp</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[Bitwise]]></category>
		<category><![CDATA[Optimization]]></category>

		<guid isPermaLink="false">http://www.philihp.com/blog/?p=404</guid>
		<description><![CDATA[This is among many brilliant hacks from this page. b = (b * 0x0202020202ULL &#038; 0x010884422010ULL) % 1023; The multiply operation creates five separate copies of the 8-bit byte pattern to fan-out into a 64-bit value. The AND operation selects the bits that are in the correct (reversed) positions, relative to each 10-bit groups of [...]]]></description>
			<content:encoded><![CDATA[<p>This is among many brilliant hacks from <a href="http://graphics.stanford.edu/~seander/bithacks.html#ReverseByteWith64BitsDiv">this</a> page.</p>
<pre>b = (b * 0x0202020202ULL &#038; 0x010884422010ULL) % 1023;</pre>
<blockquote><p>The multiply operation creates five separate copies of the 8-bit byte pattern to fan-out into a 64-bit value. The AND operation selects the bits that are in the correct (reversed) positions, relative to each 10-bit groups of bits. The multiply and the AND operations copy the bits from the original byte so they each appear in only one of the 10-bit sets. The reversed positions of the bits from the original byte coincide with their relative positions within any 10-bit set. The last step, which involves modulus division by 2^10 &#8211; 1, has the effect of merging together each set of 10 bits (from positions 0-9, 10-19, 20-29, &#8230;) in the 64-bit value. They do not overlap, so the addition steps underlying the modulus division behave like or operations. </p></blockquote>
<p>The genius here, I think, is the modulus division to compress eight specific bits from a 64-bit number down into an 8-bit byte.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.philihp.com/blog/2009/reverse-the-bits-in-a-byte-in-3-operations/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Avoid Correlated Subqueries</title>
		<link>http://www.philihp.com/blog/2008/avoid-correlated-subqueries/</link>
		<comments>http://www.philihp.com/blog/2008/avoid-correlated-subqueries/#comments</comments>
		<pubDate>Tue, 16 Sep 2008 03:58:00 +0000</pubDate>
		<dc:creator>philihp</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[Optimization]]></category>
		<category><![CDATA[SAS]]></category>
		<category><![CDATA[SQL]]></category>

		<guid isPermaLink="false">http://www.philihp.com/blog/2008/09/avoid-correlated-subqueries/</guid>
		<description><![CDATA[If your SQL code has a nested select that references a column in an outer select, such as the following, it may be possible to rewrite to perform orders of magnitude faster. proc sql; create table new_rates as select from work.exchange_rate n where not exists( select from imf.exchange_rate o where n.effective_date=o.effective_date and n.iso_char_code=o.iso_char_code ); NOTE: [...]]]></description>
			<content:encoded><![CDATA[<p>If your SQL code has a nested select that references a column in an outer select, such as the following, it may be possible to rewrite to perform orders of magnitude faster.<br />
<code>proc sql;<br />
   create table new_rates as<br />
   select<br />
     from work.exchange_rate n<br />
     where not exists(<br />
       select  from imf.exchange_rate o<br />
       where n.effective_date=o.effective_date and n.iso_char_code=o.iso_char_code );</p>
<p><span style="color: #0000FF">NOTE: Table WORK.NEW_RATES created, with 49 rows and 4 columns.</span></p>
<p> quit;</p>
<p><span style="color: #0000FF">NOTE: PROCEDURE SQL used (Total process time):<br />
     <strong>real time           8.83 seconds</strong><br />
     cpu time            8.65 seconds</span></code><br />
Here, the table imf.exchange_rate has 13416 rows, covering exchange rates at close, daily, for 39 different currencies, over nearly 1 year. Modest, but fairly small. It has no indexes, and has not been sorted (or marked as sorted). work.exchange_rate is a smaller version of it, covering only exchange rates for the last month, with 980 rows. The query is trying to return any exchange rates that we didn&#8217;t have before.</p>
<p>Should be simple right? There&#8217;s no reason for it to take this long. By rewriting the query to do a left join, below, SAS merges the tables behind the scenes, then finishes the query in a single scan.<br />
<code>proc sql;<br />
   create table new_rates as<br />
   select n.*<br />
     from work.exchange_rate n<br />
       left join imf.exchange_rate o<br />
         on (n.effective_date = o.effective_date and n.iso_char_code = o.iso_char_code)<br />
     where o.iso_char_code = '';</p>
<p><span style="color: #0000FF">NOTE: Table WORK.NEW_RATES created, with 49 rows and 4 columns.</span></p>
<p> quit;</p>
<p><span style="color: #0000FF">NOTE: PROCEDURE SQL used (Total process time):<br />
     <strong>real time           0.13 seconds</strong><br />
     cpu time            0.13 seconds</span></code></p>
]]></content:encoded>
			<wfw:commentRss>http://www.philihp.com/blog/2008/avoid-correlated-subqueries/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

