<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-6010613752270306441</id><updated>2011-11-27T16:25:29.996-08:00</updated><category term='Surrogate key processing'/><category term='ENCODE STAGE'/><category term='Change Capture Stage-2'/><category term='Aggregator Stage-5'/><category term='Change Apply Stage-2'/><category term='COPY STAGE'/><category term='Change Capture Stage'/><category term='Aggregator Stage-6'/><category term='Aggregator Stage-4'/><category term='Aggregator Stage-3'/><category term='EXPAND STAGE'/><category term='Aggregator Stage-2'/><category term='DIFFERENCE STAGE'/><category term='Ways to execute DS Jobs'/><category term='Aggregator Stage'/><category term='Change Apply Stage-1'/><category term='COMPRESS STAGE'/><category term='DECODE STAGE'/><title type='text'>DATASTAGE</title><subtitle type='html'>Introduction to Designer,Director,Manager and Administrator,Concept oriented explanation on Processing Stages,ALl Types of File Stages,Database Stages,Developement /Debug Stages, Restructure, Project LifeCycle, Routines,FaQs,Interview Questions,Real time Stuff On Concept,And so on and On,,,......</subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://datastagecareer.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6010613752270306441/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://datastagecareer.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>Eye On This Stuff</name><uri>http://www.blogger.com/profile/02519382966557475421</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>18</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-6010613752270306441.post-8010664580880792433</id><published>2008-05-04T08:51:00.000-07:00</published><updated>2008-05-04T08:57:36.918-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='EXPAND STAGE'/><title type='text'>PROCESSING STAGE</title><content type='html'>&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;span style="color:#660000;"&gt;&lt;strong&gt;EXPAND STAGE&lt;/strong&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;The Expand stage is a processing stage. It can have a single input link and a single output link. Follow this link for a list of steps you must take when deploying an Expand stage in your job.&lt;br /&gt;&lt;/p&gt;&lt;p&gt;The Expand stage uses the UNIX uncompress or GZIP utility to expand a data set. It converts a previously compressed data set back into a sequence of records from a stream of raw binary data.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color:#000099;"&gt;PROPERTIES&lt;br /&gt;&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="color:#660000;"&gt;&lt;strong&gt;OPTION CATEGORY&lt;/strong&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;strong&gt;&lt;span style="color:#000099;"&gt;Command.&lt;/span&gt;&lt;/strong&gt; Specifies whether the stage will use uncompress (the default) or GZIP.&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6010613752270306441-8010664580880792433?l=datastagecareer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://datastagecareer.blogspot.com/feeds/8010664580880792433/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6010613752270306441&amp;postID=8010664580880792433' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6010613752270306441/posts/default/8010664580880792433'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6010613752270306441/posts/default/8010664580880792433'/><link rel='alternate' type='text/html' href='http://datastagecareer.blogspot.com/2008/05/processing-stage_04.html' title='PROCESSING STAGE'/><author><name>Eye On This Stuff</name><uri>http://www.blogger.com/profile/02519382966557475421</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6010613752270306441.post-8367754568416172354</id><published>2008-05-04T08:31:00.000-07:00</published><updated>2008-05-04T08:50:58.103-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='ENCODE STAGE'/><title type='text'>PROCESSING STAGE</title><content type='html'>&lt;p&gt; &lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;span style="color:#660000;"&gt;ENCODE STAGE&lt;/span&gt;&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;&lt;br /&gt;The Encode stage is a processing stage. It encodes a data set using a UNIX encoding command, such as gzip, that you supply. The stage converts a data set from a sequence of records into a stream of raw binary data. The companion Decode stage reconverts the data stream to a data set.  &lt;/p&gt;&lt;p&gt;Follow this link for a list of steps you must take when deploying an Encode stage in your job.&lt;br /&gt;An encoded data set is similar to an ordinary one, and can be written to a data set stage. You cannot use an encoded data set as an input to stages that performs column-based processing or re-orders rows, but you can input it to stages such as Copy. You can view information about the data set in the data set viewer, but not the data itself. You cannot repartition an encoded data set, and you will be warned at runtime if your job attempts to do that.&lt;br /&gt;As the output is always a single stream, you do not have to define meta data for the output link.&lt;br /&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;span style="color:#660000;"&gt;PROPERTIES&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;span style="color:#660000;"&gt;OPTION CATEGORY&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;Command Line. Specifies the command line used for encoding the data set. The command line must configure the UNIX command to accept input from standard input and write its results to standard output. The command must be located in your search path and be accessible by every processing node on which the Encode stage executes.&lt;/p&gt;&lt;p&gt; &lt;/p&gt;&lt;p&gt; &lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6010613752270306441-8367754568416172354?l=datastagecareer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://datastagecareer.blogspot.com/feeds/8367754568416172354/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6010613752270306441&amp;postID=8367754568416172354' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6010613752270306441/posts/default/8367754568416172354'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6010613752270306441/posts/default/8367754568416172354'/><link rel='alternate' type='text/html' href='http://datastagecareer.blogspot.com/2008/05/processing-stage.html' title='PROCESSING STAGE'/><author><name>Eye On This Stuff</name><uri>http://www.blogger.com/profile/02519382966557475421</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6010613752270306441.post-6902669857923555254</id><published>2008-05-04T07:13:00.000-07:00</published><updated>2008-05-04T07:36:10.213-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Ways to execute DS Jobs'/><title type='text'>Ways to execute DS Jobs</title><content type='html'>&lt;div align="center"&gt;&lt;strong&gt;&gt;&gt;PREVIOUS&gt;&gt;&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;&lt;p&gt;&lt;/p&gt;&lt;p&gt;&lt;strong&gt;&lt;span style="color:#660000;"&gt;1.1. What are the ways to execute datastage jobs?&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;A job can be run using a few different methods: &lt;/p&gt;&lt;p&gt;from Datastage Director (menu Job -&gt; Run now...)&lt;br /&gt;from command line using a dsjob command&lt;br /&gt;Datastage routine can run a job (DsRunJob command)&lt;br /&gt;by a job sequencer&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;&lt;p align="center"&gt;&lt;strong&gt;&gt;&gt;NEXT&gt;&gt;&lt;/strong&gt;&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6010613752270306441-6902669857923555254?l=datastagecareer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://datastagecareer.blogspot.com/feeds/6902669857923555254/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6010613752270306441&amp;postID=6902669857923555254' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6010613752270306441/posts/default/6902669857923555254'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6010613752270306441/posts/default/6902669857923555254'/><link rel='alternate' type='text/html' href='http://datastagecareer.blogspot.com/2008/05/ways-to-execute-ds-jobs.html' title='Ways to execute DS Jobs'/><author><name>Eye On This Stuff</name><uri>http://www.blogger.com/profile/02519382966557475421</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6010613752270306441.post-1685452995483694106</id><published>2008-03-16T14:07:00.000-07:00</published><updated>2008-05-04T07:11:46.368-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='DIFFERENCE STAGE'/><title type='text'>DIFFERENCE STAGE</title><content type='html'>&lt;span style="font-size:100%;"&gt;&lt;span style="color:#000099;"&gt;&lt;/span&gt;&lt;strong&gt;&lt;/strong&gt;&lt;div align="center"&gt;&lt;br /&gt;&lt;strong&gt;&gt;&gt;PREVIOUS&gt;&gt;&lt;/strong&gt;&lt;/div&gt;&lt;div align="center"&gt;&lt;br /&gt;&lt;br /&gt; &lt;/div&gt;&lt;/span&gt;&lt;p&gt;&lt;span style="font-size:100%;"&gt;&lt;b&gt;&lt;span style="font-family:';"&gt;&lt;span style="COLOR: rgb(153,51,0)"&gt;DIFFERENCE STAGE&lt;/span&gt;&lt;?xml:namespace prefix = o /&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;span style="font-family:';font-size:100%;"&gt;&lt;br /&gt;&lt;br /&gt;The Difference stage is a processing stage. It performs a record-by-record comparison of two input data sets, which are different versions of the same data set designated the before and after data sets. An example before and after data set are given in &lt;i&gt;Parallel Job Developer's Guide&lt;/i&gt;.. Follow this link for a list of steps you must take when deploying a Difference stage in your job.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;span style="font-family:';font-size:100%;"&gt;The Difference stage outputs a single data set whose records represent the difference between them. The stage assumes that the input data sets have been key-partitioned and sorted in ascending order on the key columns you specify for the Difference stage comparison. You can achieve this by using the Sort stage or by using the built in sorting and partitioning abilities of the Difference stage.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;span style="font-family:';font-size:100%;"&gt;The comparison is performed based on a set of difference key columns. Two records are copies of one another if they have the same value for all difference keys. You can also optionally specify change values. If two records have identical key columns, you can compare the value columns to see if one is an edited copy of the other.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;span style="font-family:';font-size:100%;"&gt;The stage generates an extra column, DiffCode, which indicates the result of each record comparison.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;span style="font-family:';font-size:100%;"&gt;The &lt;a href="http://dwh-career.blogspot.com/"&gt;Difference stage&lt;/a&gt; is similar, but not identical, to the Change Capture stage. The Change Capture stage is intended to be used in conjunction with the Change Apply stage; it produces a change data set which contains changes that need to be applied to the before data set to turn it into the after data set. The Difference stage outputs the before and after rows to the output data set, plus a code indicating if there are differences. Usually, the before and after data will have the same column names, in which case the after data set effectively overwrites the before data set and so you only see one set of columns in the output. You are warned that DataStage is doing this. If your before and after data sets have different column names, columns from both data sets are output; note that any key and value columns must have the same name.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;span style="font-family:';font-size:100%;"&gt;The stage generates an extra column, Diff, which indicates the result of each record comparison.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;span style="font-size:100%;"&gt;&lt;b&gt;&lt;span style="font-family:';"&gt;&lt;span style="COLOR: rgb(0,0,153)"&gt;&lt;br /&gt;PROPERTIES&lt;/span&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;span style="font-size:100%;"&gt;&lt;b&gt;&lt;span style="font-family:';"&gt;&lt;span style="COLOR: rgb(153,51,0)"&gt;DIFFERENCE KEY CATEGORIES&lt;/span&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style="font-family:';"&gt;Key&lt;/span&gt;&lt;/span&gt;&lt;span style="font-family:';font-size:100%;"&gt;. Specifies the name of a difference key input column. This property can be repeated to specify multiple difference key input columns. You can use the &lt;span class="hcp1"&gt;Column Selection&lt;span style="FONT-WEIGHT: normal"&gt; dialog box&lt;/span&gt;&lt;/span&gt; to select several columns at once if required. &lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;span style="font-family:';font-size:100%;"&gt;Key has this dependent property:&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style="font-family:';color:#000099;"&gt;&lt;strong&gt;Case Sensitive&lt;/strong&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style="font-family:';font-size:100%;"&gt;&lt;strong&gt;&lt;span style="color:#000099;"&gt;.&lt;/span&gt;&lt;/strong&gt; Use this to property to specify whether each key is case sensitive or not. It is set to True by default; for example, the values “CASE” and “case” would not be judged equivalent. This property is only available if the All non-Key columns are values property is set to True.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;&lt;h5&gt;&lt;span style="COLOR: rgb(153,51,0);font-size:100%;" &gt;&lt;/span&gt;&lt;/h5&gt;&lt;h5&gt;&lt;span style="COLOR: rgb(153,51,0);font-size:100%;" &gt;Difference Values Category&lt;/span&gt;&lt;span style="font-family:';font-size:100%;"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/h5&gt;&lt;p&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style="font-family:';color:#000099;"&gt;&lt;strong&gt;All non-Key Columns are Values&lt;/strong&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style="font-family:';font-size:100%;"&gt;&lt;strong&gt;&lt;span style="color:#000099;"&gt;.&lt;/span&gt;&lt;/strong&gt; Set this to True to indicate that any columns not designated &lt;/span&gt;&lt;span style="font-family:';font-size:100%;"&gt;as difference key &lt;a href="http://dwh-career.blogspot.com/"&gt;columns&lt;/a&gt; are value columns. It is False by default. The property has this dependent property:&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style="font-family:';color:#000099;"&gt;&lt;strong&gt;Case Sensitive&lt;/strong&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style="font-family:';font-size:100%;"&gt;&lt;strong&gt;&lt;span style="color:#000099;"&gt;.&lt;/span&gt;&lt;/strong&gt; Use this to property to specify whether each value is case sensitive or not. It is set to True by default; for example, the values “CASE” and “case” would not be judged equivalent.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;&lt;h5&gt;&lt;span style="font-family:';font-size:100%;"&gt;&lt;span style="color:#660000;"&gt;&lt;/span&gt;&lt;/span&gt;&lt;/h5&gt;&lt;h5&gt;&lt;span style="font-family:';font-size:100%;"&gt;&lt;span style="color:#660000;"&gt;&lt;/span&gt;&lt;/span&gt; &lt;/h5&gt;&lt;h5&gt;&lt;span style="font-family:';font-size:100%;"&gt;&lt;span style="color:#660000;"&gt;Options Category&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/span&gt;&lt;/h5&gt;&lt;p class="MsoNormal"&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style="font-family:';color:#000099;"&gt;&lt;strong&gt;&lt;/strong&gt;&lt;/span&gt;&lt;/span&gt; &lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style="font-family:';color:#000099;"&gt;&lt;strong&gt;Tolerate Unsorted Inputs&lt;/strong&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style="font-family:';font-size:100%;"&gt;&lt;strong&gt;&lt;span style="color:#000099;"&gt;.&lt;/span&gt;&lt;/strong&gt; Specifies that the input data sets are not sorted. This property allows you to process groups of &lt;a href="http://dwh-career.blogspot.com/"&gt;records&lt;/a&gt; that may be arranged by the difference key columns but not sorted. The stage processed the input records in the order in which they appear on its input. It is False by default.&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style="font-family:';color:#000099;"&gt;&lt;strong&gt;&lt;/strong&gt;&lt;/span&gt;&lt;/span&gt; &lt;/p&gt;&lt;p&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style="font-family:';color:#000099;"&gt;&lt;strong&gt;Log Statistics&lt;/strong&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style="font-family:';font-size:100%;"&gt;&lt;strong&gt;&lt;span style="color:#000099;"&gt;.&lt;/span&gt;&lt;/strong&gt; This property configures the stage to display result information containing the number of input records and the number of copy, delete, edit, and insert records. It is False by default.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style="font-family:';color:#000099;"&gt;&lt;strong&gt;&lt;/strong&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style="font-family:';color:#000099;"&gt;&lt;strong&gt;&lt;/strong&gt;&lt;/span&gt;&lt;/span&gt; &lt;/p&gt;&lt;p&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style="font-family:';color:#000099;"&gt;&lt;strong&gt;Drop Output for Insert&lt;/strong&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style="font-family:';font-size:100%;"&gt;&lt;strong&gt;&lt;span style="color:#000099;"&gt;. &lt;/span&gt;&lt;/strong&gt;Specifies to drop (not generate) an output record for an insert result. By default, an output record is always created by the stage.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style="font-family:';color:#000099;"&gt;&lt;strong&gt;&lt;/strong&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style="font-family:';color:#000099;"&gt;&lt;strong&gt;&lt;/strong&gt;&lt;/span&gt;&lt;/span&gt; &lt;/p&gt;&lt;p&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style="font-family:';color:#000099;"&gt;&lt;strong&gt;Drop Output for Delete&lt;/strong&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style="font-family:';font-size:100%;"&gt;&lt;strong&gt;&lt;span style="color:#000099;"&gt;.&lt;/span&gt;&lt;/strong&gt; Specifies to drop (not generate) the output record for a delete result . By default, an output record is always created by the &lt;a href="http://dwh-career.blogspot.com/"&gt;stage&lt;/a&gt;.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style="font-family:';color:#000099;"&gt;&lt;strong&gt;&lt;/strong&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style="font-family:';color:#000099;"&gt;&lt;strong&gt;&lt;/strong&gt;&lt;/span&gt;&lt;/span&gt; &lt;/p&gt;&lt;p&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style="font-family:';color:#000099;"&gt;&lt;strong&gt;Drop Output for Edit&lt;/strong&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style="font-family:';font-size:100%;"&gt;&lt;strong&gt;&lt;span style="color:#000099;"&gt;.&lt;/span&gt;&lt;/strong&gt; Specifies to drop (not generate) the output record for an edit result . By default, an output record is always created by the stage.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style="font-family:';"&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style="font-family:';color:#000099;"&gt;&lt;strong&gt;&lt;/strong&gt;&lt;/span&gt;&lt;/span&gt; &lt;/p&gt;&lt;p&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style="font-family:';color:#000099;"&gt;&lt;strong&gt;Drop Output for Copy&lt;/strong&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style="font-family:';font-size:100%;"&gt;&lt;strong&gt;&lt;span style="color:#000099;"&gt;.&lt;/span&gt;&lt;/strong&gt; Specifies to drop (not generate) the output record for a copy result . By default, an output record is always created by the stage.&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style="font-family:';color:#000099;"&gt;&lt;strong&gt;&lt;/strong&gt;&lt;/span&gt;&lt;/span&gt; &lt;/p&gt;&lt;p&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style="font-family:';color:#000099;"&gt;&lt;strong&gt;Copy Code&lt;/strong&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style="font-family:';font-size:100%;"&gt;&lt;strong&gt;&lt;span style="color:#000099;"&gt;.&lt;/span&gt;&lt;/strong&gt; Allows you to specify an alternative value for the code that indicates the after record is a copy of the before record. By default this code is 0.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style="font-family:';"&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style="font-family:';color:#000099;"&gt;&lt;strong&gt;&lt;/strong&gt;&lt;/span&gt;&lt;/span&gt; &lt;/p&gt;&lt;p&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style="font-family:';color:#000099;"&gt;&lt;strong&gt;Deleted Code&lt;/strong&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style="font-family:';font-size:100%;"&gt;&lt;strong&gt;&lt;span style="color:#000099;"&gt;.&lt;/span&gt;&lt;/strong&gt; Allows you to specify an alternative value for the code that indicates that a record in the before set has been deleted from the after set. By default this code is 2.&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style="font-family:';color:#000099;"&gt;&lt;strong&gt;&lt;/strong&gt;&lt;/span&gt;&lt;/span&gt; &lt;/p&gt;&lt;p&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style="font-family:';color:#000099;"&gt;&lt;strong&gt;Edit Code&lt;/strong&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style="font-family:';font-size:100%;"&gt;&lt;strong&gt;&lt;span style="color:#000099;"&gt;.&lt;/span&gt;&lt;/strong&gt; Allows you to specify an alternative value for the code that indicates the after record is an edited version of the before record. By default this code is 3.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style="font-family:';"&gt;&lt;/span&gt;&lt;/span&gt; &lt;/p&gt;&lt;p&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style="font-family:';color:#000099;"&gt;&lt;strong&gt;Insert Code&lt;/strong&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style="font-family:';font-size:100%;"&gt;&lt;strong&gt;&lt;span style="color:#000099;"&gt;.&lt;/span&gt;&lt;/strong&gt; Allows you to specify an alternative value for the code that indicates a new record &lt;/span&gt;&lt;span style="font-family:';font-size:100%;"&gt;has been inserted in the after set that did not exist in the before set. By default this code is 1.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;&lt;span style="font-size:100%;"&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;div align="center"&gt;&lt;strong&gt;&gt;&gt;NEXT&gt;&gt;&lt;/strong&gt;&lt;/div&gt;&lt;div align="center"&gt;&lt;strong&gt;&lt;/strong&gt;&lt;/div&gt;&lt;div align="center"&gt;&lt;/div&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6010613752270306441-1685452995483694106?l=datastagecareer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://datastagecareer.blogspot.com/feeds/1685452995483694106/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6010613752270306441&amp;postID=1685452995483694106' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6010613752270306441/posts/default/1685452995483694106'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6010613752270306441/posts/default/1685452995483694106'/><link rel='alternate' type='text/html' href='http://datastagecareer.blogspot.com/2008/03/difference-stage.html' title='DIFFERENCE STAGE'/><author><name>Eye On This Stuff</name><uri>http://www.blogger.com/profile/02519382966557475421</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6010613752270306441.post-1244487776660568506</id><published>2008-03-16T14:05:00.000-07:00</published><updated>2008-03-16T14:06:35.669-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='DECODE STAGE'/><title type='text'>PROCESSING STAGE</title><content type='html'>&lt;span style="font-size:100%;"&gt;&lt;br /&gt;&lt;br /&gt;&lt;/span&gt; &lt;p&gt;&lt;span style="font-size:100%;"&gt;&lt;b style=""&gt;&lt;span style=";font-family:&amp;quot;;" &gt;&lt;span style="color: rgb(153, 51, 0);"&gt;DECODE STAGE&lt;/span&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;The Decode stage is an active stage. It decodes a data set using a UNIX decoding command, such as gzip, that you supply. It converts a data stream of raw binary data into a data set. Its companion stage Encode converts a data set from a sequence of records to a stream of raw binary data.  Follow this link for a list of steps you must take when deploying a Decode stage in your job.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;As the input is always a single stream, you do not have to define meta data for the input link.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style="font-size:100%;"&gt;&lt;b style=""&gt;&lt;span style=";font-family:&amp;quot;;" &gt;&lt;br /&gt;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;span style="font-size:100%;"&gt;&lt;b style=""&gt;&lt;span style=";font-family:&amp;quot;;" &gt;PROPERTIES&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;OPTION CATEGORY&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Command Line&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;. Specifies the command line used for decoding the data set. The command line must configure the UNIX command to accept input from standard input and write its results to standard output. The command must be located in the search path of your application and be accessible by every processing node on which the Decode stage executes.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;span style="font-size:100%;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6010613752270306441-1244487776660568506?l=datastagecareer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://datastagecareer.blogspot.com/feeds/1244487776660568506/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6010613752270306441&amp;postID=1244487776660568506' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6010613752270306441/posts/default/1244487776660568506'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6010613752270306441/posts/default/1244487776660568506'/><link rel='alternate' type='text/html' href='http://datastagecareer.blogspot.com/2008/03/processing-stage_4714.html' title='PROCESSING STAGE'/><author><name>Eye On This Stuff</name><uri>http://www.blogger.com/profile/02519382966557475421</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6010613752270306441.post-6500946461281166400</id><published>2008-03-16T12:34:00.000-07:00</published><updated>2008-03-16T12:36:27.156-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='COPY STAGE'/><title type='text'>PROCESSING STAGE</title><content type='html'>&lt;span style="font-size:100%;"&gt;&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;  &lt;p style="color: rgb(153, 51, 0);"&gt;&lt;span style="font-size:130%;"&gt;&lt;b style=""&gt;&lt;span style=";font-family:&amp;quot;;" &gt;COPY STAGE&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;span style="font-size:100%;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;span style="font-size:100%;"&gt;&lt;b style=""&gt;&lt;span style=";font-family:&amp;quot;;" &gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;The Copy stage is a processing stage. It can have a single input link and any number of output links. Follow this link for a list of steps you must take when deploying a Copy stage in your job.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;The Copy stage copies a single input data set to a number of output data sets. Each record of the input data set is copied to every output data set without modification. This lets you make a backup copy of a data set on disk while performing an operation on another copy, for example.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Where you are using a Copy stage with a single input and a single output, you should ensure that you set the Force property in the stage editor TRUE. This prevents DataStage from deciding that the Copy operation is superfluous and optimizing it out of the job.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;span style="font-size:100%;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;span style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style="font-size:100%;"&gt;&lt;b style=""&gt;&lt;span style=";font-family:&amp;quot;;" &gt;&lt;span style="color: rgb(153, 51, 0);"&gt;PROPERTIES&lt;/span&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;  &lt;h5&gt;&lt;span style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Options Category&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/span&gt;&lt;/h5&gt;  &lt;h5 style="color: rgb(102, 102, 102);"&gt;&lt;span style="font-size:100%;"&gt;&lt;span style="color: rgb(51, 51, 51);" class="hcp1"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Force&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;" &gt;&lt;span style="color: rgb(51, 51, 51);"&gt;. Set True to specify that DataStage should not try to optimize the job by removing a Copy operation where there is one input and one output. Set False by default.&lt;/span&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/span&gt;&lt;/h5&gt;  &lt;span style="font-size:100%;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6010613752270306441-6500946461281166400?l=datastagecareer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://datastagecareer.blogspot.com/feeds/6500946461281166400/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6010613752270306441&amp;postID=6500946461281166400' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6010613752270306441/posts/default/6500946461281166400'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6010613752270306441/posts/default/6500946461281166400'/><link rel='alternate' type='text/html' href='http://datastagecareer.blogspot.com/2008/03/processing-stage_6105.html' title='PROCESSING STAGE'/><author><name>Eye On This Stuff</name><uri>http://www.blogger.com/profile/02519382966557475421</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6010613752270306441.post-2754634183304848009</id><published>2008-03-16T08:45:00.000-07:00</published><updated>2008-03-16T12:33:35.608-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='COMPRESS STAGE'/><title type='text'>PROCESSING STAGE</title><content type='html'>&lt;span style="font-size:100%;"&gt;&lt;br /&gt;&lt;br /&gt;&lt;/span&gt; &lt;p&gt;&lt;span style="font-size:100%;"&gt;&lt;b style=""&gt;&lt;span style=";font-family:&amp;quot;;" &gt;&lt;span style="font-size:130%;"&gt;&lt;span style="color: rgb(153, 0, 0);"&gt;COMPRESS STAGE&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;span style="font-size:100%;"&gt;&lt;b style=""&gt;&lt;span style=";font-family:&amp;quot;;" &gt;&lt;span style="font-size:130%;"&gt;&lt;span style="color: rgb(153, 0, 0);"&gt;&lt;/span&gt;&lt;/span&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;The Compress stage is a processing stage. It can have a single input link and a single output link. Follow this link for a list of steps you must take when deploying a Compress stage in your job.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;The Compress stage uses the UNIX compress or GZIP utility to compress a data set. It converts a data set from a sequence of records into a stream of raw binary data.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;A compressed data set is similar to an ordinary data set and can be stored in a persistent form by a Data Set stage. However, a compressed data set cannot be processed by many stages until it is expanded, that is, until its rows are returned to their normal format. Stages that do not perform column-based processing or reorder the rows can operate on compressed data sets. For example, you can use the copy stage to create a copy of the compressed data set.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;Because compressing a data set removes its normal record boundaries, the compressed data set must not be repartitioned before it is expanded.&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;span style="font-size:100%;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style="font-size:100%;"&gt;&lt;b style=""&gt;PROPERTERTIES&lt;/b&gt;&lt;b style=""&gt;&lt;span style=";font-family:&amp;quot;;" &gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;  &lt;h5&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;Options Category&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/h5&gt;  &lt;h5&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Command&lt;/span&gt;&lt;/span&gt;&lt;span style="font-size:100%;"&gt;. Specifies whether the stage will use compress (the default) or GZIP.&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/h5&gt; &lt;span style="font-size:100%;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6010613752270306441-2754634183304848009?l=datastagecareer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://datastagecareer.blogspot.com/feeds/2754634183304848009/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6010613752270306441&amp;postID=2754634183304848009' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6010613752270306441/posts/default/2754634183304848009'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6010613752270306441/posts/default/2754634183304848009'/><link rel='alternate' type='text/html' href='http://datastagecareer.blogspot.com/2008/03/processing-stage_4358.html' title='PROCESSING STAGE'/><author><name>Eye On This Stuff</name><uri>http://www.blogger.com/profile/02519382966557475421</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6010613752270306441.post-3565705386150034524</id><published>2008-03-16T08:36:00.000-07:00</published><updated>2008-03-16T08:39:08.485-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Change Capture Stage-2'/><title type='text'>PROCESSING STAGE</title><content type='html'>&lt;span style="font-size:100%;"&gt;&lt;br /&gt;&lt;span style="color: rgb(153, 51, 0);font-size:130%;" &gt;&lt;span style="font-weight: bold;"&gt;&lt;br /&gt;Change Capture Stage&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;/span&gt; &lt;p&gt;&lt;span style="font-size:100%;"&gt;&lt;b style=""&gt;&lt;span style=";font-family:&amp;quot;;" &gt;PROPERTIES TAB&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;  &lt;h5&gt;&lt;span style="color: rgb(0, 0, 153);font-size:100%;" &gt;Change Keys Category&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/h5&gt;  &lt;p&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Key&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;. Specifies the name of a difference key input column. This property can be repeated to specify multiple difference key input columns. You can use the &lt;span class="hcp1"&gt;Column Selection&lt;span style="font-weight: normal;"&gt; dialog boX&lt;/span&gt;&lt;/span&gt; to select several columns at once if required. Key has the following dependent properties:&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p style="margin-left: 0.5in; text-indent: -0.25in;"&gt;&lt;!--[if !supportLists]--&gt;&lt;span style=";font-family:Symbol;font-size:100%;"  &gt;·&lt;span style=";font-family:&amp;quot;;" &gt;         &lt;/span&gt;&lt;/span&gt;&lt;!--[endif]--&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Case Sensitive&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;. Use this to property to specify whether each key is case sensitive or not. It is set to True by default; for example, the values “CASE” and “case” would not be judged equivalent.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p style="margin-left: 0.5in; text-indent: -0.25in;"&gt;&lt;!--[if !supportLists]--&gt;&lt;span style=";font-family:Symbol;font-size:100%;"  &gt;·&lt;span style=";font-family:&amp;quot;;" &gt;         &lt;/span&gt;&lt;/span&gt;&lt;!--[endif]--&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Sort Order&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;. Specify ascending or descending sort order.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p style="margin-left: 0.5in; text-indent: -0.25in;"&gt;&lt;!--[if !supportLists]--&gt;&lt;span style=";font-family:Symbol;font-size:100%;"  &gt;·&lt;span style=";font-family:&amp;quot;;" &gt;         &lt;/span&gt;&lt;/span&gt;&lt;!--[endif]--&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Nulls Position&lt;/span&gt;&lt;/span&gt;&lt;span style="font-size:100%;"&gt;. Specify whether null values should be placed first or last.&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;h5&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;&lt;span style="color: rgb(0, 0, 153);"&gt;Change Value category&lt;/span&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/h5&gt;  &lt;p&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Value&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;. Specifies the name of a value input column (see page 31-1 for an explanation of how Value columns are used). You can use the &lt;span class="hcp1"&gt;Column Selection&lt;span style="font-weight: normal;"&gt; dialog box&lt;/span&gt;&lt;/span&gt; to select several columns at once if required. Value has the following dependent property:&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p style="margin-left: 0.5in; text-indent: -0.25in;"&gt;&lt;!--[if !supportLists]--&gt;&lt;span style=";font-family:Symbol;font-size:100%;"  &gt;·&lt;span style=";font-family:&amp;quot;;" &gt;         &lt;/span&gt;&lt;/span&gt;&lt;!--[endif]--&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Case Sensitive&lt;/span&gt;&lt;/span&gt;&lt;span style="font-size:100%;"&gt;. Use this to property to specify whether each value is case sensitive or not. It is set to True by default; for example, the values “CASE” and “case” would not be judged equivalent.&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;h5&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;&lt;span style="color: rgb(0, 0, 153);"&gt;Options Category&lt;/span&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/h5&gt;  &lt;p class="MsoNormal"&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Change Mode&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;. This mode determines how keys and values are specified. Choose Explicit Keys &amp;amp; Values to specify the keys and values yourself.&lt;span class="hcp1"&gt; &lt;/span&gt;Choose All keys, Explicit values to specify that&lt;span class="hcp1"&gt; &lt;/span&gt;value columns must be defined, but all other columns are key columns unless excluded. Choose Explicit Keys, All Values to specify that key columns must be defined but all other columns are value columns unless they are excluded.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Log Statistics&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;. This property configures the stage to display result information containing the number of input records and the number of copy, delete, edit, and insert records. &lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Drop Output for Insert&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;. Specifies to drop (not generate) an output record for an insert result. By default, an output record is always created by the stage.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Drop Output for Delete&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;. Specifies to drop (not generate) the output record for a delete result. By default, an output record is always created by the stage.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Drop Output for Edit&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;. Specifies to drop (not generate) the output record for an edit result. By default, an output record is always created by the stage.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Drop Output for Copy&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;. Specifies to drop (not generate) the output record for a copy result. By default, an output record is always created by the stage.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Code Column Name&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;. Allows you to specify a different name for the output column carrying the change code generated for each record by the stage. By default the column is called change_code.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Copy Code&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;. Allows you to specify an alternative value for the code that indicates the after record is a copy of the before record. By default this code is 0.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Deleted Code&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;. Allows you to specify an alternative value for the code that indicates that a record in the before set has been deleted from the after set. By default this code is 2.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Edit Code&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;. Allows you to specify an alternative value for the code that indicates the after record is an edited version of the before record. By default this code is 3.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Insert Code&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;. Allows you to specify an alternative value for the code that indicates a new record has been inserted in the after set that did not exist in the before set. By default this code is 1.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;span style="font-size:100%;"&gt;&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6010613752270306441-3565705386150034524?l=datastagecareer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://datastagecareer.blogspot.com/feeds/3565705386150034524/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6010613752270306441&amp;postID=3565705386150034524' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6010613752270306441/posts/default/3565705386150034524'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6010613752270306441/posts/default/3565705386150034524'/><link rel='alternate' type='text/html' href='http://datastagecareer.blogspot.com/2008/03/processing-stage_600.html' title='PROCESSING STAGE'/><author><name>Eye On This Stuff</name><uri>http://www.blogger.com/profile/02519382966557475421</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6010613752270306441.post-4390746162702018998</id><published>2008-03-16T08:33:00.000-07:00</published><updated>2008-03-16T08:34:51.906-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Change Capture Stage'/><title type='text'>PROCESSING STAGE</title><content type='html'>&lt;span style="font-size:100%;"&gt;&lt;br /&gt;&lt;br /&gt;&lt;/span&gt; &lt;p&gt;&lt;span style="font-size:100%;"&gt;&lt;b style=""&gt;&lt;span style=";font-family:&amp;quot;;" &gt;&lt;span style="font-size:130%;"&gt;&lt;span style="color: rgb(153, 51, 0);"&gt;Change Capture Stage&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;The &lt;b style=""&gt;Change Capture Stage&lt;/b&gt; is a processing stage. The stage compares two data sets and makes a record of the differences. An example before and after data set are given in &lt;i&gt;Parallel Job Developer's Guide&lt;/i&gt;.  Follow this link for a list of steps you must take when deploying a Change Capture stage in your job.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;The Change Capture stage takes two input data sets, denoted before and after, and outputs a single data set whose records represent the changes made to the before data set to obtain the after data set. The stage produces a change data set, whose table definition is transferred from the after data set’s table definition with the addition of one column: a change code with values encoding the four actions: insert, delete, copy, and edit. The preserve-partitioning flag is set on the change data set.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;The compare is based on a set of key columns, rows from the two data sets are assumed to be copies of one another if they have the same values in these key columns. You can also optionally specify change values. If two rows have identical key columns, you can compare the value columns to see if one is an edited copy of the other.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;The stage assumes that the incoming data is hash-partitioned and sorted in ascending order (this is done automatically if (auto) is selected on the partitioning tab). The columns the data is hashed on should be the key columns used for the data compare. You can achieve the sorting and partitioning using the Sort stage or by using the built in sorting and partitioning abilities of the Change Capture stage.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;You can use the companion Change Apply stage to combine the changes from the Change Capture stage with the original before data set to reproduce the after data set.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;The Change Capture stage is very similar to the Difference stage.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style="font-size:100%;"&gt;&lt;b style=""&gt;&lt;span style=";font-family:&amp;quot;;" &gt;&lt;br /&gt;&lt;span style="color: rgb(0, 0, 153);"&gt;The stage editor has three pages:&lt;/span&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p style="margin-left: 0.5in; text-indent: -0.25in;"&gt;&lt;!--[if !supportLists]--&gt;&lt;span style=";font-family:Symbol;font-size:100%;"  &gt;·&lt;span style=";font-family:&amp;quot;;" &gt;         &lt;/span&gt;&lt;/span&gt;&lt;!--[endif]--&gt;&lt;span class="hcp2"  style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Stage&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt; page. This is always present and is used to specify general information about the stage.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p style="margin-left: 0.5in; text-indent: -0.25in;"&gt;&lt;!--[if !supportLists]--&gt;&lt;span style=";font-family:Symbol;font-size:100%;"  &gt;·&lt;span style=";font-family:&amp;quot;;" &gt;         &lt;/span&gt;&lt;/span&gt;&lt;!--[endif]--&gt;&lt;span class="hcp2"  style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Inputs&lt;span style="font-weight: normal;"&gt; page&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;. This is where you specify details about the data set having its duplicates removed.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p style="margin-left: 0.5in; text-indent: -0.25in;"&gt;&lt;!--[if !supportLists]--&gt;&lt;span style=";font-family:Symbol;font-size:100%;"  &gt;·&lt;span style=";font-family:&amp;quot;;" &gt;         &lt;/span&gt;&lt;/span&gt;&lt;!--[endif]--&gt;&lt;span class="hcp2"  style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Outputs&lt;span style="font-weight: normal;"&gt; page&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;. This is where you specify details about the processed data being output from the stage.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;The &lt;span class="hcp2"&gt;General&lt;/span&gt; tab allows you to specify an optional description of the stage. The &lt;span class="hcp2"&gt;Properties&lt;span style="font-weight: normal;"&gt; tab&lt;/span&gt;&lt;/span&gt; lets you specify what the stage does. The &lt;span class="hcp2"&gt;Advanced&lt;span style="font-weight: normal;"&gt; tab&lt;/span&gt;&lt;/span&gt; allows you to specify how the stage executes. The &lt;span class="hcp2"&gt;Link Ordering&lt;span style="font-weight: normal;"&gt; tab&lt;/span&gt;&lt;/span&gt; allows you to specify which input link carries the before data set and which the after data set.&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6010613752270306441-4390746162702018998?l=datastagecareer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://datastagecareer.blogspot.com/feeds/4390746162702018998/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6010613752270306441&amp;postID=4390746162702018998' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6010613752270306441/posts/default/4390746162702018998'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6010613752270306441/posts/default/4390746162702018998'/><link rel='alternate' type='text/html' href='http://datastagecareer.blogspot.com/2008/03/processing-stage_4019.html' title='PROCESSING STAGE'/><author><name>Eye On This Stuff</name><uri>http://www.blogger.com/profile/02519382966557475421</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6010613752270306441.post-5321563928186417743</id><published>2008-03-16T08:29:00.000-07:00</published><updated>2008-03-16T08:32:27.076-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Change Apply Stage-2'/><title type='text'>PROCESSING STAGE</title><content type='html'>&lt;span style="font-size:100%;"&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold; color: rgb(153, 0, 0);"&gt;Change Apply Stage&lt;/span&gt;&lt;br /&gt;&lt;/span&gt; &lt;p style="color: rgb(0, 0, 153);"&gt;&lt;span style="font-size:100%;"&gt;&lt;b style=""&gt;&lt;span style=";font-family:&amp;quot;;" &gt;&lt;span style="font-size:130%;"&gt;The stage editor has three pages:&lt;/span&gt;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;span style="font-size:100%;"&gt;&lt;b style=""&gt;&lt;span style=";font-family:&amp;quot;;" &gt;&lt;span style="font-size:130%;"&gt;&lt;span style="color: rgb(153, 51, 0);"&gt;&lt;/span&gt;&lt;/span&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p style="margin-left: 0.5in; text-indent: -0.25in;"&gt;&lt;!--[if !supportLists]--&gt;&lt;span style=";font-family:Symbol;font-size:100%;"  &gt;·&lt;span style=";font-family:&amp;quot;;" &gt;         &lt;/span&gt;&lt;/span&gt;&lt;!--[endif]--&gt;&lt;span class="hcp3"  style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Stage&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt; page. This is always present and is used to specify general information about the stage.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p style="margin-left: 0.5in; text-indent: -0.25in;"&gt;&lt;!--[if !supportLists]--&gt;&lt;span style=";font-family:Symbol;font-size:100%;"  &gt;·&lt;span style=";font-family:&amp;quot;;" &gt;         &lt;/span&gt;&lt;/span&gt;&lt;!--[endif]--&gt;&lt;span class="hcp3"  style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Inputs&lt;span style="font-weight: normal;"&gt; page&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;. This is where you specify the details about the single input set from which you are selecting records.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p style="margin-left: 0.5in; text-indent: -0.25in;"&gt;&lt;!--[if !supportLists]--&gt;&lt;span style=";font-family:Symbol;font-size:100%;"  &gt;·&lt;span style=";font-family:&amp;quot;;" &gt;         &lt;/span&gt;&lt;/span&gt;&lt;!--[endif]--&gt;&lt;span class="hcp3"  style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Outputs&lt;span style="font-weight: normal;"&gt; page&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;. This is where you specify details about the processed data being output from the stage.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style="font-size:100%;"&gt;&lt;b style=""&gt;&lt;span style=";font-family:&amp;quot;;" &gt;&lt;span style="color: rgb(0, 0, 153);"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;span style="font-size:100%;"&gt;&lt;b style=""&gt;&lt;span style=";font-family:&amp;quot;;" &gt;&lt;span style="color: rgb(0, 0, 153);"&gt;PROPERTIES:&lt;/span&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;  &lt;h5&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;&lt;span style="color: rgb(153, 51, 153);"&gt;Change Keys Category&lt;/span&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/h5&gt;  &lt;p class="MsoNormal"&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Key&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;.Specifies the name of a difference key input column. This property can be repeated to specify multiple difference key input columns. You can use the &lt;span class="hcp1"&gt;Column Selection&lt;span style="font-weight: normal;"&gt; dialog box&lt;/span&gt;&lt;/span&gt; to select several columns at once if required. Key has the following dependent properties:&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p style="margin-left: 0.5in; text-indent: -0.25in;"&gt;&lt;!--[if !supportLists]--&gt;&lt;span style=";font-family:Symbol;font-size:100%;"  &gt;·&lt;span style=";font-family:&amp;quot;;" &gt;         &lt;/span&gt;&lt;/span&gt;&lt;!--[endif]--&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Case Sensitive&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;. Use this to property to specify whether each key is case sensitive or not. It is set to True by default; for example, the values “CASE” and “case” would not be judged equivalent.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p style="margin-left: 0.5in; text-indent: -0.25in;"&gt;&lt;!--[if !supportLists]--&gt;&lt;span style=";font-family:Symbol;font-size:100%;"  &gt;·&lt;span style=";font-family:&amp;quot;;" &gt;         &lt;/span&gt;&lt;/span&gt;&lt;!--[endif]--&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Sort Orde&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;r. Specify ascending or descending sort order.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p style="margin-left: 0.5in; text-indent: -0.25in;"&gt;&lt;!--[if !supportLists]--&gt;&lt;span style=";font-family:Symbol;font-size:100%;"  &gt;·&lt;span style=";font-family:&amp;quot;;" &gt;         &lt;/span&gt;&lt;/span&gt;&lt;!--[endif]--&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Nulls Position&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;. Specify whether null values should be placed first or last.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;h5&gt;&lt;br /&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;&lt;span style="color: rgb(102, 51, 102);"&gt;&lt;/span&gt;&lt;/span&gt;&lt;/h5&gt;&lt;h5&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;&lt;span style="color: rgb(102, 51, 102);"&gt;Change Value category&lt;/span&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/h5&gt;  &lt;p&gt;&lt;span style="font-weight: bold;font-size:100%;" class="hcp1" &gt;&lt;span style=";font-family:&amp;quot;;" &gt;Value&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;&lt;span style="font-weight: bold;"&gt;.&lt;/span&gt; Specifies the name of a value input column. Value has the following dependent properties:&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p style="margin-left: 0.5in; text-indent: -0.25in;"&gt;&lt;!--[if !supportLists]--&gt;&lt;span style=";font-family:Symbol;font-size:100%;"  &gt;·&lt;span style=";font-family:&amp;quot;;" &gt;         &lt;/span&gt;&lt;/span&gt;&lt;!--[endif]--&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Case Sensitive&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;. Use this to property to specify whether each value is case sensitive or not. It is set to True by default; for example, the values “CASE” and “case” would not be judged equivalent.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;h5&gt;&lt;br /&gt;&lt;/h5&gt;&lt;h5&gt;&lt;span style="color: rgb(0, 0, 153);font-size:100%;" &gt;Options Category&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/h5&gt;  &lt;p&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Change Mode&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;. This mode determines how keys and values are specified. Choose Explicit Keys &amp;amp; Values to specify the keys and values yourself.&lt;span class="hcp1"&gt; &lt;/span&gt;Choose All keys, Explicit values to specify that&lt;span class="hcp1"&gt; &lt;/span&gt;value columns must be defined, but all other columns are key columns unless excluded. Choose Explicit Keys, All Values to specify that key columns must be defined but all other columns are value columns unless they are excluded.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Log Statistics&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;. This property configures the stage to display result information containing the number of input records and the number of copy, delete, edit, and insert records. &lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Check Value Columns on Delete&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;. Specifies that DataStage should not check value columns on deletes. Normally, Change Apply compares the value columns of delete change records to those in the before record to ensure that it is deleting the correct record.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Code Column Name&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;. Allows you to specify that a different name has been used for the change data set column carrying the change code generated for each record by the stage. By default the column is called change_code.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Copy Code&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;. Allows you to specify an alternative value for the code that indicates a record copy. By default this code is 0.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Delete Code&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;. Allows you to specify an alternative value for the code that indicates a record delete. By default this code is 2.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Edit Code&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;. Allows you to specify an alternative value for the code that indicates a record edit. By default this code is 3.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Insert Code&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;. Allows you to specify an alternative value for the code that indicates a record insert. By default this code is 1.&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6010613752270306441-5321563928186417743?l=datastagecareer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://datastagecareer.blogspot.com/feeds/5321563928186417743/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6010613752270306441&amp;postID=5321563928186417743' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6010613752270306441/posts/default/5321563928186417743'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6010613752270306441/posts/default/5321563928186417743'/><link rel='alternate' type='text/html' href='http://datastagecareer.blogspot.com/2008/03/processing-stage_799.html' title='PROCESSING STAGE'/><author><name>Eye On This Stuff</name><uri>http://www.blogger.com/profile/02519382966557475421</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6010613752270306441.post-1461426398413102736</id><published>2008-03-16T08:23:00.000-07:00</published><updated>2008-03-16T08:28:59.175-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Change Apply Stage-1'/><title type='text'>PROCESSING STAGE</title><content type='html'>&lt;span style="font-size:100%;"&gt;&lt;br /&gt;&lt;br /&gt;&lt;/span&gt; &lt;p&gt;&lt;span style="font-size:100%;"&gt;&lt;b style=""&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Change Apply Stage&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;The &lt;b style=""&gt;Change Apply stage&lt;/b&gt; is an active stage. It takes the change data set, that contains the changes in the before and after data sets, from the Change Capture stage and applies the encoded change operations to a before data set to compute an after data set. An example before and change data set are given in &lt;i&gt;Parallel Job Developer's Guide&lt;/i&gt;.  Follow this link for a list of steps you must take when deploying a Change Apply stage in your job.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;The before input to Change Apply must have the same columns as the before input that was input to Change Capture, and an automatic conversion must exist between the types of corresponding columns. In addition, results are only guaranteed if the contents of the before input to Change Apply are identical (in value and record order in each partition) to the before input that was fed to Change Capture, and if the keys are unique. &lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;The change input to Change Apply must have been output from Change Capture without modification. Because preserve-partitioning is set on the change output of Change Capture, you will be warned at run time if the Change Apply stage does not have the same number of partitions as the Change Capture stage. Additionally, both inputs of Change Apply are designated as partitioned using the Same partitioning method.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;The Change Apply stage read a record from the change data set and from the before data set, compares their key column values, and acts accordingly:&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p style="margin-left: 0.5in; text-indent: -0.25in;"&gt;&lt;!--[if !supportLists]--&gt;&lt;span style=";font-family:Symbol;font-size:100%;"  &gt;·&lt;span style=";font-family:&amp;quot;;" &gt;         &lt;/span&gt;&lt;/span&gt;&lt;!--[endif]--&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;If the before keys come before the change keys in the specified sort order, the before record is copied to the output. The change record is retained for the next comparison. &lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p style="margin-left: 0.5in; text-indent: -0.25in;"&gt;&lt;!--[if !supportLists]--&gt;&lt;span style=";font-family:Symbol;font-size:100%;"  &gt;·&lt;span style=";font-family:&amp;quot;;" &gt;         &lt;/span&gt;&lt;/span&gt;&lt;!--[endif]--&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;If the before keys are equal to the change keys, the behavior depends on the code in the change_code column of the change record:&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="bullnest" style="margin-left: 0.5in; text-indent: -0.25in;"&gt;&lt;!--[if !supportLists]--&gt;&lt;span style=";font-family:Symbol;font-size:100%;"  &gt;·&lt;span style=";font-family:&amp;quot;;" &gt;         &lt;/span&gt;&lt;/span&gt;&lt;!--[endif]--&gt;&lt;span class="hcp3"  style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Insert&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;: The change record is copied to the output; the stage retains the same before record for the next comparison.. If key columns are not unique, and there is more than one consecutive insert with the same key, then Change Apply applies all the consecutive inserts before existing records. This record order may be different from the after data set given to Change Capture.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="bullnest" style="margin-left: 0.5in; text-indent: -0.25in;"&gt;&lt;!--[if !supportLists]--&gt;&lt;span style=";font-family:Symbol;font-size:100%;"  &gt;·&lt;span style=";font-family:&amp;quot;;" &gt;         &lt;/span&gt;&lt;/span&gt;&lt;!--[endif]--&gt;&lt;span class="hcp3"  style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Delete&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;: The value columns of the before and change records are compared. If the value columns are the same or if the Check Value Columns on Delete is specified as False, the change and before records are both discarded; no record is transferred to the output. If the value columns are not the same, the before record is copied to the output and the stage retains the same change record for the next comparison.&lt;br /&gt;If key columns are not unique, the value columns ensure that the correct record is deleted. If more than one record with the same keys have matching value columns, the first-encountered record is deleted. This may cause different record ordering than in the after data set given to the Change Capture stage. A warning is issued and both change record and before record are discarded, i.e. no output record results.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="bullnest" style="margin-left: 0.5in; text-indent: -0.25in;"&gt;&lt;!--[if !supportLists]--&gt;&lt;span style=";font-family:Symbol;font-size:100%;"  &gt;·&lt;span style=";font-family:&amp;quot;;" &gt;         &lt;/span&gt;&lt;/span&gt;&lt;!--[endif]--&gt;&lt;span class="hcp3"  style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Edit&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;: The change record is copied to the output; the before record is discarded. If key columns are not unique, then the first before record encountered with matching keys will be edited. This may be a different record from the one that was edited in the after data set given to the Change Capture stage. A warning is issued and the change record is copied to the output; but the stage retains the same before record for the next comparison..&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="bullnest" style="margin-left: 0.5in; text-indent: -0.25in;"&gt;&lt;!--[if !supportLists]--&gt;&lt;span style=";font-family:Symbol;font-size:100%;"  &gt;·&lt;span style=";font-family:&amp;quot;;" &gt;         &lt;/span&gt;&lt;/span&gt;&lt;!--[endif]--&gt;&lt;span class="hcp3"  style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Copy&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;: The change record is discarded. The before record is copied to the output.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p style="margin-left: 0.5in; text-indent: -0.25in;"&gt;&lt;!--[if !supportLists]--&gt;&lt;span style=";font-family:Symbol;font-size:100%;"  &gt;·&lt;span style=";font-family:&amp;quot;;" &gt;         &lt;/span&gt;&lt;/span&gt;&lt;!--[endif]--&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;If the before keys come after the change keys, behavior also depends on the change_code column:.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="bullnest" style="margin-left: 0.5in; text-indent: -0.25in;"&gt;&lt;!--[if !supportLists]--&gt;&lt;span style=";font-family:Symbol;font-size:100%;"  &gt;·&lt;span style=";font-family:&amp;quot;;" &gt;         &lt;/span&gt;&lt;/span&gt;&lt;!--[endif]--&gt;&lt;span class="hcp3"  style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Insert&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;. The change record is copied to the output, the stage retains the same before record for the next comparison. (The same as when the keys are equal.)&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="bullnest" style="margin-left: 0.5in; text-indent: -0.25in;"&gt;&lt;!--[if !supportLists]--&gt;&lt;span style=";font-family:Symbol;font-size:100%;"  &gt;·&lt;span style=";font-family:&amp;quot;;" &gt;         &lt;/span&gt;&lt;/span&gt;&lt;!--[endif]--&gt;&lt;span class="hcp3"  style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Delete&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;. A warning is issued and the change record discarded while the before record is retained for the next comparison.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="bullnest" style="margin-left: 0.5in; text-indent: -0.25in;"&gt;&lt;!--[if !supportLists]--&gt;&lt;span style=";font-family:Symbol;font-size:100%;"  &gt;·&lt;span style=";font-family:&amp;quot;;" &gt;         &lt;/span&gt;&lt;/span&gt;&lt;!--[endif]--&gt;&lt;span class="hcp3"  style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Edit&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt; or &lt;span class="hcp3"&gt;Copy&lt;/span&gt;. A warning is issued and the change record is copied to the output while the before record is retained for the next comparison. &lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;If the before input of Change Apply is identical to the before input of Change Capture and either the keys are unique or copy records are used, then the output of Change Apply is identical to the after input of Change Capture. However, if the before input of Change Apply is not the same (different record contents or ordering), or the keys are not unique and copy records are not used, this is not detected and the rules described above are applied anyway, producing a result that might or might not be useful.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;span style="font-size:100%;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6010613752270306441-1461426398413102736?l=datastagecareer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://datastagecareer.blogspot.com/feeds/1461426398413102736/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6010613752270306441&amp;postID=1461426398413102736' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6010613752270306441/posts/default/1461426398413102736'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6010613752270306441/posts/default/1461426398413102736'/><link rel='alternate' type='text/html' href='http://datastagecareer.blogspot.com/2008/03/processing-stage_2419.html' title='PROCESSING STAGE'/><author><name>Eye On This Stuff</name><uri>http://www.blogger.com/profile/02519382966557475421</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6010613752270306441.post-8346583571207214598</id><published>2008-03-16T08:13:00.000-07:00</published><updated>2008-03-16T08:20:13.251-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Aggregator Stage-6'/><title type='text'>PROCESSING STAGE</title><content type='html'>&lt;span style="color: rgb(102, 0, 0);font-size:130%;" &gt;&lt;span style="font-weight: bold;"&gt;Aggregator Stage&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;p&gt;&lt;span style="font-size:100%;"&gt;&lt;span style="font-weight: bold; color: rgb(0, 0, 153);font-family:&amp;quot;;font-size:130%;"  &gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;span style="font-size:100%;"&gt;&lt;span style="font-weight: bold; color: rgb(0, 0, 153);font-family:&amp;quot;;font-size:130%;"  &gt;OUTPUT TAB&lt;/span&gt;&lt;b style=""&gt;&lt;span style=";font-family:&amp;quot;;" &gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;&lt;br /&gt;The &lt;span class="hcp1"&gt;Outputs&lt;/span&gt; page allows you to specify details about data output from the Aggregator stage. The Aggregator stage can have only one output link. &lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;The &lt;span class="hcp1"&gt;General&lt;/span&gt; tab allows you to specify an optional description of the output link. The &lt;span class="hcp1"&gt;Columns&lt;/span&gt;&lt;span class="hcp1"&gt;&lt;span style="font-weight: normal;"&gt; tab&lt;/span&gt;&lt;/span&gt; specifies the column definitions of incoming data. The &lt;span class="hcp1"&gt;Mapping&lt;/span&gt;&lt;span class="hcp1"&gt;&lt;span style="font-weight: normal;"&gt; tab&lt;/span&gt;&lt;/span&gt; allows you to specify the relationship between the processed data being produced by the Aggregator stage and the Output columns. The &lt;span class="hcp1"&gt;Advanced&lt;/span&gt;&lt;span class="hcp1"&gt;&lt;span style="font-weight: normal;"&gt; tab&lt;/span&gt;&lt;/span&gt; you to change the default buffering settings for the output link.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style="font-size:100%;"&gt;&lt;b style=""&gt;&lt;span style=";font-family:&amp;quot;;" &gt;&lt;br /&gt;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;span style="font-size:100%;"&gt;&lt;b style=""&gt;&lt;span style=";font-family:&amp;quot;;" &gt;&lt;span style="color: rgb(153, 51, 0);"&gt;OUTPUT MAPPING TAB&lt;/span&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;For the Aggregator stage the &lt;b&gt;Mapping&lt;/b&gt; tab allows you to specify how the output columns are derived, i.e., what input columns map onto them or how they are generated.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;span style=";font-family:&amp;quot;;font-size:10;"  &gt;&lt;span style="font-size:100%;"&gt;In the example the left pane represents the data after it has been grouped and summarized. The Expression field shows how the column has been derived. The right pane represents the data being output by the stage after the grouping and summarizing. In this example key carries the value of the key field on which the data was grouped (for example, if you were grouping by date it would contain each date grouped on). Column am1mean carries the mean of all the  values in the group, am1min the minimum value, and am1sum the sum.&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6010613752270306441-8346583571207214598?l=datastagecareer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://datastagecareer.blogspot.com/feeds/8346583571207214598/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6010613752270306441&amp;postID=8346583571207214598' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6010613752270306441/posts/default/8346583571207214598'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6010613752270306441/posts/default/8346583571207214598'/><link rel='alternate' type='text/html' href='http://datastagecareer.blogspot.com/2008/03/processing-stage_463.html' title='PROCESSING STAGE'/><author><name>Eye On This Stuff</name><uri>http://www.blogger.com/profile/02519382966557475421</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6010613752270306441.post-9199905039933396387</id><published>2008-03-16T08:00:00.000-07:00</published><updated>2008-03-16T08:10:24.313-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Aggregator Stage-5'/><title type='text'>PROCESSING STAGE</title><content type='html'>&lt;span style="font-size:100%;"&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 0, 153);font-size:130%;" &gt;&lt;span style="font-weight: bold;"&gt;Aggregator Stage&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt; &lt;p&gt;&lt;span style="font-size:100%;"&gt;&lt;b style=""&gt;&lt;span style=""&gt;&lt;span style="color: rgb(153, 51, 0);"&gt;Each of these properties has a dependent property as follows:&lt;/span&gt;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;span style="font-size:100%;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;span style="font-size:100%;"&gt;&lt;b style=""&gt;&lt;span style=""&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p style="margin-left: 0.5in; text-indent: -0.25in;"&gt;&lt;!--[if !supportLists]--&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style="font-weight: normal;font-family:Symbol;" &gt;·&lt;span style=""&gt;         &lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;!--[endif]--&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style=""&gt;Decimal Output&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-size:100%;" &gt;. By default all calculation or recalculation columns have an output type of double. This property allows you to specify that the column has an output type of decimal. You can also specify a precision and scale for they type (by default 8,2).&lt;span class="hcp1"&gt;&lt;span style="font-weight: normal;"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style=";font-size:100%;" &gt;The &lt;span class="hcp1"&gt;Inputs&lt;/span&gt; page allows you to specify details about the incoming data set.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style=";font-size:100%;" &gt;The &lt;span class="hcp1"&gt;General&lt;/span&gt; tab allows you to specify an optional description of the input link. The &lt;span class="hcp1"&gt;Partitioning&lt;/span&gt;&lt;span class="hcp1"&gt;&lt;span style="font-weight: normal;"&gt; tab&lt;/span&gt;&lt;/span&gt; allows you to specify how incoming data is partitioned before being grouped and/or summarized. The &lt;span class="hcp1"&gt;Columns&lt;/span&gt;&lt;span class="hcp1"&gt;&lt;span style="font-weight: normal;"&gt; tab&lt;/span&gt;&lt;/span&gt; specifies the column definitions of   incoming data. The &lt;span class="hcp1"&gt;Advanced&lt;/span&gt;&lt;span class="hcp1"&gt;&lt;span style="font-weight: normal;"&gt; tab &lt;/span&gt;&lt;/span&gt;allows you to change the default buffering settings for the input link.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style=";font-size:100%;" &gt;The &lt;span class="hcp1"&gt;Partitioning&lt;/span&gt; tab allows you to specify details about how the incoming data is partitioned or collected before it is grouped and/or summarized. It also allows you to specify that the data should be sorted before being operated on.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style=";font-size:100%;" &gt;By default the stage partitions in Auto mode. This attempts to work out the best partitioning method depending on execution modes of current and preceding stages and how many nodes are specified in the Configuration file. &lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style=";font-size:100%;" &gt;If the Aggregator stage is operating in sequential mode, it will first collect the data before writing it to the file using the default round Auto collection method.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;   &lt;p&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;The &lt;span class="hcp1"&gt;Partitioning&lt;/span&gt; tab allows you to override this default behavior.&lt;br /&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;&lt;span style="font-weight: bold;"&gt;The exact operation of this tab depends on:&lt;/span&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p style="margin-left: 0.5in; text-indent: -0.25in;"&gt;&lt;!--[if !supportLists]--&gt;&lt;span style=";font-family:Symbol;font-size:100%;"  &gt;·&lt;span style=";font-family:&amp;quot;;" &gt;         &lt;/span&gt;&lt;/span&gt;&lt;!--[endif]--&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;Whether the Aggregator stage is set to execute in parallel or sequential mode.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p style="margin-left: 0.5in; text-indent: -0.25in;"&gt;&lt;!--[if !supportLists]--&gt;&lt;span style=";font-family:Symbol;font-size:100%;"  &gt;·&lt;span style=";font-family:&amp;quot;;" &gt;         &lt;/span&gt;&lt;/span&gt;&lt;!--[endif]--&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;Whether the preceding stage in the job is set to execute in parallel or sequential mode.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;If the Aggregator stage is set to execute in parallel, then you can set a partitioning method by selecting from the Partitioning mode drop-down list. This will override any current partitioning (even if the Preserve Partitioning option has been set on the previous stage).&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;If the Aggregator stage is set to execute in sequential mode, but the preceding stage is executing in parallel, then you can set a collection method from the &lt;b style=""&gt;Collection type&lt;/b&gt; drop-down list. This will override the default collection method.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;The &lt;span class="hcp1"&gt;Partitioning&lt;/span&gt; tab also allows you to specify that data arriving on the input link should be sorted before being processed. The sort is always carried out within data partitions. If the stage is partitioning incoming data the sort occurs after the partitioning. If the stage is collecting data, the sort occurs before the collection. The availability of sorting depends on the partitioning or collecting method chosen (it is not available for the default auto modes).&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;If NLS is enabled an additional button opens a dialog box allowing you to select a locale specifying the collate convention for the sort. &lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;You can also specify sort direction, case sensitivity, whether sorted as ASCII or EBCDIC, and whether null columns will appear first or last for each column. Where you are using a keyed partitioning method, you can also specify whether the column is used as a key for sorting, for partitioning, or for both. Select the column in the Selected list and right-click to invoke the shortcut menu.&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;span style=";font-family:&amp;quot;;font-size:10;"  &gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6010613752270306441-9199905039933396387?l=datastagecareer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://datastagecareer.blogspot.com/feeds/9199905039933396387/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6010613752270306441&amp;postID=9199905039933396387' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6010613752270306441/posts/default/9199905039933396387'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6010613752270306441/posts/default/9199905039933396387'/><link rel='alternate' type='text/html' href='http://datastagecareer.blogspot.com/2008/03/processing-stage_4205.html' title='PROCESSING STAGE'/><author><name>Eye On This Stuff</name><uri>http://www.blogger.com/profile/02519382966557475421</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6010613752270306441.post-8396801812671466653</id><published>2008-03-16T07:55:00.000-07:00</published><updated>2008-03-16T07:59:20.031-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Aggregator Stage-4'/><title type='text'>PROCESSING STAGE</title><content type='html'>&lt;span style="font-size:100%;"&gt;&lt;br /&gt;&lt;br /&gt;&lt;/span&gt; &lt;h6&gt;&lt;br /&gt;&lt;/h6&gt;&lt;h6&gt;&lt;span style="color: rgb(153, 51, 0);font-size:130%;" &gt;Aggregator Stage&lt;/span&gt;&lt;br /&gt;&lt;/h6&gt;&lt;h6&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;&lt;span style="color: rgb(0, 0, 153);"&gt;Calculation and Recalculation Dependent Properties&lt;/span&gt; &lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/h6&gt;  &lt;p&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;The following properties are dependents of both Column for Calculation and Summary Column for Recalculation. These specify the various aggregate functions and the output columns to carry the results.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style="font-weight: bold;font-size:100%;" class="hcp1" &gt;&lt;span style=";font-family:&amp;quot;;" &gt;Corrected Sum of Squares&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;&lt;span style="font-weight: bold;"&gt;. &lt;/span&gt;Produces a corrected sum of squares for data in the aggregate column and outputs it to the specified output column.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style="font-weight: bold;font-size:100%;" class="hcp1" &gt;&lt;span style=";font-family:&amp;quot;;" &gt;Maximum Value&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;&lt;span style="font-weight: bold;"&gt;.&lt;/span&gt; Gives the maximum value in the aggregate column and outputs it to the specified output column.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style="font-weight: bold;font-size:100%;" class="hcp1" &gt;&lt;span style=";font-family:&amp;quot;;" &gt;Mean Value&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;&lt;span style="font-weight: bold;"&gt;.&lt;/span&gt; Gives the mean value in the aggregate column and outputs it to the specified output column.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style="font-weight: bold;font-size:100%;" class="hcp1" &gt;&lt;span style=";font-family:&amp;quot;;" &gt;Minimum Value&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;&lt;span style="font-weight: bold;"&gt;.&lt;/span&gt; Gives the minimum value in the aggregate column and outputs it to the specified output column.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style="font-weight: bold;font-size:100%;" class="hcp1" &gt;&lt;span style=";font-family:&amp;quot;;" &gt;Missing Value&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;&lt;span style="font-weight: bold;"&gt;.&lt;/span&gt; This specifies what constitutes a ‘missing’ values, for example -1 or NULL. Enter the value as a floating point number. Not available for Summary Column to Recalculate.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style="font-weight: bold;font-size:100%;" class="hcp1" &gt;&lt;span style=";font-family:&amp;quot;;" &gt;Missing Values Count&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;&lt;span style="font-weight: bold;"&gt;.&lt;/span&gt; Counts the number of aggregate columns with missing values in them and outputs the count to the specified output column. Not available for Recalculate.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style="font-weight: bold;font-size:100%;" class="hcp1" &gt;&lt;span style=";font-family:&amp;quot;;" &gt;Non-missing Values Count&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;&lt;span style="font-weight: bold;"&gt;. &lt;/span&gt;Counts the number of aggregate columns with values in them and outputs the count to the specified output column. &lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;&lt;!--[if !supportEmptyParas]--&gt; &lt;!--[endif]--&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style="font-weight: bold;font-size:100%;" class="hcp1" &gt;&lt;span style=";font-family:&amp;quot;;" &gt;Percent Coefficient of Variation&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;&lt;span style="font-weight: bold;"&gt;.&lt;/span&gt; Calculates the percent coefficient of variation for the aggregate column and outputs it to the specified output column.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style="font-weight: bold;font-size:100%;" class="hcp1" &gt;&lt;span style=";font-family:&amp;quot;;" &gt;Range&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;&lt;span style="font-weight: bold;"&gt;. &lt;/span&gt;Calculates the range of values in the aggregate column and outputs it to the specified output column.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style="font-weight: bold;font-size:100%;" class="hcp1" &gt;&lt;span style=";font-family:&amp;quot;;" &gt;Standard Deviation&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;&lt;span style="font-weight: bold;"&gt;.&lt;/span&gt; Calculates the standard deviation of values in the aggregate column and outputs it to the specified output column. &lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style="font-weight: bold;font-size:100%;" class="hcp1" &gt;&lt;span style=";font-family:&amp;quot;;" &gt;Standard Error&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;&lt;span style="font-weight: bold;"&gt;. &lt;/span&gt;Calculates the standard error of values in the aggregate column and outputs it to the specified output column. &lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style="font-weight: bold;font-size:100%;" class="hcp1" &gt;&lt;span style=";font-family:&amp;quot;;" &gt;Sum of Weights&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;&lt;span style="font-weight: bold;"&gt;. &lt;/span&gt;Calculates the sum of values in the weight column specified by the Weight column property and outputs it to the specified output column. &lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style="font-weight: bold;font-size:100%;" class="hcp1" &gt;&lt;span style=";font-family:&amp;quot;;" &gt;Sum&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;&lt;span style="font-weight: bold;"&gt;.&lt;/span&gt; Sums the values in the aggregate column and outputs the sum to the specified output column.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style="font-weight: bold;font-size:100%;" class="hcp1" &gt;&lt;span style=";font-family:&amp;quot;;" &gt;Summary&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;&lt;span style="font-weight: bold;"&gt;.&lt;/span&gt; Specifies a subrecord to write the results of the calculate or recalculate operation to. &lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style="font-weight: bold;font-size:100%;" class="hcp1" &gt;&lt;span style=";font-family:&amp;quot;;" &gt;Uncorrected Sum of Squares&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;&lt;span style="font-weight: bold;"&gt;. &lt;/span&gt;Produces an uncorrected sum of squares for data in the aggregate column and outputs it to the specified output column.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style="font-weight: bold;font-size:100%;" class="hcp1" &gt;&lt;span style=";font-family:&amp;quot;;" &gt;Variance&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;&lt;span style="font-weight: bold;"&gt;.&lt;/span&gt; Calculates the variance for the aggregate column and outputs the sum to the specified output column. This has a dependent property:&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;span style="font-weight: bold;font-size:100%;" class="hcp1" &gt;&lt;span style=";font-family:&amp;quot;;" &gt;Variance divisor&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;&lt;span style="font-weight: bold;"&gt;. &lt;/span&gt;Specifies the variance divisor. By default, uses a value of the number of records in the group minus the number of records with missing values minus 1 to calculate the variance. This corresponds to a vardiv setting of Default If you specify NRecs, the operator uses the number of records in the group minus the number of records with missing values instead.&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6010613752270306441-8396801812671466653?l=datastagecareer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://datastagecareer.blogspot.com/feeds/8396801812671466653/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6010613752270306441&amp;postID=8396801812671466653' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6010613752270306441/posts/default/8396801812671466653'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6010613752270306441/posts/default/8396801812671466653'/><link rel='alternate' type='text/html' href='http://datastagecareer.blogspot.com/2008/03/processing-stage_2080.html' title='PROCESSING STAGE'/><author><name>Eye On This Stuff</name><uri>http://www.blogger.com/profile/02519382966557475421</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6010613752270306441.post-2166653021754922587</id><published>2008-03-16T07:49:00.000-07:00</published><updated>2008-03-16T07:52:35.076-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Aggregator Stage-3'/><title type='text'>PROCESSING STAGE</title><content type='html'>&lt;span style="font-size:100%;"&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 0, 153);font-size:130%;" &gt;&lt;span style="font-weight: bold;"&gt;&lt;br /&gt;Aggregator Stage&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;/span&gt; &lt;h5&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;&lt;span style="color: rgb(153, 51, 0);"&gt;Options Category&lt;/span&gt;  &lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/h5&gt;  &lt;p&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Method&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;. The aggregate stage has two modes of operation: hash and sort. Your choice of mode depends primarily on the number of groupings in the input data set, taking into account the amount of memory available. You typically use hash mode for a relatively small number of groups; generally, fewer than about 1000 groups per megabyte of memory to be used.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;&lt;br /&gt;When using hash mode, you should hash partition the input data set by one or more of the grouping key columns so that all the records in the same group are in the same partition this happens automatically if (auto) is set in the &lt;span class="hcp1"&gt;Partitioning&lt;/span&gt; tab). However, hash partitioning is not mandatory, you can use any partitioning method you choose if keeping groups together in a single partition is not important. For example, if you’re summing records in each partition and later you’ll add the sums across all partitions, you don’t need all records in a group to be in the same partition to do this. Note, though, that there will be multiple output records for each group.&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;br /&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;&lt;o:p&gt;&lt;/o:p&gt;If the number of groups is large, which can happen if you specify many grouping keys, or if some grouping keys can take on many values, you would normally use sort mode. However, sort mode requires the input data set to have been partition sorted with all of the grouping keys specified as hashing and sorting keys this happens automatically if (auto) is set in the Partitioning tab). Sorting requires a pregrouping operation: after sorting, all records in a given group in the same partition are consecutive.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style="font-size:100%;"&gt;&lt;b style=""&gt;&lt;span style=";font-family:&amp;quot;;" &gt;&lt;br /&gt;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;span style="font-size:100%;"&gt;&lt;b style=""&gt;&lt;span style=";font-family:&amp;quot;;" &gt;&lt;span style="color: rgb(153, 51, 0);"&gt;The method property is set to hash by default.&lt;/span&gt;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;br /&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;&lt;b style=""&gt;&lt;span style=";font-family:&amp;quot;;" &gt;&lt;span style="color: rgb(153, 51, 0);"&gt;&lt;/span&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/b&gt;You may want to try both modes with your particular data and application to determine which gives the better performance. You may find that when calculating statistics on large numbers of groups, sort mode performs better than hash mode, assuming the input data set can be efficiently sorted before it is passed to group.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;&lt;br /&gt;Allow Null Outputs&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;. Set this to True to indicate that null is a valid output value when calculating minimum value, maximum value, mean value, standard deviation, standard error, sum, sum of weights, and variance. If False, the null value will have 0 substituted when all input values for the calculation column are null. It is False by default.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt; &lt;span style="font-size:100%;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6010613752270306441-2166653021754922587?l=datastagecareer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://datastagecareer.blogspot.com/feeds/2166653021754922587/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6010613752270306441&amp;postID=2166653021754922587' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6010613752270306441/posts/default/2166653021754922587'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6010613752270306441/posts/default/2166653021754922587'/><link rel='alternate' type='text/html' href='http://datastagecareer.blogspot.com/2008/03/processing-stage_4430.html' title='PROCESSING STAGE'/><author><name>Eye On This Stuff</name><uri>http://www.blogger.com/profile/02519382966557475421</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6010613752270306441.post-901403011532742068</id><published>2008-03-16T07:42:00.000-07:00</published><updated>2008-03-16T07:47:43.671-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Aggregator Stage-2'/><title type='text'>PROCESSING STAGE</title><content type='html'>&lt;span style="font-size:100%;"&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold; color: rgb(0, 0, 153);"&gt;Aggregator Stage&lt;/span&gt;&lt;br /&gt;&lt;/span&gt; &lt;h5&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;&lt;span style="color: rgb(153, 0, 0);"&gt;Grouping Keys Category&lt;/span&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/h5&gt;  &lt;p class="MsoNormal"&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Group&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;. Specifies the input columns you are using as group keys. Repeat the property to select multiple columns as group keys. You can use the &lt;span class="hcp1"&gt;Column Selection&lt;/span&gt;&lt;span class="hcp1"&gt;&lt;span style="font-weight: normal;"&gt; dialog box&lt;/span&gt;&lt;/span&gt; to select several group keys at once if required. This property has a dependent property:&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p style="margin-left: 0.5in; text-indent: -0.25in;"&gt;&lt;!--[if !supportLists]--&gt;&lt;span style=";font-family:Symbol;font-size:100%;"  &gt;·&lt;span style=";font-family:&amp;quot;;" &gt;         &lt;/span&gt;&lt;/span&gt;&lt;!--[endif]--&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Case Sensitive&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;. Use this to specify whether each group key is case sensitive or not, this is set to True by default, i.e., the values “CASE” and “case” in would end up in different groups.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;h5&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;&lt;span style="color: rgb(153, 51, 0);"&gt;Aggregations Category&lt;/span&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/h5&gt;  &lt;p class="MsoNormal"&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Aggregation Type&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;. This property allows you to specify the type of aggregation operation your stage is performing. Choose from Calculate (the default), Recalculate, and Count Rows.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Column for Calculation&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;. The Calculate aggregate type allows you to summarize the contents of a particular column or columns in your input data set by applying one or more aggregate functions to it.&lt;br /&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;Select the column to be aggregated, then select dependent properties to specify the operation to perform on it, and the output colum to  carry the result. You can use the &lt;span class="hcp1"&gt;Column Selection&lt;/span&gt;&lt;span class="hcp1"&gt;&lt;span style="font-weight: normal;"&gt; dialog be to&lt;/span&gt;&lt;/span&gt; select several columns at once if required.  &lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Count Output Column&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;. The Count Rows aggregate type performs a count of the number of records within each group. Specify the column on which the count is output. &lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Summary Column for Recalculation&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;. This aggregate type allows you to apply aggregate functions to a column that has already been summarized. This is like calculate but performs the specified aggregate operation on a set of data that has already been summarized. In practice this means you should have performed a calculate (or recalculate ) operation in a previous Aggregator stage with the Summary property set to produce a subrecord containing the summary data that is then included with the data set. Select the column to be aggregated, then select dependent properties to specify the operation to perform on it, and the output column to carry the result. You can use the &lt;span class="hcp1"&gt;Column Selection&lt;/span&gt;&lt;span class="hcp1"&gt;&lt;span style="font-weight: normal;"&gt; dialog box&lt;/span&gt;&lt;/span&gt; to select several columns at once if required.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Default To Decimal Output&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;. The output type of a calculation or recalculation column is double. Setting this property causes it to default to decimal. You can also set a default precision and scale. (You can also specify that individual columns have decimal output while others retain the default type of double.)&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span class="hcp1"  style="font-size:100%;"&gt;&lt;span style=";font-family:&amp;quot;;" &gt;Weighting column&lt;/span&gt;&lt;/span&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;. This is a dependent property of Count Output Column or Column for Calculation. Configures the stage to increment the count for the group by the contents of the weight column for each record in the group, instead of by 1. Not available for Summary Column for Recalculation. Setting this option affects only the following options:&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p style="margin-left: 0.5in; text-indent: -0.25in;"&gt;&lt;!--[if !supportLists]--&gt;&lt;span style=";font-family:Symbol;font-size:100%;"  &gt;·&lt;span style=";font-family:&amp;quot;;" &gt;         &lt;/span&gt;&lt;/span&gt;&lt;!--[endif]--&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;Percent Coefficient of Variation. &lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p style="margin-left: 0.5in; text-indent: -0.25in;"&gt;&lt;!--[if !supportLists]--&gt;&lt;span style=";font-family:Symbol;font-size:100%;"  &gt;·&lt;span style=";font-family:&amp;quot;;" &gt;         &lt;/span&gt;&lt;/span&gt;&lt;!--[endif]--&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;Mean Value&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p style="margin-left: 0.5in; text-indent: -0.25in;"&gt;&lt;!--[if !supportLists]--&gt;&lt;span style=";font-family:Symbol;font-size:100%;"  &gt;·&lt;span style=";font-family:&amp;quot;;" &gt;         &lt;/span&gt;&lt;/span&gt;&lt;!--[endif]--&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;Sum&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p style="margin-left: 0.5in; text-indent: -0.25in;"&gt;&lt;!--[if !supportLists]--&gt;&lt;span style=";font-family:Symbol;font-size:100%;"  &gt;·&lt;span style=";font-family:&amp;quot;;" &gt;         &lt;/span&gt;&lt;/span&gt;&lt;!--[endif]--&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;Sum of Weights&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;br /&gt;&lt;/p&gt;&lt;p style="margin-left: 0.5in; text-indent: -0.25in;"&gt;Uncorrected Sum of Squares&lt;br /&gt;&lt;/p&gt;&lt;span style=";font-family:&amp;quot;;font-size:100%;"  &gt;&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6010613752270306441-901403011532742068?l=datastagecareer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://datastagecareer.blogspot.com/feeds/901403011532742068/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6010613752270306441&amp;postID=901403011532742068' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6010613752270306441/posts/default/901403011532742068'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6010613752270306441/posts/default/901403011532742068'/><link rel='alternate' type='text/html' href='http://datastagecareer.blogspot.com/2008/03/processing-stage_16.html' title='PROCESSING STAGE'/><author><name>Eye On This Stuff</name><uri>http://www.blogger.com/profile/02519382966557475421</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6010613752270306441.post-7355108088832792916</id><published>2008-03-16T06:27:00.000-07:00</published><updated>2008-03-16T07:28:54.262-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Aggregator Stage'/><title type='text'>Processing Stage</title><content type='html'>&lt;p class="MsoNormal"&gt;&lt;span style="font-size:100%;"&gt;&lt;b style=""&gt;&lt;span style=""&gt;&lt;span style="color: rgb(153, 0, 0);"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span style="font-size:100%;"&gt;&lt;b style=""&gt;&lt;span style=""&gt;&lt;span style="color: rgb(153, 0, 0);font-size:130%;" &gt;Aggregator Stage&lt;/span&gt;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;br /&gt;&lt;span style="font-size:100%;"&gt;&lt;b style=""&gt;&lt;span style=""&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style=""&gt;The Aggregator stage is a processing stage. It classifies data rows from a single input link into groups and computes totals or other aggregate functions for each group. The summed totals for each group are output from the stage via an output link. Follow this link for a list of steps you must take when deploying an Aggregator stage in your job.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style=""&gt;&lt;span style="font-weight: bold;"&gt;&lt;br /&gt;The stage editor has three pages:&lt;/span&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p style="margin-left: 0.5in; text-indent: -0.25in;"&gt;&lt;!--[if !supportLists]--&gt;&lt;span style=";font-family:Symbol;font-size:100%;"  &gt;·&lt;span style=""&gt;         &lt;/span&gt;&lt;/span&gt;&lt;!--[endif]--&gt;&lt;span class="hcp2"  style="font-size:100%;"&gt;&lt;span style=""&gt;Stage&lt;/span&gt;&lt;/span&gt;&lt;span style=""&gt; page. This is always present and is used to specify general information about the stage.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p style="margin-left: 0.5in; text-indent: -0.25in;"&gt;&lt;!--[if !supportLists]--&gt;&lt;span style=";font-family:Symbol;font-size:100%;"  &gt;·&lt;span style=""&gt;         &lt;/span&gt;&lt;/span&gt;&lt;!--[endif]--&gt;&lt;span class="hcp2"  style="font-size:100%;"&gt;&lt;span style=""&gt;Inputs&lt;span style="font-weight: normal;"&gt; page&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=""&gt;. This is where you specify details about the data being grouped and/or aggregated.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p style="margin-left: 0.5in; text-indent: -0.25in;"&gt;&lt;!--[if !supportLists]--&gt;&lt;span style=";font-family:Symbol;font-size:100%;"  &gt;·&lt;span style=""&gt;         &lt;/span&gt;&lt;/span&gt;&lt;!--[endif]--&gt;&lt;span class="hcp2"  style="font-size:100%;"&gt;&lt;span style=""&gt;Outputs&lt;span style="font-weight: normal;"&gt; page&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style=""&gt;. This is where you specify details about the groups being output from the stage.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style=""&gt;The aggregator stage gives you access to grouping and summary operations. One of the easiest ways to expose patterns in a collection of records is to group records with similar characteristics, then compute statistics on all records in the group. You can then use these statistics to compare properties of the different groups. For example, records containing cash register transactions might be grouped by the day of the week to see which day had the largest number of transactions, the largest amount of revenue, etc.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style=""&gt;Records can be grouped by one or more characteristics, where record characteristics correspond to column values. In other words, a group is a set of records with the same value for one or more columns. For example, transaction records might be grouped by both day of the week and by month. These groupings might show that the busiest day of the week varies by season. &lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style=""&gt;In addition to revealing patterns in your data, grouping can also reduce the volume of data by summarizing the records in each group, making it easier to manage. If you group a large volume of data on the basis of one or more characteristics of the data, the resulting data set is generally much smaller than the original and is therefore easier to analyze using standard workstation or PC-based tools.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style=""&gt;At a practical level, you should be aware that, in a parallel environment, the way that you partition data before grouping and summarizing it can affect the results. For example, if you partitioned using the round robin method records with identical values in the column you are grouping on would end up in different partitions. If you then performed a sum operation within these partitions you would not be operating on all the relevant columns. In such circumstances you may want the hash partition the data on the on one or more of the grouping keys to ensure that your groups are entire.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p&gt;&lt;span style=""&gt;It is important that you bear these facts in mind and take any steps you need to prepare your data set before presenting it to the aggregator stage. In practice this could mean you use Sort stages or additional Aggregate stages in the job.The &lt;span class="hcp1"&gt;Properties&lt;/span&gt; tab allows you to specify properties which determine what the stage actually does.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6010613752270306441-7355108088832792916?l=datastagecareer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://datastagecareer.blogspot.com/feeds/7355108088832792916/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6010613752270306441&amp;postID=7355108088832792916' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6010613752270306441/posts/default/7355108088832792916'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6010613752270306441/posts/default/7355108088832792916'/><link rel='alternate' type='text/html' href='http://datastagecareer.blogspot.com/2008/03/processing-stage.html' title='Processing Stage'/><author><name>Eye On This Stuff</name><uri>http://www.blogger.com/profile/02519382966557475421</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6010613752270306441.post-1859601121977950015</id><published>2008-03-16T04:29:00.000-07:00</published><updated>2008-03-16T04:46:01.373-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Surrogate key processing'/><title type='text'>More efficient surrogate key processing</title><content type='html'>&lt;span id="intelliTXT"&gt;What I'm basically trying to learn is if  you all find the performance of most first tier ETL tools in processing surrogate keys (lookup &amp;amp; generation) acceptable.&lt;br /&gt;&lt;br /&gt;Even after fine tuning or re-writing the process to be more like a fact  build, do you find that a large percentage of the ETL time is spent on this  process? This would imply there are significant performance gains to be  made in more efficient surrogate key processing.&lt;br /&gt;&lt;br /&gt;We've found that most ETL tools are relatively inefficient in processing  surrogate keys simply because they are generalist engines. That surrogate  key processing occupies 30-50% of the ETL time.&lt;br /&gt;&lt;br /&gt;For example, most ETL tools are only able to process surrogate keys in a multi-pass operation. First pass to update the dimensions, second pass to update the facts. Processing the load in a single pass would significantly reduce complexity and improve performance.&lt;br /&gt;&lt;br /&gt;Secondly, ETL tools are generally unable to concurrently process multiple small sessions which update the same surrogate keys. As in the instance of multiple sources (or multiple versions of the same source such as different regions) updating in parallel. Or processing dimension updates for multiple transaction types in parallel. Instead, most ETL tools will handle these as one large job, thus processing each source or transaction type in sequence without fully utilising a machine's multiple processors.&lt;br /&gt;&lt;br /&gt;Of course, if most companies can truly live with their current performance, it's not an issue. But I'm wondering if they do so because they feel their only option to accepting the inefficiency is to custom design a high performance solution which would be expensive and difficult to maintain. Or to throw more hardware at the process which is also expensive.&lt;br /&gt;&lt;br /&gt;Based on a client's high performance needs, we've designed a specialised engine that streamlines surrogate key processing for updating marts and warehouses. I'm trying to understand how much need there is for our product.&lt;br /&gt;&lt;br /&gt;Is the need only amongst those companies with an exceptionally large dimension (like customer) or those experiencing exceptionally high volumes of data or arrival rates?&lt;br /&gt;&lt;br /&gt;Or is it that most companies just don't realise that they can significantly improve this portion of the ETL process and radically improve their data mart updating. This means they can maximise their existing infrastructure investment and do more with it. For example, move from a monthly to a weekly load or add new sources or marts or querying tools. What thousands of flowers can bloom once this technical bottleneck is removed?&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6010613752270306441-1859601121977950015?l=datastagecareer.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://datastagecareer.blogspot.com/feeds/1859601121977950015/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=6010613752270306441&amp;postID=1859601121977950015' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6010613752270306441/posts/default/1859601121977950015'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6010613752270306441/posts/default/1859601121977950015'/><link rel='alternate' type='text/html' href='http://datastagecareer.blogspot.com/2008/03/more-efficient-surrogate-key-processing.html' title='More efficient surrogate key processing'/><author><name>Eye On This Stuff</name><uri>http://www.blogger.com/profile/02519382966557475421</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry></feed>
