DATASTAGE: PROCESSING STAGE

Sunday, March 16, 2008

PROCESSING STAGE

COMPRESS STAGE

The Compress stage is a processing stage. It can have a single input link and a single output link. Follow this link for a list of steps you must take when deploying a Compress stage in your job.

The Compress stage uses the UNIX compress or GZIP utility to compress a data set. It converts a data set from a sequence of records into a stream of raw binary data.

A compressed data set is similar to an ordinary data set and can be stored in a persistent form by a Data Set stage. However, a compressed data set cannot be processed by many stages until it is expanded, that is, until its rows are returned to their normal format. Stages that do not perform column-based processing or reorder the rows can operate on compressed data sets. For example, you can use the copy stage to create a copy of the compressed data set.

Because compressing a data set removes its normal record boundaries, the compressed data set must not be repartitioned before it is expanded.

PROPERTERTIES

DATASTAGE

Labels

About Me

Sunday, March 16, 2008

PROCESSING STAGE

Options Category

Command. Specifies whether the stage will use compress (the default) or GZIP.

0 comments: