SlideShare a Scribd company logo
Transformation Guide




Informatica PowerCenter®
(Version 8.1.1)
Informatica PowerCenter Transformation Guide
Version 8.1.1
April 2007

Copyright (c) 1998–2007 Informatica Corporation.
All rights reserved. Printed in the USA.

This software and documentation contain proprietary information of Informatica Corporation and are provided under a license agreement containing
restrictions on use and disclosure and are also protected by copyright law. Reverse engineering of the software is prohibited. No part of this document may be
reproduced or transmitted in any form, by any means (electronic, photocopying, recording or otherwise) without prior consent of Informatica Corporation.

Use, duplication, or disclosure of the Software by the U.S. Government is subject to the restrictions set forth in the applicable software license agreement and as
provided in DFARS 227.7202-1(a) and 227.7702-3(a) (1995), DFARS 252.227-7013(c)(1)(ii) (OCT 1988), FAR 12.212(a) (1995), FAR 52.227-19, or FAR
52.227-14 (ALT III), as applicable.

The information in this document is subject to change without notice. If you find any problems in the documentation, please report them to us in writing.
Informatica Corporation does not warrant that this documentation is error free.

Informatica, PowerCenter, PowerCenterRT, PowerCenter Connect, PowerCenter Data Analyzer, PowerExchange, PowerMart, SuperGlue, Metadata Manager,
Informatica Data Quality and Informatica Data Explorer are trademarks or registered trademarks of Informatica Corporation in the United States and in
jurisdictions throughout the world. All other company and product names may be trade names or trademarks of their respective owners.

Portions of this software and/or documentation are subject to copyright held by third parties, including without limitation: Copyright DataDirect Technologies,
1999-2002. All rights reserved. Copyright © Sun Microsystems. All Rights Reserved. Copyright © RSA Security Inc. All Rights Reserved. Copyright © Ordinal
Technology Corp. All Rights Reserved.

Informatica PowerCenter products contain ACE (TM) software copyrighted by Douglas C. Schmidt and his research group at Washington University and
University of California, Irvine, Copyright (c) 1993-2002, all rights reserved.

Portions of this software contain copyrighted material from The JBoss Group, LLC. Your right to use such materials is set forth in the GNU Lesser General
Public License Agreement, which may be found at http://www.opensource.org/licenses/lgpl-license.php. The JBoss materials are provided free of charge by
Informatica, “as-is”, without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and fitness
for a particular purpose.

Portions of this software contain copyrighted material from Meta Integration Technology, Inc. Meta Integration® is a registered trademark of Meta Integration
Technology, Inc.

This product includes software developed by the Apache Software Foundation (http://www.apache.org/). The Apache Software is Copyright (c) 1999-2005 The
Apache Software Foundation. All rights reserved.

This product includes software developed by the OpenSSL Project for use in the OpenSSL Toolkit and redistribution of this software is subject to terms available
at http://www.openssl.org. Copyright 1998-2003 The OpenSSL Project. All Rights Reserved.

The zlib library included with this software is Copyright (c) 1995-2003 Jean-loup Gailly and Mark Adler.

The Curl license provided with this Software is Copyright 1996-2004, Daniel Stenberg, <Daniel@haxx.se>. All Rights Reserved.

The PCRE library included with this software is Copyright (c) 1997-2001 University of Cambridge Regular expression support is provided by the PCRE library
package, which is open source software, written by Philip Hazel. The source for this library may be found at ftp://ftp.csx.cam.ac.uk/pub/software/programming/
pcre.

InstallAnywhere is Copyright 2005 Zero G Software, Inc. All Rights Reserved.

Portions of the Software are Copyright (c) 1998-2005 The OpenLDAP Foundation. All rights reserved. Redistribution and use in source and binary forms, with
or without modification, are permitted only as authorized by the OpenLDAP Public License, available at http://www.openldap.org/software/release/license.html.

This Software is protected by U.S. Patent Numbers 6,208,990; 6,044,374; 6,014,670; 6,032,158; 5,794,246; 6,339,775 and other U.S. Patents Pending.

DISCLAIMER: Informatica Corporation provides this documentation “as is” without warranty of any kind, either express or implied,
including, but not limited to, the implied warranties of non-infringement, merchantability, or use for a particular purpose. The information provided in this
documentation may include technical inaccuracies or typographical errors. Informatica could make improvements and/or changes in the products described in
this documentation at any time without notice.
Table of Contents
     List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxi

     List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxv

     Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxix
     About This Book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxx
          Document Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxx
     Other Informatica Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxxi
          Visiting Informatica Customer Portal . . . . . . . . . . . . . . . . . . . . . . . . . xxxi
          Visiting the Informatica Web Site . . . . . . . . . . . . . . . . . . . . . . . . . . . xxxi
          Visiting the Informatica Knowledge Base . . . . . . . . . . . . . . . . . . . . . . xxxi
          Obtaining Technical Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxxi


     Chapter 1: Working with Transformations . . . . . . . . . . . . . . . . . . . . . . 1
     Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
     Creating a Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
     Configuring Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
     Working with Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
          Creating Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
          Configuring Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
          Linking Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
     Multi-Group Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
     Working with Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
          Using the Expression Editor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
     Using Local Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
          Temporarily Store Data and Simplify Complex Expressions . . . . . . . . . . 14
          Store Values Across Rows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
          Capture Values from Stored Procedures . . . . . . . . . . . . . . . . . . . . . . . . 15
          Guidelines for Configuring Variable Ports . . . . . . . . . . . . . . . . . . . . . . 16
     Using Default Values for Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
          Entering User-Defined Default Values . . . . . . . . . . . . . . . . . . . . . . . . . 20
          Entering User-Defined Default Input Values . . . . . . . . . . . . . . . . . . . . . 22
          Entering User-Defined Default Output Values . . . . . . . . . . . . . . . . . . . 25
          General Rules for Default Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28


                                                                                                                      iii
Entering and Validating Default Values . . . . . . . . . . . . . . . . . . . . . . . . . 28
             Configuring Tracing Level in Transformations . . . . . . . . . . . . . . . . . . . . . . . 30
             Reusable Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
                  Instances and Inherited Changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
                  Mapping Variables in Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
                  Creating Reusable Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
                  Promoting Non-Reusable Transformations . . . . . . . . . . . . . . . . . . . . . . 32
                  Creating Non-Reusable Instances of Reusable Transformations . . . . . . . . 33
                  Adding Reusable Transformations to Mappings . . . . . . . . . . . . . . . . . . . 33
                  Modifying a Reusable Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . 34


             Chapter 2: Aggregator Transformation . . . . . . . . . . . . . . . . . . . . . . . 37
             Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
                  Ports in the Aggregator Transformation . . . . . . . . . . . . . . . . . . . . . . . . . 38
                  Components of the Aggregator Transformation . . . . . . . . . . . . . . . . . . . 38
                  Aggregate Caches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
             Aggregate Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
                  Aggregate Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
                  Nested Aggregate Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
                  Conditional Clauses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
                  Non-Aggregate Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
                  Null Values in Aggregate Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
             Group By Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
                  Non-Aggregate Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
                  Default Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
             Using Sorted Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
                  Sorted Input Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
                  Pre-Sorting Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
             Creating an Aggregator Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
             Tips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
             Troubleshooting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51


             Chapter 3: Custom Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . 53
             Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
                  Working with Transformations Built On the Custom Transformation . . . 54
                  Code Page Compatibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
                  Distributing Custom Transformation Procedures . . . . . . . . . . . . . . . . . . 56


iv   Table of Contents
Creating Custom Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
     Rules and Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
     Custom Transformation Components . . . . . . . . . . . . . . . . . . . . . . . . . . 58
Working with Groups and Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
     Creating Groups and Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
     Editing Groups and Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
     Defining Port Relationships . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
Working with Port Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
Custom Transformation Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
     Setting the Update Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
     Working with Thread-Specific Procedure Code . . . . . . . . . . . . . . . . . . . 66
Working with Transaction Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
     Transformation Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
     Generate Transaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
     Working with Transaction Boundaries . . . . . . . . . . . . . . . . . . . . . . . . . 69
Blocking Input Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
     Writing the Procedure Code to Block Data . . . . . . . . . . . . . . . . . . . . . . 70
     Configuring Custom Transformations as Blocking Transformations . . . . 70
     Validating Mappings with Custom Transformations . . . . . . . . . . . . . . . 71
Working with Procedure Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
Creating Custom Transformation Procedures . . . . . . . . . . . . . . . . . . . . . . . 73
     Step 1. Create the Custom Transformation . . . . . . . . . . . . . . . . . . . . . . 73
     Step 2. Generate the C Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
     Step 3. Fill Out the Code with the Transformation Logic . . . . . . . . . . . 76
     Step 4. Build the Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
     Step 5. Create a Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
     Step 6. Run the Session in a Workflow . . . . . . . . . . . . . . . . . . . . . . . . . 87


Chapter 4: Custom Transformation Functions . . . . . . . . . . . . . . . . . 89
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
     Working with Handles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
Function Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
Working with Rows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
     Rules and Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
Generated Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
     Initialization Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
     Notification Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100



                                                                                             Table of Contents   v
Deinitialization Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
             API Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
                  Set Data Access Mode Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
                  Navigation Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
                  Property Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
                  Rebind Datatype Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
                  Data Handling Functions (Row-Based Mode) . . . . . . . . . . . . . . . . . . . 117
                  Set Pass-Through Port Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
                  Output Notification Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
                  Data Boundary Output Notification Function . . . . . . . . . . . . . . . . . . . 121
                  Error Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
                  Session Log Message Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
                  Increment Error Count Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
                  Is Terminated Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
                  Blocking Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
                  Pointer Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
                  Change String Mode Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
                  Set Data Code Page Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
                  Row Strategy Functions (Row-Based Mode) . . . . . . . . . . . . . . . . . . . . 128
                  Change Default Row Strategy Function . . . . . . . . . . . . . . . . . . . . . . . 129
             Array-Based API Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
                  Maximum Number of Rows Functions . . . . . . . . . . . . . . . . . . . . . . . . 130
                  Number of Rows Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
                  Is Row Valid Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
                  Data Handling Functions (Array-Based Mode) . . . . . . . . . . . . . . . . . . 132
                  Row Strategy Functions (Array-Based Mode) . . . . . . . . . . . . . . . . . . . . 135
                  Set Input Error Row Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
             Java API Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
             C++ API Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139


             Chapter 5: Expression Transformation . . . . . . . . . . . . . . . . . . . . . . 141
             Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
                  Calculating Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
                  Adding Multiple Calculations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
             Creating an Expression Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . 143




vi   Table of Contents
Chapter 6: External Procedure Transformation . . . . . . . . . . . . . . . . 145
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
     Code Page Compatibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
     External Procedures and External Procedure Transformations . . . . . . . . 147
     External Procedure Transformation Properties . . . . . . . . . . . . . . . . . . . 147
     Pipeline Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
     COM Versus Informatica External Procedures . . . . . . . . . . . . . . . . . . . 148
     The BankSoft Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
Developing COM Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
     Steps for Creating a COM Procedure . . . . . . . . . . . . . . . . . . . . . . . . . 149
     COM External Procedure Server Type . . . . . . . . . . . . . . . . . . . . . . . . 149
     Using Visual C++ to Develop COM Procedures . . . . . . . . . . . . . . . . . 149
     Developing COM Procedures with Visual Basic . . . . . . . . . . . . . . . . . 156
Developing Informatica External Procedures . . . . . . . . . . . . . . . . . . . . . . . 159
     Step 1. Create the External Procedure Transformation . . . . . . . . . . . . . 159
     Step 2. Generate the C++ Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
     Step 3. Fill Out the Method Stub with Implementation . . . . . . . . . . . . 164
     Step 4. Building the Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
     Step 5. Create a Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
     Step 6. Run the Session in a Workflow . . . . . . . . . . . . . . . . . . . . . . . . 167
Distributing External Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
     Distributing COM Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
     Distributing Informatica Modules . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
Development Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
     COM Datatypes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
     Row-Level Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
     Return Values from Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
     Exceptions in Procedure Calls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
     Memory Management for Procedures . . . . . . . . . . . . . . . . . . . . . . . . . 173
     Wrapper Classes for Pre-Existing C/C++ Libraries or VB Functions . . . 173
     Generating Error and Tracing Messages . . . . . . . . . . . . . . . . . . . . . . . 173
     Unconnected External Procedure Transformations . . . . . . . . . . . . . . . . 175
     Initializing COM and Informatica Modules . . . . . . . . . . . . . . . . . . . . 175
     Other Files Distributed and Used in TX . . . . . . . . . . . . . . . . . . . . . . . 179
Service Process Variables in Initialization Properties . . . . . . . . . . . . . . . . . 180
External Procedure Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
     Dispatch Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181


                                                                                            Table of Contents   vii
External Procedure Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
                    Property Access Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
                    Parameter Access Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
                    Code Page Access Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
                    Transformation Name Access Functions . . . . . . . . . . . . . . . . . . . . . . . 185
                    Procedure Access Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
                    Partition Related Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
                    Tracing Level Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187


               Chapter 7: Filter Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
               Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
               Filter Condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
               Creating a Filter Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
               Tips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
               Troubleshooting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196


               Chapter 8: HTTP Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
               Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
                    Authentication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
                    Connecting to the HTTP Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
               Creating an HTTP Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
               Configuring the Properties Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
               Configuring the HTTP Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
                    Selecting a Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
                    Configuring Groups and Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
                    Configuring a URL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
               Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
                    GET Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
                    POST Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
                    SIMPLE POST Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211


               Chapter 9: Java Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
               Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
                    Steps to Define a Java Transformation . . . . . . . . . . . . . . . . . . . . . . . . . 214
                    Active and Passive Java Transformations . . . . . . . . . . . . . . . . . . . . . . . 215
                    Datatype Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
               Using the Java Code Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217


viii   Table of Contents
Configuring Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
     Creating Groups and Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
     Setting Default Port Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
Configuring Java Transformation Properties . . . . . . . . . . . . . . . . . . . . . . . 221
     Working with Transaction Control . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
     Setting the Update Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
Developing Java Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
     Creating Java Code Snippets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
     Importing Java Packages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
     Defining Helper Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
     On Input Row Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
     On End of Data Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
     On Receiving Transaction Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
Configuring Java Transformation Settings . . . . . . . . . . . . . . . . . . . . . . . . . 229
     Configuring the Classpath . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
     Enabling High Precision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
Compiling a Java Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
Fixing Compilation Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
     Locating the Source of Compilation Errors . . . . . . . . . . . . . . . . . . . . . 232
     Identifying Compilation Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234


Chapter 10: Java Transformation API Reference . . . . . . . . . . . . . . . 237
Java Transformation API Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
commit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
     Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
     Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
failSession . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
     Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
     Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
generateRow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
     Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
     Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
getInRowType . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
     Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
     Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
incrementErrorCount . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243
     Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243



                                                                                               Table of Contents   ix
Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243
            isNull . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244
                 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244
                 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244
            logInfo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
                 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
                 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
            logError . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246
                 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246
                 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246
            rollBack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247
                 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247
                 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247
            setNull . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248
                 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248
                 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248
            setOutRowType . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
                 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
                 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249


            Chapter 11: Java Transformation Example . . . . . . . . . . . . . . . . . . . 251
            Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
            Step 1. Import the Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
            Step 2. Create Transformation and Configure Ports . . . . . . . . . . . . . . . . . . 254
            Step 3. Enter Java Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256
                 Import Packages Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256
                 Helper Code Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
                 On Input Row Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258
            Step 4. Compile the Java Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
            Step 5. Create a Session and Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . 262
                 Sample Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262


            Chapter 12: Java Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263
            Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264
                 Expression Function Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264
            Using the Define Expression Dialog Box . . . . . . . . . . . . . . . . . . . . . . . . . . 266
                 Step 1. Configure the Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266


x   Table of Contents
Step 2. Create and Validate the Expression . . . . . . . . . . . . . . . . . . . . . 267
     Step 3. Generate Java Code for the Expression . . . . . . . . . . . . . . . . . . 267
     Steps to Create an Expression and Generate Java Code . . . . . . . . . . . . 268
     Java Expression Templates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
Working with the Simple Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271
     invokeJExpression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271
     Simple Interface Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272
Working with the Advanced Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273
     Steps to Invoke an Expression with the Advanced Interface . . . . . . . . . 273
     Rules and Guidelines for Working with the Advanced Interface . . . . . . 273
     EDataType Class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274
     JExprParamMetadata Class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274
     defineJExpression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
     JExpression Class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276
     Advanced Interface Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
JExpression API Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
     invoke . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
     getResultDataType . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280
     getResultMetadata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280
     isResultNull . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280
     getInt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281
     getDouble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281
     getLong . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281
     getStringBuffer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282
     getBytes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282


Chapter 13: Joiner Transformation . . . . . . . . . . . . . . . . . . . . . . . . . 283
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284
     Working with the Joiner Transformation . . . . . . . . . . . . . . . . . . . . . . 284
Joiner Transformation Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286
Defining a Join Condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288
Defining the Join Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289
     Normal Join . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289
     Master Outer Join . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290
     Detail Outer Join . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290
     Full Outer Join . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291
Using Sorted Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292



                                                                                               Table of Contents   xi
Configuring the Sort Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292
                   Adding Transformations to the Mapping . . . . . . . . . . . . . . . . . . . . . . . 293
                   Configuring the Joiner Transformation . . . . . . . . . . . . . . . . . . . . . . . . 293
                   Defining the Join Condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294
              Joining Data from a Single Source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296
                   Joining Two Branches of the Same Pipeline . . . . . . . . . . . . . . . . . . . . . 296
                   Joining Two Instances of the Same Source . . . . . . . . . . . . . . . . . . . . . . 297
                   Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298
              Blocking the Source Pipelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299
                   Unsorted Joiner Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299
                   Sorted Joiner Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299
              Working with Transactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300
                   Preserving Transaction Boundaries for a Single Pipeline . . . . . . . . . . . . 301
                   Preserving Transaction Boundaries in the Detail Pipeline . . . . . . . . . . . 301
                   Dropping Transaction Boundaries for Two Pipelines . . . . . . . . . . . . . . 302
              Creating a Joiner Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303
              Tips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306


              Chapter 14: Lookup Transformation . . . . . . . . . . . . . . . . . . . . . . . . 307
              Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308
              Connected and Unconnected Lookups . . . . . . . . . . . . . . . . . . . . . . . . . . . 309
                   Connected Lookup Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . 309
                   Unconnected Lookup Transformation . . . . . . . . . . . . . . . . . . . . . . . . . 310
              Relational and Flat File Lookups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311
                   Relational Lookups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311
                   Flat File Lookups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311
              Lookup Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313
                   Lookup Source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313
                   Lookup Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313
                   Lookup Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314
                   Lookup Condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315
                   Metadata Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315
              Lookup Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316
                   Configuring Lookup Properties in a Session . . . . . . . . . . . . . . . . . . . . 320
              Lookup Query . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324
                   Default Lookup Query . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324
                   Overriding the Lookup Query . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324


xii   Table of Contents
Lookup Condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328
     Uncached or Static Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328
     Dynamic Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329
     Handling Multiple Matches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329
Lookup Caches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 330
Configuring Unconnected Lookup Transformations . . . . . . . . . . . . . . . . . 331
     Step 1. Add Input Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331
     Step 2. Add the Lookup Condition . . . . . . . . . . . . . . . . . . . . . . . . . . 332
     Step 3. Designate a Return Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332
     Step 4. Call the Lookup Through an Expression . . . . . . . . . . . . . . . . . 333
Creating a Lookup Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335
Tips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 336


Chapter 15: Lookup Caches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338
     Cache Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339
Building Connected Lookup Caches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340
     Sequential Caches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340
     Concurrent Caches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341
Using a Persistent Lookup Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342
     Using a Non-Persistent Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342
     Using a Persistent Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342
     Rebuilding the Lookup Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342
Working with an Uncached Lookup or Static Cache . . . . . . . . . . . . . . . . . 344
Working with a Dynamic Lookup Cache . . . . . . . . . . . . . . . . . . . . . . . . . 345
     Using the NewLookupRow Port . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347
     Using the Associated Input Port . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348
     Working with Lookup Transformation Values . . . . . . . . . . . . . . . . . . . 349
     Using the Ignore Null Property . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353
     Using the Ignore in Comparison Property . . . . . . . . . . . . . . . . . . . . . . 354
     Using Update Strategy Transformations with a Dynamic Cache . . . . . . 354
     Updating the Dynamic Lookup Cache . . . . . . . . . . . . . . . . . . . . . . . . 356
     Using the WHERE Clause with a Dynamic Cache . . . . . . . . . . . . . . . 358
     Synchronizing the Dynamic Lookup Cache . . . . . . . . . . . . . . . . . . . . . 359
     Example Using a Dynamic Lookup Cache . . . . . . . . . . . . . . . . . . . . . . 360
     Rules and Guidelines for Dynamic Caches . . . . . . . . . . . . . . . . . . . . . 361
Sharing the Lookup Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363



                                                                                                Table of Contents   xiii
Sharing an Unnamed Lookup Cache . . . . . . . . . . . . . . . . . . . . . . . . . . 363
                   Sharing a Named Lookup Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365
              Lookup Cache Tips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369


              Chapter 16: Normalizer Transformation . . . . . . . . . . . . . . . . . . . . . . 371
              Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 372
              Normalizer Transformation Components . . . . . . . . . . . . . . . . . . . . . . . . . . 374
                   Ports Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374
                   Properties Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 376
                   Normalizer Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 377
              Normalizer Transformation Generated Keys . . . . . . . . . . . . . . . . . . . . . . . 379
                   Storing Generated Key Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 379
                   Changing the Generated Key Values . . . . . . . . . . . . . . . . . . . . . . . . . . 379
              VSAM Normalizer Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 380
                   VSAM Normalizer Ports Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 382
                   VSAM Normalizer Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383
                   Steps to Create a VSAM Normalizer Transformation . . . . . . . . . . . . . . 385
              Pipeline Normalizer Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387
                   Pipeline Normalizer Ports Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 388
                   Pipeline Normalizer Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 390
                   Steps to Create a Pipeline Normalizer Transformation . . . . . . . . . . . . . 391
              Using a Normalizer Transformation in a Mapping . . . . . . . . . . . . . . . . . . . 394
                   Generating Key Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 396
              Troubleshooting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 399


              Chapter 17: Rank Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . 401
              Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 402
                   Ranking String Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403
                   Rank Caches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403
                   Rank Transformation Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403
              Ports in a Rank Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 404
                   Rank Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 404
              Defining Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 405
              Creating a Rank Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 406


              Chapter 18: Router Transformation . . . . . . . . . . . . . . . . . . . . . . . . . 409
              Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 410


xiv   Table of Contents
Working with Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 412
     Input Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 412
     Output Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 412
     Using Group Filter Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413
     Adding Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414
Working with Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 416
Connecting Router Transformations in a Mapping . . . . . . . . . . . . . . . . . . 418
Creating a Router Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 420


Chapter 19: Sequence Generator Transformation . . . . . . . . . . . . . . 421
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 422
Common Uses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423
     Creating Keys . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423
     Replacing Missing Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423
Sequence Generator Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 424
     NEXTVAL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 424
     CURRVAL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 426
Transformation Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 427
     Start Value and Cycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 428
     Increment By . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 428
     End Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 429
     Current Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 429
     Number of Cached Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 430
     Reset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 431
Creating a Sequence Generator Transformation . . . . . . . . . . . . . . . . . . . . . 432


Chapter 20: Sorter Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . 435
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 436
Sorting Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437
Sorter Transformation Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 439
     Sorter Cache Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 439
     Case Sensitive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 441
     Work Directory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 441
     Distinct Output Rows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 441
     Tracing Level . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 441
     Null Treated Low . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 442
     Transformation Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 442


                                                                                              Table of Contents   xv
Creating a Sorter Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 443


              Chapter 21: Source Qualifier Transformation . . . . . . . . . . . . . . . . . 445
              Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 446
                   Transformation Datatypes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 446
                   Target Load Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 446
                   Parameters and Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 447
              Source Qualifier Transformation Properties . . . . . . . . . . . . . . . . . . . . . . . . 449
              Default Query . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 451
                   Viewing the Default Query . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 452
                   Overriding the Default Query . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 452
              Joining Source Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 454
                   Default Join . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 454
                   Custom Joins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455
                   Heterogeneous Joins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 456
                   Creating Key Relationships . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 456
              Adding an SQL Query . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 458
              Entering a User-Defined Join . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 460
              Outer Join Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 462
                   Informatica Join Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 462
                   Creating an Outer Join . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 467
                   Common Database Syntax Restrictions . . . . . . . . . . . . . . . . . . . . . . . . 469
              Entering a Source Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 470
              Using Sorted Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 472
              Select Distinct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 474
                   Overriding Select Distinct in the Session . . . . . . . . . . . . . . . . . . . . . . 474
              Adding Pre- and Post-Session SQL Commands . . . . . . . . . . . . . . . . . . . . . 475
              Creating a Source Qualifier Transformation . . . . . . . . . . . . . . . . . . . . . . . . 476
                   Creating a Source Qualifier Transformation By Default . . . . . . . . . . . . 476
                   Creating a Source Qualifier Transformation Manually . . . . . . . . . . . . . 476
                   Configuring Source Qualifier Transformation Options . . . . . . . . . . . . . 476
              Troubleshooting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 478


              Chapter 22: SQL Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . 479
              Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 480
              Script Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 481
                   Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 482


xvi   Table of Contents
Script Mode Rules and Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . 482
Query Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 484
     Using Static SQL Queries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 485
     Using Dynamic SQL Queries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 486
     Query Mode Rules and Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . 489
Connecting to Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 490
     Using a Static Database Connection . . . . . . . . . . . . . . . . . . . . . . . . . . 490
     Passing a Logical Database Connection . . . . . . . . . . . . . . . . . . . . . . . . 490
     Passing Full Connection Information . . . . . . . . . . . . . . . . . . . . . . . . . 490
     Database Connections Rules and Guidelines . . . . . . . . . . . . . . . . . . . . 493
Session Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 494
     Input Row to Output Row Cardinality . . . . . . . . . . . . . . . . . . . . . . . . 494
     Transaction Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 498
     High Availability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 498
Creating an SQL Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 500
SQL Transformation Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 502
     Properties Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 502
     SQL Settings Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 505
     SQL Ports Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 506
SQL Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 509


Chapter 23: Using the SQL Transformation in a Mapping . . . . . . . . 511
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 512
Dynamic Update Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 513
     Defining the Source File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 514
     Creating a Target Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 515
     Creating the Database Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 515
     Configuring the Expression Transformation . . . . . . . . . . . . . . . . . . . . 516
     Defining the SQL Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . 516
     Configuring Session Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 518
     Target Data Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 518
Dynamic Connection Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 519
     Defining the Source File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 520
     Creating a Target Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 520
     Creating the Database Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 520
     Creating the Database Connections . . . . . . . . . . . . . . . . . . . . . . . . . . 521
     Configuring the Expression Transformation . . . . . . . . . . . . . . . . . . . . 521



                                                                                            Table of Contents   xvii
Defining the SQL Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . 522
                     Configuring Session Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 524
                     Target Data Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 524


                Chapter 24: Stored Procedure Transformation . . . . . . . . . . . . . . . . 525
                Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 526
                Connected and Unconnected Transformations . . . . . . . . . . . . . . . . . . . . . . 527
                Input and Output Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 528
                     Input/Output Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 528
                     Return Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 528
                     Status Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 528
                Running a Stored Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 529
                     Stored Procedure Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 529
                     Executing Stored Procedures with a Database Connection . . . . . . . . . . 529
                Using a Stored Procedure in a Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . 531
                Writing a Stored Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 532
                     Sample Stored Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 532
                Creating a Stored Procedure Transformation . . . . . . . . . . . . . . . . . . . . . . . 535
                     Importing Stored Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 535
                     Manually Creating Stored Procedure Transformations . . . . . . . . . . . . . 537
                     Setting Options for the Stored Procedure . . . . . . . . . . . . . . . . . . . . . . 538
                     Using $Source and $Target Variables . . . . . . . . . . . . . . . . . . . . . . . . . 539
                     Changing the Stored Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 540
                Configuring a Connected Transformation . . . . . . . . . . . . . . . . . . . . . . . . . 541
                Configuring an Unconnected Transformation . . . . . . . . . . . . . . . . . . . . . . 542
                     Calling a Stored Procedure From an Expression . . . . . . . . . . . . . . . . . . 542
                     Calling a Pre- or Post-Session Stored Procedure . . . . . . . . . . . . . . . . . . 545
                Error Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 548
                     Pre-Session Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 548
                     Post-Session Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 549
                     Session Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 549
                Supported Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 550
                     SQL Declaration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 550
                     Parameter Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 550
                     Input/Output Port in Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 550
                     Type of Return Value Supported . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 551
                Expression Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 552


xviii   Table of Contents
Tips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 553
Troubleshooting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 554


Chapter 25: Transaction Control Transformation . . . . . . . . . . . . . . 555
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 556
Transaction Control Transformation Properties . . . . . . . . . . . . . . . . . . . . . 557
     Properties Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 557
     Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 558
Using Transaction Control Transformations in Mappings . . . . . . . . . . . . . . 560
     Sample Transaction Control Mappings with Multiple Targets . . . . . . . 561
Mapping Guidelines and Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 564
Creating a Transaction Control Transformation . . . . . . . . . . . . . . . . . . . . . 565


Chapter 26: Union Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . 567
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 568
     Union Transformation Rules and Guidelines . . . . . . . . . . . . . . . . . . . . 568
     Union Transformation Components . . . . . . . . . . . . . . . . . . . . . . . . . . 568
Working with Groups and Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 570
Creating a Union Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 572
Using a Union Transformation in Mappings . . . . . . . . . . . . . . . . . . . . . . . 574


Chapter 27: Update Strategy Transformation . . . . . . . . . . . . . . . . . 575
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 576
     Setting the Update Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 576
Flagging Rows Within a Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 577
     Forwarding Rejected Rows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 577
     Update Strategy Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 577
     Aggregator and Update Strategy Transformations . . . . . . . . . . . . . . . . 578
     Lookup and Update Strategy Transformations . . . . . . . . . . . . . . . . . . . 579
Setting the Update Strategy for a Session . . . . . . . . . . . . . . . . . . . . . . . . . 580
     Specifying an Operation for All Rows . . . . . . . . . . . . . . . . . . . . . . . . . 580
     Specifying Operations for Individual Target Tables . . . . . . . . . . . . . . . 581
Update Strategy Checklist . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 583


Chapter 28: XML Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . 585
XML Source Qualifier Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . 586
XML Parser Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 587

                                                                                                Table of Contents   xix
XML Generator Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 588


             Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 589




xx   Table of Contents
List of Figures
    Figure   1-1. Sample Ports Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
    Figure   1-2. Example of Input, Output, and Input/Output Ports . . . . . . . . . . . . . . . . . . . . . . . 8
    Figure   1-3. Sample Input and Output Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
    Figure   1-4. Expression Editor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
    Figure   1-5. Variable Ports Store Values Across Rows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
    Figure   1-6. Default Value for Input and Input/Output Ports . . . . . . . . . . . . . . . . . . . . . . . . . 19
    Figure   1-7. Default Value for Output Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
    Figure   1-8. Using a Constant as a Default Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
    Figure   1-9. Using the ERROR Function to Skip Null Input Values . . . . . . . . . . . . . . . . . . . . 24
    Figure   1-10. Entering and Validating Default Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
    Figure   1-11. Reverting to Original Reusable Transformation Properties . . . . . . . . . . . . . . . . . 35
    Figure   2-1. Sample Mapping with Aggregator and Sorter Transformations . . . . . . . . . . . . . . . 46
    Figure   3-1. Custom Transformation Ports Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
    Figure   3-2. Editing Port Dependencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
    Figure   3-3. Port Attribute Definitions Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
    Figure   3-4. Edit Port Attribute Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
    Figure   3-5. Custom Transformation Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
    Figure   3-6. Custom Transformation Ports Tab - Union Example . . . . . . . . . . . . . . . . . . . . . . 74
    Figure   3-7. Custom Transformation Properties Tab - Union Example . . . . . . . . . . . . . . . . . . 75
    Figure   3-8. Mapping with a Custom Transformation - Union Example . . . . . . . . . . . . . . . . . 87
    Figure   4-1. Custom Transformation Handles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
    Figure   6-1. Process for Distributing External Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
    Figure   6-2. External Procedure Transformation Initialization Properties . . . . . . . . . . . . . . . . 178
    Figure   6-3. External Procedure Transformation Initialization Properties Tab . . . . . . . . . . . . 180
    Figure   7-1. Sample Mapping with a Filter Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . 190
    Figure   7-2. Specifying a Filter Condition in a Filter Transformation . . . . . . . . . . . . . . . . . . 191
    Figure   8-1. HTTP Transformation Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
    Figure   8-2. HTTP Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
    Figure   8-3. HTTP Transformation Properties Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
    Figure   8-4. HTTP Transformation HTTP Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
    Figure   8-5. HTTP Tab for a GET Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
    Figure   8-6. HTTP Tab for a POST Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
    Figure   8-7. HTTP Tab for a SIMPLE POST Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
    Figure   9-1. Java Code Tab Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
    Figure   9-2. Java Transformation Ports Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
    Figure   9-3. Java Transformation Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
    Figure   9-4. Java Transformation Settings Dialog Box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
    Figure   9-5. Highlighted Error in Code Entry Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
    Figure   9-6. Highlighted Error in Full Code Window . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
    Figure   11-1. Java Transformation Example - Sample Mapping . . . . . . . . . . . . . . . . . . . . . . . 253



                                                                                                                  List of Figures        xxi
Figure   11-2. Java Transformation Example - Ports Tab . . . . . . . . . . . . . . . . . . . . . . . .                .   ..   .   .255
         Figure   11-3. Java Transformation Example - Import Packages Tab . . . . . . . . . . . . . . . .                      .   ..   .   .257
         Figure   11-4. Java Transformation Example - Helper Code Tab . . . . . . . . . . . . . . . . . .                      .   ..   .   .258
         Figure   11-5. Java Transformation Example - On Input Row Tab . . . . . . . . . . . . . . . . .                       .   ..   .   .260
         Figure   11-6. Java Transformation Example - Successful Compilation . . . . . . . . . . . . . .                       .   ..   .   .261
         Figure   12-1. Define Expression Dialog Box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .           .   ..   .   .267
         Figure   12-2. Java Expressions Code Entry Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .            .   ..   .   .268
         Figure   13-1. Mapping with Master and Detail Pipelines . . . . . . . . . . . . . . . . . . . . . . .                 .   ..   .   .284
         Figure   13-2. Joiner Transformation Properties Tab . . . . . . . . . . . . . . . . . . . . . . . . . . .             .   ..   .   .286
         Figure   13-3. Mapping Configured to Join Data from Two Pipelines . . . . . . . . . . . . . .                         .   ..   .   .295
         Figure   13-4. Mapping that Joins Two Branches of a Pipeline . . . . . . . . . . . . . . . . . . . .                  .   ..   .   .297
         Figure   13-5. Mapping that Joins Two Instances of the Same Source . . . . . . . . . . . . . . .                      .   ..   .   .297
         Figure   13-6. Preserving Transaction Boundaries when You Join Two Pipeline Branches                                  .   ..   .   .301
         Figure   14-1. Session Properties for Flat File Lookups . . . . . . . . . . . . . . . . . . . . . . . . .             .   ..   .   .321
         Figure   14-2. Return Port in a Lookup Transformation . . . . . . . . . . . . . . . . . . . . . . . .                 .   ..   .   .333
         Figure   15-1. Building Lookup Caches Sequentially . . . . . . . . . . . . . . . . . . . . . . . . . . .              .   ..   .   .340
         Figure   15-2. Building Lookup Caches Concurrently . . . . . . . . . . . . . . . . . . . . . . . . . .                .   ..   .   .341
         Figure   15-3. Mapping with a Dynamic Lookup Cache . . . . . . . . . . . . . . . . . . . . . . . . .                  .   ..   .   .346
         Figure   15-4. Dynamic Lookup Transformation Ports Tab . . . . . . . . . . . . . . . . . . . . . .                    .   ..   .   .347
         Figure   15-5. Using Update Strategy Transformations with a Lookup Transformation . .                                 .   ..   .   .355
         Figure   15-6. Slowly Changing Dimension Mapping with Dynamic Lookup Cache . . . .                                    .   ..   .   .360
         Figure   16-1. Normalizer Transformation Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .            .   ..   .   .374
         Figure   16-2. Normalizer Ports Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .     .   ..   .   .375
         Figure   16-3. Normalizer Transformation Properties Tab . . . . . . . . . . . . . . . . . . . . . . .                 .   ..   .   .376
         Figure   16-4. Normalizer Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   .   ..   .   .377
         Figure   16-5. COBOL Source Definition Example . . . . . . . . . . . . . . . . . . . . . . . . . . . .                .   ..   .   .381
         Figure   16-6. Sales File VSAM Normalizer Transformation . . . . . . . . . . . . . . . . . . . . .                    .   ..   .   .381
         Figure   16-7. VSAM Normalizer Ports Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .            .   ..   .   .383
         Figure   16-8. Normalizer Tab for a VSAM Normalizer Transformation . . . . . . . . . . . . .                          .   ..   .   .384
         Figure   16-9. Pipeline Normalizer Columns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .          .   ..   .   .387
         Figure   16-10. Pipeline Normalizer Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .       .   ..   .   .387
         Figure   16-11. Pipeline Normalizer Ports Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .           .   ..   .   .389
         Figure   16-12. Normalizer Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    .   ..   .   .390
         Figure   16-13. Grouping Repeated Columns on the Normalizer Tab . . . . . . . . . . . . . . .                         .   ..   .   .391
         Figure   16-14. Group-Level Column on the Normalizer Tab . . . . . . . . . . . . . . . . . . . .                      .   ..   .   .393
         Figure   16-15. Sales File COBOL Source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .         .   ..   .   .394
         Figure   16-16. Multiple Record Types Routed to Different Targets . . . . . . . . . . . . . . . .                     .   ..   .   .395
         Figure   16-17. Router Transformation User-Defined Groups . . . . . . . . . . . . . . . . . . . .                     .   ..   .   .396
         Figure   16-18. COBOL Source with A Multiple-Occurring Group of Columns . . . . . . .                                 .   ..   .   .397
         Figure   16-19. Generated Keys in Target Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .           .   ..   .   .397
         Figure   16-20. Generated Keys Mapped to Target Keys . . . . . . . . . . . . . . . . . . . . . . . .                  .   ..   .   .398
         Figure   17-1. Sample Mapping with a Rank Transformation . . . . . . . . . . . . . . . . . . . . .                    .   ..   .   .402
         Figure   18-1. Comparing Router and Filter Transformations . . . . . . . . . . . . . . . . . . . .                    .   ..   .   .410



xxii   List of Figures
Figure   18-2.   Sample Router Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .           ..   .   411
Figure   18-3.   Using a Router Transformation in a Mapping . . . . . . . . . . . . . . . . . . . . . .                   ..   .   413
Figure   18-4.   Specifying Group Filter Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . .             ..   .   414
Figure   18-5.   Router Transformation Ports Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .            ..   .   416
Figure   18-6.   Input Port Name and Corresponding Output Port Names . . . . . . . . . . . . .                            ..   .   417
Figure   19-1.   Connecting NEXTVAL to Two Target Tables in a Mapping . . . . . . . . . . . .                             ..   .   424
Figure   19-2.   Mapping with a Sequence Generator and an Expression Transformation . . .                                 ..   .   425
Figure   19-3.   Connecting CURRVAL and NEXTVAL Ports to a Target . . . . . . . . . . . . .                               ..   .   426
Figure   20-1.   Sample Mapping with a Sorter Transformation . . . . . . . . . . . . . . . . . . . . .                    ..   .   436
Figure   20-2.   Sample Sorter Transformation Ports Configuration . . . . . . . . . . . . . . . . . .                     ..   .   437
Figure   20-3.   Sorter Transformation Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .           ..   .   439
Figure   21-1.   Source Definition Connected to a Source Qualifier Transformation . . . . . .                             ..   .   451
Figure   21-2.   Joining Two Tables with One Source Qualifier Transformation . . . . . . . . .                            ..   .   455
Figure   21-3.   Creating a Relationship Between Two Tables . . . . . . . . . . . . . . . . . . . . . . .                 ..   .   457
Figure   22-1.   SQL Transformation Script Mode Ports . . . . . . . . . . . . . . . . . . . . . . . . . .                 ..   .   481
Figure   22-2.   SQL Editor for an SQL Transformation Query . . . . . . . . . . . . . . . . . . . . .                     ..   .   484
Figure   22-3.   SQL Transformation Static Query Mode Ports . . . . . . . . . . . . . . . . . . . . .                     ..   .   486
Figure   22-4.   SQL Transformation Ports to Pass a Full Dynamic Query . . . . . . . . . . . . .                          ..   .   487
Figure   22-5.   SQL Transformation Properties Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . .              ..   .   503
Figure   22-6.   SQL Settings Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   ..   .   505
Figure   22-7.   SQL Transformation SQL Ports Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . .               ..   .   507
Figure   23-1.   Dynamic Query Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .          ..   .   513
Figure   23-2.   Dynamic Query Expression Transformation Ports . . . . . . . . . . . . . . . . . . .                      ..   .   516
Figure   23-3.   Dynamic Query SQL Transformation Ports tab: . . . . . . . . . . . . . . . . . . . .                      ..   .   517
Figure   23-4.   Dynamic Connection Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .             ..   .   519
Figure   23-5.   Dynamic Query Example Expression Transformation Ports . . . . . . . . . . . .                            ..   .   521
Figure   23-6.   Dynamic Connection Example SQL Transformation Ports . . . . . . . . . . . . .                            ..   .   523
Figure   24-1.   Sample Mapping with a Stored Procedure Transformation . . . . . . . . . . . . .                          ..   .   541
Figure   24-2.   Expression Transformation Referencing a Stored Procedure Transformation                                  ..   .   542
Figure   24-3.   Stored Procedure Error Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .            ..   .   548
Figure   25-1.   Transaction Control Transformation Properties . . . . . . . . . . . . . . . . . . . . .                  ..   .   557
Figure   25-2.   Sample Transaction Control Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . .               ..   .   559
Figure   25-3.   Effective and Ineffective Transaction Control Transformations . . . . . . . . . .                        ..   .   561
Figure   25-4.   Transaction Control Transformation Effective for a Transformation . . . . . .                            ..   .   561
Figure   25-5.   Valid Mapping with Transaction Control Transformations . . . . . . . . . . . . .                         ..   .   562
Figure   25-6.   Invalid Mapping with Transaction Control Transformations . . . . . . . . . . .                           ..   .   563
Figure   26-1.   Union Transformation Groups Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .              ..   .   570
Figure   26-2.   Union Transformation Group Ports Tab . . . . . . . . . . . . . . . . . . . . . . . . . .                 ..   .   571
Figure   26-3.   Union Transformation Ports Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .             ..   .   571
Figure   26-4.   Mapping with a Union Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . .                ..   .   574
Figure   27-1.   Specifying Operations for Individual Target Tables . . . . . . . . . . . . . . . . . .                   ..   .   582




                                                                                                               List of Figures           xxiii
xxiv   List of Figures
List of Tables
    Table 1-1. Transformation Descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .            .   .   ..   .   ..   .   .. 2
    Table 1-2. Multi-Group Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .              .   .   ..   .   ..   .   .. 9
    Table 1-3. Transformations Containing Expressions . . . . . . . . . . . . . . . . . . . . . . .                 .   .   ..   .   ..   .   . 11
    Table 1-4. Variable Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   .   .   ..   .   ..   .   . 14
    Table 1-5. System Default Values and Integration Service Behavior . . . . . . . . . . . .                       .   .   ..   .   ..   .   . 18
    Table 1-6. Transformations Supporting User-Defined Default Values . . . . . . . . . .                           .   .   ..   .   ..   .   . 20
    Table 1-7. Default Values for Input and Input/Output Ports . . . . . . . . . . . . . . . .                      .   .   ..   .   ..   .   . 22
    Table 1-8. Supported Default Values for Output Ports . . . . . . . . . . . . . . . . . . . . .                  .   .   ..   .   ..   .   . 26
    Table 1-9. Session Log Tracing Levels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .           .   .   ..   .   ..   .   . 30
    Table 3-1. Custom Transformation Properties . . . . . . . . . . . . . . . . . . . . . . . . . . .               .   .   ..   .   ..   .   . 64
    Table 3-2. Transaction Boundary Handling with Custom Transformations . . . . . .                                .   .   ..   .   ..   .   . 69
    Table 3-3. Module File Names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .        .   .   ..   .   ..   .   . 85
    Table 3-4. UNIX Commands to Build the Shared Library. . . . . . . . . . . . . . . . . . .                       .   .   ..   .   ..   .   . 86
    Table 4-1. Custom Transformation Handles . . . . . . . . . . . . . . . . . . . . . . . . . . . .                .   .   ..   .   ..   .   . 91
    Table 4-2. Custom Transformation Generated Functions . . . . . . . . . . . . . . . . . . .                      .   .   ..   .   ..   .   . 92
    Table 4-3. Custom Transformation API Functions . . . . . . . . . . . . . . . . . . . . . . . .                  .   .   ..   .   ..   .   . 92
    Table 4-4. Custom Transformation Array-Based API Functions . . . . . . . . . . . . . .                          .   .   ..   .   ..   .   . 94
    Table 4-5. INFA_CT_MODULE Property IDs . . . . . . . . . . . . . . . . . . . . . . . . . .                      .   .   ..   .   ..   .    109
    Table 4-6. INFA_CT_PROC_HANDLE Property IDs . . . . . . . . . . . . . . . . . . . . .                           .   .   ..   .   ..   .    110
    Table 4-7. INFA_CT_TRANS_HANDLE Property IDs . . . . . . . . . . . . . . . . . . . .                            .   .   ..   .   ..   .    111
    Table 4-8. INFA_CT_INPUT_GROUP and INFA_CT_OUTPUT_GROUP
    Handle Property IDs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   . . . . . . . . 112
    Table 4-9. INFA_CT_INPUTPORT and INFA_CT_OUTPUTPORT_HANDLE
    Handle Property IDs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   .   .   ..   .   ..   .   113
    Table 4-10. Property Functions (MBCS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .             .   .   ..   .   ..   .   115
    Table 4-11. Property Functions (Unicode) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .            .   .   ..   .   ..   .   115
    Table 4-12. Compatible Datatypes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .          .   .   ..   .   ..   .   116
    Table 4-13. Get Data Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .        .   .   ..   .   ..   .   118
    Table 4-14. Get Data Functions (Array-Based Mode) . . . . . . . . . . . . . . . . . . . . . .                   .   .   ..   .   ..   .   133
    Table 6-1. Differences Between COM and Informatica External Procedures . . . . .                                .   .   ..   .   ..   .   148
    Table 6-2. Visual C++ and Transformation Datatypes . . . . . . . . . . . . . . . . . . . . .                    .   .   ..   .   ..   .   171
    Table 6-3. Visual Basic and Transformation Datatypes . . . . . . . . . . . . . . . . . . . . .                  .   .   ..   .   ..   .   171
    Table 6-4. External Procedure Initialization Properties . . . . . . . . . . . . . . . . . . . . .               .   .   ..   .   ..   .   180
    Table 6-5. Descriptions of Parameter Access Functions. . . . . . . . . . . . . . . . . . . . .                  .   .   ..   .   ..   .   183
    Table 6-6. Member Variable of the External Procedure Base Class. . . . . . . . . . . . .                        .   .   ..   .   ..   .   185
    Table 8-1. HTTP Transformation Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . .               .   .   ..   .   ..   .   202
    Table 8-2. HTTP Transformation Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                .   .   ..   .   ..   .   205
    Table 8-3. GET Method Groups and Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                .   .   ..   .   ..   .   206
    Table 8-4. POST Method Groups and Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . .                 .   .   ..   .   ..   .   206
    Table 8-5. SIMPLE POST Method Groups and Ports . . . . . . . . . . . . . . . . . . . . .                        .   .   ..   .   ..   .   207
    Table 9-1. Mapping from PowerCenter Datatypes to Java Datatypes . . . . . . . . . . .                           .   .   ..   .   ..   .   215


                                                                                                                        List of Tables               xxv
Table 9-2. Java Transformation Properties . . . . . . . . . . . . . . . . . . . . . . .           ...   .   .   ..   .   .   ..   .   ..   .   .221
         Table 11-1. Input and Output Ports . . . . . . . . . . . . . . . . . . . . . . . . . . .          ...   .   .   ..   .   .   ..   .   ..   .   .254
         Table 12-1. Enumerated Java Datatypes . . . . . . . . . . . . . . . . . . . . . . . .             ...   .   .   ..   .   .   ..   .   ..   .   .274
         Table 12-2. JExpression API Methods . . . . . . . . . . . . . . . . . . . . . . . . . .           ...   .   .   ..   .   .   ..   .   ..   .   .276
         Table 13-1. Joiner Transformation Properties . . . . . . . . . . . . . . . . . . . .              ...   .   .   ..   .   .   ..   .   ..   .   .286
         Table 13-2. Integration Service Behavior with Transformation Scopes for                           the
         Joiner Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   ...   .   .   ..   .   .   ..   .   ..   .   .300
         Table 14-1. Differences Between Connected and Unconnected Lookups .                               ...   .   .   ..   .   .   ..   .   ..   .   .309
         Table 14-2. Lookup Transformation Port Types . . . . . . . . . . . . . . . . . .                  ...   .   .   ..   .   .   ..   .   ..   .   .314
         Table 14-3. Lookup Transformation Properties . . . . . . . . . . . . . . . . . . .                ...   .   .   ..   .   .   ..   .   ..   .   .316
         Table 14-4. Session Properties for Flat File Lookups . . . . . . . . . . . . . . .                ...   .   .   ..   .   .   ..   .   ..   .   .322
         Table 15-1. Lookup Caching Comparison . . . . . . . . . . . . . . . . . . . . . . .               ...   .   .   ..   .   .   ..   .   ..   .   .339
         Table 15-2. Integration Service Handling of Persistent Caches . . . . . . . .                     ...   .   .   ..   .   .   ..   .   ..   .   .343
         Table 15-3. NewLookupRow Values . . . . . . . . . . . . . . . . . . . . . . . . . . .             ...   .   .   ..   .   .   ..   .   ..   .   .348
         Table 15-4. Dynamic Lookup Cache Behavior for Insert Row Type . . . .                             ...   .   .   ..   .   .   ..   .   ..   .   .357
         Table 15-5. Dynamic Lookup Cache Behavior for Update Row Type . . .                               ...   .   .   ..   .   .   ..   .   ..   .   .358
         Table 15-6. Location for Sharing Unnamed Cache . . . . . . . . . . . . . . . . .                  ...   .   .   ..   .   .   ..   .   ..   .   .364
         Table 15-7. Properties for Sharing Unnamed Cache . . . . . . . . . . . . . . . .                  ...   .   .   ..   .   .   ..   .   ..   .   .364
         Table 15-8. Location for Sharing Named Cache . . . . . . . . . . . . . . . . . . .                ...   .   .   ..   .   .   ..   .   ..   .   .367
         Table 15-9. Properties for Sharing Named Cache . . . . . . . . . . . . . . . . . .                ...   .   .   ..   .   .   ..   .   ..   .   .367
         Table 16-1. Normalizer Transformation Properties . . . . . . . . . . . . . . . .                  ...   .   .   ..   .   .   ..   .   ..   .   .376
         Table 16-2. Normalizer Tab Columns . . . . . . . . . . . . . . . . . . . . . . . . . .            ...   .   .   ..   .   .   ..   .   ..   .   .378
         Table 16-3. Normalizer Tab for a VSAM Normalizer Transformation . . .                             ...   .   .   ..   .   .   ..   .   ..   .   .384
         Table 16-4. Pipeline Normalizer Tab . . . . . . . . . . . . . . . . . . . . . . . . . .           ...   .   .   ..   .   .   ..   .   ..   .   .390
         Table 17-1. Rank Transformation Ports . . . . . . . . . . . . . . . . . . . . . . . .             ...   .   .   ..   .   .   ..   .   ..   .   .404
         Table 17-2. Rank Transformation Properties . . . . . . . . . . . . . . . . . . . . .              ...   .   .   ..   .   .   ..   .   ..   .   .407
         Table 19-1. Sequence Generator Transformation Properties . . . . . . . . . .                      ...   .   .   ..   .   .   ..   .   ..   .   .427
         Table 20-1. Column Sizes for Sorter Data Calculations . . . . . . . . . . . . .                   ...   .   .   ..   .   .   ..   .   ..   .   .440
         Table 21-1. Conversion for Datetime Mapping Parameters and Variables                              ...   .   .   ..   .   .   ..   .   ..   .   .447
         Table 21-2. Source Qualifier Transformation Properties . . . . . . . . . . . . .                  ...   .   .   ..   .   .   ..   .   ..   .   .449
         Table 21-3. Locations for Entering Outer Join Syntax . . . . . . . . . . . . . .                  ...   .   .   ..   .   .   ..   .   ..   .   .463
         Table 21-4. Syntax for Normal Joins in a Join Override . . . . . . . . . . . . .                  ...   .   .   ..   .   .   ..   .   ..   .   .463
         Table 21-5. Syntax for Left Outer Joins in a Join Override . . . . . . . . . . .                  ...   .   .   ..   .   .   ..   .   ..   .   .465
         Table 21-6. Syntax for Right Outer Joins in a Join Override . . . . . . . . .                     ...   .   .   ..   .   .   ..   .   ..   .   .467
         Table 22-1. Full Database Connection Ports . . . . . . . . . . . . . . . . . . . . .              ...   .   .   ..   .   .   ..   .   ..   .   .491
         Table 22-2. Native Connect String Syntax . . . . . . . . . . . . . . . . . . . . .                ...   .   .   ..   .   .   ..   .   ..   .   .491
         Table 22-3. Output Rows By Query Statement - Query Mode . . . . . . . .                           ...   .   .   ..   .   .   ..   .   ..   .   .495
         Table 22-4. NumRowsAffected Rows by Query Statement - Query Mode                                  ...   .   .   ..   .   .   ..   .   ..   .   .495
         Table 22-5. Output Rows by Query Statement - Query Mode . . . . . . . .                           ...   .   .   ..   .   .   ..   .   ..   .   .497
         Table 22-6. SQL Transformation Connection Options . . . . . . . . . . . . . .                     ...   .   .   ..   .   .   ..   .   ..   .   .501
         Table 22-7. SQL Transformation Properties . . . . . . . . . . . . . . . . . . . . .               ...   .   .   ..   .   .   ..   .   ..   .   .503
         Table 22-8. SQL Settings Tab Attributes . . . . . . . . . . . . . . . . . . . . . . . .           ...   .   .   ..   .   .   ..   .   ..   .   .505
         Table 22-9. SQL Transformation Ports . . . . . . . . . . . . . . . . . . . . . . . . .            ...   .   .   ..   .   .   ..   .   ..   .   .507
         Table 22-10. Standard SQL Statements . . . . . . . . . . . . . . . . . . . . . . . . .            ...   .   .   ..   .   .   ..   .   ..   .   .509

xxvi   List of Tables
Table   24-1.   Connected and Unconnected Stored Procedure Transformation Tasks .                              .   .   ..   .   ..   .   527
Table   24-2.   Setting Options for the Stored Procedure Transformation . . . . . . . . .                      .   .   ..   .   ..   .   538
Table   27-1.   Constants for Each Database Operation . . . . . . . . . . . . . . . . . . . . . .              .   .   ..   .   ..   .   577
Table   27-2.   Specifying an Operation for All Rows . . . . . . . . . . . . . . . . . . . . . . . .           .   .   ..   .   ..   .   580
Table   27-3.   Update Strategy Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   .   .   ..   .   ..   .   581




                                                                                                                   List of Tables              xxvii
xxviii   List of Tables
Preface


   Welcome to PowerCenter, the Informatica software product that delivers an open, scalable
   data integration solution addressing the complete life cycle for all data integration projects
   including data warehouses, data migration, data synchronization, and information hubs.
   PowerCenter combines the latest technology enhancements for reliably managing data
   repositories and delivering information resources in a timely, usable, and efficient manner.
   The PowerCenter repository coordinates and drives a variety of core functions, including
   extracting, transforming, loading, and managing data. The Integration Service can extract
   large volumes of data from multiple platforms, handle complex transformations on the data,
   and support high-speed loads. PowerCenter can simplify and accelerate the process of
   building a comprehensive data warehouse from disparate data sources.




                                                                                                xxix
About This Book
                The Transformation Guide is written for the developers and software engineers responsible for
                implementing your data warehouse. The Transformation Guide assumes that you have a solid
                understanding of your operating systems, relational database concepts, and the database
                engines, flat files, or mainframe system in your environment. This guide also assumes that
                you are familiar with the interface requirements for your supporting applications.


        Document Conventions
                This guide uses the following formatting conventions:

                 If you see…                          It means…

                 italicized text                      The word or set of words are especially emphasized.

                 boldfaced text                       Emphasized subjects.

                 italicized monospaced text           This is the variable name for a value you enter as part of an
                                                      operating system command. This is generic text that should be
                                                      replaced with user-supplied values.

                 Note:                                The following paragraph provides additional facts.

                 Tip:                                 The following paragraph provides suggested uses.

                 Warning:                             The following paragraph notes situations where you can overwrite
                                                      or corrupt data, unless you follow the specified procedure.

                 monospaced text                      This is a code example.

                 bold monospaced text                 This is an operating system command you enter from a prompt to
                                                      run a task.




xxx   Preface
Other Informatica Resources
       In addition to the product manuals, Informatica provides these other resources:
       ♦   Informatica Customer Portal
       ♦   Informatica web site
       ♦   Informatica Knowledge Base
       ♦   Informatica Technical Support


    Visiting Informatica Customer Portal
       As an Informatica customer, you can access the Informatica Customer Portal site at
       http://my.informatica.com. The site contains product information, user group information,
       newsletters, access to the Informatica customer support case management system (ATLAS),
       the Informatica Knowledge Base, Informatica Documentation Center, and access to the
       Informatica user community.


    Visiting the Informatica Web Site
       You can access the Informatica corporate web site at http://www.informatica.com. The site
       contains information about Informatica, its background, upcoming events, and sales offices.
       You will also find product and partner information. The services area of the site includes
       important information about technical support, training and education, and implementation
       services.


    Visiting the Informatica Knowledge Base
       As an Informatica customer, you can access the Informatica Knowledge Base at
       http://my.informatica.com. Use the Knowledge Base to search for documented solutions to
       known technical issues about Informatica products. You can also find answers to frequently
       asked questions, technical white papers, and technical tips.


    Obtaining Technical Support
       There are many ways to access Informatica Technical Support. You can contact a Technical
       Support Center by using the telephone numbers listed in the following table, you can send
       email, or you can use the WebSupport Service.
       Use the following email addresses to contact Informatica Technical Support:
       ♦   support@informatica.com for technical inquiries
       ♦   support_admin@informatica.com for general customer service requests




                                                                                         Preface   xxxi
WebSupport requires a user name and password. You can request a user name and password at
                  http://my.informatica.com.

                   North America / South America   Europe / Middle East / Africa      Asia / Australia


                   Informatica Corporation         Informatica Software Ltd.          Informatica Business Solutions
                   Headquarters                    6 Waltham Park                     Pvt. Ltd.
                   100 Cardinal Way                Waltham Road, White Waltham        Diamond District
                   Redwood City, California        Maidenhead, Berkshire              Tower B, 3rd Floor
                   94063                           SL6 3TN                            150 Airport Road
                   United States                   United Kingdom                     Bangalore 560 008
                                                                                      India

                   Toll Free                       Toll Free                          Toll Free
                   877 463 2435                    00 800 4632 4357                   Australia: 1 800 151 830
                                                                                      Singapore: 001 800 4632 4357

                   Standard Rate                   Standard Rate                      Standard Rate
                   United States: 650 385 5800     Belgium: +32 15 281 702            India: +91 80 4112 5738
                                                   France: +33 1 41 38 92 26
                                                   Germany: +49 1805 702 702
                                                   Netherlands: +31 306 022 797
                                                   United Kingdom: +44 1628 511 445




xxxii   Preface
Chapter 1




Working with
Transformations
   This chapter includes the following topics:
   ♦   Overview, 2
   ♦   Creating a Transformation, 5
   ♦   Configuring Transformations, 6
   ♦   Working with Ports, 7
   ♦   Multi-Group Transformations, 9
   ♦   Working with Expressions, 10
   ♦   Using Local Variables, 14
   ♦   Using Default Values for Ports, 18
   ♦   Configuring Tracing Level in Transformations, 30
   ♦   Reusable Transformations, 31




                                                              1
Overview
            A transformation is a repository object that generates, modifies, or passes data. The Designer
            provides a set of transformations that perform specific functions. For example, an Aggregator
            transformation performs calculations on groups of data.
            Transformations in a mapping represent the operations the Integration Service performs on
            the data. Data passes through transformation ports that you link in a mapping or mapplet.
            Transformations can be active or passive. An active transformation can change the number of
            rows that pass through it, such as a Filter transformation that removes rows that do not meet
            the filter condition. A passive transformation does not change the number of rows that pass
            through it, such as an Expression transformation that performs a calculation on data and
            passes all rows through the transformation.
            Transformations can be connected to the data flow, or they can be unconnected. An
            unconnected transformation is not connected to other transformations in the mapping. An
            unconnected transformation is called within another transformation, and returns a value to
            that transformation.
            Table 1-1 provides a brief description of each transformation:

            Table 1-1. Transformation Descriptions

             Transformation                   Type           Description

             Aggregator                       Active/        Performs aggregate calculations.
                                              Connected

             Application Source Qualifier     Active/        Represents the rows that the Integration Service reads from an
                                              Connected      application, such as an ERP source, when it runs a session.

             Custom                           Active or      Calls a procedure in a shared library or DLL.
                                              Passive/
                                              Connected

             Expression                       Passive/       Calculates a value.
                                              Connected

             External Procedure               Passive/       Calls a procedure in a shared library or in the COM layer of
                                              Connected or   Windows.
                                              Unconnected

             Filter                           Active/        Filters data.
                                              Connected

             HTTP                             Passive/       Connects to an HTTP server to read or update data.
                                              Connected

             Input                            Passive/       Defines mapplet input rows. Available in the Mapplet Designer.
                                              Connected

             Java                             Active or      Executes user logic coded in Java. The byte code for the user logic
                                              Passive/       is stored in the repository.
                                              Connected




2   Chapter 1: Working with Transformations
Table 1-1. Transformation Descriptions

 Transformation                Type           Description

 Joiner                        Active/        Joins data from different databases or flat file systems.
                               Connected

 Lookup                        Passive/       Looks up values.
                               Connected or
                               Unconnected

 Normalizer                    Active/        Source qualifier for COBOL sources. Can also use in the pipeline to
                               Connected      normalize data from relational or flat file sources.

 Output                        Passive/       Defines mapplet output rows. Available in the Mapplet Designer.
                               Connected

 Rank                          Active/        Limits records to a top or bottom range.
                               Connected

 Router                        Active/        Routes data into multiple transformations based on group
                               Connected      conditions.

 Sequence Generator            Passive/       Generates primary keys.
                               Connected

 Sorter                        Active/        Sorts data based on a sort key.
                               Connected

 Source Qualifier              Active/        Represents the rows that the Integration Service reads from a
                               Connected      relational or flat file source when it runs a session.

 SQL                           Active or      Executes SQL queries against a database.
                               Passive/
                               Connected

 Stored Procedure              Passive/       Calls a stored procedure.
                               Connected or
                               Unconnected

 Transaction Control           Active/        Defines commit and rollback transactions.
                               Connected

 Union                         Active/        Merges data from different databases or flat file systems.
                               Connected

 Update Strategy               Active/        Determines whether to insert, delete, update, or reject rows.
                               Connected

 XML Generator                 Active/        Reads data from one or more input ports and outputs XML through a
                               Connected      single output port.

 XML Parser                    Active/        Reads XML from one input port and outputs data to one or more
                               Connected      output ports.

 XML Source Qualifier          Active/        Represents the rows that the Integration Service reads from an XML
                               Connected      source when it runs a session.




                                                                                                   Overview         3
When you build a mapping, you add transformations and configure them to handle data
            according to a business purpose. Complete the following tasks to incorporate a
            transformation into a mapping:
            1.   Create the transformation. Create it in the Mapping Designer as part of a mapping, in
                 the Mapplet Designer as part of a mapplet, or in the Transformation Developer as a
                 reusable transformation.
            2.   Configure the transformation. Each type of transformation has a unique set of options
                 that you can configure.
            3.   Link the transformation to other transformations and target definitions. Drag one port
                 to another to link them in the mapping or mapplet.




4   Chapter 1: Working with Transformations
Creating a Transformation
      You can create transformations using the following Designer tools:
      ♦    Mapping Designer. Create transformations that connect sources to targets.
           Transformations in a mapping cannot be used in other mappings unless you configure
           them to be reusable.
      ♦    Transformation Developer. Create individual transformations, called reusable
           transformations, that use in multiple mappings. For more information, see “Reusable
           Transformations” on page 31.
      ♦    Mapplet Designer. Create and configure a set of transformations, called mapplets, that
           you use in multiple mappings. For more information, see “Mapplets” in the Designer
           Guide.
      Use the same process to create a transformation in the Mapping Designer, Transformation
      Developer, and Mapplet Designer.

      To create a transformation:

      1.    Open the appropriate Designer tool.
      2.    In the Mapping Designer, open or create a Mapping. In the Mapplet Designer, open or
            create a Mapplet.
      3.    On the Transformations toolbar, click the button corresponding to the transformation
            you want to create.
            -or-
            Click Transformation > Create and select the type of transformation you want to create.




      4.    Drag across the portion of the mapping where you want to place the transformation.
            The new transformation appears in the workspace. Next, you need to configure the
            transformation by adding any new ports to it and setting other properties.




                                                                        Creating a Transformation     5
Configuring Transformations
            After you create a transformation, you can configure it. Every transformation contains the
            following common tabs:
            ♦   Transformation. Name the transformation or add a description.
            ♦   Port. Add and configure ports.
            ♦   Properties. Configure properties that are unique to the transformation.
            ♦   Metadata Extensions. Extend the metadata in the repository by associating information
                with individual objects in the repository.
            Some transformations might include other tabs, such as the Condition tab, where you enter
            conditions in a Joiner or Normalizer transformation.
            When you configure transformations, you might complete the following tasks:
            ♦   Add ports. Define the columns of data that move into and out of the transformation.
            ♦   Add groups. In some transformations, define input or output groups that define a row of
                data entering or leaving the transformation.
            ♦   Enter expressions. Enter SQL-like expressions in some transformations that transform the
                data.
            ♦   Define local variables. Define local variables in some transformations that temporarily
                store data.
            ♦   Override default values. Configure default values for ports to handle input nulls and
                output transformation errors.
            ♦   Enter tracing levels. Choose the amount of detail the Integration Service writes in the
                session log about a transformation.




6   Chapter 1: Working with Transformations
Working with Ports
      After you create a transformation, you need to add and configure ports using the Ports tab.
      Figure 1-1 shows a sample Ports tab:

      Figure 1-1. Sample Ports Tab




    Creating Ports
       You can create a new port in the following ways:
      ♦   Drag a port from another transformation. When you drag a port from another
          transformation the Designer creates a port with the same properties, and it links the two
          ports. Click Layout > Copy Columns to enable copying ports.
      ♦   Click the Add button on the Ports tab. The Designer creates an empty port you can
          configure.


    Configuring Ports
      On the Ports tab, you can configure the following properties:
      ♦   Port name. The name of the port.
      ♦   Datatype, precision, and scale. If you plan to enter an expression or condition, make sure
          the datatype matches the return value of the expression.
      ♦   Port type. Transformations may contain a combination of input, output, input/output,
          and variable port types.




                                                                              Working with Ports      7
♦   Default value. The Designer assigns default values to handle null values and output
                transformation errors. You can override the default value in some ports.
            ♦   Description. A description of the port.
            ♦   Other properties. Some transformations have properties specific to that transformation,
                such as expressions or group by properties.
            For more information about configuration options, see the appropriate sections in this
            chapter or in the specific transformation chapters.
            Note: The Designer creates some transformations with configured ports. For example, the
            Designer creates a Lookup transformation with an output port for each column in the table or
            view used for the lookup. You need to create a port representing a value used to perform a
            lookup.


      Linking Ports
            Once you add and configure a transformation in a mapping, you link it to targets and other
            transformations. You link mapping objects through the ports. Data passes into and out of a
            mapping through the following ports:
            ♦   Input ports. Receive data.
            ♦   Output ports. Pass data.
            ♦   Input/output ports. Receive data and pass it unchanged.
            Figure 1-2 shows an example of a transformation with input, output, and input/output ports:

            Figure 1-2. Example of Input, Output, and Input/Output Ports



                                                Input Port
                                                Input/Output Port
                                                Output Ports




            To link ports, drag between ports in different mapping objects. The Designer validates the
            link and creates the link only when the link meets validation requirements.
            For more information about connecting mapping objects or about how to link ports, see
            “Mappings” in the Designer Guide.




8   Chapter 1: Working with Transformations
Multi-Group Transformations
      Transformations have input and output groups. A group is a set of ports that define a row of
      incoming or outgoing data. A group is analogous to a table in a relational source or target
      definition. Most transformations have one input and one output group. However, some have
      multiple input groups, multiple output groups, or both. A group is the representation of a
      row of data entering or leaving a transformation.
      Table 1-2 lists the transformations with multiple groups:

      Table 1-2. Multi-Group Transformations

          Transformation           Description

          Custom                   Contains any number of input and output groups.

          HTTP                     Contains an input, output, and a header group.

          Joiner                   Contains two input groups, the master source and detail source, and one output group.

          Router                   Contains one input group and multiple output groups.

          Union                    Contains multiple input groups and one output group.

          XML Source Qualifier     Contains multiple input and output groups.

          XML Target Definition    Contains multiple input groups.

          XML Parser               Contains one input group and multiple output groups.

          XML Generator            Contains multiple input groups and one output group.


      When you connect transformations in a mapping, you must consider input and output
      groups. For more information about connecting transformations in a mapping, see
      “Mappings” in the Designer Guide.
      Some multiple input group transformations require the Integration Service to block data at an
      input group while the Integration Service waits for a row from a different input group. A
      blocking transformation is a multiple input group transformation that blocks incoming data.
      The following transformations are blocking transformations:
      ♦     Custom transformation with the Inputs May Block property enabled
      ♦     Joiner transformation configured for unsorted input
      The Designer performs data flow validation when you save or validate a mapping. Some
      mappings that contain blocking transformations might not be valid. For more information
      about data flow validation, see “Mappings” in the Designer Guide.
      For more information about blocking source data, see “Integration Service Architecture” in
      the Administrator Guide.




                                                                                     Multi-Group Transformations           9
Working with Expressions
             You can enter expressions using the Expression Editor in some transformations. Create
             expressions with the following functions:
             ♦   Transformation language functions. SQL-like functions designed to handle common
                 expressions.
             ♦   User-defined functions. Functions you create in PowerCenter based on transformation
                 language functions.
             ♦   Custom functions. Functions you create with the Custom Function API.
             For more information about the transformation language and custom functions, see the
             Transformation Language Reference. For more information about user-defined functions, see
             “Working with User-Defined Functions” in the Designer Guide.
             Enter an expression in an output port that uses the value of data from an input or input/
             output port. For example, you have a transformation with an input port IN_SALARY that
             contains the salaries of all the employees. You might want to use the individual values from
             the IN_SALARY column later in the mapping, and the total and average salaries you calculate
             through this transformation. For this reason, the Designer requires you to create a separate
             output port for each calculated value.
             Figure 1-3 shows an Aggregator transformation that uses input ports to calculate sums and
             averages:

             Figure 1-3. Sample Input and Output Ports




10   Chapter 1: Working with Transformations
Table 1-3 lists the transformations in which you can enter expressions:

Table 1-3. Transformations Containing Expressions

 Transformation     Expression                                                Return Value

 Aggregator         Performs an aggregate calculation based on all data       Result of an aggregate calculation for a
                    passed through the transformation. Alternatively, you     port.
                    can specify a filter for records in the aggregate
                    calculation to exclude certain kinds of records. For
                    example, you can find the total number and average
                    salary of all employees in a branch office using this
                    transformation.

 Expression         Performs a calculation based on values within a           Result of a row-level calculation for a
                    single row. For example, based on the price and           port.
                    quantity of a particular item, you can calculate the
                    total purchase price for that line item in an order.

 Filter             Specifies a condition used to filter rows passed          TRUE or FALSE, depending on whether
                    through this transformation. For example, if you want     a row meets the specified condition. Only
                    to write customer data to the BAD_DEBT table for          rows that return TRUE are passed
                    customers with outstanding balances, you could use        through this transformation. The
                    the Filter transformation to filter customer data.        transformation applies this value to each
                                                                              row passed through it.

 Rank               Sets the conditions for rows included in a rank. For      Result of a condition or calculation for a
                    example, you can rank the top 10 salespeople who          port.
                    are employed with the company.

 Router             Routes data into multiple transformations based on a      TRUE or FALSE, depending on whether
                    group expression. For example, use this                   a row meets the specified group
                    transformation to compare the salaries of employees       expression. Only rows that return TRUE
                    at three different pay levels. You can do this by         pass through each user-defined group in
                    creating three groups in the Router transformation.       this transformation. Rows that return
                    For example, create one group expression for each         FALSE pass through the default group.
                    salary range.

 Update Strategy    Flags a row for update, insert, delete, or reject. You    Numeric code for update, insert, delete,
                    use this transformation when you want to control          or reject. The transformation applies this
                    updates to a target, based on some condition you          value to each row passed through it.
                    apply. For example, you might use the Update
                    Strategy transformation to flag all customer rows for
                    update when the mailing address has changed, or
                    flag all employee rows for reject for people who no
                    longer work for the company.

 Transaction        Specifies a condition used to determine the action the    One of the following built-in variables,
 Control            Integration Service performs, either commit, roll back,   depending on whether or not a row
                    or no transaction change. You use this transformation     meets the specified condition:
                    when you want to control commit and rollback              - TC_CONTINUE_TRANSACTION
                    transactions based on a row or set of rows that pass      - TC_COMMIT_BEFORE
                    through the transformation. For example, use this         - TC_COMMIT_AFTER
                    transformation to commit a set of rows based on an        - TC_ROLLBACK_BEFORE
                    order entry date.                                         - TC_ROLLBACK_AFTER
                                                                              The Integration Service performs actions
                                                                              based on the return value.




                                                                                    Working with Expressions               11
Using the Expression Editor
             Use the Expression Editor to build SQL-like statements. Although you can enter an
             expression manually, you should use the point-and-click method. Select functions, ports,
             variables, and operators from the point-and-click interface to minimize errors when you build
             expressions.
             Figure 1-4 shows an example of the Expression Editor:

             Figure 1-4. Expression Editor




             Entering Port Names into an Expression
             For connected transformations, if you use port names in an expression, the Designer updates
             that expression when you change port names in the transformation. For example, you write a
             valid expression that determines the difference between two dates, Date_Promised and
             Date_Delivered. Later, if you change the Date_Promised port name to Due_Date, the
             Designer changes the Date_Promised port name to Due_Date in the expression.
             Note: You can propagate the name Due_Date to other non-reusable transformations that
             depend on this port in the mapping. For more information, see “Mappings” in the Designer
             Guide.

             Adding Comments
             You can add comments to an expression to give descriptive information about the expression
             or to specify a valid URL to access business documentation about the expression.
             You can add comments in one of the following ways:
             ♦   To add comments within the expression, use -- or // comment indicators.
             ♦   To add comments in the dialog box, click the Comments button.
             For examples on adding comments to expressions, see “The Transformation Language” in the
             Transformation Language Reference.


12   Chapter 1: Working with Transformations
For more information about linking to business documentation, see “Using the Designer” in
the Designer Guide.

Validating Expressions
Use the Validate button to validate an expression. If you do not validate an expression, the
Designer validates it when you close the Expression Editor. If the expression is invalid, the
Designer displays a warning. You can save the invalid expression or modify it. You cannot run
a session against a mapping with invalid expressions.

Expression Editor Display
The Expression Editor can display syntax expressions in different colors for better readability.
If you have the latest Rich Edit control, riched20.dll, installed on the system, the Expression
Editor displays expression functions in blue, comments in grey, and quoted strings in green.
You can resize the Expression Editor. Expand the dialog box by dragging from the borders.
The Designer saves the new size for the dialog box as a client setting.

Adding Expressions to an Output Port
Complete the following steps to add an expression to an output port.

To add expressions:

1.   In the transformation, select the port and open the Expression Editor.
2.   Enter the expression.
     Use the Functions and Ports tabs and the operator keys.
3.   Add comments to the expression.
     Use comment indicators -- or //.
4.   Validate the expression.
     Use the Validate button to validate the expression.




                                                                    Working with Expressions   13
Using Local Variables
             Use local variables in Aggregator, Expression, and Rank transformations. You can reference
             variables in an expression or use them to temporarily store data. Variables are an easy way to
             improve performance.
             You might use variables to complete the following tasks:
             ♦    Temporarily store data.
             ♦    Simplify complex expressions.
             ♦    Store values from prior rows.
             ♦    Capture multiple return values from a stored procedure.
             ♦    Compare values.
             ♦    Store the results of an unconnected Lookup transformation.


       Temporarily Store Data and Simplify Complex Expressions
             Variables improve performance when you enter several related expressions in the same
             transformation. Rather than parsing and validating the same expression components each
             time, you can define these components as variables.
             For example, if an Aggregator transformation uses the same filter condition before calculating
             sums and averages, you can define this condition once as a variable, and then reuse the
             condition in both aggregate calculations.
             You can simplify complex expressions. If an Aggregator includes the same calculation in
             multiple expressions, you can improve session performance by creating a variable to store the
             results of the calculation.
             For example, you might create the following expressions to find both the average salary and
             the total salary using the same data:
                        AVG( SALARY, ( ( JOB_STATUS = 'Full-time' ) AND (OFFICE_ID = 1000 ) ) )

                        SUM( SALARY, ( ( JOB_STATUS = 'Full-time' ) AND (OFFICE_ID = 1000 ) ) )

             Rather than entering the same arguments for both calculations, you might create a variable
             port for each condition in this calculation, then modify the expression to use the variables.
             Table 1-4 shows how to use variables to simplify complex expressions and temporarily store
             data:

             Table 1-4. Variable Usage

                 Port                    Value

                 V_CONDITION1            JOB_STATUS = ‘Full-time’

                 V_CONDITION2            OFFICE_ID = 1000




14   Chapter 1: Working with Transformations
Table 1-4. Variable Usage

   Port                       Value

   AVG_SALARY                 AVG(SALARY, (V_CONDITION1 AND V_CONDITION2) )

   SUM_SALARY                 SUM(SALARY, (V_CONDITION1 AND V_CONDITION2) )



Store Values Across Rows
  Use variables to store data from prior rows. This can help you perform procedural
  calculations.
  Figure 1-5 shows how to use variables to find out how many customers are in each state:

  Figure 1-5. Variable Ports Store Values Across Rows




  Since the Integration Service groups the input data by state, the company uses variables to
  hold the value of the previous state read and a state counter. The following expression
  compares the previous state to the state just read:
          IIF(PREVIOUS_STATE = STATE, STATE_COUNTER +1, 1)

  The STATE_COUNTER is incremented if the row is a member of the previous state. For
  each new state, the Integration Service sets the counter back to 1. Then an output port passes
  the value of the state counter to the next transformation.


Capture Values from Stored Procedures
  Variables also provide a way to capture multiple columns of return values from stored
  procedures. For more information, see “Stored Procedure Transformation” on page 525.



                                                                              Using Local Variables   15
Guidelines for Configuring Variable Ports
             Consider the following factors when you configure variable ports in a transformation:
             ♦    Port order. The Integration Service evaluates ports by dependency. The order of the ports
                  in a transformation must match the order of evaluation: input ports, variable ports, output
                  ports.
             ♦    Datatype. The datatype you choose reflects the return value of the expression you enter.
             ♦    Variable initialization. The Integration Service sets initial values in variable ports, where
                  you can create counters.

             Port Order
             The Integration Service evaluates ports in the following order:
             1.    Input ports. The Integration Service evaluates all input ports first since they do not
                   depend on any other ports. Therefore, you can create input ports in any order. Since they
                   do not reference other ports, the Integration Service does not order input ports.
             2.    Variable ports. Variable ports can reference input ports and variable ports, but not output
                   ports. Because variable ports can reference input ports, the Integration Service evaluates
                   variable ports after input ports. Likewise, since variables can reference other variables, the
                   display order for variable ports is the same as the order in which the Integration Service
                   evaluates each variable.
                   For example, if you calculate the original value of a building and then adjust for
                   depreciation, you might create the original value calculation as a variable port. This
                   variable port needs to appear before the port that adjusts for depreciation.
             3.    Output ports. Because output ports can reference input ports and variable ports, the
                   Integration Service evaluates output ports last. The display order for output ports does
                   not matter since output ports cannot reference other output ports. Be sure output ports
                   display at the bottom of the list of ports.

             Datatype
             When you configure a port as a variable, you can enter any expression or condition in it. The
             datatype you choose for this port reflects the return value of the expression you enter. If you
             specify a condition through the variable port, any numeric datatype returns the values for
             TRUE (non-zero) and FALSE (zero).

             Variable Initialization
             The Integration Service does not set the initial value for variables to NULL. Instead, the
             Integration Service uses the following guidelines to set initial values for variables:
             ♦    Zero for numeric ports
             ♦    Empty strings for string ports
             ♦    01/01/1753 for Date/Time ports with PMServer 4.0 date handling compatibility disabled
             ♦    01/01/0001 for Date/Time ports with PMServer 4.0 date handling compatibility enabled

16   Chapter 1: Working with Transformations
Therefore, use variables as counters, which need an initial value. For example, you can create
a numeric variable with the following expression:
      VAR1 + 1

This expression counts the number of rows in the VAR1 port. If the initial value of the
variable were set to NULL, the expression would always evaluate to NULL. This is why the
initial value is set to zero.




                                                                      Using Local Variables   17
Using Default Values for Ports
             All transformations use default values that determine how the Integration Service handles
             input null values and output transformation errors. Input, output, and input/output ports are
             created with a system default value that you can sometimes override with a user-defined
             default value. Default values have different functions in different types of ports:
             ♦   Input port. The system default value for null input ports is NULL. It displays as a blank in
                 the transformation. If an input value is NULL, the Integration Service leaves it as NULL.
             ♦   Output port. The system default value for output transformation errors is ERROR. The
                 default value appears in the transformation as ERROR(‘transformation error’). If a
                 transformation error occurs, the Integration Service skips the row. The Integration Service
                 notes all input rows skipped by the ERROR function in the session log file.
                 The following errors are considered transformation errors:
                 −   Data conversion errors, such as passing a number to a date function.
                 −   Expression evaluation errors, such as dividing by zero.
                 −   Calls to an ERROR function.
             ♦   Input/output port. The system default value for null input is the same as input ports,
                 NULL. The system default value appears as a blank in the transformation. The default
                 value for output transformation errors is the same as output ports. The default value for
                 output transformation errors does not display in the transformation.
                 Table 1-5 shows the system default values for ports in connected transformations:

                 Table 1-5. System Default Values and Integration Service Behavior

                                     Default                                                            User-Defined Default
                     Port Type                    Integration Service Behavior
                                     Value                                                              Value Supported

                     Input,          NULL         Integration Service passes all input null values as   Input
                     Input/Output                 NULL.                                                 Input/Output

                     Output,         ERROR        Integration Service calls the ERROR function for      Output
                     Input/Output                 output port transformation errors. The Integration
                                                  Service skips rows with errors and writes the input
                                                  data and error message in the session log file.


                     Note: Variable ports do not support default values. The Integration Service initializes
                     variable ports according to the datatype. For more information, see “Using Local
                     Variables” on page 14.




18   Chapter 1: Working with Transformations
Figure 1-6 shows that the system default value for input and input/output ports appears as
a blank in the transformation:

Figure 1-6. Default Value for Input and Input/Output Ports




Selected
Port




Blank
default
value for
input port
means
NULL.



Figure 1-7 shows that the system default value for output ports appears
ERROR(‘transformation error’):

Figure 1-7. Default Value for Output Ports




Selected
Port




ERROR
Default
Value




                                                             Using Default Values for Ports   19
You can override some of the default values to change the Integration Service behavior when it
             encounters null input values and output transformation errors.


       Entering User-Defined Default Values
             You can override the system default values with user-defined default values for supported
             input, input/output, and output ports within a connected transformation:
             ♦     Input ports. You can enter user-defined default values for input ports if you do not want
                   the Integration Service to treat null values as NULL.
             ♦     Output ports. You can enter user-defined default values for output ports if you do not
                   want the Integration Service to skip the row or if you want the Integration Service to write
                   a specific message with the skipped row to the session log.
             ♦     Input/output ports. You can enter user-defined default values to handle null input values
                   for input/output ports in the same way you can enter user-defined default values for null
                   input values for input ports. You cannot enter user-defined default values for output
                   transformation errors in an input/output port.
             Note: The Integration Service ignores user-defined default values for unconnected
             transformations. For example, if you call a Lookup or Stored Procedure transformation
             through an expression, the Integration Service ignores any user-defined default value and uses
             the system default value only.
             Table 1-6 shows the ports for each transformation that support user-defined default values:

             Table 1-6. Transformations Supporting User-Defined Default Values

                                      Input Values for
                                                           Output Values for     Output Values for
                 Transformation       Input Port
                                                           Output Port           Input/Output Port
                                      Input/Output Port

                 Aggregator           Supported            Not Supported         Not Supported

                 Custom               Supported            Supported             Not Supported

                 Expression           Supported            Supported             Not Supported

                 External Procedure   Supported            Supported             Not Supported

                 Filter               Supported            Not Supported         Not Supported

                 HTTP                 Supported            Not Supported         Not Supported

                 Java                 Supported            Supported             Supported

                 Lookup               Supported            Supported             Not Supported

                 Normalizer           Supported            Supported             Not Supported

                 Rank                 Not Supported        Supported             Not Supported

                 Router               Supported            Not Supported         Not Supported

                 SQL                  Supported            Not Supported         Supported

                 Stored Procedure     Supported            Supported             Not Supported



20   Chapter 1: Working with Transformations
Table 1-6. Transformations Supporting User-Defined Default Values

                           Input Values for
                                               Output Values for    Output Values for
    Transformation         Input Port
                                               Output Port          Input/Output Port
                           Input/Output Port

    Sequence Generator     n/a                 Not Supported        Not Supported

    Sorter                 Supported           Not Supported        Not Supported

    Source Qualifier       Not Supported       n/a                  Not Supported

    Transaction Control    Not Supported       n/a                  Not Supported

    Union                  Supported           Supported            n/a

    Update Strategy        Supported           n/a                  Not Supported

    XML Generator          n/a                 Supported            Not Supported

    XML Parser             Supported           n/a                  Not Supported

    XML Source Qualifier   Not Supported       n/a                  Not Supported


Use the following options to enter user-defined default values:
♦     Constant value. Use any constant (numeric or text), including NULL.
♦     Constant expression. You can include a transformation function with constant
      parameters.
♦     ERROR. Generate a transformation error. Write the row and a message in the session log
      or row error log. The Integration Service writes the row to session log or row error log
      based on session configuration.
♦     ABORT. Abort the session.

Entering Constant Values
You can enter any constant value as a default value. The constant value must match the port
datatype. For example, a default value for a numeric port must be a numeric constant. Some
constant values include:
             0

             9999

             NULL

             'Unknown Value'
             'Null input data'


Entering Constant Expressions
A constant expression is any expression that uses transformation functions (except aggregate
functions) to write constant expressions. You cannot use values from input, input/output, or
variable ports in a constant expression.



                                                                          Using Default Values for Ports   21
Some valid constant expressions include:
                       500 * 1.75

                       TO_DATE('January 1, 1998, 12:05 AM')

                       ERROR ('Null not allowed')
                       ABORT('Null not allowed')

                       SYSDATE

             You cannot use values from ports within the expression because the Integration Service assigns
             default values for the entire mapping when it initializes the session. Some invalid default
             values include the following examples, which incorporate values read from ports:
                       AVG(IN_SALARY)
                       IN_PRICE * IN_QUANTITY

                       :LKP(LKP_DATES, DATE_SHIPPED)

             Note: You cannot call a stored procedure or lookup table from a default value expression.


             Entering ERROR and ABORT Functions
             Use the ERROR and ABORT functions for input and output port default values, and input
             values for input/output ports. The Integration Service skips the row when it encounters the
             ERROR function. It aborts the session when it encounters the ABORT function.


       Entering User-Defined Default Input Values
             You can enter a user-defined default input value if you do not want the Integration Service to
             treat null values as NULL. You can complete the following functions to override null values:
             ♦     Replace the null value with a constant value or constant expression.
             ♦     Skip the null value with an ERROR function.
             ♦     Abort the session with the ABORT function.
             Table 1-7 summarizes how the Integration Service handles null input for input and input/
             output ports:

             Table 1-7. Default Values for Input and Input/Output Ports

                                         Default Value
                 Default Value                           Description
                                         Type

                 NULL (displays blank)   System          Integration Service passes NULL.

                 Constant or             User-Defined    Integration Service replaces the null value with the value of the constant
                 Constant expression                     or constant expression.




22   Chapter 1: Working with Transformations
Table 1-7. Default Values for Input and Input/Output Ports

                          Default Value
 Default Value                              Description
                          Type

 ERROR                    User-Defined      Integration Service treats this as a transformation error:
                                            - Increases the transformation error count by 1.
                                            - Skips the row, and writes the error message to the session log file or
                                              row error log.
                                            The Integration Service does not write rows to the reject file.

 ABORT                    User-Defined      Session aborts when the Integration Service encounters a null input
                                            value. The Integration Service does not increase the error count or write
                                            rows to the reject file.


Replacing Null Values
Use a constant value or expression to substitute a specified value for a NULL. For example, if
an input string port is called DEPT_NAME and you want to replace null values with the
string ‘UNKNOWN DEPT’, you could set the default value to ‘UNKNOWN DEPT’.
Depending on the transformation, the Integration Service passes ‘UNKNOWN DEPT’ to an
expression or variable within the transformation or to the next transformation in the data
flow.
Figure 1-8 shows a string constant as a user-defined default value for input or input/output
ports:

Figure 1-8. Using a Constant as a Default Value




Selected Port




User-Defined
Constant
Default Value




                                                                               Using Default Values for Ports           23
The Integration Service replaces all null values in the EMAIL port with the string
             ‘UNKNOWN DEPT.’
                     DEPT_NAME            REPLACED VALUE

                     Housewares           Housewares

                     NULL                 UNKNOWN DEPT
                     Produce              Produce


             Skipping Null Records
             Use the ERROR function as the default value when you do not want null values to pass into a
             transformation. For example, you might want to skip a row when the input value of
             DEPT_NAME is NULL. You could use the following expression as the default value:
                     ERROR('Error. DEPT is NULL')

             Figure 1-9 shows a default value that instructs the Integration Service to skip null values:

             Figure 1-9. Using the ERROR Function to Skip Null Input Values




             When you use the ERROR function as a default value, the Integration Service skips the row
             with the null value. The Integration Service writes all rows skipped by the ERROR function
             into the session log file. It does not write these rows to the session reject file.
                     DEPT_NAME           RETURN VALUE

                     Housewares          Housewares

                     NULL                'Error. DEPT is NULL' (Row is skipped)

                     Produce             Produce




24   Chapter 1: Working with Transformations
The following session log shows where the Integration Service skips the row with the null
  value:
         TE_11019 Port [DEPT_NAME]: Default value is: ERROR(<<Transformation
         Error>> [error]: Error. DEPT is NULL

         ... error('Error. DEPT is NULL')
         ).

         CMN_1053 EXPTRANS: : ERROR: NULL input column DEPT_NAME: Current Input
         data:

         CMN_1053     Input row from SRCTRANS: Rowdata: ( RowType=4 Src Rowid=2 Targ
         Rowid=2

              DEPT_ID (DEPT_ID:Int:): "2"
              DEPT_NAME (DEPT_NAME:Char.25:): "NULL"

              MANAGER_ID (MANAGER_ID:Int:): "1"

         )

  For more information about the ERROR function, see “Functions” in the Transformation
  Language Reference.

  Aborting the Session
  Use the ABORT function to abort a session when the Integration Service encounters any null
  input values. For more information about the ABORT function, see “Functions” in the
  Transformation Language Reference.


Entering User-Defined Default Output Values
  You can enter user-defined default values for output ports if you do not want the Integration
  Service to skip rows with errors or if you want the Integration Service to write a specific
  message with the skipped row to the session log. You can enter default values to complete the
  following functions when the Integration Service encounters output transformation errors:
  ♦   Replace the error with a constant value or constant expression. The Integration Service
      does not skip the row.
  ♦   Abort the session with the ABORT function.
  ♦   Write specific messages in the session log for transformation errors.
  You cannot enter user-defined default output values for input/output ports.




                                                                  Using Default Values for Ports   25
Table 1-8 summarizes how the Integration Service handles output port transformation errors
             and default values in transformations:

             Table 1-8. Supported Default Values for Output Ports

                                     Default Value
              Default Value                            Description
                                     Type

              Transformation Error   System            When a transformation error occurs and you did not override the default
                                                       value, the Integration Service performs the following tasks:
                                                       - Increases the transformation error count by 1.
                                                       - Skips the row, and writes the error and input row to the session log file or
                                                         row error log, depending on session configuration.
                                                       The Integration Service does not write the row to the reject file.

              Constant or            User-Defined      Integration Service replaces the error with the default value.
              Constant Expression                      The Integration Service does not increase the error count or write a
                                                       message to the session log.

              ABORT                  User-Defined      Session aborts and the Integration Service writes a message to the
                                                       session log.
                                                       The Integration Service does not increase the error count or write rows to
                                                       the reject file.


             Replacing Errors
             If you do not want the Integration Service to skip a row when a transformation error occurs,
             use a constant or constant expression as the default value for an output port. For example, if
             you have a numeric output port called NET_SALARY and you want to use the constant value
             ‘9999’ when a transformation error occurs, assign the default value 9999 to the
             NET_SALARY port. If there is any transformation error (such as dividing by zero) while
             computing the value of NET_SALARY, the Integration Service uses the default value 9999.

             Aborting the Session
             Use the ABORT function as the default value in an output port if you do not want to allow
             any transformation errors.

             Writing Messages in the Session Log or Row Error Logs
             You can enter a user-defined default value in the output port if you want the Integration
             Service to write a specific message in the session log with the skipped row. The system default
             is ERROR (‘transformation error’), and the Integration Service writes the message
             ‘transformation error’ in the session log along with the skipped row. You can replace
             ‘transformation error’ if you want to write a different message.
             When you enable row error logging, the Integration Service writes error messages to the error
             log instead of the session log and the Integration Service does not log Transaction Control
             transformation rollback or commit errors. If you want to write rows to the session log in
             addition to the row error log, you can enable verbose data tracing.




26   Chapter 1: Working with Transformations
Working with ERROR Functions in Output Port Expressions
If you enter an expression that uses the ERROR function, the user-defined default value for
the output port might override the ERROR function in the expression.
For example, you enter the following expression that instructs the Integration Service to use
the value ‘Negative Sale’ when it encounters an error:
         IIF( TOTAL_SALES>0, TOTAL_SALES, ERROR ('Negative Sale'))

The following examples show how user-defined default values may override the ERROR
function in the expression:
♦   Constant value or expression. The constant value or expression overrides the ERROR
    function in the output port expression.
    For example, if you enter ‘0’ as the default value, the Integration Service overrides the
    ERROR function in the output port expression. It passes the value 0 when it encounters
    an error. It does not skip the row or write ‘Negative Sale’ in the session log.
♦   ABORT. The ABORT function overrides the ERROR function in the output port
    expression.
    If you use the ABORT function as the default value, the Integration Service aborts the
    session when a transformation error occurs. The ABORT function overrides the ERROR
    function in the output port expression.
♦   ERROR. If you use the ERROR function as the default value, the Integration Service
    includes the following information in the session log:
    −   Error message from the default value
    −   Error message indicated in the ERROR function in the output port expression
    −   Skipped row
    For example, you can override the default value with the following ERROR function:
         ERROR('No default value')

    The Integration Service skips the row, and includes both error messages in the log.
             TE_7007 Transformation Evaluation Error; current row skipped...

             TE_7007 [<<Transformation Error>> [error]: Negative Sale

         ... error('Negative Sale')

         ]

         Sun Sep 20 13:57:28 1998

         TE_11019 Port [OUT_SALES]: Default value is: ERROR(<<Transformation
         Error>> [error]: No default value

         ... error('No default value')




                                                               Using Default Values for Ports   27
General Rules for Default Values
             Use the following rules and guidelines when you create default values:
             ♦   The default value must be either a NULL, a constant value, a constant expression, an
                 ERROR function, or an ABORT function.
             ♦   For input/output ports, the Integration Service uses default values to handle null input
                 values. The output default value of input/output ports is always ERROR(‘Transformation
                 Error’).
             ♦   Variable ports do not use default values.
             ♦   You can assign default values to group by ports in the Aggregator and Rank
                 transformations.
             ♦   Not all port types in all transformations allow user-defined default values. If a port does
                 not allow user-defined default values, the default value field is disabled.
             ♦   Not all transformations allow user-defined default value. For more information, see
                 Table 1-6 on page 20.
             ♦   If a transformation is not connected to the mapping data flow (an unconnected
                 transformation), the Integration Service ignores user-defined default values.
             ♦   If any input port is unconnected, its value is assumed to be NULL and the Integration
                 Service uses the default value for that input port.
             ♦   If an input port default value contains the ABORT function and the input value is NULL,
                 the Integration Service immediately stops the session. Use the ABORT function as a
                 default value to restrict null input values. The first null value in an input port stops the
                 session.
             ♦   If an output port default value contains the ABORT function and any transformation
                 error occurs for that port, the session immediately stops. Use the ABORT function as a
                 default value to enforce strict rules for transformation errors. The first transformation
                 error for this port stops the session.
             ♦   The ABORT function, constant values, and constant expressions override ERROR
                 functions configured in output port expressions.


        Entering and Validating Default Values
             You can validate default values as you enter them. The Designer includes a Validate button so
             you can ensure valid default values. A message appears indicating if the default is valid.




28   Chapter 1: Working with Transformations
Figure 1-10 shows the user-defined value for a port and the Validate button:

                Figure 1-10. Entering and Validating Default Values




Selected Port




User-Defined                                                                                       Validate Button
Default Value




                The Designer also validates default values when you save a mapping. If you enter an invalid
                default value, the Designer marks the mapping invalid.




                                                                              Using Default Values for Ports         29
Configuring Tracing Level in Transformations
             When you configure a transformation, you can set the amount of detail the Integration
             Service writes in the session log.
             Table 1-9 describes the session log tracing levels:

             Table 1-9. Session Log Tracing Levels

              Tracing Level       Description

              Normal              Integration Service logs initialization and status information, errors encountered, and skipped
                                  rows due to transformation row errors. Summarizes session results, but not at the level of
                                  individual rows.

              Terse               Integration Service logs initialization information and error messages and notification of rejected
                                  data.

              Verbose             In addition to normal tracing, Integration Service logs additional initialization details, names of
              Initialization      index and data files used, and detailed transformation statistics.

              Verbose Data        In addition to verbose initialization tracing, Integration Service logs each row that passes into
                                  the mapping. Also notes where the Integration Service truncates string data to fit the precision
                                  of a column and provides detailed transformation statistics.
                                  Allows the Integration Service to write errors to both the session log and error log when you
                                  enable row error logging.
                                  When you configure the tracing level to verbose data, the Integration Service writes row data for
                                  all rows in a block when it processes a transformation.


             By default, the tracing level for every transformation is Normal. Change the tracing level to a
             Verbose setting only when you need to debug a transformation that is not behaving as
             expected. To add a slight performance boost, you can also set the tracing level to Terse,
             writing the minimum of detail to the session log when running a workflow containing the
             transformation.
             When you configure a session, you can override the tracing levels for individual
             transformations with a single tracing level for all transformations in the session.




30   Chapter 1: Working with Transformations
Reusable Transformations
      Mappings can contain reusable and non-reusable transformations. Non-reusable
      transformations exist within a single mapping. Reusable transformations can be used in
      multiple mappings.
      For example, you might create an Expression transformation that calculates value-added tax
      for sales in Canada, which is useful when you analyze the cost of doing business in that
      country. Rather than perform the same work every time, you can create a reusable
      transformation. When you need to incorporate this transformation into a mapping, you add
      an instance of it to the mapping. Later, if you change the definition of the transformation, all
      instances of it inherit the changes.
      The Designer stores each reusable transformation as metadata separate from any mapping that
      uses the transformation. If you review the contents of a folder in the Navigator, you see the
      list of all reusable transformations in that folder.
      Each reusable transformation falls within a category of transformations available in the
      Designer. For example, you can create a reusable Aggregator transformation to perform the
      same aggregate calculations in multiple mappings, or a reusable Stored Procedure
      transformation to call the same stored procedure in multiple mappings.
      You can create most transformations as a non-reusable or reusable. However, you can only
      create the External Procedure transformation as a reusable transformation.
      When you add instances of a reusable transformation to mappings, you must be careful that
      changes you make to the transformation do not invalidate the mapping or generate
      unexpected data.


    Instances and Inherited Changes
      When you add a reusable transformation to a mapping, you add an instance of the
      transformation. The definition of the transformation still exists outside the mapping, while a
      copy (or instance) appears within the mapping.
      Since the instance of a reusable transformation is a pointer to that transformation, when you
      change the transformation in the Transformation Developer, its instances reflect these
      changes. Instead of updating the same transformation in every mapping that uses it, you can
      update the reusable transformation once, and all instances of the transformation inherit the
      change. Note that instances do not inherit changes to property settings, only modifications to
      ports, expressions, and the name of the transformation.


    Mapping Variables in Expressions
      Use mapping parameters and variables in reusable transformation expressions. When the
      Designer validates the parameter or variable, it treats it as an Integer datatype. When you use
      the transformation in a mapplet or mapping, the Designer validates the expression again. If
      the mapping parameter or variable does not exist in the mapplet or mapping, the Designer



                                                                         Reusable Transformations   31
logs an error. For more information, see “Mapping Parameters and Variables” in the Designer
             Guide.


       Creating Reusable Transformations
             You can create a reusable transformation using the following methods:
             ♦    Design it in the Transformation Developer. In the Transformation Developer, you can
                  build new reusable transformations.
             ♦    Promote a non-reusable transformation from the Mapping Designer. After you add a
                  transformation to a mapping, you can promote it to the status of reusable transformation.
                  The transformation designed in the mapping then becomes an instance of a reusable
                  transformation maintained elsewhere in the repository.
             If you promote a transformation to reusable status, you cannot demote it. However, you can
             create a non-reusable instance of it.
             Note: Sequence Generator transformations must be reusable in mapplets. You cannot demote
             reusable Sequence Generator transformations to non-reusable in a mapplet.

             To create a reusable transformation:

             1.    In the Designer, switch to the Transformation Developer.
             2.    Click the button on the Transformation toolbar corresponding to the type of
                   transformation you want to create.
             3.    Drag within the workbook to create the transformation.
             4.    Double-click the transformation title bar to open the dialog displaying its properties.
             5.    Click the Rename button and enter a descriptive name for the transformation, and click
                   OK.
             6.    Click the Ports tab, then add any input and output ports you need for this
                   transformation.
             7.    Set the other properties of the transformation, and click OK.
                   These properties vary according to the transformation you create. For example, if you
                   create an Expression transformation, you need to enter an expression for one or more of
                   the transformation output ports. If you create a Stored Procedure transformation, you
                   need to identify the stored procedure to call.
             8.    Click Repository > Save.


       Promoting Non-Reusable Transformations
             The other technique for creating a reusable transformation is to promote an existing
             transformation within a mapping. By checking the Make Reusable option in the Edit
             Transformations dialog box, you instruct the Designer to promote the transformation and
             create an instance of it in the mapping.


32   Chapter 1: Working with Transformations
To promote a non-reusable transformation:

  1.   In the Designer, open a mapping and double-click the title bar of the transformation you
       want to promote.
  2.   Select the Make Reusable option.
  3.   When prompted whether you are sure you want to promote the transformation, click Yes.
  4.   Click OK to return to the mapping.
  5.   Click Repository > Save.
  Now, when you look at the list of reusable transformations in the folder you are working in,
  the newly promoted transformation appears in this list.


Creating Non-Reusable Instances of Reusable Transformations
  You can create a non-reusable instance of a reusable transformation within a mapping.
  Reusable transformations must be made non-reusable within the same folder. If you want to
  have a non-reusable instance of a reusable transformation in a different folder, you need to
  first make a non-reusable instance of the transformation in the source folder, and then copy it
  into the target folder.

  To create a non-reusable instance of a reusable transformation:

  1.   In the Designer, open a mapping.
  2.   In the Navigator, select an existing transformation and drag the transformation into the
       mapping workspace. Hold down the Ctrl key before you release the transformation.
       The status bar displays the following message:
         Make a non-reusable copy of this transformation and add it to this
         mapping.

  3.   Release the transformation.
       The Designer creates a non-reusable instance of the existing reusable transformation.
  4.   Click Repository > Save.


Adding Reusable Transformations to Mappings
  After you create a reusable transformation, you can add it to mappings.

  To add a reusable transformation:

  1.   In the Designer, switch to the Mapping Designer.
  2.   Open or create a mapping.
  3.   In the list of repository objects, drill down until you find the reusable transformation you
       want in the Transformations section of a folder.
  4.   Drag the transformation from the Navigator into the mapping.


                                                                      Reusable Transformations   33
A copy (or instance) of the reusable transformation appears.
             5.    Link the new transformation to other transformations or target definitions.
             6.    Click Repository > Save.


       Modifying a Reusable Transformation
             Changes to a reusable transformation that you enter through the Transformation Developer
             are immediately reflected in all instances of that transformation. While this feature is a
             powerful way to save work and enforce standards (for example, by publishing the official
             version of a depreciation calculation through a reusable transformation), you risk invalidating
             mappings when you modify a reusable transformation.
             To see what mappings, mapplets, or shortcuts may be affected by changes you make to a
             transformation, select the transformation in the workspace or Navigator, right-click, and
             select View Dependencies.
             If you make any of the following changes to the reusable transformation, mappings that use
             instances of it may be invalidated:
             ♦    When you delete a port or multiple ports in a transformation, you disconnect the instance
                  from part or all of the data flow through the mapping.
             ♦    When you change a port datatype, you make it impossible to map data from that port to
                  another port using an incompatible datatype.
             ♦    When you change a port name, expressions that refer to the port are no longer valid.
             ♦    When you enter an invalid expression in the reusable transformation, mappings that use
                  the transformation are no longer valid. The Integration Service cannot run sessions based
                  on invalid mappings.

             Reverting to Original Reusable Transformation
             If you change the properties of a reusable transformation in a mapping, you can revert to the
             original reusable transformation properties by clicking the Revert button.




34   Chapter 1: Working with Transformations
Figure 1-11 shows how you can revert to the original properties of the reusable
transformation:

Figure 1-11. Reverting to Original Reusable Transformation Properties




                                                                           Revert to original
                                                                           properties
                                                                           defined in
                                                                           Transformation
                                                                           Developer.




                                                                        Reusable Transformations   35
36   Chapter 1: Working with Transformations
Chapter 2




Aggregator Transformation


   This chapter includes the following topics:
   ♦   Overview, 38
   ♦   Aggregate Expressions, 40
   ♦   Group By Ports, 42
   ♦   Using Sorted Input, 45
   ♦   Creating an Aggregator Transformation, 47
   ♦   Tips, 50
   ♦   Troubleshooting, 51




                                                               37
Overview
                    Transformation type:
                    Active
                    Connected


             The Aggregator transformation lets you perform aggregate calculations, such as averages and
             sums. The Aggregator transformation is unlike the Expression transformation, in that you use
             the Aggregator transformation to perform calculations on groups. The Expression
             transformation permits you to perform calculations on a row-by-row basis only.
             When using the transformation language to create aggregate expressions, use conditional
             clauses to filter rows, providing more flexibility than SQL language.
             The Integration Service performs aggregate calculations as it reads, and stores necessary data
             group and row data in an aggregate cache.
             After you create a session that includes an Aggregator transformation, you can enable the
             session option, Incremental Aggregation. When the Integration Service performs incremental
             aggregation, it passes new source data through the mapping and uses historical cache data to
             perform new aggregation calculations incrementally. For information about incremental
             aggregation, see “Using Incremental Aggregation” in the Workflow Administration Guide.


       Ports in the Aggregator Transformation
             To configure ports in the Aggregator transformation, complete the following tasks:
             ♦   Enter an expression in any output port, using conditional clauses or non-aggregate
                 functions in the port.
             ♦   Create multiple aggregate output ports.
             ♦   Configure any input, input/output, output, or variable port as a group by port.
             ♦   Improve performance by connecting only the necessary input/output ports to subsequent
                 transformations, reducing the size of the data cache.
             ♦   Use variable ports for local variables.
             ♦   Create connections to other transformations as you enter an expression.


       Components of the Aggregator Transformation
             The Aggregator is an active transformation, changing the number of rows in the pipeline. The
             Aggregator transformation has the following components and options:
             ♦   Aggregate expression. Entered in an output port. Can include non-aggregate expressions
                 and conditional clauses.
             ♦   Group by port. Indicates how to create groups. The port can be any input, input/output,
                 output, or variable port. When grouping data, the Aggregator transformation outputs the
                 last row of each group unless otherwise specified.


38   Chapter 2: Aggregator Transformation
♦   Sorted input. Use to improve session performance. To use sorted input, you must pass
      data to the Aggregator transformation sorted by group by port, in ascending or descending
      order.
  ♦   Aggregate cache. The Integration Service stores data in the aggregate cache until it
      completes aggregate calculations. It stores group values in an index cache and row data in
      the data cache.


Aggregate Caches
  When you run a session that uses an Aggregator transformation, the Integration Service
  creates index and data caches in memory to process the transformation. If the Integration
  Service requires more space, it stores overflow values in cache files.
  You can configure the index and data caches in the Aggregator transformation or in the
  session properties. Or, you can configure the Integration Service to determine the cache size at
  runtime.
  For more information about configuring index and data caches, see “Creating an Aggregator
  Transformation” on page 47.
  For information about configuring the Integration Service to determine the cache size at
  runtime, see “Working with Sessions” in the Workflow Administration Guide.
  Note: The Integration Service uses memory to process an Aggregator transformation with
  sorted ports. It does not use cache memory. You do not need to configure cache memory for
  Aggregator transformations that use sorted ports.




                                                                                   Overview     39
Aggregate Expressions
             The Designer allows aggregate expressions only in the Aggregator transformation. An
             aggregate expression can include conditional clauses and non-aggregate functions. It can also
             include one aggregate function nested within another aggregate function, such as:
                    MAX( COUNT( ITEM ))

             The result of an aggregate expression varies depending on the group by ports used in the
             transformation. For example, when the Integration Service calculates the following aggregate
             expression with no group by ports defined, it finds the total quantity of items sold:
                    SUM( QUANTITY )

             However, if you use the same expression, and you group by the ITEM port, the Integration
             Service returns the total quantity of items sold, by item.
             You can create an aggregate expression in any output port and use multiple aggregate ports in
             a transformation.


       Aggregate Functions
             Use the following aggregate functions within an Aggregator transformation. You can nest one
             aggregate function within another aggregate function.
             The transformation language includes the following aggregate functions:
             ♦   AVG
             ♦   COUNT
             ♦   FIRST
             ♦   LAST
             ♦   MAX
             ♦   MEDIAN
             ♦   MIN
             ♦   PERCENTILE
             ♦   STDDEV
             ♦   SUM
             ♦   VARIANCE
             When you use any of these functions, you must use them in an expression within an
             Aggregator transformation. For a description of these functions, see “Functions” in the
             Transformation Language Reference.


       Nested Aggregate Functions
             You can include multiple single-level or multiple nested functions in different output ports in
             an Aggregator transformation. However, you cannot include both single-level and nested


40   Chapter 2: Aggregator Transformation
functions in an Aggregator transformation. Therefore, if an Aggregator transformation
  contains a single-level function in any output port, you cannot use a nested function in any
  other port in that transformation. When you include single-level and nested functions in the
  same Aggregator transformation, the Designer marks the mapping or mapplet invalid. If you
  need to create both single-level and nested functions, create separate Aggregator
  transformations.


Conditional Clauses
  Use conditional clauses in the aggregate expression to reduce the number of rows used in the
  aggregation. The conditional clause can be any clause that evaluates to TRUE or FALSE.
  For example, use the following expression to calculate the total commissions of employees
  who exceeded their quarterly quota:
        SUM( COMMISSION, COMMISSION > QUOTA )


Non-Aggregate Functions
  You can also use non-aggregate functions in the aggregate expression.
  The following expression returns the highest number of items sold for each item (grouped by
  item). If no items were sold, the expression returns 0.
        IIF( MAX( QUANTITY ) > 0, MAX( QUANTITY ), 0))


Null Values in Aggregate Functions
  When you configure the Integration Service, you can choose how you want the Integration
  Service to handle null values in aggregate functions. You can choose to treat null values in
  aggregate functions as NULL or zero. By default, the Integration Service treats null values as
  NULL in aggregate functions.
  For information about changing this default behavior, see “Creating and Configuring the
  Integration Service” in the Administrator Guide.




                                                                       Aggregate Expressions   41
Group By Ports
             The Aggregator transformation lets you define groups for aggregations, rather than
             performing the aggregation across all input data. For example, rather than finding the total
             company sales, you can find the total sales grouped by region.
             To define a group for the aggregate expression, select the appropriate input, input/output,
             output, and variable ports in the Aggregator transformation. You can select multiple group by
             ports, creating a new group for each unique combination of groups. The Integration Service
             then performs the defined aggregation for each group.
             When you group values, the Integration Service produces one row for each group. If you do
             not group values, the Integration Service returns one row for all input rows. The Integration
             Service typically returns the last row of each group (or the last row received) with the result of
             the aggregation. However, if you specify a particular row to be returned (for example, by using
             the FIRST function), the Integration Service then returns the specified row.
             When selecting multiple group by ports in the Aggregator transformation, the Integration
             Service uses port order to determine the order by which it groups. Since group order can
             affect the results, order group by ports to ensure the appropriate grouping. For example, the
             results of grouping by ITEM_ID then QUANTITY can vary from grouping by QUANTITY
             then ITEM_ID, because the numeric values for quantity are not necessarily unique.
             The following Aggregator transformation groups first by STORE_ID and then by ITEM:




             If you send the following data through this Aggregator transformation:
             STORE_ID       ITEM            QTY        PRICE
             101            'battery'       3          2.99
             101            'battery'       1          3.19
             101            'battery'       2          2.59
             101            'AAA'           2          2.45
             201            'battery'       1          1.99
             201            'battery'       4          1.59
             301            'battery'       1          2.45




42   Chapter 2: Aggregator Transformation
The Integration Service performs the aggregate calculation on the following unique groups:
  STORE_ID       ITEM
  101            'battery'
  101            'AAA'
  201            'battery'
  301            'battery'


  The Integration Service then passes the last row received, along with the results of the
  aggregation, as follows:
  STORE_ID          ITEM                QTY         PRICE         SALES_PER_STORE
  101               'battery'           2           2.59          17.34
  101               'AAA'               2           2.45          4.90
  201               'battery'           4           1.59          8.35
  301               'battery'           1           2.45          2.45



Non-Aggregate Expressions
  Use non-aggregate expressions in group by ports to modify or replace groups. For example, if
  you want to replace ‘AAA battery’ before grouping, you can create a new group by output
  port, named CORRECTED_ITEM, using the following expression:
        IIF( ITEM = 'AAA battery', battery, ITEM )


Default Values
  Use default values in the group by port to replace null input values. This allows the
  Integration Service to include null item groups in the aggregation. For more information
  about default values, see “Using Default Values for Ports” on page 18.




                                                                              Group By Ports   43
For example, if you define a default value of ‘Misc’ in the ITEM column as shown below, the
             Integration Service replaces null groups with ‘Misc’:




44   Chapter 2: Aggregator Transformation
Using Sorted Input
      You can improve Aggregator transformation performance by using the sorted input option.
      When you use sorted input, the Integration Service assumes all data is sorted by group. As the
      Integration Service reads rows for a group, it performs aggregate calculations. When
      necessary, it stores group information in memory. To use the Sorted Input option, you must
      pass sorted data to the Aggregator transformation. You can gain performance with sorted
      ports when you configure the session with multiple partitions.
      When you do not use sorted input, the Integration Service performs aggregate calculations as
      it reads. However, since data is not sorted, the Integration Service stores data for each group
      until it reads the entire source to ensure all aggregate calculations are accurate.
      For example, one Aggregator transformation has the STORE_ID and ITEM group by ports,
      with the sorted input option selected. When you pass the following data through the
      Aggregator, the Integration Service performs an aggregation for the three rows in the
      101/battery group as soon as it finds the new group, 201/battery:
      STORE_ID        ITEM               QTY        PRICE
      101             'battery'          3          2.99
      101             'battery'          1          3.19
      101             'battery'          2          2.59
      201             'battery'          4          1.59
      201             'battery'          1          1.99


      If you use sorted input and do not presort data correctly, you receive unexpected results.


    Sorted Input Conditions
      Do not use sorted input if either of the following conditions are true:
      ♦   The aggregate expression uses nested aggregate functions.
      ♦   The session uses incremental aggregation.
      If you use sorted input and do not sort data correctly, the session fails.


    Pre-Sorting Data
      To use sorted input, you pass sorted data through the Aggregator.
      Data must be sorted as follows:
      ♦   By the Aggregator group by ports, in the order they appear in the Aggregator
          transformation.
      ♦   Using the same sort order configured for the session. If data is not in strict ascending or
          descending order based on the session sort order, the Integration Service fails the session.
          For example, if you configure a session to use a French sort order, data passing into the
          Aggregator transformation must be sorted using the French sort order.

                                                                                   Using Sorted Input   45
For relational and file sources, use the Sorter transformation to sort data in the mapping
             before passing it to the Aggregator transformation. You can place the Sorter transformation
             anywhere in the mapping prior to the Aggregator if no transformation changes the order of
             the sorted data. Group by columns in the Aggregator transformation must be in the same
             order as they appear in the Sorter transformation. For information about sorting data using
             the Sorter transformation, see “Sorter Transformation” on page 435.
             If the session uses relational sources, you can also use the Number of Sorted Ports option in
             the Source Qualifier transformation to sort group by columns in the source database. Group
             by columns must be in the same order in both the Aggregator and Source Qualifier
             transformations. For information about sorting data in the Source Qualifier, see “Using
             Sorted Ports” on page 472.
             Figure 2-1 shows the mapping with a Sorter transformation configured to sort the source data
             in descending order by ITEM_NAME:

             Figure 2-1. Sample Mapping with Aggregator and Sorter Transformations




             The Sorter transformation sorts the data as follows:
             ITEM_NAME            QTY           PRICE
             Soup                 4             2.95
             Soup                 1             2.95
             Soup                 2             3.25
             Cereal               1             4.49
             Cereal               2             5.25


             With sorted input, the Aggregator transformation returns the following results:
             ITEM_NAME            QTY                PRICE                  INCOME_PER_ITEM
             Cereal               2                  5.25                   14.99
             Soup                 2                  3.25                   21.25




46   Chapter 2: Aggregator Transformation
Creating an Aggregator Transformation
      To use an Aggregator transformation in a mapping, add the Aggregator transformation to the
      mapping. Then configure the transformation with an aggregate expression and group by
      ports.

      To create an Aggregator transformation:

      1.   In the Mapping Designer, click Transformation > Create. Select the Aggregator
           transformation.
      2.   Enter a name for the Aggregator, click Create. Then click Done.
           The Designer creates the Aggregator transformation.
      3.   Drag the ports to the Aggregator transformation.
           The Designer creates input/output ports for each port you include.
      4.   Double-click the title bar of the transformation to open the Edit Transformations dialog
           box.
      5.   Select the Ports tab.
      6.   Click the group by option for each column you want the Aggregator to use in creating
           groups.
           Optionally, enter a default value to replace null groups.
           If you want to use a non-aggregate expression to modify groups, click the Add button and
           enter a name and data type for the port. Make the port an output port by clearing Input
           (I). Click in the right corner of the Expression field, enter the non-aggregate expression
           using one of the input ports, and click OK. Select Group By.
      7.   Click Add and enter a name and data type for the aggregate expression port. Make the
           port an output port by clearing Input (I). Click in the right corner of the Expression field
           to open the Expression Editor. Enter the aggregate expression, click Validate, and click
           OK.
           Make sure the expression validates before closing the Expression Editor.
      8.   Add default values for specific ports.
           If certain ports are likely to contain null values, you might specify a default value if the
           target database does not handle null values.




                                                                Creating an Aggregator Transformation     47
9.   Select the Properties tab.




                  Select and modify these options:

                    Aggregator Setting      Description

                    Cache Directory         Local directory where the Integration Service creates the index and data cache files. By
                                            default, the Integration Service uses the directory entered in the Workflow Manager for
                                            the process variable $PMCacheDir. If you enter a new directory, make sure the
                                            directory exists and contains enough disk space for the aggregate caches.
                                            If you have enabled incremental aggregation, the Integration Service creates a backup
                                            of the files each time you run the session. The cache directory must contain enough
                                            disk space for two sets of the files. For information about incremental aggregation, see
                                            “Using Incremental Aggregation” in the Workflow Administration Guide.

                    Tracing Level           Amount of detail displayed in the session log for this transformation.

                    Sorted Input            Indicates input data is presorted by groups. Select this option only if the mapping
                                            passes sorted data to the Aggregator transformation.

                    Aggregator Data         Data cache size for the transformation. Default cache size is 2,000,000 bytes. If the
                    Cache Size              total configured session cache size is 2 GB (2,147,483,648 bytes) or greater, you must
                                            run the session on a 64-bit Integration Service. You can configure the Integration
                                            Service to determine the cache size at runtime, or you can configure a numeric value. If
                                            you configure the Integration Service to determine the cache size, you can also
                                            configure a maximum amount of memory for the Integration Service to allocate to the
                                            cache.




48   Chapter 2: Aggregator Transformation
Aggregator Setting     Description

       Aggregator Index       Index cache size for the transformation. Default cache size is 1,000,000 bytes. If the
       Cache Size             total configured session cache size is 2 GB (2,147,483,648 bytes) or greater, you must
                              run the session on a 64-bit Integration Service. You can configure the Integration
                              Service to determine the cache size at runtime, or you can configure a numeric value. If
                              you configure the Integration Service to determine the cache size, you can also
                              configure a maximum amount of memory for the Integration Service to allocate to the
                              cache.

       Transformation Scope   Specifies how the Integration Service applies the transformation logic to incoming data:
                              - Transaction. Applies the transformation logic to all rows in a transaction. Choose
                                Transaction when a row of data depends on all rows in the same transaction, but does
                                not depend on rows in other transactions.
                              - All Input. Applies the transformation logic on all incoming data. When you choose All
                                Input, the PowerCenter drops incoming transaction boundaries. Choose All Input
                                when a row of data depends on all rows in the source.
                              For more information about transformation scope, see “Understanding Commit Points”
                              in the Workflow Administration Guide.


10.   Click OK.
11.   Click Repository > Save to save changes to the mapping.




                                                                     Creating an Aggregator Transformation               49
Tips
             Use the following guidelines to optimize the performance of an Aggregator transformation.

             Use sorted input to decrease the use of aggregate caches.
             Sorted input reduces the amount of data cached during the session and improves session
             performance. Use this option with the Sorter transformation to pass sorted data to the
             Aggregator transformation.

             Limit connected input/output or output ports.
             Limit the number of connected input/output or output ports to reduce the amount of data
             the Aggregator transformation stores in the data cache.

             Filter before aggregating.
             If you use a Filter transformation in the mapping, place the transformation before the
             Aggregator transformation to reduce unnecessary aggregation.




50   Chapter 2: Aggregator Transformation
Troubleshooting
      I selected sorted input but the workflow takes the same amount of time as before.
      You cannot use sorted input if any of the following conditions are true:
      ♦   The aggregate expression contains nested aggregate functions.
      ♦   The session uses incremental aggregation.
      ♦   Source data is data driven.
      When any of these conditions are true, the Integration Service processes the transformation as
      if you do not use sorted input.

      A session using an Aggregator transformation causes slow performance.
      The Integration Service may be paging to disk during the workflow. You can increase session
      performance by increasing the index and data cache sizes in the transformation properties. For
      more information about caching, see “Session Caches” in the Workflow Administration Guide.

      I entered an override cache directory in the Aggregator transformation, but the Integration
      Service saves the session incremental aggregation files somewhere else.
      You can override the transformation cache directory on a session level. The Integration
      Service notes the cache directory in the session log. You can also check the session properties
      for an override cache directory.




                                                                                 Troubleshooting    51
52   Chapter 2: Aggregator Transformation
Chapter 3




Custom Transformation


   This chapter includes the following topics:
   ♦   Overview, 54
   ♦   Creating Custom Transformations, 57
   ♦   Working with Groups and Ports, 59
   ♦   Working with Port Attributes, 62
   ♦   Custom Transformation Properties, 64
   ♦   Working with Transaction Control, 68
   ♦   Blocking Input Data, 70
   ♦   Working with Procedure Properties, 72
   ♦   Creating Custom Transformation Procedures, 73




                                                             53
Overview
                   Transformation type:
                   Active/Passive
                   Connected


            Custom transformations operate in conjunction with procedures you create outside of the
            Designer interface to extend PowerCenter functionality. You can create a Custom
            transformation and bind it to a procedure that you develop using the functions described in
            “Custom Transformation Functions” on page 89.
            Use the Custom transformation to create transformation applications, such as sorting and
            aggregation, which require all input rows to be processed before outputting any output rows.
            To support this process, the input and output functions occur separately in Custom
            transformations compared to External Procedure transformations.
            The Integration Service passes the input data to the procedure using an input function. The
            output function is a separate function that you must enter in the procedure code to pass
            output data to the Integration Service. In contrast, in the External Procedure transformation,
            an external procedure function does both input and output, and its parameters consist of all
            the ports of the transformation.
            You can also use the Custom transformation to create a transformation that requires multiple
            input groups, multiple output groups, or both. A group is the representation of a row of data
            entering or leaving a transformation. For example, you might create a Custom transformation
            with one input group and multiple output groups that parses XML data. Or, you can create a
            Custom transformation with two input groups and one output group that merges two streams
            of input data into one stream of output data.


       Working with Transformations Built On the Custom Transformation
            You can build transformations using the Custom transformation. Some of the PowerCenter
            transformations are built using the Custom transformation. Rules that apply to Custom
            transformations, such as blocking rules, also apply to transformations built using Custom
            transformations. For example, when you connect a Custom transformation in a mapping, you
            must verify that the data can flow from all sources in a target load order group to the targets
            without the Integration Service blocking all sources. Similarly, you must also verify this for
            transformations built using a Custom transformation. For more information about data flow
            validation, see “Mappings” in the Designer Guide.
            The following transformations that ship with Informatica products are built using the
            Custom transformation:
            ♦   HTTP transformation with PowerCenter
            ♦   Java transformation with PowerCenter
            ♦   SQL transformation with PowerCenter
            ♦   Union transformation with PowerCenter


54   Chapter 3: Custom Transformation
♦   XML Parser transformation with PowerCenter
  ♦   XML Generator transformation with PowerCenter
  ♦   SAP/ALE_IDoc_Interpreter transformation with PowerCenter Connect for SAP
      NetWeaver mySAP Option
  ♦   SAP/ALE_IDoc_Prepare transformation with PowerCenter Connect for SAP NetWeaver
      mySAP Option
  ♦   Web Service Consumer transformation with PowerCenter Connect for Web Services
  ♦   Address transformation with Data Cleansing Option
  ♦   Parse transformation with Data Cleansing Option


Code Page Compatibility
  The Custom transformation procedure code page is the code page of the data the Custom
  transformation procedure processes. The following factors determine the Custom
  transformation procedure code page:
  ♦   Integration Service data movement mode
  ♦   The INFA_CTChangeStringMode() function
  ♦   The INFA_CTSetDataCodePageID() function
  The Custom transformation procedure code page must be two-way compatible with the
  Integration Service code page. The Integration Service passes data to the procedure in the
  Custom transformation procedure code page. Also, the data the procedure passes to the
  Integration Service must be valid characters in the Custom transformation procedure code
  page.
  By default, when the Integration Service runs in ASCII mode, the Custom transformation
  procedure code page is ASCII. Also, when the Integration Service runs in Unicode mode, the
  Custom transformation procedure code page is UCS-2, but the Integration Service only
  passes characters that are valid in the Integration Service code page.
  However, use the INFA_CTChangeStringMode() functions in the procedure code to request
  the data in a different format. In addition, when the Integration Service runs in Unicode
  mode, you can request the data in a different code page using the
  INFA_CTSetDataCodePageID() function.
  Changing the format or requesting the data in a different code page changes the Custom
  transformation procedure code page to the code page the procedure requests:
  ♦   ASCII mode. You can write the external procedure code to request the data in UCS-2
      format using the INFA_CTChangeStringMode() function. When you use this function,
      the procedure must pass only ASCII characters in UCS-2 format to the Integration
      Service. Do not use the INFA_CTSetDataCodePageID() function when the Integration
      Service runs in ASCII mode.
  ♦   Unicode mode. You can write the external procedure code to request the data in MBCS
      using the INFA_CTChangeStringMode() function. When the external procedure requests
      the data in MBCS, the Integration Service passes the data in the Integration Service code


                                                                                 Overview      55
page. When you use the INFA_CTChangeStringMode() function, you can write the
                external procedure code to request the data in a different code page from the Integration
                Service code page using the INFA_CTSetDataCodePageID() function. The code page you
                specify in the INFA_CTSetDataCodePageID() function must be two-way compatible with
                the Integration Service code page.
            Note: You can also use the INFA_CTRebindInputDataType() function to change the format
            for a specific port in the Custom transformation.


       Distributing Custom Transformation Procedures
            You can copy a Custom transformation from one repository to another. When you copy a
            Custom transformation between repositories, you must verify that the Integration Service
            machine the target repository uses contains the Custom transformation procedure.




56   Chapter 3: Custom Transformation
Creating Custom Transformations
      You can create reusable Custom transformations in the Transformation Developer, and add
      instances of the transformation to mappings. You can create non-reusable Custom
      transformations in the Mapping Designer or Mapplet Designer.
      Each Custom transformation specifies a module and a procedure name. You can create a
      Custom transformation based on an existing shared library or DLL containing the procedure,
      or you can create a Custom transformation as the basis for creating the procedure. When you
      create a Custom transformation to use with an existing shared library or DLL, make sure you
      define the correct module and procedure name.
      When you create a Custom transformation as the basis for creating the procedure, select the
      transformation and generate the code. The Designer uses the transformation properties when
      it generates the procedure code. It generates code in a single directory for all transformations
      sharing a common module name.
      The Designer generates the following files:
      ♦   m_<module_name>.c. Defines the module. This file includes an initialization function,
          m_<module_name>_moduleInit() that lets you write code you want the Integration
          Service to run when it loads the module. Similarly, this file includes a deinitialization
          function, m_<module_name>_moduleDeinit(), that lets you write code you want the
          Integration Service to run before it unloads the module.
      ♦   p_<procedure_name>.c. Defines the procedure in the module. This file contains the code
          that implements the procedure logic, such as data cleansing or merging data.
      ♦   makefile.aix, makefile.aix64, makefile.hp, makefile.hp64, makefile.hpparisc64,
          makefile.linux, makefile.sol, and makefile.sol64. Make files for the UNIX platforms. Use
          makefile.aix64 for 64-bit AIX platforms, makefile.sol64 for 64-bit Solaris platforms, and
          makefile.hp64 for 64-bit HP-UX (Itanium) platforms.


    Rules and Guidelines
      Use the following rules and guidelines when you create a Custom transformation:
      ♦   Custom transformations are connected transformations. You cannot reference a Custom
          transformation in an expression.
      ♦   You can include multiple procedures in one module. For example, you can include an
          XML writer procedure and an XML parser procedure in the same module.
      ♦   You can bind one shared library or DLL to multiple Custom transformation instances if
          you write the procedure code to handle multiple Custom transformation instances.
      ♦   When you write the procedure code, you must make sure it does not violate basic mapping
          rules. For more information about mappings and mapping validation, see “Mappings” in
          the Designer Guide.
      ♦   The Custom transformation sends and receives high precision decimals as high precision
          decimals.



                                                                   Creating Custom Transformations    57
♦   Use multi-threaded code in Custom transformation procedures.


       Custom Transformation Components
            When you configure a Custom transformation, you define the following components:
            ♦   Transformation tab. You can rename the transformation and add a description on the
                Transformation tab.
            ♦   Ports tab. You can add and edit ports and groups to a Custom transformation. For more
                information about creating ports and groups, see “Working with Groups and Ports” on
                page 59. You can also define the input ports an output port depends on. For more
                information about defining port dependencies, see “Defining Port Relationships” on
                page 60.
            ♦   Port Attribute Definitions tab. You can create user-defined port attributes for Custom
                transformation ports. For more information about creating and editing port attributes, see
                “Working with Port Attributes” on page 62.
            ♦   Properties tab. You can define transformation properties such as module and function
                identifiers, transaction properties, and the runtime location. For more information about
                defining transformation properties, see “Custom Transformation Properties” on page 64.
            ♦   Initialization Properties tab. You can define properties that the external procedure uses at
                runtime, such as during initialization. For more information about creating initialization
                properties, see “Working with Procedure Properties” on page 72.
            ♦   Metadata Extensions tab. You can create metadata extensions to define properties that the
                procedure uses at runtime, such as during initialization. For more information about using
                metadata extensions for procedure properties, see “Working with Procedure Properties” on
                page 72.




58   Chapter 3: Custom Transformation
Working with Groups and Ports
      A Custom transformation has both input and output groups. It also can have input ports,
      output ports, and input/output ports. You create and edit groups and ports on the Ports tab of
      the Custom transformation. You can also define the relationship between input and output
      ports on the Ports tab.
      Figure 3-1 shows the Custom transformation Ports tab:

      Figure 3-1. Custom Transformation Ports Tab



                                                                             Add and delete groups,
                                                                             and edit port attributes.

                                                                             First Input Group Header

                                                                             Output Group Header



                                                                             Second Input Group Header




                                                                             Coupled Group Headers




    Creating Groups and Ports
      You can create multiple input groups and multiple output groups in a Custom
      transformation. You must create at least one input group and one output group. To create an
      input group, click the Create Input Group icon. To create an output group, click the Create
      Output Group icon. When you create a group, the Designer adds it as the last group. When
      you create a passive Custom transformation, you can only create one input group and one
      output group.
      To create a port, click the Add button. When you create a port, the Designer adds it below the
      currently selected row or group. Each port contains attributes defined on the Port Attribute
      Definitions tab. You can edit the attributes for each port. For more information about
      creating and editing user-defined port attributes, see “Working with Port Attributes” on
      page 62.




                                                                    Working with Groups and Ports        59
Editing Groups and Ports
            Use the following rules and guidelines when you edit ports and groups in a Custom
            transformation:
            ♦   You can change group names by typing in the group header.
            ♦   You can only enter ASCII characters for port and group names.
            ♦   Once you create a group, you cannot change the group type. If you need to change the
                group type, delete the group and add a new group.
            ♦   When you delete a group, the Designer deletes all ports of the same type in that group.
                However, all input/output ports remain in the transformation, belong to the group above
                them, and change to input ports or output ports, depending on the type of group you
                delete. For example, an output group contains output ports and input/output ports. You
                delete the output group. The Designer deletes the output ports. It changes the input/
                output ports to input ports. Those input ports belong to the input group with the header
                directly above them.
            ♦   To move a group up or down, select the group header and click the Move Port Up or Move
                Port Down button. The ports above and below the group header remain the same, but the
                groups to which they belong might change.


       Defining Port Relationships
            By default, an output port in a Custom transformation depends on all input ports. However,
            you can define the relationship between input and output ports in a Custom transformation.
            When you do this, you can view link paths in a mapping containing a Custom transformation
            and you can see which input ports an output port depends on. You can also view source
            column dependencies for target ports in a mapping containing a Custom transformation.
            To define the relationship between ports in a Custom transformation, create a port
            dependency. A port dependency is the relationship between an output or input/output port
            and one or more input or input/output ports. When you create a port dependency, base it on
            the procedure logic in the code.
            To create a port dependency, click Custom Transformation on the Ports tab and choose Port
            Dependencies.




60   Chapter 3: Custom Transformation
Figure 3-2 shows where you create and edit port dependencies:

Figure 3-2. Editing Port Dependencies




                                                               Choose an output or input/output port.




                                                               Add a port dependency.

                                                               Remove a port dependency.



                                                               Choose an input or input/output port on
                                                               which the output or input/output port
                                                               depends.




For example, create a external procedure that parses XML data. You create a Custom
transformation with one input group containing one input port and multiple output groups
containing multiple output ports. According to the external procedure logic, all output ports
depend on the input port. You can define this relationship in the Custom transformation by
creating a port dependency for each output port. Define each port dependency so that the
output port depends on the one input port.

To create a port dependency:

1.   On the Ports tab, click Custom Transformation and choose Port Dependencies.
2.   In the Output Port Dependencies dialog box, select an output or input/output port in
     the Output Port field.
3.   In the Input Ports pane, select an input or input/output port on which the output port or
     input/output port depends.
4.   Click Add.
5.   Repeat steps 3 to 4 to include more input or input/output ports in the port dependency.
6.   To create another port dependency, repeat steps 2 to 5.
7.   Click OK.




                                                               Working with Groups and Ports             61
Working with Port Attributes
            Ports have certain attributes, such as datatype and precision. When you create a Custom
            transformation, you can create user-defined port attributes. User-defined port attributes apply
            to all ports in a Custom transformation.
            For example, you create a external procedure to parse XML data. You can create a port
            attribute called “XML path” where you can define the position of an element in the XML
            hierarchy.
            Create port attributes and assign default values on the Port Attribute Definitions tab of the
            Custom transformation. You can define a specific port attribute value for each port on the
            Ports tab.
            Figure 3-3 shows the Port Attribute Definitions tab where you create port attributes:

            Figure 3-3. Port Attribute Definitions Tab




                                                                                  Port Attribute



                                                                                  Default Value




            When you create a port attribute, define the following properties:
            ♦   Name. The name of the port attribute.
            ♦   Datatype. The datatype of the port attribute value. You can choose Boolean, Numeric, or
                String.
            ♦   Value. The default value of the port attribute. This property is optional. When you enter a
                value here, the value applies to all ports in the Custom transformation. You can override
                the port attribute value for each port on the Ports tab.
            You define port attributes for each Custom transformation. You cannot copy a port attribute
            from one Custom transformation to another.




62   Chapter 3: Custom Transformation
Editing Port Attribute Values
After you create port attributes, you can edit the port attribute values for each port in the
transformation. To edit the port attribute values, click Custom Transformation on the Ports
tab and choose Edit Port Attribute.
Figure 3-4 shows where you edit port attribute values:

Figure 3-4. Edit Port Attribute Values


                                                                     Filter ports by group.




                                                                     Edit port attribute value.



                                                                     Revert to default port attribute
                                                                     value.




You can change the port attribute value for a particular port by clicking the Open button.
This opens the Edit Port Attribute Default Value dialog box. Or, you can enter a new value by
typing directly in the Value column.
You can filter the ports listed in the Edit Port Level Attributes dialog box by choosing a group
from the Select Group field.




                                                                  Working with Port Attributes          63
Custom Transformation Properties
            Properties for the Custom transformation apply to both the procedure and the
            transformation. Configure the Custom transformation properties on the Properties tab of the
            Custom transformation.
            Figure 3-5 shows the Custom transformation Properties tab:

            Figure 3-5. Custom Transformation Properties




            Table 3-1 describes the Custom transformation properties:

            Table 3-1. Custom Transformation Properties

              Option                    Description

              Language                  Language used for the procedure code. You define the language when you create the
                                        Custom transformation. If you need to change the language, create a new Custom
                                        transformation.

              Module Identifier         Module name. Applies to Custom transformation procedures developed using C or C++.
                                        Enter only ASCII characters in this field. You cannot enter multibyte characters.
                                        This property is the base name of the DLL or the shared library that contains the procedure.
                                        The Designer uses this name to create the C file when you generate the external procedure
                                        code.

              Function Identifier       Name of the procedure in the module. Applies to Custom transformation procedures
                                        developed using C.
                                        Enter only ASCII characters in this field. You cannot enter multibyte characters.
                                        The Designer uses this name to create the C file where you enter the procedure code.




64   Chapter 3: Custom Transformation
Table 3-1. Custom Transformation Properties

 Option                  Description

 Class Name              Class name of the Custom transformation procedure. Applies to Custom transformation
                         procedures developed using C++ or Java.
                         Enter only ASCII characters in this field. You cannot enter multibyte characters.

 Runtime Location        Location that contains the DLL or shared library. Default is $PMExtProcDir. Enter a path
                         relative to the Integration Service machine that runs the session using the Custom
                         transformation.
                         If you make this property blank, the Integration Service uses the environment variable
                         defined on the Integration Service machine to locate the DLL or shared library.
                         You must copy all DLLs or shared libraries to the runtime location or to the environment
                         variable defined on the Integration Service machine. The Integration Service fails to load the
                         procedure when it cannot locate the DLL, shared library, or a referenced file.

 Tracing Level           Amount of detail displayed in the session log for this transformation. Default is Normal.

 Is Partitionable        Indicates if you can create multiple partitions in a pipeline that uses this transformation:
                         - No. The transformation cannot be partitioned. The transformation and other
                           transformations in the same pipeline are limited to one partition.
                         - Locally. The transformation can be partitioned, but the Integration Service must run all
                           partitions in the pipeline on the same node. Choose Local when different partitions of the
                           Custom transformation must share objects in memory.
                         - Across Grid. The transformation can be partitioned, and the Integration Service can
                           distribute each partition to different nodes.
                         Default is No.
                         For more information about using partitioning with Custom transformations, see “Working
                         with Partition Points” in the Workflow Administration Guide.

 Inputs Must Block       Indicates if the procedure associated with the transformation must be able to block incoming
                         data. Default is enabled.
                         For more information about blocking data, see “Blocking Input Data” on page 70.

 Is Active               Indicates if this transformation is an active or passive transformation.
                         You cannot change this property after you create the Custom transformation. If you need to
                         change this property, create a new Custom transformation and select the correct property
                         value.

 Update Strategy         Indicates if this transformation defines the update strategy for output rows. Default is
 Transformation          disabled. You can enable this for active Custom transformations.
                         For more information about this property, see “Setting the Update Strategy” on page 66.

 Transformation Scope    Indicates how the Integration Service applies the transformation logic to incoming data:
                         - Row
                         - Transaction
                         - All Input
                         When the transformation is passive, this property is always Row. When the transformation is
                         active, this property is All Input by default.
                         For more information about working with transaction control, see “Working with Transaction
                         Control” on page 68.

 Generate Transaction    Indicates if this transformation can generate transactions. When a Custom transformation
                         generates transactions, it generates transactions for all output groups.
                         Default is disabled. You can only enable this for active Custom transformations.
                         For more information about working with transaction control, see “Working with Transaction
                         Control” on page 68.




                                                                          Custom Transformation Properties                65
Table 3-1. Custom Transformation Properties

                Option                    Description

                Output is Ordered         Indicates if the order of the output data is consistent between session runs.
                                          - Never. The order of the output data is inconsistent between session runs. This is the default
                                            for active transformations.
                                          - Based On Input Order. The output order is consistent between session runs when the input
                                            data order is consistent between session runs. This is the default for passive
                                            transformations.
                                          - Always. The order of the output data is consistent between session runs even if the order of
                                            the input data is inconsistent between session runs.

                Requires Single Thread    Indicates if the Integration Service processes each partition at the procedure with one
                Per Partition             thread. When you enable this option, the procedure code can use thread-specific operations.
                                          Default is enabled.
                                          For more information about writing thread-specific operations, see “Working with Thread-
                                          Specific Procedure Code” on page 66.

                Output is Deterministic   Indicates whether the transformation generates consistent output data between session
                                          runs. You must enable this property to perform recovery on sessions that use this
                                          transformation.
                                          For more information about session recovery, see “Recovering Workflows” in the Workflow
                                          Administration Guide.



       Setting the Update Strategy
            Use an active Custom transformation to set the update strategy for a mapping at the following
            levels:
            ♦     Within the procedure. You can write the external procedure code to set the update strategy
                  for output rows. The external procedure can flag rows for insert, update, delete, or reject.
                  For more information about the functions used to set the update strategy, see “Row
                  Strategy Functions (Row-Based Mode)” on page 128.
            ♦     Within the mapping. Use the Custom transformation in a mapping to flag rows for insert,
                  update, delete, or reject. Select the Update Strategy Transformation property for the
                  Custom transformation.
            ♦     Within the session. Configure the session to treat the source rows as data driven.
            If you do not configure the Custom transformation to define the update strategy, or you do
            not configure the session as data driven, the Integration Service does not use the external
            procedure code to flag the output rows. Instead, when the Custom transformation is active,
            the Integration Service flags the output rows as insert. When the Custom transformation is
            passive, the Integration Service retains the row type. For example, when a row flagged for
            update enters a passive Custom transformation, the Integration Service maintains the row
            type and outputs the row as update.


       Working with Thread-Specific Procedure Code
            Custom transformation procedures can include thread-specific operations. A thread-specific
            operation is code that performs an action based on the thread that is processing the
            procedure.


66   Chapter 3: Custom Transformation
You can configure the Custom transformation so the Integration Service uses one thread to
process the Custom transformation for each partition using the Requires Single Thread Per
Partition property.
When you configure a Custom transformation to process each partition with one thread, the
Integration Service calls the following functions with the same thread for each partition:
♦   p_<proc_name>_partitionInit()
♦   p_<proc_name>_partitionDeinit()
♦   p_<proc_name>_inputRowNotification()
♦   p_<proc_name>_dataBdryRowNotification()
♦   p_<proc_name>_eofNotification()
You can include thread-specific operations in these functions because the Integration Service
uses the same thread to process these functions for each partition. For example, you might
attach and detach threads to a Java Virtual Machine.
Note: When you configure a Custom transformation to process each partition with one thread,
the Workflow Manager adds partition points depending on the mapping configuration. For
more information, see “Working with Partition Points” in the Workflow Administration Guide.




                                                           Custom Transformation Properties   67
Working with Transaction Control
            You can define transaction control for Custom transformations using the following
            transformation properties:
            ♦   Transformation Scope. Determines how the Integration Service applies the transformation
                logic to incoming data.
            ♦   Generate Transaction. Indicates that the procedure generates transaction rows and outputs
                them to the output groups.


       Transformation Scope
            You can configure how the Integration Service applies the transformation logic to incoming
            data. You can choose one of the following values:
            ♦   Row. Applies the transformation logic to one row of data at a time. Choose Row when the
                results of the procedure depend on a single row of data. For example, you might choose
                Row when a procedure parses a row containing an XML file.
            ♦   Transaction. Applies the transformation logic to all rows in a transaction. Choose
                Transaction when the results of the procedure depend on all rows in the same transaction,
                but not on rows in other transactions. When you choose Transaction, you must connect all
                input groups to the same transaction control point. For example, you might choose
                Transaction when the external procedure performs aggregate calculations on the data in a
                single transaction.
            ♦   All Input. Applies the transformation logic to all incoming data. When you choose All
                Input, the Integration Service drops transaction boundaries. Choose All Input when the
                results of the procedure depend on all rows of data in the source. For example, you might
                choose All Input when the external procedure performs aggregate calculations on all
                incoming data, or when it sorts all incoming data.
            For more information about transformation scope, see “Understanding Commit Points” in
            the Workflow Administration Guide.


       Generate Transaction
            You can write the external procedure code to output transactions, such as commit and
            rollback rows. When the external procedure outputs commit and rollback rows, configure the
            Custom transformation to generate transactions. Select the Generate Transaction
            transformation property. You can enable this property for active Custom transformations. For
            information about the functions you use to generate transactions, see “Data Boundary
            Output Notification Function” on page 121.
            When the external procedure outputs a commit or rollback row, it outputs or rolls back the
            row for all output groups.
            When you configure the transformation to generate transactions, the Integration Service
            treats the Custom transformation like a Transaction Control transformation. Most rules that
            apply to a Transaction Control transformation in a mapping also apply to the Custom

68   Chapter 3: Custom Transformation
transformation. For example, when you configure a Custom transformation to generate
  transactions, you cannot concatenate pipelines or pipeline branches containing the
  transformation. For more information about working with Transaction Control
  transformations, see “Transaction Control Transformation” on page 555.
  When you edit or create a session using a Custom transformation configured to generate
  transactions, configure it for user-defined commit.


Working with Transaction Boundaries
  The Integration Service handles transaction boundaries entering and leaving Custom
  transformations based on the mapping configuration and the Custom transformation
  properties.
  Table 3-2 describes how the Integration Service handles transaction boundaries at Custom
  transformations:

  Table 3-2. Transaction Boundary Handling with Custom Transformations

   Transformation
                      Generate Transactions Enabled                      Generate Transactions Disabled
   Scope

   Row                Integration Service drops incoming transaction     When the incoming data for all input groups
                      boundaries and does not call the data              comes from the same transaction control point,
                      boundary notification function.                    the Integration Service preserves incoming
                      It outputs transaction rows according to the       transaction boundaries and outputs them
                      procedure logic across all output groups.          across all output groups. However, it does not
                                                                         call the data boundary notification function.

                                                                         When the incoming data for the input groups
                                                                         comes from different transaction control points,
                                                                         the Integration Service drops incoming
                                                                         transaction boundaries. It does not call the
                                                                         data boundary notification function. The
                                                                         Integration Service outputs all rows in one
                                                                         open transaction.

   Transaction        Integration Service preserves incoming             Integration Service preserves incoming
                      transaction boundaries and calls the data          transaction boundaries and calls the data
                      boundary notification function.                    boundary notification function.
                      However, it outputs transaction rows according     It outputs the transaction rows across all output
                      to the procedure logic across all output groups.   groups.

   All Input          Integration Service drops incoming transaction     Integration Service drops incoming transaction
                      boundaries and does not call the data              boundaries and does not call the data
                      boundary notification function. The Integration    boundary notification function. It outputs all
                      Service outputs transaction rows according to      rows in one open transaction.
                      the procedure logic across all output groups.




                                                                             Working with Transaction Control            69
Blocking Input Data
            By default, the Integration Service concurrently reads sources in a target load order group.
            However, you can write the external procedure code to block input data on some input
            groups. Blocking is the suspension of the data flow into an input group of a multiple input
            group transformation. For more information about blocking source data, see “Integration
            Service Architecture” in the Administrator Guide.
            To use a Custom transformation to block input data, you must write the procedure code to
            block and unblock data. You must also enable blocking on the Properties tab for the Custom
            transformation.


       Writing the Procedure Code to Block Data
            You can write the procedure to block and unblock incoming data. To block incoming data,
            use the INFA_CTBlockInputFlow() function. To unblock incoming data, use the
            INFA_CTUnblockInputFlow() function. For more information about the blocking
            functions, see “Blocking Functions” on page 125.
            You might want to block input data if the external procedure needs to alternate reading from
            input groups. Without the blocking functionality, you would need to write the procedure
            code to buffer incoming data. You can block input data instead of buffering it which usually
            increases session performance.
            For example, you need to create an external procedure with two input groups. The external
            procedure reads a row from the first input group and then reads a row from the second input
            group. If you use blocking, you can write the external procedure code to block the flow of
            data from one input group while it processes the data from the other input group. When you
            write the external procedure code to block data, you increase performance because the
            procedure does not need to copy the source data to a buffer. However, you could write the
            external procedure to allocate a buffer and copy the data from one input group to the buffer
            until it is ready to process the data. Copying source data to a buffer decreases performance.


       Configuring Custom Transformations as Blocking Transformations
            When you create a Custom transformation, the Designer enables the Inputs Must Block
            transformation property by default. This property affects data flow validation when you save
            or validate a mapping. When you enable this property, the Custom transformation is a
            blocking transformation. When you clear this property, the Custom transformation is not a
            blocking transformation. For more information about blocking transformations, see “Multi-
            Group Transformations” on page 9.
            Configure the Custom transformation as a blocking transformation when the external
            procedure code must be able to block input data.
            You can configure the Custom transformation as a non-blocking transformation when one of
            the following conditions is true:
            ♦   The procedure code does not include the blocking functions.


70   Chapter 3: Custom Transformation
♦   The procedure code includes two algorithms, one that uses blocking and the other that
      copies the source data to a buffer allocated by the procedure instead of blocking data. The
      code checks whether or not the Integration Service allows the Custom transformation to
      block data. The procedure uses the algorithm with the blocking functions when it can
      block, and uses the other algorithm when it cannot block. You might want to do this to
      create a Custom transformation that you use in multiple mapping configurations.
      For more information about verifying whether the Integration Service allows a Custom
      transformation to block data, see “Validating Mappings with Custom Transformations”
      on page 71.
  Note: When the procedure blocks data and you configure the Custom transformation as a
  non-blocking transformation, the Integration Service fails the session.


Validating Mappings with Custom Transformations
  When you include a Custom transformation in a mapping, both the Designer and Integration
  Service validate the mapping. The Designer validates the mapping you save or validate and
  the Integration Service validates the mapping when you run the session.

  Validating at Design Time
  When you save or validate a mapping, the Designer performs data flow validation. When the
  Designer does this, it verifies that the data can flow from all sources in a target load order
  group to the targets without blocking transformations blocking all sources. Some mappings
  with blocking transformations are invalid. For more information about data flow validation,
  see “Mappings” in the Designer Guide.

  Validating at Runtime
  When you run a session, the Integration Service validates the mapping against the procedure
  code at runtime. When the Integration Service does this, it tracks whether or not it allows the
  Custom transformations to block data:
  ♦   Configure the Custom transformation as a blocking transformation. The Integration
      Service always allows the Custom transformation to block data.
  ♦   Configure the Custom transformation as a non-blocking transformation. The
      Integration Service allows the Custom transformation to block data depending on the
      mapping configuration. If the Integration Service can block data at the Custom
      transformation without blocking all sources in the target load order group simultaneously,
      it allows the Custom transformation to block data.
  You can write the procedure code to check whether or not the Integration Service allows a
  Custom transformation to block data. Use the INFA_CT_getInternalProperty() function to
  access the INFA_CT_TRANS_MAY_BLOCK_DATA property ID. The Integration Service
  returns TRUE when the Custom transformation can block data, and it returns FALSE when
  the Custom transformation cannot block data. For more information about the
  INFA_CT_getInternalProperty() function, see “Property Functions” on page 108.




                                                                            Blocking Input Data   71
Working with Procedure Properties
            You can define property name and value pairs in the Custom transformation that the
            procedure can use when the Integration Service runs the procedure, such as during
            initialization time. You can create user-defined properties on the following tabs of the Custom
            transformation:
            ♦   Metadata Extensions. You can specify the property name, datatype, precision, and value.
                Use metadata extensions for passing information to the procedure. For more information
                about creating metadata extensions, see “Metadata Extensions” in the Repository Guide.
            ♦   Initialization Properties. You can specify the property name and value.
            While you can define properties on both tabs in the Custom transformation, the Metadata
            Extensions tab lets you provide more detail for the property. Use metadata extensions to pass
            properties to the procedure.
            For example, you create a Custom transformation external procedure that sorts data after
            transforming it. You could create a boolean metadata extension named Sort_Ascending.
            When you use the Custom transformation in a mapping, you can choose True or False for the
            metadata extension, depending on how you want the procedure to sort the data.
            When you define a property in the Custom transformation, use the get all property names
            functions, such as INFA_CTGetAllPropertyNamesM(), to access the names of all properties
            defined on the Initialization Properties and Metadata Extensions tab. Use the get external
            property functions, such as INFA_CT_getExternalPropertyM(), to access the property name
            and value of a property ID you specify.
            Note: When you define a metadata extension and an initialization property with the same
            name, the property functions only return information for the metadata extension.




72   Chapter 3: Custom Transformation
Creating Custom Transformation Procedures
      You can create Custom transformation procedures that run on 32-bit or 64-bit Integration
      Service machines. Use the following steps as a guideline when you create a Custom
      transformation procedure:
      1.   In the Transformation Developer, create a reusable Custom transformation. Or, in the
           Mapplet Designer or Mapping Designer, create a non-reusable Custom transformation.
      2.   Generate the template code for the procedure.
           When you generate the procedure code, the Designer uses the information from the
           Custom transformation to create C source code files and makefiles.
      3.   Modify the C files to add the procedure logic.
      4.   Use a C/C++ compiler to compile and link the source code files into a DLL or shared
           library and copy it to the Integration Service machine.
      5.   Create a mapping with the Custom transformation.
      6.   Run the session in a workflow.
      This section includes an example to demonstrate this process. The steps in this section create
      a Custom transformation that contains two input groups and one output group. The Custom
      transformation procedure verifies that the Custom transformation uses two input groups and
      one output group. It also verifies that the number of ports in all groups are equal and that the
      port datatypes are the same for all groups. The procedure takes rows of data from each input
      group and outputs all rows to the output group.


    Step 1. Create the Custom Transformation
      The first step is to create a Custom transformation.

      To create a Custom transformation:

      1.   In the Transformation Developer, click Transformation > Create.
      2.   In the Create Transformation dialog box, choose Custom transformation, enter a
           transformation name, and click Create.
           In the Union example, enter CT_Inf_Union as the transformation name.
      3.   In the Active or Passive dialog box, create the transformation as a passive or active
           transformation, and click OK.
           In the Union example, choose Active.
      4.   Click Done to close the Create Transformation dialog box.
      5.   Open the transformation and click the Ports tab. Create groups and ports.
           You can edit the groups and ports later, if necessary. For more information about creating
           groups and ports, see “Working with Groups and Ports” on page 59.


                                                            Creating Custom Transformation Procedures   73
In the Union example, create the groups and ports shown in Figure 3-6:

                  Figure 3-6. Custom Transformation Ports Tab - Union Example




                                                                                        First Input Group



                                                                                        Second Input Group


                                                                                        Output Group




            6.    Select the Properties tab and enter a module and function identifier and the runtime
                  location. Edit other transformation properties.
                  For more information about Custom transformation properties, see “Custom
                  Transformation Properties” on page 64.




74   Chapter 3: Custom Transformation
In the Union example, enter the properties shown in Figure 3-7:

         Figure 3-7. Custom Transformation Properties Tab - Union Example




   7.    Click the Metadata Extensions tab to enter metadata extensions, such as properties the
         external procedure might need for initialization. For more information about using
         metadata extensions for procedure properties, see “Working with Procedure Properties”
         on page 72.
         In the Union example, do not create metadata extensions.
   8.    Click the Port Attribute Definitions tab to create port attributes, if necessary. For more
         information about creating port attributes, see “Working with Port Attributes” on
         page 62.
         In the Union example, do not create port attributes.
   9.    Click OK.
   10.   Click Repository > Save.
   After you create the Custom transformation that calls the procedure, the next step is to
   generate the C files.


Step 2. Generate the C Files
   After you create a Custom transformation, you generate the source code files. The Designer
   generates file names in lower case.




                                                              Creating Custom Transformation Procedures   75
To generate the code for a Custom transformation procedure:

            1.    In the Transformation Developer, select the transformation and click Transformation >
                  Generate Code.
            2.    Select the procedure you just created. The Designer lists the procedures as
                  <module_name>.<procedure_name>.
                  In the Union example, select UnionDemo.Union.
            3.    Specify the directory where you want to generate the files, and click Generate.
                  In the Union example, select <client_installation_directory>/TX.
                  The Designer creates a subdirectory, <module_name>, in the directory you specified. In
                  the Union example, the Designer creates <client_installation_directory>/TX/
                  UnionDemo. It also creates the following files:
                  ♦   m_UnionDemo.c
                  ♦   m_UnionDemo.h
                  ♦   p_Union.c
                  ♦   p_Union.h
                  ♦   makefile.aix (32-bit), makefile.aix64 (64-bit), makefile.hp (32-bit), makefile.hp64
                      (64-bit), makefile.hpparisc64, makefile.linux (32-bit), and makefile.sol (32-bit).


       Step 3. Fill Out the Code with the Transformation Logic
            You must code the procedure C file. Optionally, you can also code the module C file. In the
            Union example, you fill out the procedure C file only. You do not need to fill out the module
            C file.

            To code the procedure C file:

            1.    Open p_<procedure_name>.c for the procedure.
                  In the Union example, open p_Union.c.
            2.    Enter the C code for the procedure.
            3.    Save the modified file.
                  In the Union example, use the following code:
            /**************************************************************************
              *
              * Copyright (c) 2005 Informatica Corporation. This file contains
              * material proprietary to Informatica Corporation and may not be copied
              * or distributed in any form without the written permission of Informatica
              * Corporation
              *
              **************************************************************************/




76   Chapter 3: Custom Transformation
/**************************************************************************
* Custom Transformation p_union Procedure File
*
* This file contains code that functions that will be called by the main
* server executable.
*
* for more information on these files,
* see $(INFA_HOME)/ExtProc/include/Readme.txt
**************************************************************************/


/*
* INFORMATICA 'UNION DEMO' developed using the API for custom
* transformations.


* File Name: p_Union.c
*
* An example of a custom transformation ('Union') using PowerCenter8.0
*
* The purpose of the 'Union' transformation is to combine pipelines with the
* same row definition into one pipeline (i.e. union of multiple pipelines).
* [ Note that it does not correspond to the mathematical definition of union
* since it does not eliminate duplicate rows.]
*
* This example union transformation allows N input pipelines ( each
* corresponding to an input group) to be combined into one pipeline.
*
* To use this transformation in a mapping, the following attributes must be
* true:
* a. The transformation must have >= 2 input groups and only one output group.
* b. In the Properties tab set the following properties:
*         i.    Module Identifier: UnionDemo
*         ii.   Function Identifier: Union
*         iii. Inputs May Block: Unchecked
*         iv.   Is Active: Checked
*         v.    Update Strategy Transformation: Unchecked *
*         vi.   Transformation Scope: All
*         vii. Generate Transaction: Unchecked *
*
*         * This version of the union transformation does not provide code for
*         changing the update strategy or for generating transactions.
* c. The input groups and the output group must have the same number of ports
*     and the same datatypes. This is verified in the initialization of the
*     module and the session is failed if this is not true.
* d. The transformation can be used in multiple number of times in a Target


                                                   Creating Custom Transformation Procedures   77
*      Load Order Group and can also be contained within multiple partitions.
              *
              */


            /**************************************************************************
                                               Includes
              **************************************************************************/


            include <stdlib.h>
            #include "p_union.h"


            /**************************************************************************
                                               Forward Declarations
              **************************************************************************/
            INFA_STATUS validateProperties(const INFA_CT_PARTITION_HANDLE* partition);


            /**************************************************************************
                                               Functions
              **************************************************************************/


            /**************************************************************************
                  Function: p_union_procInit


              Description: Initialization for the procedure. Returns INFA_SUCCESS if
              procedure initialization succeeds, else return INFA_FAILURE.


              Input: procedure - the handle for the procedure
              Output: None
              Remarks: This function will get called once for the session at
              initialization time. It will be called after the moduleInit function.
              **************************************************************************/


            INFA_STATUS p_union_procInit( INFA_CT_PROCEDURE_HANDLE procedure)
            {
                   const INFA_CT_TRANSFORMATION_HANDLE* transformation = NULL;
                   const INFA_CT_PARTITION_HANDLE* partition = NULL;
                   size_t nTransformations = 0, nPartitions = 0, i = 0;


                   /* Log a message indicating beginning of the procedure initialization */
                   INFA_CTLogMessageM( eESL_LOG,
                                        "union_demo: Procedure initialization started ..." );


                   INFA_CTChangeStringMode( procedure, eASM_MBCS );




78   Chapter 3: Custom Transformation
/* Get the transformation handles */
     transformation = INFA_CTGetChildrenHandles( procedure,
                                                   &nTransformations,
                                                   TRANSFORMATIONTYPE);


     /* For each transformation verify that the 0th partition has the correct
      * properties. This does not need to be done for all partitions since rest
      * of the partitions have the same information */
     for (i = 0; i < nTransformations; i++)
     {
         /* Get the partition handle */
         partition = INFA_CTGetChildrenHandles(transformation[i],
                                                 &nPartitions, PARTITIONTYPE );


         if (validateProperties(partition) != INFA_SUCCESS)
         {
             INFA_CTLogMessageM( eESL_ERROR,
                                   "union_demo: Failed to validate attributes of "
                                   "the transformation");
             return INFA_FAILURE;
         }
     }


     INFA_CTLogMessageM( eESL_LOG,
                         "union_demo: Procedure initialization completed." );


     return INFA_SUCCESS;
}


/**************************************************************************
    Function: p_union_procDeinit


Description: Deinitialization for the procedure. Returns INFA_SUCCESS if
procedure deinitialization succeeds, else return INFA_FAILURE.


Input: procedure - the handle for the procedure
Output: None
Remarks: This function will get called once for the session at
deinitialization time. It will be called before the moduleDeinit
function.
**************************************************************************/


INFA_STATUS p_union_procDeinit( INFA_CT_PROCEDURE_HANDLE procedure,
                                   INFA_STATUS sessionStatus )


                                                   Creating Custom Transformation Procedures   79
{
                 /* Do nothing ... */
                 return INFA_SUCCESS;
            }


            /**************************************************************************
                Function: p_union_partitionInit


              Description: Initialization for the partition. Returns INFA_SUCCESS if
              partition deinitialization succeeds, else return INFA_FAILURE.


              Input: partition - the handle for the partition
              Output: None
              Remarks: This function will get called once for each partition for each
              transformation in the session.
              **************************************************************************/


            INFA_STATUS p_union_partitionInit( INFA_CT_PARTITION_HANDLE partition )
            {
                 /* Do nothing ... */
                 return INFA_SUCCESS;
            }


            /**************************************************************************
                Function: p_union_partitionDeinit


              Description: Deinitialization for the partition. Returns INFA_SUCCESS if
              partition deinitialization succeeds, else return INFA_FAILURE.


              Input: partition - the handle for the partition
              Output: None
              Remarks: This function will get called once for each partition for each
              transformation in the session.
              **************************************************************************/


            INFA_STATUS p_union_partitionDeinit( INFA_CT_PARTITION_HANDLE partition )
            {
                 /* Do nothing ... */
                 return INFA_SUCCESS;
            }


            /**************************************************************************
                Function: p_union_inputRowNotification




80   Chapter 3: Custom Transformation
Description: Notification that a row needs to be processed for an input
group in a transformation for the given partition. Returns INFA_ROWSUCCESS
if the input row was processed successfully, INFA_ROWFAILURE if the input
row was not processed successfully and INFA_FATALERROR if the input row
causes the session to fail.


Input: partition - the handle for the partition for the given row
        group - the handle for the input group for the given row
Output: None
Remarks: This function is probably where the meat of your code will go,
as it is called for every row that gets sent into your transformation.
**************************************************************************/


INFA_ROWSTATUS p_union_inputRowNotification( INFA_CT_PARTITION_HANDLE partition,
                                              INFA_CT_INPUTGROUP_HANDLE inputGroup )


{
    const INFA_CT_OUTPUTGROUP_HANDLE* outputGroups = NULL;
    const INFA_CT_INPUTPORT_HANDLE* inputGroupPorts = NULL;
    const INFA_CT_OUTPUTPORT_HANDLE* outputGroupPorts = NULL;
    size_t nNumInputPorts = 0, nNumOutputGroups = 0,
          nNumPortsInOutputGroup = 0, i = 0;


    /* Get the output group port handles */
    outputGroups = INFA_CTGetChildrenHandles(partition,
                                              &nNumOutputGroups,
                                              OUTPUTGROUPTYPE);


    outputGroupPorts = INFA_CTGetChildrenHandles(outputGroups[0],
                                                  &nNumPortsInOutputGroup,
                                                  OUTPUTPORTTYPE);


    /* Get the input groups port handles */
    inputGroupPorts = INFA_CTGetChildrenHandles(inputGroup,
                                                 &nNumInputPorts,
                                                 INPUTPORTTYPE);


    /* For the union transformation, on receiving a row of input, we need to
    * output that row on the output group. */
    for (i = 0; i < nNumInputPorts; i++)
    {
        INFA_CTSetData(outputGroupPorts[i],
                       INFA_CTGetDataVoid(inputGroupPorts[i]));




                                                 Creating Custom Transformation Procedures   81
INFA_CTSetIndicator(outputGroupPorts[i],
                                            INFA_CTGetIndicator(inputGroupPorts[i]) );


                      INFA_CTSetLength(outputGroupPorts[i],
                                         INFA_CTGetLength(inputGroupPorts[i]) );
                 }


                 /* We know there is only one output group for each partition */
                 return INFA_CTOutputNotification(outputGroups[0]);
            }


            /**************************************************************************
                Function: p_union_eofNotification


              Description: Notification that the last row for an input group has already
              been seen. Return INFA_FAILURE if the session should fail as a result of
              seeing this notification, INFA_SUCCESS otherwise.


              Input: partition - the handle for the partition for the notification
                      group - the handle for the input group for the notification
              Output: None
              **************************************************************************/


            INFA_STATUS p_union_eofNotification( INFA_CT_PARTITION_HANDLE partition,
                                                    INFA_CT_INPUTGROUP_HANDLE group)
            {
                 INFA_CTLogMessageM( eESL_LOG,
                                        "union_demo: An input group received an EOF notification");


                 return INFA_SUCCESS;
            }


            /**************************************************************************
                Function: p_union_dataBdryNotification


              Description: Notification that a transaction has ended. The data
              boundary type can either be commit or rollback.
              Return INFA_FAILURE if the session should fail as a result of
              seeing this notification, INFA_SUCCESS otherwise.


              Input: partition - the handle for the partition for the notification
                      transactionType - commit or rollback
              Output: None
              **************************************************************************/


82   Chapter 3: Custom Transformation
INFA_STATUS p_union_dataBdryNotification ( INFA_CT_PARTITION_HANDLE partition,
                                             INFA_CT_DATABDRY_TYPE transactionType)
{
     /* Do nothing */
     return INFA_SUCCESS;
}


/* Helper functions */


/**************************************************************************
    Function: validateProperties


Description: Validate that the transformation has all properties expected
by a union transformation, such as at least one input group, and only
one output group. Return INFA_FAILURE if the session should fail since the
transformation was invalid, INFA_SUCCESS otherwise.


Input: partition - the handle for the partition
Output: None
**************************************************************************/


INFA_STATUS validateProperties(const INFA_CT_PARTITION_HANDLE* partition)
{
     const INFA_CT_INPUTGROUP_HANDLE* inputGroups = NULL;
     const INFA_CT_OUTPUTGROUP_HANDLE* outputGroups = NULL;
     size_t nNumInputGroups = 0, nNumOutputGroups = 0;
     const INFA_CT_INPUTPORT_HANDLE** allInputGroupsPorts = NULL;
     const INFA_CT_OUTPUTPORT_HANDLE* outputGroupPorts = NULL;
     size_t nNumPortsInOutputGroup = 0;
     size_t i = 0, nTempNumInputPorts = 0;


     /* Get the input and output group handles */
     inputGroups = INFA_CTGetChildrenHandles(partition[0],
                                              &nNumInputGroups,
                                              INPUTGROUPTYPE);


     outputGroups = INFA_CTGetChildrenHandles(partition[0],
                                               &nNumOutputGroups,
                                               OUTPUTGROUPTYPE);


     /* 1. Number of input groups must be >= 2 and number of output groups must
      *    be equal to one. */
     if (nNumInputGroups < 1 || nNumOutputGroups != 1)


                                                    Creating Custom Transformation Procedures   83
{
                          INFA_CTLogMessageM( eESL_ERROR,
                                             "UnionDemo: There must be at least two input groups "
                                             "and only one output group");
                          return INFA_FAILURE;
                 }


                 /* 2. Verify that the same number of ports are in each group (including
                  * output group). */
                 outputGroupPorts = INFA_CTGetChildrenHandles(outputGroups[0],
                                                                    &nNumPortsInOutputGroup,
                                                                    OUTPUTPORTTYPE);


                 /* Allocate an array for all input groups ports */
                 allInputGroupsPorts = malloc(sizeof(INFA_CT_INPUTPORT_HANDLE*) *
                                                     nNumInputGroups);


                 for (i = 0; i < nNumInputGroups; i++)
                 {
                          allInputGroupsPorts[i] = INFA_CTGetChildrenHandles(inputGroups[i],
                                                                              &nTempNumInputPorts,
                                                                              INPUTPORTTYPE);


                      if ( nNumPortsInOutputGroup != nTempNumInputPorts)
                      {
                            INFA_CTLogMessageM( eESL_ERROR,
                                                   "UnionDemo: The number of ports in all input and "
                                                   "the output group must be the same.");
                            return INFA_FAILURE;
                      }
                 }


                 free(allInputGroupsPorts);


                 /* 3. Datatypes of ports in input group 1 must match data types of all other
                  *         groups.
                 TODO:*/


                 return INFA_SUCCESS;
            }




84   Chapter 3: Custom Transformation
Step 4. Build the Module
   You can build the module on a Windows or UNIX platform.
   Table 3-3 lists the library file names for each platform when you build the module:

   Table 3-3. Module File Names

    Platform          Module File Name

    Windows           <module_identifier>.dll

    AIX               lib<module_identifier>.a

    HP-UX             lib<module_identifier>.sl

    Linux             lib<module_identifier>.so

    Solaris           lib<module_identifier>.so


   Building the Module on Windows
   On Windows, use Microsoft Visual C++ to build the module.

   To build the module on Windows:

   1.     Start Visual C++.
   2.     Click File > New.
   3.     In the New dialog box, click the Projects tab and select the Win32 Dynamic-Link Library
          option.
   4.     Enter its location.
          In the Union example, enter <client_installation_directory>/TX/UnionDemo.
   5.     Enter the name of the project.
          You must use the module name specified for the Custom transformation as the project
          name. In the Union example, enter UnionDemo.
   6.     Click OK.
          Visual C++ creates a wizard to help you define the project components.
   7.     In the wizard, select An empty DLL project and click Finish. Click OK in the New
          Project Information dialog box.
          Visual C++ creates the project files in the directory you specified.
   8.     Click Project > Add To Project > Files.




                                                          Creating Custom Transformation Procedures   85
9.    Navigate up a directory level. This directory contains the procedure files you created.
                  Select all .c files and click OK.
                  In the Union example, add the following files:
                  ♦    m_UnionDemo.c
                  ♦    p_Union.c
            10.   Click Project > Settings.
            11.   Click the C/C++ tab, and select Preprocessor from the Category field.
            12.   In the Additional Include Directories field, enter the following path and click OK:
                      ..; <PowerCenter_install_dir>extprocincludect

            13.   Click Build > Build <module_name>.dll or press F7 to build the project.
                  Visual C++ creates the DLL and places it in the debug or release directory under the
                  project directory.

            Building the Module on UNIX
            On UNIX, use any C compiler to build the module.

            To build the module on UNIX:

            1.    Copy all C files and makefiles generated by the Designer to the UNIX machine.
                  Note: If you build the shared library on a machine other than the Integration Service
                  machine, you must also copy the files in the following directory to the build machine:
                  <PowerCenter_install_dir>ExtProcincludect
                  In the Union example, copy all files in <client_installation_directory>/TX/UnionDemo.
            2.    Set the environment variable INFA_HOME to the Integration Service installation
                  directory.
                  Note: If you specify an incorrect directory path for the INFA_HOME environment
                  variable, the Integration Service cannot start.
            3.    Enter a command from Table 3-4 to make the project.

                  Table 3-4. UNIX Commands to Build the Shared Library

                      UNIX Version      Command

                      AIX (32-bit)      make -f makefile.aix

                      AIX (64-bit)      make -f makefile.aix64

                      HP-UX (32-bit)    make -f makefile.hp

                      HP-UX (64-bit)    make -f makefile.hp64

                      HP-UX PA-RISC     make -f makefile.hpparisc64




86   Chapter 3: Custom Transformation
Table 3-4. UNIX Commands to Build the Shared Library

        UNIX Version        Command

        Linux               make -f makefile.linux

        Solaris             make -f makefile.sol



Step 5. Create a Mapping
  In the Mapping Designer, create a mapping that uses the Custom transformation.
  In the Union example, create a mapping similar to the one in Figure 3-8:

  Figure 3-8. Mapping with a Custom Transformation - Union Example




  In this mapping, two sources with the same ports and datatypes connect to the two input
  groups in the Custom transformation. The Custom transformation takes the rows from both
  sources and outputs them all through its one output group. The output group has the same
  ports and datatypes as the input groups.


Step 6. Run the Session in a Workflow
  When you run the session, the Integration Service looks for the shared library or DLL in the
  runtime location you specify in the Custom transformation.

  To run a session in a workflow:

  1.   In the Workflow Manager, create a workflow.
  2.   Create a session for this mapping in the workflow.
  3.   Copy the shared library or DLL to the runtime location directory.
  4.   Run the workflow containing the session.
       When the Integration Service loads a Custom transformation bound to a procedure, it
       loads the DLL or shared library and calls the procedure you define.



                                                              Creating Custom Transformation Procedures   87
88   Chapter 3: Custom Transformation
Chapter 4




Custom Transformation
Functions
   This chapter includes the following topics:
   ♦   Overview, 90
   ♦   Function Reference, 92
   ♦   Working with Rows, 96
   ♦   Generated Functions, 98
   ♦   API Functions, 104
   ♦   Array-Based API Functions, 130
   ♦   Java API Functions, 138
   ♦   C++ API Functions, 139




                                                             89
Overview
            Custom transformations operate in conjunction with procedures you create outside of the
            Designer to extend PowerCenter functionality. The Custom transformation functions allow
            you to develop the transformation logic in a procedure you associate with a Custom
            transformation. PowerCenter provides two sets of functions called generated and API
            functions. The Integration Service uses generated functions to interface with the procedure.
            When you create a Custom transformation and generate the source code files, the Designer
            includes the generated functions in the files. Use the API functions in the procedure code to
            develop the transformation logic.
            When you write the procedure code, you can configure it to receive a block of rows from the
            Integration Service or a single row at a time. You can increase the procedure performance
            when it receives and processes a block of rows. For more information about receiving rows
            from the Integration Service, see “Working with Rows” on page 96.


       Working with Handles
            Most functions are associated with a handle, such as INFA_CT_PARTITION_HANDLE.
            The first parameter for these functions is the handle the function affects. Custom
            transformation handles have a hierarchical relationship to each other. A parent handle has a
            1:n relationship to its child handle.




90   Chapter 4: Custom Transformation Functions
Figure 4-1 shows the Custom transformation handles:

Figure 4-1. Custom Transformation Handles

                         INFA_CT_MODULE_HANDLE                       Parent handle to INFA_CT_PROC_HANDLE

                          contains n       contains 1


                           INFA_CT_PROC_HANDLE                       Child handle to INFA_CT_MODULE_HANDLE

                          contains n       contains 1


                          INFA_CT_TRANS_HANDLE

                          contains n       contains 1


                        INFA_CT_PARTITION_HANDLE


       contains n         contains 1    contains n             contains 1


  INFA_CT_INPUTGROUP_HANDLE                   INFA_CT_OUTPUTGROUP_HANDLE

   contains n       contains 1                  contains n         contains 1


   INFA_CT_INPUTPORT_HANDLE                    INFA_CT_OUTPUTPORT_HANDLE



Table 4-1 describes the Custom transformation handles:

Table 4-1. Custom Transformation Handles

 Handle Name                           Description

 INFA_CT_MODULE_HANDLE                 Represents the shared library or DLL. The external procedure can only access
                                       the module handle in its own shared library or DLL. It cannot access the
                                       module handle in any other shared library or DLL.

 INFA_CT_PROC_HANDLE                   Represents a specific procedure within the shared library or DLL.
                                       You might use this handle when you need to write a function to affect a
                                       procedure referenced by multiple Custom transformations.

 INFA_CT_TRANS_HANDLE                  Represents a specific Custom transformation instance in the session.

 INFA_CT_PARTITION_HANDLE              Represents a specific partition in a specific Custom transformation instance.

 INFA_CT_INPUTGROUP_HANDLE             Represents an input group in a partition.

 INFA_CT_INPUTPORT_HANDLE              Represents an input port in an input group in a partition.

 INFA_CT_OUTPUTGROUP_HANDLE            Represents an output group in a partition.

 INFA_CT_OUTPUTPORT_HANDLE             Represents an output port in an output group in a partition.




                                                                                                      Overview         91
Function Reference
            The Custom transformation functions include generated and API functions.
            Table 4-2 lists the Custom transformation generated functions:

            Table 4-2. Custom Transformation Generated Functions

              Function                                Description

              m_<module_name>_moduleInit()            Module initialization function. For more information, see “Module
                                                      Initialization Function” on page 98.

              p_<proc_name>_procInit()                Procedure initialization function. For more information, see “Procedure
                                                      Initialization Function” on page 99.

              p_<proc_name>_partitionInit()           Partition initialization function. For more information, see “Partition
                                                      Initialization Function” on page 99.

              p_<proc_name>_inputRowNotification()    Input row notification function. For more information, see “Input Row
                                                      Notification Function” on page 100.

              p_<proc_name>_dataBdryNotification()    Data boundary notification function. For more information, see “Data
                                                      Boundary Notification Function” on page 101.

              p_<proc_name>_eofNotification()         End of file notification function. For more information, see “End Of File
                                                      Notification Function” on page 101.

              p_<proc_name>_partitionDeinit()         Partition deinitialization function. For more information, see “Partition
                                                      Deinitialization Function” on page 102.

              p_<proc_name>_procedureDeinit()         Procedure deinitialization function. For more information, see “Procedure
                                                      Deinitialization Function” on page 102.

              m_<module_name>_moduleDeinit()          Module deinitialization function. For more information, see “Module
                                                      Deinitialization Function” on page 103.


            Table 4-3 lists the Custom transformation API functions:

            Table 4-3. Custom Transformation API Functions

              Function                                 Description

              INFA_CTSetDataAccessMode()               Set data access mode function. For more information, see “Set Data
                                                       Access Mode Function” on page 104.

              INFA_CTGetAncestorHandle()               Get ancestor handle function. For more information, see “Get Ancestor
                                                       Handle Function” on page 105.

              INFA_CTGetChildrenHandles()              Get children handles function. For more information, see “Get Children
                                                       Handles Function” on page 106.

              INFA_CTGetInputPortHandle()              Get input port handle function. For more information, see “Get Port
                                                       Handle Functions” on page 107.

              INFA_CTGetOutputPortHandle()             Get output port handle function. For more information, see “Get Port
                                                       Handle Functions” on page 107.




92   Chapter 4: Custom Transformation Functions
Table 4-3. Custom Transformation API Functions

 Function                                  Description

 INFA_CTGetInternalProperty<datatype>()    Get internal property function. For more information, see “Get Internal
                                           Property Function” on page 108.

 INFA_CTGetAllPropertyNamesM()             Get all property names in MBCS mode function. For more information,
                                           see “Get All External Property Names (MBCS or Unicode)” on
                                           page 114.

 INFA_CTGetAllPropertyNamesU()             Get all property names in Unicode mode function. For more
                                           information, see “Get All External Property Names (MBCS or Unicode)”
                                           on page 114.

 INFA_CTGetExternalProperty<datatype>M()   Get external property in MBCS function. For more information, see “Get
                                           External Properties (MBCS or Unicode)” on page 114.

 INFA_CTGetExternalProperty<datatype>U()   Get external property in Unicode function. For more information, see
                                           “Get External Properties (MBCS or Unicode)” on page 114.

 INFA_CTRebindInputDataType()              Rebind input port datatype function. For more information, see “Rebind
                                           Datatype Functions” on page 115.

 INFA_CTRebindOutputDataType()             Rebind output port datatype function. For more information, see
                                           “Rebind Datatype Functions” on page 115.

 INFA_CTGetData<datatype>()                Get data functions. For more information, see “Get Data Functions
                                           (Row-Based Mode)” on page 118.

 INFA_CTSetData()                          Set data functions. For more information, see “Set Data Function (Row-
                                           Based Mode)” on page 118.

 INFA_CTGetIndicator()                     Get indicator function. For more information, see “Indicator Functions
                                           (Row-Based Mode)” on page 119.

 INFA_CTSetIndicator()                     Set indicator function. For more information, see “Indicator Functions
                                           (Row-Based Mode)” on page 119.

 INFA_CTGetLength()                        Get length function. For more information, see “Length Functions” on
                                           page 120.

 INFA_CTSetLength()                        Set length function. For more information, see “Length Functions” on
                                           page 120.

 INFA_CTSetPassThruPort()                  Set pass-through port function. For more information, see “Set Pass-
                                           Through Port Function” on page 120.

 INFA_CTOutputNotification()               Output notification function. For more information, see “Output
                                           Notification Function” on page 121.

 INFA_CTDataBdryOutputNotification()       Data boundary output notification function. For more information, see
                                           “Data Boundary Output Notification Function” on page 121.

 INFA_CTGetErrorMsgU()                     Get error message in Unicode function. For more information, see
                                           “Error Functions” on page 122.

 INFA_CTGetErrorMsgM()                     Get error message in MBCS function. For more information, see “Error
                                           Functions” on page 122.

 INFA_CTLogMessageU()                      Log message in the session log in Unicode function. For more
                                           information, see “Session Log Message Functions” on page 123.


                                                                                      Function Reference             93
Table 4-3. Custom Transformation API Functions

              Function                                 Description

              INFA_CTLogMessageM()                     Log message in the session log in MBCS function. For more
                                                       information, see “Session Log Message Functions” on page 123.

              INFA_CTIncrementErrorCount()             Increment error count function. For more information, see “Increment
                                                       Error Count Function” on page 124.

              INFA_CTIsTerminateRequested()            Is terminate requested function. For more information, see “Is
                                                       Terminated Function” on page 124.

              INFA_CTBlockInputFlow()                  Block input groups function. For more information, see “Blocking
                                                       Functions” on page 125.

              INFA_CTUnblockInputFlow()                Unblock input groups function. For more information, see “Blocking
                                                       Functions” on page 125.

              INFA_CTSetUserDefinedPtr()               Set user-defined pointer function. For more information, see “Pointer
                                                       Functions” on page 126.

              INFA_CTGetUserDefinedPtr()               Get user-defined pointer function. For more information, see “Pointer
                                                       Functions” on page 126.

              INFA_CTChangeStringMode()                Change the string mode function. For more information, see “Change
                                                       String Mode Function” on page 126.

              INFA_CTSetDataCodePageID()               Set the data code page ID function. For more information, see “Set
                                                       Data Code Page Function” on page 127.

              INFA_CTGetRowStrategy()                  Get row strategy function. For more information, see “Row Strategy
                                                       Functions (Row-Based Mode)” on page 128.

              INFA_CTSetRowStrategy()                  Set the row strategy function. For more information, see “Row Strategy
                                                       Functions (Row-Based Mode)” on page 128.

              INFA_CTChangeDefaultRowStrategy()        Change the default row strategy of a transformation. For more
                                                       information, see “Change Default Row Strategy Function” on page 129.


            Table 4-4 lists the Custom transformation array-based functions:

            Table 4-4. Custom Transformation Array-Based API Functions

              Function                                 Description

              INFA_CTAGetInputRowMax()                 Get maximum number of input rows function. For more information, see
                                                       “Maximum Number of Rows Functions” on page 130.

              INFA_CTAGetOutputRowMax()                Get maximum number of output rows function. For more information,
                                                       see “Maximum Number of Rows Functions” on page 130.

              INFA_CTASetOutputRowMax()                Set maximum number of output rows function. For more information,
                                                       see “Maximum Number of Rows Functions” on page 130.

              INFA_CTAGetNumRows()                     Get number of rows function. For more information, see “Number of
                                                       Rows Functions” on page 131.

              INFA_CTASetNumRows()                     Set number of rows function. For more information, see “Number of
                                                       Rows Functions” on page 131.



94   Chapter 4: Custom Transformation Functions
Table 4-4. Custom Transformation Array-Based API Functions

 Function                                  Description

 INFA_CTAIsRowValid()                      Is row valid function. For more information, see “Is Row Valid Function”
                                           on page 132.

 INFA_CTAGetData<datatype>()               Get data functions. For more information, see “Get Data Functions
                                           (Array-Based Mode)” on page 133.

 INFA_CTAGetIndicator()                    Get indicator function. For more information, see “Get Indicator
                                           Function (Array-Based Mode)” on page 134.

 INFA_CTASetData()                         Set data function. For more information, see “Set Data Function (Array-
                                           Based Mode)” on page 134.

 INFA_CTAGetRowStrategy()                  Get row strategy function. For more information, see “Row Strategy
                                           Functions (Array-Based Mode)” on page 135.

 INFA_CTASetRowStrategy()                  Set row strategy function. For more information, see “Row Strategy
                                           Functions (Array-Based Mode)” on page 135.

 INFA_CTASetInputErrorRowM()               Set input error row function for MBCS. For more information, see “Set
                                           Input Error Row Functions” on page 136.

 INFA_CTASetInputErrorRowU()               Set input error row function for Unicode. For more information, see “Set
                                           Input Error Row Functions” on page 136.




                                                                                       Function Reference          95
Working with Rows
            The Integration Service can pass a single row to a Custom transformation procedure or a
            block of rows in an array. You can write the procedure code to specify whether the procedure
            receives one row or a block of rows. You can increase performance when the procedure
            receives a block of rows:
            ♦    You can decrease the number of function calls the Integration Service and procedure make.
                 The Integration Service calls the input row notification function fewer times, and the
                 procedure calls the output notification function fewer times.
            ♦    You can increase the locality of memory access space for the data.
            ♦    You can write the procedure code to perform an algorithm on a block of data instead of
                 each row of data.
            By default, the procedure receives a row of data at a time. To receive a block of rows, you must
            include the INFA_CTSetDataAccessMode() function to change the data access mode to
            array-based. When the data access mode is array-based, you must use the array-based data
            handling and row strategy functions to access and output the data. When the data access
            mode is row-based, you must use the row-based data handling and row strategy functions to
            access and output the data.
            All array-based functions use the prefix INFA_CTA. All other functions use the prefix
            INFA_CT. For more information about the array-based functions, see “Array-Based API
            Functions” on page 130.
            Use the following steps to write the procedure code to access a block of rows:
            1.    Call INFA_CTSetDataAccessMode() during the procedure initialization, to change the
                  data access mode to array-based.
            2.    When you create a passive Custom transformation, you can also call
                  INFA_CTSetPassThruPort() during procedure initialization to pass through the data for
                  input/output ports.
                  When a block of data reaches the Custom transformation procedure, the Integration
                  Service calls p_<proc_name>_inputRowNotification() for each block of data. Perform
                  the rest of the steps inside this function.
            3.    Call INFA_CTAGetNumRows() using the input group handle in the input row
                  notification function to find the number of rows in the current block.
            4.    Call one of the INFA_CTAGetData<datatype>() functions using the input port handle
                  to get the data for a particular row in the block.
            5.    Call INFA_CTASetData to output rows in a block.
            6.    Before calling INFA_CTOutputNotification(), call INFA_CTASetNumRows() to notify
                  the Integration Service of the number of rows the procedure is outputting in the block.
            7.    Call INFA_CTOutputNotification().




96   Chapter 4: Custom Transformation Functions
Rules and Guidelines
  Use the following rules and guidelines when you write the procedure code to use either row-
  based or array-based data access mode:
  ♦   In row-based mode, you can return INFA_ROWERROR in the input row notification
      function to indicate the function encountered an error for the row of data on input. The
      Integration Service increments the internal error count.
  ♦   In array-based mode, do not return INFA_ROWERROR in the input row notification
      function. The Integration Service treats that as a fatal error. If you need to indicate a row
      in a block has an error, call the INFA_CTASetInputErrorRowM() or
      INFA_CTASetInputErrorRowU() function.
  ♦   In row-based mode, the Integration Service only passes valid rows to the procedure.
  ♦   In array-based mode, an input block may contain invalid rows, such as dropped, filtered,
      or error rows. Call INFA_CTAIsRowValid() to determine if a row in a block is valid.
  ♦   In array-based mode, do not call INFA_CTASetNumRows() for a passive Custom
      transformation. You can call this function for active Custom transformations.
  ♦   In array-based mode, call INFA_CTOutputNotification() once.
  ♦   In array-based mode, you can call INFA_CTSetPassThruPort() only for passive Custom
      transformations.
  ♦   In array-based mode for passive Custom transformations, you must output all rows in an
      output block, including any error row.




                                                                             Working with Rows    97
Generated Functions
            When you use the Designer to generate the procedure code, the Designer includes a set of
            functions called generated functions in the m_<module_name>.c and p_<procedure_name>.c
            files. The Integration Service uses the generated functions to interface with the procedure.
            When you run a session, the Integration Service calls these generated functions in the
            following order for each target load order group in the mapping:
            1.     Initialization functions
            2.     Notification functions
            3.     Deinitialization functions


       Initialization Functions
            The Integration Service first calls the initialization functions. Use the initialization functions
            to write processes you want the Integration Service to run before it passes data to the Custom
            transformation. Writing code in the initialization functions reduces processing overhead
            because the Integration Service runs these processes only once for a module, procedure, or
            partition.
            The Designer generates the following initialization functions:
            ♦    m_<module_name>_moduleInit(). For more information, see “Module Initialization
                 Function” on page 98.
            ♦    p_<proc_name>_procInit(). For more information, see “Procedure Initialization
                 Function” on page 99.
            ♦    p_<proc_name>_partitionInit(). For more information, see “Partition Initialization
                 Function” on page 99.

            Module Initialization Function
            The Integration Service calls the m_<module_name>_moduleInit() function during session
            initialization, before it runs the pre-session tasks. It calls this function, once for a module,
            before all other functions.
            If you want the Integration Service to run a specific process when it loads the module, you
            must include it in this function. For example, you might write code to create global structures
            that procedures within this module access.
            Use the following syntax:
                     INFA_STATUS m_<module_name>_moduleInit(INFA_CT_MODULE_HANDLE module);


                                                      Input/
                Argument     Datatype                            Description
                                                      Output

                module       INFA_CT_MODULE_HANDLE    Input      Module handle.




98   Chapter 4: Custom Transformation Functions
The return value datatype is INFA_STATUS. Use INFA_SUCCESS and INFA_FAILURE for
the return value. When the function returns INFA_FAILURE, the Integration Service fails
the session.

Procedure Initialization Function
The Integration Service calls p_<proc_name>_procInit() function during session
initialization, before it runs the pre-session tasks and after it runs the module initialization
function. The Integration Service calls this function once for each procedure in the module.
Write code in this function when you want the Integration Service to run a process for a
particular procedure. You can also enter some API functions in the procedure initialization
function, such as navigation and property functions.
Use the following syntax:
       INFA_STATUS p_<proc_name>_procInit(INFA_CT_PROCEDURE_HANDLE procedure);


                                                Input/
 Argument         Datatype                                Description
                                                Output

 procedure        INFA_CT_PROCEDURE_HANDLE      Input     Procedure handle.


The return value datatype is INFA_STATUS. Use INFA_SUCCESS and INFA_FAILURE for
the return value. When the function returns INFA_FAILURE, the Integration Service fails
the session.

Partition Initialization Function
The Integration Service calls p_<proc_name>_partitionInit() function before it passes data to
the Custom transformation. The Integration Service calls this function once for each partition
at a Custom transformation instance.
If you want the Integration Service to run a specific process before it passes data through a
partition of the Custom transformation, you must include it in this function.
Use the following syntax:
       INFA_STATUS p_<proc_name>_partitionInit(INFA_CT_PARTITION_HANDLE
       transformation);


                                                 Input/
 Argument            Datatype                              Description
                                                 Output

 transformation      INFA_CT_PARTITION_HANDLE    Input     Partition handle.


The return value datatype is INFA_STATUS. Use INFA_SUCCESS and INFA_FAILURE for
the return value. When the function returns INFA_FAILURE, the Integration Service fails
the session.
Note: When the Custom transformation requires one thread for each partition, you can
include thread-specific operations in the partition initialization function. For more


                                                                               Generated Functions   99
information about working with thread-specific procedure code, see “Working with Thread-
             Specific Procedure Code” on page 66.


        Notification Functions
             The Integration Service calls the notification functions when it passes a row of data to the
             Custom transformation.
             The Designer generates the following notification functions:
             ♦     p_<proc_name>_inputRowNotification(). For more information, see “Input Row
                   Notification Function” on page 100.
             ♦     p_<proc_name>_dataBdryRowNotification(). For more information, see “Data
                   Boundary Notification Function” on page 101.
             ♦     p_<proc_name>_eofNotification(). For more information, see “End Of File Notification
                   Function” on page 101.
             Note: When the Custom transformation requires one thread for each partition, you can
             include thread-specific operations in the notification functions. For more information about
             working with thread-specific procedure code, see “Working with Thread-Specific Procedure
             Code” on page 66.

             Input Row Notification Function
             The Integration Service calls the p_<proc_name>_inputRowNotification() function when it
             passes a row or a block of rows to the Custom transformation. It notes which input group and
             partition receives data through the input group handle and partition handle.
             Use the following syntax:
                         INFA_ROWSTATUS
                         p_<proc_name>_inputRowNotification(INFA_CT_PARTITION_HANDLE Partition,
                         INFA_CT_INPUTGROUP_HANDLE group);


                                                                 Input/
                 Argument        Datatype                                   Description
                                                                 Output

                 partition       INFA_CT_PARTITION_HANDLE        Input      Partition handle.

                 group           INFA_CT_INPUTGROUP_HANDLE       Input      Input group handle.


             The datatype of the return value is INFA_ROWSTATUS. Use the following values for the
             return value:
             ♦     INFA_ROWSUCCESS. Indicates the function successfully processed the row of data.
             ♦     INFA_ROWERROR. Indicates the function encountered an error for the row of data. The
                   Integration Service increments the internal error count. Only return this value when the
                   data access mode is row.
                   If the input row notification function returns INFA_ROWERROR in array-based mode,
                   the Integration Service treats it as a fatal error. If you need to indicate a row in a block has


100   Chapter 4: Custom Transformation Functions
an error, call the INFA_CTASetInputErrorRowM() or INFA_CTASetInputErrorRowU()
      function.
♦     INFA_FATALERROR. Indicates the function encountered a fatal error for the row of data
      or the block of data. The Integration Service fails the session.

Data Boundary Notification Function
The Integration Service calls the p_<proc_name>_dataBdryNotification() function when it
passes a commit or rollback row to a partition.
Use the following syntax:
            INFA_STATUS p_<proc_name>_dataBdryNotification(INFA_CT_PARTITION_HANDLE
            transformation, INFA_CTDataBdryType dataBoundaryType);


                                                   Input/
    Argument           Datatype                              Description
                                                   Output

    transformation     INFA_CT_PARTITION_HANDLE    Input     Partition handle.

    dataBoundaryType   INFA_CTDataBdryType         Input     Integration Service uses one of the
                                                             following values for the dataBoundaryType
                                                             parameter:
                                                             - eBT_COMMIT
                                                             - eBT_ROLLBACK


The return value datatype is INFA_STATUS. Use INFA_SUCCESS and INFA_FAILURE for
the return value. When the function returns INFA_FAILURE, the Integration Service fails
the session.

End Of File Notification Function
The Integration Service calls the p_<proc_name>_eofNotification() function after it passes
the last row to a partition in an input group.
Use the following syntax:
            INFA_STATUS p_<proc_name>_eofNotification(INFA_CT_PARTITION_HANDLE
            transformation, INFA_CT_INPUTGROUP_HANDLE group);


                                                    Input/
    Argument           Datatype                               Description
                                                    Output

    transformation     INFA_CT_PARTITION_HANDLE     Input     Partition handle.

    group              INFA_CT_INPUTGROUP_HANDLE    Input     Input group handle.


The return value datatype is INFA_STATUS. Use INFA_SUCCESS and INFA_FAILURE for
the return value. When the function returns INFA_FAILURE, the Integration Service fails
the session.




                                                                           Generated Functions      101
Deinitialization Functions
             The Integration Service calls the deinitialization functions after it processes data for the
             Custom transformation. Use the deinitialization functions to write processes you want the
             Integration Service to run after it passes all rows of data to the Custom transformation.
             The Designer generates the following deinitialization functions:
             ♦     p_<proc_name>_partitionDeinit(). For more information, see “Partition Deinitialization
                   Function” on page 102.
             ♦     p_<proc_name>_procDeinit(). For more information, see “Procedure Deinitialization
                   Function” on page 102.
             ♦     m_<module_name>_moduleDeinit(). For more information, see “Module
                   Deinitialization Function” on page 103.
             Note: When the Custom transformation requires one thread for each partition, you can
             include thread-specific operations in the initialization and deinitialization functions. For
             more information about working with thread-specific procedure code, see “Working with
             Thread-Specific Procedure Code” on page 66.

             Partition Deinitialization Function
             The Integration Service calls the p_<proc_name>_partitionDeinit() function after it calls the
             p_<proc_name>_eofNotification() or p_<proc_name>_abortNotification() function. The
             Integration Service calls this function once for each partition of the Custom transformation.
             Use the following syntax:
                        INFA_STATUS p_<proc_name>_partitionDeinit(INFA_CT_PARTITION_HANDLE
                        partition);


                                                                Input/
                 Argument            Datatype                            Description
                                                                Output

                 partition           INFA_CT_PARTITION_HANDLE   Input    Partition handle.


             The return value datatype is INFA_STATUS. Use INFA_SUCCESS and INFA_FAILURE for
             the return value. When the function returns INFA_FAILURE, the Integration Service fails
             the session.
             Note: When the Custom transformation requires one thread for each partition, you can
             include thread-specific operations in the partition deinitialization function. For more
             information about working with thread-specific procedure code, see “Working with Thread-
             Specific Procedure Code” on page 66.

             Procedure Deinitialization Function
             The Integration Service calls the p_<proc_name>_procDeinit() function after it calls the
             p_<proc_name>_partitionDeinit() function for all partitions of each Custom transformation
             instance that uses this procedure in the mapping.



102   Chapter 4: Custom Transformation Functions
Use the following syntax:
       INFA_STATUS p_<proc_name>_procDeinit(INFA_CT_PROCEDURE_HANDLE procedure,
       INFA_STATUS sessionStatus);


                                                  Input/
 Argument           Datatype                                 Description
                                                  Output

 procedure          INFA_CT_PROCEDURE_HANDLE      Input      Procedure handle.

 sessionStatus      INFA_STATUS                   Input      Integration Service uses one of the
                                                             following values for the sessionStatus
                                                             parameter:
                                                             - INFA_SUCCESS. Indicates the session
                                                               succeeded.
                                                             - INFA_FAILURE. Indicates the session
                                                               failed.


The return value datatype is INFA_STATUS. Use INFA_SUCCESS and INFA_FAILURE for
the return value. When the function returns INFA_FAILURE, the Integration Service fails
the session.

Module Deinitialization Function
The Integration Service calls the m_<module_name>_moduleDeinit() function after it runs
the post-session tasks. It calls this function, once for a module, after all other functions.
Use the following syntax:
       INFA_STATUS m_<module_name>_moduleDeinit(INFA_CT_MODULE_HANDLE module,
       INFA_STATUS sessionStatus);


                                                Input/
 Argument           Datatype                               Description
                                                Output

 module             INFA_CT_MODULE_HANDLE       Input      Module handle.

 sessionStatus      INFA_STATUS                 Input      Integration Service uses one of the
                                                           following values for the sessionStatus
                                                           parameter:
                                                           - INFA_SUCCESS. Indicates the session
                                                             succeeded.
                                                           - INFA_FAILURE. Indicates the session
                                                             failed.


The return value datatype is INFA_STATUS. Use INFA_SUCCESS and INFA_FAILURE for
the return value. When the function returns INFA_FAILURE, the Integration Service fails
the session.




                                                                            Generated Functions       103
API Functions
             PowerCenter provides a set of API functions that you use to develop the transformation logic.
             When the Designer generates the source code files, it includes the generated functions in the
             source code. Add API functions to the code to implement the transformation logic. The
             procedure uses the API functions to interface with the Integration Service. You must code API
             functions in the procedure C file. Optionally, you can also code the module C file.
             Informatica provides the following groups of API functions:
             ♦   Set data access mode. See “Set Data Access Mode Function” on page 104.
             ♦   Navigation. See “Navigation Functions” on page 105.
             ♦   Property. See “Property Functions” on page 108.
             ♦   Rebind datatype. See “Rebind Datatype Functions” on page 115.
             ♦   Data handling (row-based mode). See “Data Handling Functions (Row-Based Mode)” on
                 page 117.
             ♦   Set pass-through port. See “Set Pass-Through Port Function” on page 120.
             ♦   Output notification. See “Output Notification Function” on page 121.
             ♦   Data boundary output notification. See “Data Boundary Output Notification Function”
                 on page 121.
             ♦   Error. See “Error Functions” on page 122.
             ♦   Session log message. See “Session Log Message Functions” on page 123.
             ♦   Increment error count. See “Increment Error Count Function” on page 124.
             ♦   Is terminated. See “Is Terminated Function” on page 124.
             ♦   Blocking. See “Blocking Functions” on page 125.
             ♦   Pointer. See “Pointer Functions” on page 126.
             ♦   Change string mode. See “Change String Mode Function” on page 126.
             ♦   Set data code page. See “Set Data Code Page Function” on page 127.
             ♦   Row strategy (row-based mode). See “Row Strategy Functions (Row-Based Mode)” on
                 page 128.
             ♦   Change default row strategy. See “Change Default Row Strategy Function” on page 129.
             Informatica also provides array-based API Functions. For more information about array-based
             API functions, see “Array-Based API Functions” on page 130.


        Set Data Access Mode Function
             By default, the Integration Service passes data to the Custom transformation procedure one
             row at a time. However, use the INFA_CTSetDataAccessMode() function to change the data
             access mode to array-based. When you set the data access mode to array-based, the
             Integration Service passes multiple rows to the procedure as a block in an array.



104   Chapter 4: Custom Transformation Functions
When you set the data access mode to array-based, you must use the array-based versions of
  the data handling functions and row strategy functions. When you use a row-based data
  handling or row strategy function and you switch to array-based mode, you will get
  unexpected results. For example, the DLL or shared library might crash.
  You can only use this function in the procedure initialization function.
  If you do not use this function in the procedure code, the data access mode is row-based.
  However, when you want the data access mode to be row-based, include this function and set
  the access mode to row-based.
  For more information about the array-based functions, see “Array-Based API Functions” on
  page 130.
  Use the following syntax:
             INFA_STATUS INFA_CTSetDataAccessMode( INFA_CT_PROCEDURE_HANDLE procedure,
             INFA_CT_DATA_ACCESS_MODE mode );


                                                 Input/
      Argument      Datatype                                Description
                                                 Output

      procedure     INFA_CT_PROCEDURE_HANDLE     Input      Procedure name.

      mode          INFA_CT_DATA_ACCESS_MODE     Input      Data access mode.
                                                            Use the following values for the mode
                                                            parameter:
                                                            - eDA_ROW
                                                            - eDA_ARRAY



Navigation Functions
  Use the navigation functions when you want the procedure to navigate through the handle
  hierarchy. For more information about handles, see “Working with Handles” on page 90.
  PowerCenter provides the following navigation functions:
  ♦    INFA_CTGetAncestorHandle(). For more information, see “Get Ancestor Handle
       Function” on page 105.
  ♦    INFA_CTGetChildrenHandles(). For more information, see “Get Children Handles
       Function” on page 106.
  ♦    INFA_CTGetInputPortHandle(). For more information, see “Get Port Handle
       Functions” on page 107.
  ♦    INFA_CTGetOutputPortHandle(). For more information, see “Get Port Handle
       Functions” on page 107.

  Get Ancestor Handle Function
  Use the INFA_CTGetAncestorHandle() function when you want the procedure to access a
  parent handle of a given handle.




                                                                                   API Functions    105
Use the following syntax:
                        INFA_CT_HANDLE INFA_CTGetAncestorHandle(INFA_CT_HANDLE handle,
                        INFA_CTHandleType returnHandleType);


                                                       Input/
               Argument            Datatype                     Description
                                                       Output

               handle              INFA_CT_HANDLE      Input    Handle name.

               returnHandleType    INFA_CTHandleType   Input    Return handle type.
                                                                Use the following values for the returnHandleType
                                                                parameter:
                                                                - PROCEDURETYPE
                                                                - TRANSFORMATIONTYPE
                                                                - PARTITIONTYPE
                                                                - INPUTGROUPTYPE
                                                                - OUTPUTGROUPTYPE
                                                                - INPUTPORTTYPE
                                                                - OUTPUTPORTTYPE


             The handle parameter specifies the handle whose parent you want the procedure to access.
             The Integration Service returns INFA_CT_HANDLE if you specify a valid handle in the
             function. Otherwise, it returns a null value.
             To avoid compilation errors, you must code the procedure to set a handle name to the return
             value.
             For example, you can enter the following code:
                        INFA_CT_MODULE_HANDLE module = INFA_CTGetAncestorHandle(procedureHandle,
                        INFA_CT_HandleType);


             Get Children Handles Function
             Use the INFA_CTGetChildrenHandles() function when you want the procedure to access the
             children handles of a given handle.
             Use the following syntax:
                        INFA_CT_HANDLE* INFA_CTGetChildrenHandles(INFA_CT_HANDLE handle, size_t*
                        pnChildrenHandles, INFA_CTHandleType returnHandleType);


                                                       Input/
               Argument              Datatype                       Description
                                                       Output

               handle                INFA_CT_HANDLE    Input        Handle name.




106   Chapter 4: Custom Transformation Functions
Input/
    Argument              Datatype                      Description
                                              Output

    pnChildrenHandles     size_t*             Output    Integration Service returns an array of children
                                                        handles. The pnChildrenHandles parameter
                                                        indicates the number of children handles in the
                                                        array.

    returnHandleType      INFA_CTHandleType   Input     Use the following values for the returnHandleType
                                                        parameter:
                                                        - PROCEDURETYPE
                                                        - TRANSFORMATIONTYPE
                                                        - PARTITIONTYPE
                                                        - INPUTGROUPTYPE
                                                        - OUTPUTGROUPTYPE
                                                        - INPUTPORTTYPE
                                                        - OUTPUTPORTTYPE


The handle parameter specifies the handle whose children you want the procedure to access.
The Integration Service returns INFA_CT_HANDLE* when you specify a valid handle in the
function. Otherwise, it returns a null value.
To avoid compilation errors, you must code the procedure to set a handle name to the
returned value.
For example, you can enter the following code:
          INFA_CT_PARTITION_HANDLE partition =
          INFA_CTGetChildrenHandles(procedureHandle, pnChildrenHandles,
          INFA_CT_PARTITION_HANDLE_TYPE);


Get Port Handle Functions
The Integration Service associates the INFA_CT_INPUTPORT_HANDLE with input and
input/output ports, and the INFA_CT_OUTPUTPORT_HANDLE with output and input/
output ports.
PowerCenter provides the following get port handle functions:
♦    INFA_CTGetInputPortHandle(). Use this function when the procedure knows the
     output port handle for an input/output port and needs the input port handle.
     Use the following syntax:
          INFA_CTINFA_CT_INPUTPORT_HANDLE
          INFA_CTGetInputPortHandle(INFA_CT_OUTPUTPORT_HANDLE outputPortHandle);


                                                        Input/
       Argument             Datatype                                  Description
                                                        Output

       outputPortHandle     INFA_CT_OUTPUTPORT_HANDLE   input         Output port handle.


♦    INFA_CTGetOutputPortHandle(). Use this function when the procedure knows the
     input port handle for an input/output port and needs the output port handle.



                                                                                      API Functions         107
Use the following syntax:
                     INFA_CT_OUTPUTPORT_HANDLE
                     INFA_CTGetOutputPortHandle(INFA_CT_INPUTPORT_HANDLE inputPortHandle);


                                                                   Input/
                   Argument            Datatype                               Description
                                                                   Output

                   inputPortHandle     INFA_CT_INPUTPORT_HANDLE    input      Input port handle.


             The Integration Service returns NULL when you use the get port handle functions with input
             or output ports.


        Property Functions
             Use the property functions when you want the procedure to access the Custom
             transformation properties. The property functions access properties on the following tabs of
             the Custom transformation:
             ♦   Ports
             ♦   Properties
             ♦   Initialization Properties
             ♦   Metadata Extensions
             ♦   Port Attribute Definitions
             Use the following property functions in initialization functions:
             ♦   INFA_CTGetInternalProperty<datatype>(). For more information, see “Get Internal
                 Property Function” on page 108.
             ♦   INFA_CTGetAllPropertyNamesM(). For more information, see “Get All External
                 Property Names (MBCS or Unicode)” on page 114.
             ♦   INFA_CTGetAllPropertyNamesU(). For more information, see “Get All External
                 Property Names (MBCS or Unicode)” on page 114.
             ♦   INFA_CTGetExternalProperty<datatype>M(). For more information, see “Get External
                 Properties (MBCS or Unicode)” on page 114.
             ♦   INFA_CTGetExternalProperty<datatype>U(). For more information, see “Get External
                 Properties (MBCS or Unicode)” on page 114.

             Get Internal Property Function
             PowerCenter provides functions to access the port attributes specified on the ports tab, and
             properties specified for attributes on the Properties tab of the Custom transformation.
             The Integration Service associates each port and property attribute with a property ID. You
             must specify the property ID in the procedure to access the values specified for the attributes.
             For more information about property IDs, see “Port and Property Attribute Property IDs” on
             page 109. For the handle parameter, specify a handle name from the handle hierarchy. The
             Integration Service fails the session if the handle name is invalid.


108   Chapter 4: Custom Transformation Functions
Use the following functions when you want the procedure to access the properties:
♦    INFA_CTGetInternalPropertyStringM(). Accesses a value of type string in MBCS for a
     given property ID.
     Use the following syntax:
         INFA_STATUS INFA_CTGetInternalPropertyStringM( INFA_CT_HANDLE handle,
         size_t propId, const char** psPropValue );

♦    INFA_CTGetInternalPropertyStringU(). Accesses a value of type string in Unicode for a
     given property ID.
     Use the following syntax:
         INFA_STATUS INFA_CTGetInternalPropertyStringU( INFA_CT_HANDLE handle,
         size_t propId, const INFA_UNICHAR** psPropValue );

♦    INFA_CTGetInternalPropertyInt32(). Accesses a value of type integer for a given
     property ID.
     Use the following syntax:
         INFA_STATUS INFA_CTGetInternalPropertyInt32( INFA_CT_HANDLE handle,
         size_t propId, INFA_INT32* pnPropValue );

♦    INFA_CTGetInternalPropertyBool(). Accesses a value of type Boolean for a given
     property ID.
     Use the following syntax:
         INFA_STATUS INFA_CTGetInternalPropertyBool( INFA_CT_HANDLE handle, size_t
         propId, INFA_BOOLEN* pbPropValue );

♦    INFA_CTGetInternalPropertyINFA_PTR(). Accesses a pointer to a value for a given
     property ID.
     Use the following syntax:
         INFA_STATUS INFA_CTGetInternalPropertyINFA_PTR( INFA_CT_HANDLE handle,
         size_t propId, INFA_PTR* pvPropValue );

The return value datatype is INFA_STATUS. Use INFA_SUCCESS and INFA_FAILURE for
the return value.

Port and Property Attribute Property IDs
The following tables list the property IDs for the port and property attributes in the Custom
transformation. Each table lists a Custom transformation handle and the property IDs you
can access with the handle in a property function.
Table 4-5 lists INFA_CT_MODULE _HANDLE property IDs:

Table 4-5. INFA_CT_MODULE Property IDs

    Handle Property ID                     Datatype   Description

    INFA_CT_MODULE_NAME                    String     Specifies the module name.

    INFA_CT_SESSION_INFA_VERSION           String     Specifies the Informatica version.




                                                                                   API Functions   109
Table 4-5. INFA_CT_MODULE Property IDs

               Handle Property ID                          Datatype   Description

               INFA_CT_SESSION_CODE_PAGE                   Integer    Specifies the Integration Service code page.

               INFA_CT_SESSION_DATAMOVEMENT_MODE           Integer    Specifies the data movement mode. The
                                                                      Integration Service returns one of the following
                                                                      values:
                                                                      - eASM_MBCS
                                                                      - eASM_UNICODE

               INFA_CT_SESSION_VALIDATE_CODEPAGE           Boolean    Specifies whether the Integration Service enforces
                                                                      code page validation.

               INFA_CT_SESSION_PROD_INSTALL_DIR            String     Specifies the Integration Service installation
                                                                      directory.

               INFA_CT_SESSION_HIGH_PRECISION_MODE         Boolean    Specifies whether session is configured for high
                                                                      precision.

               INFA_CT_MODULE_RUNTIME_DIR                  String     Specifies the runtime directory for the DLL or
                                                                      shared library.

               INFA_CT_SESSION_IS_UPD_STR_ALLOWED          Boolean    Specifies whether the Update Strategy
                                                                      Transformation property is selected in the
                                                                      transformation.

               INFA_CT_TRANS_OUTPUT_IS_                    Integer    Specifies whether the Custom transformation
               REPEATABLE                                             produces data in the same order in every session
                                                                      run. The Integration Service returns one of the
                                                                      following values:
                                                                      - eOUTREPEAT_NEVER = 1
                                                                      - eOUTREPEAT_ALWAYS = 2
                                                                      - eOUTREPEAT_BASED_ON_INPUT_ORDER = 3

               INFA_CT_TRANS_FATAL_ERROR                   Boolean    Specifies if the Custom Transformation caused a
                                                                      fatal error. The Integration Service returns one of
                                                                      the following values:
                                                                      - INFA_TRUE
                                                                      - INFA_FALSE


             Table 4-6 lists INFA_CT_PROC_HANDLE property IDs:

             Table 4-6. INFA_CT_PROC_HANDLE Property IDs

               Handle Property ID                          Datatype   Description

               INFA_CT_PROCEDURE_NAME                      String     Specifies the Custom transformation procedure
                                                                      name.




110   Chapter 4: Custom Transformation Functions
Table 4-7 lists INFA_CT_TRANS_HANDLE property IDs:

Table 4-7. INFA_CT_TRANS_HANDLE Property IDs

 Handle Property ID                            Datatype   Description

 INFA_CT_TRANS_INSTANCE_NAME                   String     Specifies the Custom transformation instance
                                                          name.

 INFA_CT_TRANS_TRACE_LEVEL                     Integer    Specifies the tracing level. The Integration Service
                                                          returns one of the following values:
                                                          - eTRACE_TERSE
                                                          - eTRACE_NORMAL
                                                          - eTRACE_VERBOSE_INIT
                                                          - eTRACE_VERBOSE_DATA

 INFA_CT_TRANS_MAY_BLOCK_DATA                  Boolean    Specifies if the Integration Service allows the
                                                          procedure to block input data in the current session.

 INFA_CT_TRANS_MUST_BLOCK_DATA                 Boolean    Specifies if the Inputs Must Block Custom
                                                          transformation property is selected.

 INFA_CT_TRANS_ISACTIVE                        Boolean    Specifies whether the Custom transformation is an
                                                          active or passive transformation.

 INFA_CT_TRANS_ISPARTITIONABLE                 Boolean    Specifies if you can partition sessions that use this
                                                          Custom transformation.

 INFA_CT_TRANS_IS_UPDATE_                      Boolean    Specifies if the Custom transformation behaves like
 STRATEGY                                                 an Update Strategy transformation.

 INFA_CT_TRANS_DEFAULT_UPDATE_STRATE           Integer    Specifies the default update strategy.
 GY                                                       - eDUS_INSERT
                                                          - eDUS_UPDATE
                                                          - eDUS_DELETE
                                                          - eDUS_REJECT
                                                          - eDUS_PASSTHROUGH

 INFA_CT_TRANS_NUM_PARTITIONS                  Integer    Specifies the number of partitions in the sessions
                                                          that use this Custom transformation.

 INFA_CT_TRANS_DATACODEPAGE                    Integer    Specifies the code page in which the Integration
                                                          Service passes data to the Custom transformation.
                                                          Use the set data code page function if you want the
                                                          Custom transformation to access data in a different
                                                          code page. For more information, see “Set Data
                                                          Code Page Function” on page 127.

 INFA_CT_TRANS_TRANSFORM_                      Integer    Specifies the transformation scope in the Custom
 SCOPE                                                    transformation. The Integration Service returns one
                                                          of the following values:
                                                          - eTS_ROW
                                                          - eTS_TRANSACTION
                                                          - eTS_ALLINPUT




                                                                                         API Functions         111
Table 4-7. INFA_CT_TRANS_HANDLE Property IDs

               Handle Property ID                           Datatype   Description

               INFA_CT_TRANS_GENERATE_                      Boolean    Specifies if the Generate Transaction property is
               TRANSACT                                                enabled. The Integration Service returns one of the
                                                                       following values:
                                                                       - INFA_TRUE
                                                                       - INFA_FALSE

               INFA_CT_TRANS_OUTPUT_IS_                     Integer    Specifies whether the Custom transformation
               REPEATABLE                                              produces data in the same order in every session
                                                                       run. The Integration Service returns one of the
                                                                       following values:
                                                                       - eOUTREPEAT_NEVER = 1
                                                                       - eOUTREPEAT_ALWAYS = 2
                                                                       - eOUTREPEAT_BASED_ON_INPUT_ORDER = 3

               INFA_CT_TRANS_FATAL_ERROR                    Boolean    Specifies if the Custom Transformation caused a
                                                                       fatal error. The Integration Service returns one of
                                                                       the following values:
                                                                       - INFA_TRUE
                                                                       - INFA_FALSE


             Table 4-8 lists INFA_CT_INPUT_GROUP_HANDLE and
             INFA_CT_OUTPUT_GROUP_HANDLE property IDs:

             Table 4-8. INFA_CT_INPUT_GROUP and INFA_CT_OUTPUT_GROUP Handle Property IDs

               Handle Property ID                           Datatype   Description

               INFA_CT_GROUP_NAME                           String     Specifies the group name.

               INFA_CT_GROUP_NUM_PORTS                      Integer    Specifies the number of ports in the group.

               INFA_CT_GROUP_ISCONNECTED                    Boolean    Specifies if all ports in a group are connected to
                                                                       another transformation.

               INFA_CT_PORT_NAME                            String     Specifies the port name.

               INFA_CT_PORT_CDATATYPE                       Integer    Specifies the port datatype. The Integration Service
                                                                       returns one of the following values:
                                                                       - eINFA_CTYPE_SHORT
                                                                       - eINFA_CTYPE_INT32
                                                                       - eINFA_CTYPE_CHAR
                                                                       - eINFA_CTYPE_RAW
                                                                       - eINFA_CTYPE_UNICHAR
                                                                       - eINFA_CTYPE_TIME
                                                                       - eINFA_CTYPE_FLOAT
                                                                       - eINFA_CTYPE_DOUBLE
                                                                       - eINFA_CTYPE_DECIMAL18_FIXED
                                                                       - eINFA_CTYPE_DECIMAL28_FIXED
                                                                       - eINFA_CTYPE_INFA_CTDATETIME

               INFA_CT_PORT_PRECISION                       Integer    Specifies the port precision.

               INFA_CT_PORT_SCALE                           Integer    Specifies the port scale (if applicable).




112   Chapter 4: Custom Transformation Functions
Table 4-8. INFA_CT_INPUT_GROUP and INFA_CT_OUTPUT_GROUP Handle Property IDs

 Handle Property ID                        Datatype   Description

 INFA_CT_PORT_IS_MAPPED                    Boolean    Specifies whether the port is linked to other
                                                      transformations in the mapping.

 INFA_CT_PORT_STORAGESIZE                  Integer    Specifies the internal storage size of the data for a
                                                      port. The storage size depends on the datatype of
                                                      the port.

 INFA_CT_PORT_BOUNDDATATYPE                Integer    Specifies the port datatype. Use instead of
                                                      INFA_CT_PORT_CDATATYPE if you rebind the port
                                                      and specify a datatype other than the default. For
                                                      more information about rebinding a port, see
                                                      “Rebind Datatype Functions” on page 115.


Table 4-9 lists INFA_CT_INPUTPORT_HANDLE and INFA_CT_OUTPUT_HANDLE
property IDs:

Table 4-9. INFA_CT_INPUTPORT and INFA_CT_OUTPUTPORT_HANDLE Handle Property IDs

 Handle Property ID                        Datatype   Description

 INFA_CT_PORT_NAME                         String     Specifies the port name.

 INFA_CT_PORT_CDATATYPE                    Integer    Specifies the port datatype. The Integration Service
                                                      returns one of the following values:
                                                      - eINFA_CTYPE_SHORT
                                                      - eINFA_CTYPE_INT32
                                                      - eINFA_CTYPE_CHAR
                                                      - eINFA_CTYPE_RAW
                                                      - eINFA_CTYPE_UNICHAR
                                                      - eINFA_CTYPE_TIME
                                                      - eINFA_CTYPE_FLOAT
                                                      - eINFA_CTYPE_DOUBLE
                                                      - eINFA_CTYPE_DECIMAL18_FIXED
                                                      - eINFA_CTYPE_DECIMAL28_FIXED
                                                      - eINFA_CTYPE_INFA_CTDATETIME

 INFA_CT_PORT_PRECISION                    Integer    Specifies the port precision.

 INFA_CT_PORT_SCALE                        Integer    Specifies the port scale (if applicable).

 INFA_CT_PORT_IS_MAPPED                    Boolean    Specifies whether the port is linked to other
                                                      transformations in the mapping.

 INFA_CT_PORT_STORAGESIZE                  Integer    Specifies the internal storage size of the data for a
                                                      port. The storage size depends on the datatype of
                                                      the port.

 INFA_CT_PORT_BOUNDDATATYPE                Integer    Specifies the port datatype. Use instead of
                                                      INFA_CT_PORT_CDATATYPE if you rebind the port
                                                      and specify a datatype other than the default. For
                                                      more information about rebinding a port, see
                                                      “Rebind Datatype Functions” on page 115.




                                                                                      API Functions       113
Get All External Property Names (MBCS or Unicode)
             PowerCenter provides two functions to access the property names defined on the Metadata
             Extensions tab, Initialization Properties tab, and Port Attribute Definitions tab of the Custom
             transformation.
             Use the following functions when you want the procedure to access the property names:
             ♦   INFA_CTGetAllPropertyNamesM(). Accesses the property names in MBCS.
                 Use the following syntax:
                     INFA_STATUS INFA_CTGetAllPropertyNamesM(INFA_CT_HANDLE handle, const
                     char*const** paPropertyNames, size_t* pnProperties);


                                                                Input/
                   Argument              Datatype                        Description
                                                                Output

                   handle                INFA_CT_HANDLE         Input    Specify the handle name.

                   paPropertyNames       const char*const**     Output   Specifies the property name. The Integration
                                                                         Service returns an array of property names in
                                                                         MBCS.

                   pnProperties          size_t*                Output   Indicates the number of properties in the array.


             ♦   INFA_CTGetAllPropertyNamesU(). Accesses the property names in Unicode.
                 Use the following syntax:
                     INFA_STATUS INFA_CTGetAllPropertyNamesU(INFA_CT_HANDLE handle, const
                     INFA_UNICHAR*const** pasPropertyNames, size_t* pnProperties);


                                                                Input/
                   Argument              Datatype                        Description
                                                                Output

                   handle                INFA_CT_HANDLE         Input    Specify the handle name.

                   paPropertyNames       const                  Output   Specifies the property name. The Integration
                                         INFA_UNICHAR*const**            Service returns an array of property names in
                                                                         Unicode.

                   pnProperties          size_t*                Output   Indicates the number of properties in the array.


             The return value datatype is INFA_STATUS. Use INFA_SUCCESS and INFA_FAILURE for
             the return value.

             Get External Properties (MBCS or Unicode)
             PowerCenter provides functions to access the values of the properties defined on the Metadata
             Extensions tab, Initialization Properties tab, or Port Attribute Definitions tab of the Custom
             transformation.
             You must specify the property names in the functions if you want the procedure to access the
             values. Use the INFA_CTGetAllPropertyNamesM() or INFA_CTGetAllPropertyNamesU()



114   Chapter 4: Custom Transformation Functions
functions to access property names. For the handle parameter, specify a handle name from the
  handle hierarchy. The Integration Service fails the session if the handle name is invalid.
  Note: If you define an initialization property with the same name as a metadata extension, the
  Integration Service returns the metadata extension value.
  Use the following functions when you want the procedure to access the values of the
  properties:
  ♦   INFA_CTGetExternalProperty<datatype>M(). Accesses the value of the property in
      MBCS. Use the syntax as shown in Table 4-10:

      Table 4-10. Property Functions (MBCS)

                                                                                  Property
       Syntax
                                                                                  Datatype

       INFA_STATUS INFA_CTGetExternalPropertyStringM(INFA_CT_HANDLE               String
       handle, const char* sPropName, const char** psPropValue);

       INFA_STATUS INFA_CTGetExternalPropertyINT32M(INFA_CT_HANDLE                Integer
       handle, const char* sPropName, INFA_INT32* pnPropValue);

       INFA_STATUS INFA_CTGetExternalPropertyBoolM(INFA_CT_HANDLE                 Boolean
       handle, const char* sPropName, INFA_BOOLEN* pbPropValue);


  ♦   INFA_CTGetExternalProperty<datatype>U(). Accesses the value of the property in
      Unicode. Use the syntax as shown in Table 4-11:

      Table 4-11. Property Functions (Unicode)

                                                                                      Property
       Syntax
                                                                                      Datatype

       INFA_STATUS INFA_CTGetExternalPropertyStringU(INFA_CT_HANDLE                   String
       handle, INFA_UNICHAR* sPropName, INFA_UNICHAR** psPropValue);

       INFA_STATUS INFA_CTGetExternalPropertyStringU(INFA_CT_HANDLE                   Integer
       handle, INFA_UNICHAR* sPropName, INFA_INT32* pnPropValue);

       INFA_STATUS INFA_CTGetExternalPropertyStringU(INFA_CT_HANDLE                   Boolean
       handle, INFA_UNICHAR* sPropName, INFA_BOOLEN* pbPropValue);


  The return value datatype is INFA_STATUS. Use INFA_SUCCESS and INFA_FAILURE for
  the return value.


Rebind Datatype Functions
  You can rebind a port with a datatype other than the default datatype with PowerCenter. Use
  the rebind datatype functions if you want the procedure to access data in a datatype other
  than the default datatype. You must rebind the port with a compatible datatype.
  You can only use these functions in the initialization functions.



                                                                              API Functions      115
Consider the following rules when you rebind the datatype for an output or input/output
             port:
             ♦    You must use the data handling functions to set the data and the indicator for that port.
                  Use the INFA_CTSetData() and INFA_CTSetIndicator() functions in row-based mode,
                  and use the INFA_CTASetData() function in array-based mode.
             ♦    Do not call the INFA_CTSetPassThruPort() function for the output port.
             Table 4-12 lists compatible datatypes:

             Table 4-12. Compatible Datatypes

                 Default Datatype      Compatible With

                 Char                  Unichar

                 Unichar               Char

                 Date                  INFA_DATETIME
                                       Use the following syntax:
                                       struct INFA_DATETIME
                                       {
                                       int nYear;
                                       int nMonth;
                                       int nDay;
                                       int nHour;
                                       int nMinute;
                                       int nSecond;
                                       int nNanoSecond;
                                       }

                 Dec18                 Char, Unichar

                 Dec28                 Char, Unichar




116   Chapter 4: Custom Transformation Functions
PowerCenter provides the following rebind datatype functions:
  ♦     INFA_CTRebindInputDataType(). Rebinds the input port. Use the following syntax:
            INFA_STATUS INFA_CTRebindInputDataType(INFA_CT_INPUTPORT_HANDLE
            portHandle, INFA_CDATATYPE datatype);

  ♦     INFA_CTRebindOutputDataType(). Rebinds the output port. Use the following syntax:
            INFA_STATUS INFA_CTRebindOutputDataType(INFA_CT_OUTPUTPORT_HANDLE
            portHandle, INFA_CDATATYPE datatype);


                                                 Input/
      Argument      Datatype                                  Description
                                                 Output

      portHandle    INFA_CT_OUTPUTPORT_HANDLE    Input        Output port handle.

      datatype      INFA_CDATATYPE               Input        The datatype with which you rebind the
                                                              port. Use the following values for the
                                                              datatype parameter:
                                                              - eINFA_CTYPE_SHORT
                                                              - eINFA_CTYPE_INT32
                                                              - eINFA_CTYPE_CHAR
                                                              - eINFA_CTYPE_RAW
                                                              - eINFA_CTYPE_UNICHAR
                                                              - eINFA_CTYPE_TIME
                                                              - eINFA_CTYPE_FLOAT
                                                              - eINFA_CTYPE_DOUBLE
                                                              - eINFA_CTYPE_DECIMAL18_FIXED
                                                              - eINFA_CTYPE_DECIMAL28_FIXED
                                                              - eINFA_CTYPE_INFA_CTDATETIME


  The return value datatype is INFA_STATUS. Use INFA_SUCCESS and INFA_FAILURE for
  the return value.


Data Handling Functions (Row-Based Mode)
  When the Integration Service calls the input row notification function, it notifies the
  procedure that the procedure can access a row or block of data. However, to get data from the
  input port, modify it, and set data in the output port, you must use the data handling
  functions in the input row notification function. When the data access mode is row-based,
  use the row-based data handling functions.
  Include the INFA_CTGetData<datatype>() function to get the data from the input port and
  INFA_CTSetData() function to set the data in the output port. Include the
  INFA_CTGetIndicator() or INFA_CTGetLength() function if you want the procedure to
  verify before you get the data if the port has a null value or an empty string.
  PowerCenter provides the following data handling functions:
  ♦     INFA_CTGetData<datatype>(). For more information, see “Get Data Functions (Row-
        Based Mode)” on page 118.
  ♦     INFA_CTSetData(). For more information, see “Set Data Function (Row-Based Mode)”
        on page 118.


                                                                                    API Functions      117
♦    INFA_CTGetIndicator(). For more information, see “Indicator Functions (Row-Based
                  Mode)” on page 119.
             ♦    INFA_CTSetIndicator(). For more information, see “Indicator Functions (Row-Based
                  Mode)” on page 119.
             ♦    INFA_CTGetLength(). For more information, see “Length Functions” on page 120.
             ♦    INFA_CTSetLength(). For more information, see “Length Functions” on page 120.

             Get Data Functions (Row-Based Mode)
             Use the INFA_CTGetData<datatype>() functions to retrieve data for the port the function
             specifies.
             You must modify the function name depending on the datatype of the port you want the
             procedure to access.
             Table 4-13 lists the INFA_CTGetData<datatype>() function syntax and the datatype of the
             return value:

             Table 4-13. Get Data Functions

                                                                                         Return Value
                 Syntax
                                                                                         Datatype

                 void* INFA_CTGetDataVoid(INFA_CT_INPUTPORT_HANDLE dataHandle);          Data void
                                                                                         pointer to the
                                                                                         return value

                 char* INFA_CTGetDataStringM(INFA_CT_INPUTPORT_HANDLE                    String (MBCS)
                 dataHandle);

                 IUNICHAR* INFA_CTGetDataStringU(INFA_CT_INPUTPORT_HANDLE                String
                 dataHandle);                                                            (Unicode)

                 INFA_INT32 INFA_CTGetDataINT32(INFA_CT_INPUTPORT_HANDLE                 Integer
                 dataHandle);

                 double INFA_CTGetDataDouble(INFA_CT_INPUTPORT_HANDLE                    Double
                 dataHandle);

                 INFA_CT_RAWDATE INFA_CTGetDataDate(INFA_CT_INPUTPORT_HANDLE             Raw date
                 dataHandle);

                 INFA_CT_RAWDEC18 INFA_CTGetDataRawDec18(                                Decimal BLOB
                 INFA_CT_INPUTPORT_HANDLE dataHandle);                                   (precision 18)

                 INFA_CT_RAWDEC28 INFA_CTGetDataRawDec28(                                Decimal BLOB
                 INFA_CT_INPUTPORT_HANDLE dataHandle);                                   (precision 28)

                 INFA_CT_DATETIME                                                        Datetime
                 INFA_CTGetDataDateTime(INFA_CT_INPUTPORT_HANDLE dataHandle);


             Set Data Function (Row-Based Mode)
             Use the INFA_CTSetData() function when you want the procedure to pass a value to an
             output port.


118   Chapter 4: Custom Transformation Functions
Use the following syntax:
           INFA_STATUS INFA_CTSetData(INFA_CT_OUTPUTPORT_HANDLE dataHandle, void*
           data);

The return value datatype is INFA_STATUS. Use INFA_SUCCESS and INFA_FAILURE for
the return value.
Note: If you use the INFA_CTSetPassThruPort() function on an input/output port, do not
use set the data or indicator for that port.

Indicator Functions (Row-Based Mode)
Use the indicator functions when you want the procedure to get the indicator for an input
port or to set the indicator for an output port. The indicator for a port indicates whether the
data is valid, null, or truncated.
PowerCenter provides the following indicator functions:
♦     INFA_CTGetIndicator(). Gets the indicator for an input port. Use the following syntax:
           INFA_INDICATOR INFA_CTGetIndicator(INFA_CT_INPUTPORT_HANDLE dataHandle);

      The return value datatype is INFA_INDICATOR. Use the following values for
      INFA_INDICATOR:
      −   INFA_DATA_VALID. Indicates the data is valid.
      −   INFA_NULL_DATA. Indicates a null value.
      −   INFA_DATA_TRUNCATED. Indicates the data has been truncated.
♦     INFA_CTSetIndicator(). Sets the indicator for an output port. Use the following syntax:
           INFA_STATUS INFA_CTSetIndicator(INFA_CT_OUTPUTPORT_HANDLE dataHandle,
           INFA_INDICATOR indicator);


                                                 Input/
    Argument      Datatype                                Description
                                                 Output

    dataHandle    INFA_CT_OUTPUTPORT_HANDLE      Input    Output port handle.

    indicator     INFA_INDICATOR                 Input    The indicator value for the output port. Use
                                                          one of the following values:
                                                          - INFA_DATA_VALID. Indicates the data is
                                                            valid.
                                                          - INFA_NULL_DATA. Indicates a null value.
                                                          - INFA_DATA_TRUNCATED. Indicates the
                                                            data has been truncated.


      The return value datatype is INFA_STATUS. Use INFA_SUCCESS and INFA_FAILURE
      for the return value.
      Note: If you use the INFA_CTSetPassThruPort() function on an input/output port, do not
      set the data or indicator for that port.




                                                                                  API Functions          119
Length Functions
             Use the length functions when you want the procedure to access the length of a string or
             binary input port, or to set the length of a binary or string output port.
             Use the following length functions:
             ♦   INFA_CTGetLength(). Use this function for string and binary ports only. The Integration
                 Service returns the length as the number of characters including trailing spaces. Use the
                 following syntax:
                     INFA_UINT32 INFA_CTGetLength(INFA_CT_INPUTPORT_HANDLE dataHandle);

                 The return value datatype is INFA_UINT32. Use a value between zero and 2GB for the
                 return value.
             ♦   INFA_CTSetLength(). When the Custom transformation contains a binary or string
                 output port, you must use this function to set the length of the data, including trailing
                 spaces. Verify you the length you set for string and binary ports is not greater than the
                 precision for that port. If you set the length greater than the port precision, you get
                 unexpected results. For example, the session may fail.
                 Use the following syntax:
                     INFA_STATUS INFA_CTSetLength(INFA_CT_OUTPUTPORT_HANDLE dataHandle,
                     IUINT32 length);

                 The return value datatype is INFA_STATUS. Use INFA_SUCCESS and INFA_FAILURE
                 for the return value.


        Set Pass-Through Port Function
             Use the INFA_CTSetPassThruPort() function when you want the Integration Service to pass
             data from an input port to an output port without modifying the data. When you use the
             INFA_CTSetPassThruPort() function, the Integration Service passes the data to the output
             port when it calls the input row notification function.
             Consider the following rules and guidelines when you use the set pass-through port function:
             ♦   Only use this function in an initialization function.
             ♦   If the procedure includes this function, do not include the INFA_CTSetData(),
                 INFA_CTSetLength, INFA_CTSetIndicator(), or INFA_CTASetData() functions to pass
                 data to the output port.
             ♦   In row-based mode, you can only include this function when the transformation scope is
                 Row. When the transformation scope is Transaction or All Input, this function returns
                 INFA_FAILURE.
             ♦   In row-based mode, when you use this function to output multiple rows for a given input
                 row, every output row contains the data that is passed through from the input port.
             ♦   In array-based mode, you can only use this function for passive Custom transformations.
             You must verify that the datatype, precision, and scale are the same for the input and output
             ports. The Integration Service fails the session if the datatype, precision, or scale are not the
             same for the input and output ports you specify in the INFA_CTSetPassThruPort() function.


120   Chapter 4: Custom Transformation Functions
Use the following syntax:
               INFA_STATUS INFA_CTSetPassThruPort(INFA_CT_OUTPUTPORT_HANDLE outputport,
               INFA_CT_INPUTPORT_HANDLE inputport)

   The return value datatype is INFA_STATUS. Use INFA_SUCCESS and INFA_FAILURE for
   the return value.


Output Notification Function
   When you want the procedure to output a row to the Integration Service, use the
   INFA_CTOutputNotification() function. Only include this function for active Custom
   transformations. For passive Custom transformations, the procedure outputs a row to the
   Integration Service when the input row notification function gives a return value. If the
   procedure calls this function for a passive Custom transformation, the Integration Service
   ignores the function.
   Note: When the transformation scope is Row, you can only include this function in the input
   row notification function. If you include it somewhere else, it returns a failure.
   Use the following syntax:
               INFA_ROWSTATUS INFA_CTOutputNotification(INFA_CT_OUTPUTGROUP_HANDLE
               group);


                                                        Input/
       Argument        Datatype                                      Description
                                                        Output

       group           INFA_CT_OUTPUT_GROUP_HANDLE      Input        Output group handle.


   The return value datatype is INFA_ROWSTATUS. Use the following values for the return
   value:
   ♦    INFA_ROWSUCCESS. Indicates the function successfully processed the row of data.
   ♦    INFA_ROWERROR. Indicates the function encountered an error for the row of data. The
        Integration Service increments the internal error count.
   ♦    INFA_FATALERROR. Indicates the function encountered a fatal error for the row of
        data. The Integration Service fails the session.
   Note: When the procedure code calls the INFA_CTOutputNotification() function, you must
   verify that all pointers in an output port handle point to valid data. When a pointer does not
   point to valid data, the Integration Service might shut down unexpectedly.


Data Boundary Output Notification Function
   Include the INFA_CTDataBdryOutputNotification() function when you want the procedure
   to output a commit or rollback transaction.
   When you use this function, you must select the Generate Transaction property for this
   Custom transformation. If you do not select this property, the Integration Service fails the
   session.


                                                                                    API Functions   121
Use the following syntax:
                          INFA_STATUS INFA_CTDataBdryOutputNotification(INFA_CT_PARTITION_HANDLE
                          handle, INFA_CTDataBdryType dataBoundaryType);


                                                               Input/
                 Argument           Datatype                            Description
                                                               Output

                 handle             INFA_CT_PARTITION_HANDLE   Input    Handle name.

                 dataBoundaryType   INFA_CTDataBdryType        Input    The transaction type.
                                                                        Use the following values for the
                                                                        dataBoundaryType parameter:
                                                                        - eBT_COMMIT
                                                                        - eBT_ROLLBACK


             The return value datatype is INFA_STATUS. Use INFA_SUCCESS and INFA_FAILURE for
             the return value.


        Error Functions
             Use the error functions to access procedure errors. The Integration Service returns the most
             recent error.
             PowerCenter provides the following error functions:
             ♦    INFA_CTGetErrorMsgM(). Gets the error message in MBCS. Use the following syntax:
                          const char* INFA_CTGetErrorMsgM();

             ♦    INFA_CTGetErrorMsgU(). Gets the error message in Unicode. Use the following syntax:
                          const IUNICHAR* INFA_CTGetErrorMsgU();




122   Chapter 4: Custom Transformation Functions
Session Log Message Functions
  Use the session log message functions when you want the procedure to log a message in the
  session log in either Unicode or MBCS.
  PowerCenter provides the following session log message functions:
  ♦   INFA_CTLogMessageU(). Logs a message in Unicode.
      Use the following syntax:
         void INFA_CTLogMessageU(INFA_CT_ErrorSeverityLevel errorseverityLevel,
         INFA_UNICHAR* msg)


                                                         Input/
       Argument             Datatype                              Description
                                                         Output

       errorSeverityLevel   INFA_CT_ErrorSeverityLevel   Input    Severity level of the error message that
                                                                  you want the Integration Service to write
                                                                  in the session log. Use the following
                                                                  values for the errorSeverityLevel
                                                                  parameter:
                                                                  - eESL_LOG
                                                                  - eESL_DEBUG
                                                                  - eESL_ERROR

       msg                  INFA_UNICHAR*                Input    Enter the text of the message in Unicode
                                                                  in quotes.


  ♦   INFA_CTLogMessageM(). Logs a message in MBCS.
      Use the following syntax:
         void INFA_CTLogMessageM(INFA_CT_ErrorSeverityLevel errorSeverityLevel,
         char* msg)


                                                         Input/
       Argument             Datatype                              Description
                                                         Output

       errorSeverityLevel   INFA_CT_ErrorSeverityLevel   Input    Severity level of the error message that
                                                                  you want the Integration Service to write
                                                                  in the session log. Use the following
                                                                  values for the errorSeverityLevel
                                                                  parameter:
                                                                  - eESL_LOG
                                                                  - eESL_DEBUG
                                                                  - eESL_ERROR

       msg                  char*                        Input    Enter the text of the message in MBCS in
                                                                  quotes.




                                                                                     API Functions       123
Increment Error Count Function
             Use the INFA_CTIncrementErrorCount() function when you want to increase the error
             count for the session.
             Use the following syntax:
                          INFA_STATUS INFA_CTIncrementErrorCount(INFA_CT_PARTITION_HANDLE
                          transformation, size_t nErrors, INFA_STATUS* pStatus);


                                                               Input/
                 Argument          Datatype                              Description
                                                               Output

                 transformation    INFA_CT_PARTITION_HANDLE    Input     Partition handle.

                 nErrors           size_t                      Input     Integration Service increments the error
                                                                         count by nErrors for the given transformation
                                                                         instance.

                 pStatus           INFA_STATUS*                Input     Integration Service uses INFA_FAILURE for
                                                                         the pStatus parameter when the error count
                                                                         exceeds the error threshold and fails the
                                                                         session.


             The return value datatype is INFA_STATUS. Use INFA_SUCCESS and INFA_FAILURE for
             the return value.


        Is Terminated Function
             Use the INFA_CTIsTerminated() function when you want the procedure to check if the
             PowerCenter Client has requested the Integration Service to stop the session. You might call
             this function if the procedure includes a time-consuming process.
             Use the following syntax:
                          INFA_CTTerminateType INFA_CTIsTerminated(INFA_CT_PARTITION_HANDLE
                          handle);


                                                                Input/
                 Argument           Datatype                              Description
                                                                Output

                 handle             INFA_CT_PARTITION_HANDLE    input     Partition handle.


             The return value datatype is INFA_CTTerminateType. The Integration Service returns one of
             the following values:
             ♦     eTT_NOTTERMINATED. Indicates the PowerCenter Client has not requested to stop
                   the session.
             ♦     eTT_ABORTED. Indicates the Integration Service aborted the session.
             ♦     eTT_STOPPED. Indicates the Integration Service failed the session.




124   Chapter 4: Custom Transformation Functions
Blocking Functions
  When the Custom transformation contains multiple input groups, you can write code to
  block the incoming data on an input group. For more information about blocking data, see
  “Blocking Input Data” on page 70.
  Consider the following rules when you use the blocking functions:
  ♦   You can block at most n-1 input groups.
  ♦   You cannot block an input group that is already blocked.
  ♦   You cannot block an input group when it receives data from the same source as another
      input group.
  ♦   You cannot unblock an input group that is already unblocked.
  PowerCenter provides the following blocking functions:
  ♦   INFA_CTBlockInputFlow(). Allows the procedure to block an input group.
      Use the following syntax:
         INFA_STATUS INFA_CTBlockInputFlow(INFA_CT_INPUTGROUP_HANDLE group);

  ♦   INFA_CTUnblockInputFlow(). Allows the procedure to unblock an input group.
      Use the following syntax:
         INFA_STATUS INFA_CTUnblockInputFlow(INFA_CT_INPUTGROUP_HANDLE group);


                                                      Input/
       Argument       Datatype                                     Description
                                                      Output

       group          INFA_CT_INPUTGROUP_HANDLE       Input        Input group handle.


  The return value datatype is INFA_STATUS. Use INFA_SUCCESS and INFA_FAILURE for
  the return value.

  Verify Blocking
  When you use the INFA_CTBlockInputFlow() and INFA_CTUnblockInputFlow() functions
  in the procedure code, verify the procedure checks whether or not the Integration Service
  allows the Custom transformation to block incoming data. To do this, check the value of the
  INFA_CT_TRANS_MAY_BLOCK_DATA propID using the
  INFA_CTGetInternalPropertyBool() function.
  When the value of the INFA_CT_TRANS_MAY_BLOCK_DATA propID is FALSE, the
  procedure should either not use the blocking functions, or it should return a fatal error and
  stop the session.
  If the procedure code uses the blocking functions when the Integration Service does not allow
  the Custom transformation to block data, the Integration Service might fail the session.




                                                                                 API Functions   125
Pointer Functions
             Use the pointer functions when you want the Integration Service to create and access pointers
             to an object or a structure.
             PowerCenter provides the following pointer functions:
             ♦   INFA_CTGetUserDefinedPtr(). Allows the procedure to access an object or structure
                 during run time.
                 Use the following syntax:
                     void* INFA_CTGetUserDefinedPtr(INFA_CT_HANDLE handle)


                                                           Input/
                   Argument         Datatype                            Description
                                                           Output

                   handle           INFA_CT_HANDLE         Input        Handle name.


             ♦   INFA_CTSetUserDefinedPtr(). Allows the procedure to associate an object or a structure
                 with any handle the Integration Service provides. To reduce processing overhead, include
                 this function in the initialization functions.
                 Use the following syntax:
                     void INFA_CTSetUserDefinedPtr(INFA_CT_HANDLE handle, void* pPtr)


                                                           Input/
                   Argument         Datatype                            Description
                                                           Output

                   handle           INFA_CT_HANDLE         Input        Handle name.

                   pPtr             void*                  Input        User pointer.


             You must substitute a valid handle for INFA_CT_HANDLE.


        Change String Mode Function
             When the Integration Service runs in Unicode mode, it passes data to the procedure in UCS-
             2 by default. When it runs in ASCII mode, it passes data in ASCII by default. Use the
             INFA_CTChangeStringMode() function if you want to change the default string mode for
             the procedure. When you change the default string mode to MBCS, the Integration Service
             passes data in the Integration Service code page. Use the INFA_CTSetDataCodePageID()
             function if you want to change the code page. For more information about changing the code
             page ID, see “Set Data Code Page Function” on page 127.
             When a procedure includes the INFA_CTChangeStringMode() function, the Integration
             Service changes the string mode for all ports in each Custom transformation that use this
             particular procedure.
             Use the change string mode function in the initialization functions.




126   Chapter 4: Custom Transformation Functions
Use the following syntax:
         INFA_STATUS INFA_CTChangeStringMode(INFA_CT_PROCEDURE_HANDLE procedure,
         INFA_CTStringMode stringMode);


                                                Input/
   Argument         Datatype                             Description
                                                Output

   procedure        INFA_CT_PROCEDURE_HANDLE    Input    Procedure handle name.

   stringMode       INFA_CTStringMode           Input    Specifies the string mode that you want the
                                                         Integration Service to use. Use the following values
                                                         for the stringMode parameter:
                                                         - eASM_UNICODE. Use this when the Integration
                                                           Service runs in ASCII mode and you want the
                                                           procedure to access data in Unicode.
                                                         - eASM_MBCS. Use this when the Integration
                                                           Service runs in Unicode mode and you want the
                                                           procedure to access data in MBCS.


  The return value datatype is INFA_STATUS. Use INFA_SUCCESS and INFA_FAILURE for
  the return value.


Set Data Code Page Function
  Use the INFA_CTSetDataCodePageID() when you want the Integration Service to pass data
  to the Custom transformation in a code page other than the Integration Service code page.
  Use the set data code page function in the procedure initialization function.
  Use the following syntax:
         INFA_STATUS INFA_CTSetDataCodePageID(INFA_CT_TRANSFORMATION_HANDLE
         transformation, int dataCodePageID);


                                                          Input/
   Argument              Datatype                                      Description
                                                          Output

   transformation        INFA_CT_TRANSFORMATION_HANDLE    Input        Transformation handle name.

   dataCodePageID        int                              Input        Specifies the code page you want the
                                                                       Integration Service to pass data in.
                                                                       For valid values for the
                                                                       dataCodePageID parameter, see
                                                                       “Code Pages” in the Administrator
                                                                       Guide.


  The return value datatype is INFA_STATUS. Use INFA_SUCCESS and INFA_FAILURE for
  the return value.




                                                                                        API Functions       127
Row Strategy Functions (Row-Based Mode)
             The row strategy functions allow you to access and configure the update strategy for each row.
             PowerCenter provides the following row strategy functions:
             ♦   INFA_CTGetRowStrategy(). Allows the procedure to get the update strategy for a row.
                 Use the following syntax:
                     INFA_STATUS INFA_CTGetRowStrategy(INFA_CT_INPUTGROUP_HANDLE group,
                     INFA_CTUpdateStrategy updateStrategy);


                                                                  Input/
                   Argument         Datatype                                Description
                                                                  Output

                   group            INFA_CT_INPUTGROUP_HANDLE     Input     Input group handle.

                   updateStrategy   INFA_CT_UPDATESTRATEGY        Input     Update strategy for the input port.
                                                                            The Integration Service uses the
                                                                            following values:
                                                                            - eUS_INSERT = 0
                                                                            - eUS_UPDATE = 1
                                                                            - eUS_DELETE = 2
                                                                            - eUS_REJECT = 3


             ♦   INFA_CTSetRowStrategy(). Sets the update strategy for each row. This overrides the
                 INFA_CTChangeDefaultRowStrategy function.
                 Use the following syntax:
                     INFA_STATUS INFA_CTSetRowStrategy(INFA_CT_OUTPUTGROUP_HANDLE group,
                     INFA_CT_UPDATESTRATEGY updateStrategy);


                                                                 Input/
                   Argument         Datatype                               Description
                                                                 Output

                   group            INFA_CT_OUTPUTGROUP_HANDLE   Input     Output group handle.

                   updateStrategy   INFA_CT_UPDATESTRATEGY       Input     Update strategy you want to set for the
                                                                           output port. Use one of the following
                                                                           values:
                                                                           - eUS_INSERT = 0
                                                                           - eUS_UPDATE = 1
                                                                           - eUS_DELETE = 2
                                                                           - eUS_REJECT = 3


             The return value datatype is INFA_STATUS. Use INFA_SUCCESS and INFA_FAILURE for
             the return value.




128   Chapter 4: Custom Transformation Functions
Change Default Row Strategy Function
  By default, the row strategy for a Custom transformation is pass-through when the
  transformation scope is Row. When the transformation scope is Transaction or All Input, the
  row strategy is the same value as the Treat Source Rows As session property by default.
  For example, in a mapping you have an Update Strategy transformation followed by a Custom
  transformation with Row transformation scope. The Update Strategy transformation flags the
  rows for update, insert, or delete. When the Integration Service passes a row to the Custom
  transformation, the Custom transformation retains the flag since its row strategy is pass-
  through.
  However, you can change the row strategy of a Custom transformation with PowerCenter. Use
  the INFA_CTChangeDefaultRowStrategy() function to change the default row strategy at the
  transformation level. For example, when you change the default row strategy of a Custom
  transformation to insert, the Integration Service flags all the rows that pass through this
  transformation for insert.
  Note: The Integration Service returns INFA_FAILURE if the session is not in data-driven
  mode.
  Use the following syntax:
          INFA_STATUS INFA_CTChangeDefaultRowStrategy(INFA_CT_TRANSFORMATION_HANDLE
          transformation, INFA_CT_DefaultUpdateStrategy defaultUpdateStrategy);


                                                           Input/
   Argument                Datatype                                 Description
                                                           Output

   transformation          INFA_CT_TRANSFORMATION_HANDLE   Input    Transformation handle.

   defaultUpdateStrategy   INFA_CT_DefaultUpdateStrategy   Input    Specifies the row strategy you
                                                                    want the Integration Service to
                                                                    use for the Custom
                                                                    transformation.
                                                                    - eDUS_PASSTHROUGH. Flags
                                                                      the row for passthrough.
                                                                    - eDUS_INSERT. Flags rows for
                                                                      insert.
                                                                    - eDUS_UPDATE. Flags rows
                                                                      for update.
                                                                    - eDUS_DELETE. Flags rows
                                                                      for delete.


  The return value datatype is INFA_STATUS. Use INFA_SUCCESS and INFA_FAILURE for
  the return value.




                                                                              API Functions      129
Array-Based API Functions
             The array-based functions are API functions you use when you change the data access mode
             to array-based. For more information about changing the data access mode, see “Set Data
             Access Mode Function” on page 104.
             Informatica provides the following groups of array-based API functions:
             ♦   Maximum number of rows. See “Maximum Number of Rows Functions” on page 130
             ♦   Number of rows. See “Number of Rows Functions” on page 131
             ♦   Is row valid. See “Is Row Valid Function” on page 132
             ♦   Data handling (array-based mode). See “Data Handling Functions (Array-Based Mode)”
                 on page 132
             ♦   Row strategy. See “Row Strategy Functions (Array-Based Mode)” on page 135
             ♦   Set input error row. See “Set Input Error Row Functions” on page 136


        Maximum Number of Rows Functions
             By default, the Integration Service allows a maximum number of rows in an input block and
             an output block. However, you can change the maximum number of rows allowed in an
             output block.
             Use the INFA_CTAGetInputNumRowsMax() and INFA_CTAGetOutputNumRowsMax()
             functions to determine the maximum number of rows in input and output blocks. Use the
             values these functions return to determine the buffer size if the procedure needs a buffer.
             You can set the maximum number of rows in the output block using the
             INFA_CTASetOutputRowMax() function. You might use this function if you want the
             procedure to use a larger or smaller buffer.
             You can only call these functions in an initialization function.
             PowerCenter provides the following functions to determine and set the maximum number of
             rows in blocks:
             ♦   INFA_CTAGetInputNumRowsMax(). Use this function to determine the maximum
                 number of rows allowed in an input block.
                 Use the following syntax:
                     IINT32 INFA_CTAGetInputRowMax( INFA_CT_INPUTGROUP_HANDLE inputgroup );


                                                                Input/
                   Argument         Datatype                                    Description
                                                                Output

                   inputgroup       INFA_CT_INPUTGROUP_HANDLE   Input           Input group handle.


             ♦   INFA_CTAGetOutputNumRowsMax(). Use this function to determine the maximum
                 number of rows allowed in an output block.



130   Chapter 4: Custom Transformation Functions
Use the following syntax:
         IINT32 INFA_CTAGetOutputRowMax( INFA_CT_OUTPUTGROUP_HANDLE outputgroup );


                                                   Input/
       Argument      Datatype                                    Description
                                                   Output

       outputgroup   INFA_CT_OUTPUTGROUP_HANDLE    Input         Output group handle.


  ♦   INFA_CTASetOutputRowMax(). Use this function to set the maximum number of rows
      allowed in an output block.
      Use the following syntax:
         INFA_STATUS INFA_CTASetOutputRowMax( INFA_CT_OUTPUTGROUP_HANDLE
         outputgroup, INFA_INT32 nRowMax );


                                                    Input/
       Argument       Datatype                                Description
                                                    Output

       outputgroup    INFA_CT_OUTPUTGROUP_HANDLE    Input     Output group handle.

       nRowMax        INFA_INT32                    Input     Maximum number of rows you want to
                                                              allow in an output block.
                                                              You must enter a positive number. The
                                                              function returns a fatal error when you
                                                              use a non-positive number, including
                                                              zero.



Number of Rows Functions
  Use the number of rows functions to determine the number of rows in an input block, or to
  set the number of rows in an output block for the specified input or output group.
  PowerCenter provides the following number of rows functions:
  ♦   INFA_CTAGetNumRows(). You can determine the number of rows in an input block.
      Use the following syntax:
         INFA_INT32 INFA_CTAGetNumRows( INFA_CT_INPUTGROUP_HANDLE inputgroup );


                                                   Input/
       Argument      Datatype                                 Description
                                                   Output

       inputgroup    INFA_CT_INPUTGROUP_HANDLE     Input      Input group handle.


  ♦   INFA_CTASetNumRows(). You can set the number of rows in an output block. Call this
      function before you call the output notification function.




                                                                 Array-Based API Functions         131
Use the following syntax:
                      void INFA_CTASetNumRows( INFA_CT_OUTPUTGROUP_HANDLE outputgroup,
                      INFA_INT32 nRows );


                                                                      Input/
                   Argument          Datatype                                       Description
                                                                      Output

                   outputgroup       INFA_CT_OUTPUTGROUP_HANDLE       Input         Output port handle.

                   nRows             INFA_INT32                       Input         Number of rows you want to define in
                                                                                    the output block. You must enter a
                                                                                    positive number. The Integration
                                                                                    Service fails the output notification
                                                                                    function when specify a non-positive
                                                                                    number.



        Is Row Valid Function
             Some rows in a block may be dropped, filter, or error rows. Use the INFA_CTAIsRowValid()
             function to determine if a row in a block is valid. This function returns INFA_TRUE when a
             row is valid.
             Use the following syntax:
                      INFA_BOOLEN INFA_CTAIsRowValid( INFA_CT_INPUTGROUP_HANDLE inputgroup,
                      INFA_INT32 iRow);


                                                              Input/
               Argument          Datatype                                      Description
                                                              Output

               inputgroup        INFA_CT_INPUTGROUP_HANDLE    Input            Input group handle.

               iRow              INFA_INT32                   Input            Index number of the row in the block. The
                                                                               index is zero-based.
                                                                               You must verify the procedure only passes
                                                                               an index number that exists in the data
                                                                               block. If you pass an invalid value, the
                                                                               Integration Service shuts down
                                                                               unexpectedly.



        Data Handling Functions (Array-Based Mode)
             When the Integration Service calls the p_<proc_name>_inputRowNotification() function, it
             notifies the procedure that the procedure can access a row or block of data. However, to get
             data from the input port, modify it, and set data in the output port in array-based mode, you
             must use the array-based data handling functions in the input row notification function.
             Include the INFA_CTAGetData<datatype>() function to get the data from the input port
             and INFA_CTASetData() function to set the data in the output port. Include the
             INFA_CTAGetIndicator() function if you want the procedure to verify before you get the
             data if the port has a null value or an empty string.



132   Chapter 4: Custom Transformation Functions
PowerCenter provides the following data handling functions for the array-based data access
mode:
♦    INFA_CTAGetData<datatype>(). For more information, see “Get Data Functions
     (Array-Based Mode)” on page 133.
♦    INFA_CTAGetIndicator(). For more information, see “Get Indicator Function (Array-
     Based Mode)” on page 134.
♦    INFA_CTASetData(). For more information, see “Set Data Function (Array-Based
     Mode)” on page 134.

Get Data Functions (Array-Based Mode)
Use the INFA_CTAGetData<datatype>() functions to retrieve data for the port the function
specifies. You must modify the function name depending on the datatype of the port you
want the procedure to access. The Integration Service passes the length of the data in the
array-based get data functions.
Table 4-14 lists the INFA_CTGetData<datatype>() function syntax and the datatype of the
return value:

Table 4-14. Get Data Functions (Array-Based Mode)

                                                                           Return Value
    Syntax
                                                                           Datatype

    void* INFA_CTAGetDataVoid( INFA_CT_INPUTPORT_HANDLE                    Data void pointer to
    inputport, INFA_INT32 iRow, INFA_UINT32* pLength);                     the return value

    char* INFA_CTAGetDataStringM( INFA_CT_INPUTPORT_HANDLE                 String (MBCS)
    inputport, INFA_INT32 iRow, INFA_UINT32* pLength);

    IUNICHAR* INFA_CTAGetDataStringU( INFA_CT_INPUTPORT_HANDLE             String (Unicode)
    inputport, INFA_INT32 iRow, INFA_UINT32* pLength);

    INFA_INT32 INFA_CTAGetDataINT32( INFA_CT_INPUTPORT_HANDLE              Integer
    inputport, INFA_INT32 iRow);

    double INFA_CTAGetDataDouble( INFA_CT_INPUTPORT_HANDLE                 Double
    inputport, INFA_INT32 iRow);

    INFA_CT_RAWDATETIME INFA_CTAGetDataRawDate(                            Raw date
    INFA_CT_INPUTPORT_HANDLE inputport, INFA_INT32 iRow);

    INFA_CT_DATETIME INFA_CTAGetDataDateTime(                              Datetime
    INFA_CT_INPUTPORT_HANDLE inputport, INFA_INT32 iRow);

    INFA_CT_RAWDEC18 INFA_CTAGetDataRawDec18(                              Decimal BLOB
    INFA_CT_INPUTPORT_HANDLE inputport, INFA_INT32 iRow);                  (precision 18)

    INFA_CT_RAWDEC28 INFA_CTAGetDataRawDec28(                              Decimal BLOB
    INFA_CT_INPUTPORT_HANDLE inputport, INFA_INT32 iRow);                  (precision 28)




                                                                Array-Based API Functions         133
Get Indicator Function (Array-Based Mode)
             Use the get indicator function when you want the procedure to verify if the input port has a
             null value.
             Use the following syntax:
                         INFA_INDICATOR INFA_CTAGetIndicator( INFA_CT_INPUTPORT_HANDLE inputport,
                         INFA_INT32 iRow );


                                                              Input/
                 Argument        Datatype                                    Description
                                                              Output

                 inputport       INFA_CT_INPUTPORT_HANDLE     Input          Input port handle.

                 iRow            INFA_INT32                   Input          Index number of the row in the block. The
                                                                             index is zero-based.
                                                                             You must verify the procedure only
                                                                             passes an index number that exists in the
                                                                             data block. If you pass an invalid value,
                                                                             the Integration Service shuts down
                                                                             unexpectedly.


             The return value datatype is INFA_INDICATOR. Use the following values for
             INFA_INDICATOR:
             ♦     INFA_DATA_VALID. Indicates the data is valid.
             ♦     INFA_NULL_DATA. Indicates a null value.
             ♦     INFA_DATA_TRUNCATED. Indicates the data has been truncated.

             Set Data Function (Array-Based Mode)
             Use the set data function when you want the procedure to pass a value to an output port. You
             can set the data, the length of the data, if applicable, and the indicator for the output port you
             specify. You do not use separate functions to set the length or indicator for the output port.
             Use the following syntax:
                         void INFA_CTASetData( INFA_CT_OUTPUTPORT_HANDLE outputport, INFA_INT32
                         iRow, void* pData, INFA_UINT32 nLength, INFA_INDICATOR indicator);


                                                            Input/
                 Argument      Datatype                                Description
                                                            Output

                 outputport    INFA_CT_OUTPUTPORT_HANDLE    Input      Output port handle.

                 iRow          INFA_INT32                   Input      Index number of the row in the block. The index
                                                                       is zero-based.
                                                                       You must verify the procedure only passes an
                                                                       index number that exists in the data block. If you
                                                                       pass an invalid value, the Integration Service
                                                                       shuts down unexpectedly.

                 pData         void*                        Input      Pointer to the data.




134   Chapter 4: Custom Transformation Functions
Input/
      Argument         Datatype                                    Description
                                                      Output

      nLength          INFA_UINT32                    Input        Length of the port. Use for string and binary
                                                                   ports only.
                                                                   You must verify the function passes the correct
                                                                   length of the data. If the function passes a
                                                                   different length, the output notification function
                                                                   returns failure for this port.
                                                                   Verify the length you set for string and binary
                                                                   ports is not greater than the precision for the
                                                                   port. If you set the length greater than the port
                                                                   precision, you get unexpected results. For
                                                                   example, the session may fail.

      indicator        INFA_INDICATOR                 Input        Indicator value for the output port. Use one of
                                                                   the following values:
                                                                   - INFA_DATA_VALID. Indicates the data is valid.
                                                                   - INFA_NULL_DATA. Indicates a null value.
                                                                   - INFA_DATA_TRUNCATED. Indicates the data
                                                                     has been truncated.



Row Strategy Functions (Array-Based Mode)
  The array-based row strategy functions allow you to access and configure the update strategy
  for each row in a block.
  PowerCenter provides the following row strategy functions:
  ♦     INFA_CTAGetRowStrategy(). Allows the procedure to get the update strategy for a row
        in a block.
        Use the following syntax:
             INFA_CT_UPDATESTRATEGY INFA_CTAGetRowStrategy( INFA_CT_INPUTGROUP_HANDLE
             inputgroup, INFA_INT32 iRow);


                                                          Input/
          Argument        Datatype                                     Description
                                                          Output

          inputgroup      INFA_CT_INPUTGROUP_HANDLE       Input        Input group handle.

          iRow            INFA_INT32                      Input        Index number of the row in the block. The
                                                                       index is zero-based.
                                                                       You must verify the procedure only passes
                                                                       an index number that exists in the data
                                                                       block. If you pass an invalid value, the
                                                                       Integration Service shuts down
                                                                       unexpectedly.




                                                                                Array-Based API Functions           135
♦   INFA_CTASetRowStrategy(). Sets the update strategy for a row in a block.
                 Use the following syntax:
                     void INFA_CTASetRowStrategy( INFA_CT_OUTPUTGROUP_HANDLE outputgroup,
                     INFA_INT32 iRow, INFA_CT_UPDATESTRATEGY updateStrategy );


                                                                   Input/
                   Argument         Datatype                                 Description
                                                                   Output

                   outputgroup      INFA_CT_OUTPUTGROUP_HANDLE     Input     Output group handle.

                   iRow             INFA_INT32                     Input     Index number of the row in the block.
                                                                             The index is zero-based.
                                                                             You must verify the procedure only
                                                                             passes an index number that exists in
                                                                             the data block. If you pass an invalid
                                                                             value, the Integration Service shuts
                                                                             down unexpectedly.

                   updateStrategy   INFA_CT_UPDATESTRATEGY         Input     Update strategy for the port. Use one
                                                                             of the following values:
                                                                             - eUS_INSERT = 0
                                                                             - eUS_UPDATE = 1
                                                                             - eUS_DELETE = 2
                                                                             - eUS_REJECT = 3



        Set Input Error Row Functions
             When you use array-based access mode, you cannot return INFA_ROWERROR in the input
             row notification function. Instead, use the set input error row functions to notify the
             Integration Service that a particular input row has an error.
             PowerCenter provides the following set input row functions in array-based mode:
             ♦   INFA_CTASetInputErrorRowM(). You can notify the Integration Service that a row in
                 the input block has an error and to output an MBCS error message to the session log.
                 Use the following syntax:
                     INFA_STATUS INFA_CTASetInputErrorRowM( INFA_CT_INPUTGROUP_HANDLE
                     inputGroup, INFA_INT32 iRow, size_t nErrors, INFA_MBCSCHAR* sErrMsg );


                                                                 Input/
                   Argument         Datatype                                Description
                                                                 Output

                   inputGroup       INFA_CT_INPUTGROUP_HANDLE    Input      Input group handle.

                   iRow             INFA_INT32                   Input      Index number of the row in the block.
                                                                            The index is zero-based.
                                                                            You must verify the procedure only
                                                                            passes an index number that exists in
                                                                            the data block. If you pass an invalid
                                                                            value, the Integration Service shuts
                                                                            down unexpectedly.




136   Chapter 4: Custom Transformation Functions
Input/
     Argument      Datatype                               Description
                                                Output

     nErrors       size_t                       Input     Use this parameter to specify the number
                                                          of errors this input row has caused.

     sErrMsg       INFA_MBCSCHAR*               Input     MBCS string containing the error
                                                          message you want the function to output.
                                                          You must enter a null-terminated string.
                                                          This parameter is optional. When you
                                                          include this argument, the Integration
                                                          Service prints the message in the
                                                          session log, even when you enable row
                                                          error logging.


♦   INFA_CTASetInputErrorRowU(). You can notify the Integration Service that a row in the
    input block has an error and to output a Unicode error message to the session log.
    Use the following syntax:
       INFA_STATUS INFA_CTASetInputErrorRowU( INFA_CT_INPUTGROUP_HANDLE
       inputGroup, INFA_INT32 iRow, size_t nErrors, INFA_UNICHAR* sErrMsg );


                                                Input/
     Argument      Datatype                               Description
                                                Output

     inputGroup    INFA_CT_INPUTGROUP_HANDLE    Input     Input group handle.

     iRow          INFA_INT32                   Input     Index number of the row in the block. The
                                                          index is zero-based.
                                                          You must verify the procedure only
                                                          passes an index number that exists in the
                                                          data block. If you pass an invalid value,
                                                          the Integration Service shuts down
                                                          unexpectedly.

     nErrors       size_t                       Input     Use this parameter to specify the number
                                                          of errors this output row has caused.

     sErrMsg       INFA_UNICHAR*                Input     Unicode string containing the error
                                                          message you want the function to output.
                                                          You must enter a null-terminated string.
                                                          This parameter is optional. When you
                                                          include this argument, the Integration
                                                          Service prints the message in the
                                                          session log, even when you enable row
                                                          error logging.




                                                              Array-Based API Functions          137
Java API Functions
             Information forthcoming.




138   Chapter 4: Custom Transformation Functions
C++ API Functions
      Information forthcoming.




                                 C++ API Functions   139
140   Chapter 4: Custom Transformation Functions
Chapter 5




Expression
Transformation
   This chapter includes the following topics:
   ♦   Overview, 142
   ♦   Creating an Expression Transformation, 143




                                                                141
Overview
                     Transformation type:
                     Passive
                     Connected


              Use the Expression transformation to calculate values in a single row before you write to the
              target. For example, you might need to adjust employee salaries, concatenate first and last
              names, or convert strings to numbers. Use the Expression transformation to perform any non-
              aggregate calculations. You can also use the Expression transformation to test conditional
              statements before you output the results to target tables or other transformations.
              Note: To perform calculations involving multiple rows, such as sums or averages, use the
              Aggregator transformation. Unlike the Expression transformation, the Aggregator lets you
              group and sort data. For more information, see “Aggregator Transformation” on page 37.


        Calculating Values
              To use the Expression transformation to calculate values for a single row, you must include the
              following ports:
              ♦   Input or input/output ports for each value used in the calculation. For example, when
                  calculating the total price for an order, determined by multiplying the unit price by the
                  quantity ordered, the input or input/output ports. One port provides the unit price and
                  the other provides the quantity ordered.
              ♦   Output port for the expression. You enter the expression as a configuration option for the
                  output port. The return value for the output port needs to match the return value of the
                  expression. For more information about entering expressions, see “Working with
                  Expressions” on page 10. Expressions use the transformation language, which includes
                  SQL-like functions, to perform calculations.


        Adding Multiple Calculations
              You can enter multiple expressions in a single Expression transformation. As long as you enter
              only one expression for each output port, you can create any number of output ports in the
              transformation. In this way, use one Expression transformation rather than creating separate
              transformations for each calculation that requires the same set of data.
              For example, you might want to calculate several types of withholding taxes from each
              employee paycheck, such as local and federal income tax, Social Security and Medicare. Since
              all of these calculations require the employee salary, the withholding category, and/or the
              corresponding tax rate, you can create one Expression transformation with the salary and
              withholding category as input/output ports and a separate output port for each necessary
              calculation.




142   Chapter 5: Expression Transformation
Creating an Expression Transformation
      Use the following procedure to create an Expression transformation.

      To create an Expression transformation:

      1.   In the Mapping Designer, click Transformation > Create. Select the Expression
           transformation. Enter a name for it (the convention is EXP_TransformationName) and
           click OK.
      2.   Create the input ports.
           If you have the input transformation available, you can select Link Columns from the
           Layout menu and then drag each port used in the calculation into the Expression
           transformation. With this method, the Designer copies the port into the new
           transformation and creates a connection between the two ports. Or, you can open the
           transformation and create each port manually.
           Note: If you want to make this transformation reusable, you must create each port
           manually within the transformation.
      3.   Repeat the previous step for each input port you want to add to the expression.
      4.   Create the output ports (O) you need, making sure to assign a port datatype that matches
           the expression return value. The naming convention for output ports is
           OUT_PORTNAME.
      5.   Click the small button that appears in the Expression section of the dialog box and enter
           the expression in the Expression Editor.
           To prevent typographic errors, where possible, use the listed port names and functions.
           If you select a port name that is not connected to the transformation, the Designer copies
           the port into the new transformation and creates a connection between the two ports.
           Port names used as part of an expression in an Expression transformation follow stricter
           rules than port names in other types of transformations:
           ♦   A port name must begin with a single- or double-byte letter or single- or double-byte
               underscore (_).
           ♦   It can contain any of the following single- or double-byte characters: a letter, number,
               underscore (_), $, #, or @.
      6.   Check the expression syntax by clicking Validate.
           If necessary, make corrections to the expression and check the syntax again. Then save the
           expression and exit the Expression Editor.
      7.   Connect the output ports to the next transformation or target.
      8.   Select a tracing level on the Properties tab to determine the amount of transaction detail
           reported in the session log file.
      9.   Click Repository > Save.


                                                                Creating an Expression Transformation   143
144   Chapter 5: Expression Transformation
Chapter 6




External Procedure
Transformation
   This chapter includes the following topics:
   ♦   Overview, 146
   ♦   Developing COM Procedures, 149
   ♦   Developing Informatica External Procedures, 159
   ♦   Distributing External Procedures, 169
   ♦   Development Notes, 171
   ♦   Service Process Variables in Initialization Properties, 180
   ♦   External Procedure Interfaces, 181




                                                                     145
Overview
                     Transformation type:
                     Passive
                     Connected/Unconnected


              External Procedure transformations operate in conjunction with procedures you create
              outside of the Designer interface to extend PowerCenter functionality.
              Although the standard transformations provide you with a wide range of options, there are
              occasions when you might want to extend the functionality provided with PowerCenter. For
              example, the range of standard transformations, such as Expression and Filter
              transformations, may not provide the functionality you need. If you are an experienced
              programmer, you may want to develop complex functions within a dynamic link library
              (DLL) or UNIX shared library, instead of creating the necessary Expression transformations
              in a mapping.
              To obtain this kind of extensibility, use the Transformation Exchange (TX) dynamic
              invocation interface built into PowerCenter. Using TX, you can create an Informatica
              External Procedure transformation and bind it to an external procedure that you have
              developed. You can bind External Procedure transformations to two kinds of external
              procedures:
              ♦   COM external procedures (available on Windows only)
              ♦   Informatica external procedures (available on Windows, AIX, HP-UX, Linux, and Solaris)
              To use TX, you must be an experienced C, C++, or Visual Basic programmer.
              Use multi-threaded code in external procedures.


        Code Page Compatibility
              When the Integration Service runs in ASCII mode, the external procedure can process data in
              7-bit ASCII.
              When the Integration Service runs in Unicode mode, the external procedure can process data
              that is two-way compatible with the Integration Service code page. For information about
              accessing the Integration Service code page, see “Code Page Access Functions” on page 185.
              Configure the Integration Service to run in Unicode mode if the external procedure DLL or
              shared library contains multibyte characters. External procedures must use the same code page
              as the Integration Service to interpret input strings from the Integration Service and to create
              output strings that contain multibyte characters.
              Configure the Integration Service to run in either ASCII or Unicode mode if the external
              procedure DLL or shared library contains ASCII characters only.




146   Chapter 6: External Procedure Transformation
External Procedures and External Procedure Transformations
   There are two components to TX: external procedures and External Procedure transformations.
   As its name implies, an external procedure exists separately from the Integration Service. It
   consists of C, C++, or Visual Basic code written by a user to define a transformation. This
   code is compiled and linked into a DLL or shared library, which is loaded by the Integration
   Service at runtime. An external procedure is “bound” to an External Procedure
   transformation.
   An External Procedure transformation is created in the Designer. It is an object that resides in
   the Informatica repository and serves several purposes:
   1.   It contains the metadata describing the following external procedure. It is through this
        metadata that the Integration Service knows the “signature” (number and types of
        parameters, type of return value, if any) of the external procedure.
   2.   It allows an external procedure to be referenced in a mapping. By adding an instance of
        an External Procedure transformation to a mapping, you call the external procedure
        bound to that transformation.
        Note: You can create a connected or unconnected External Procedure.

   3.   When you develop Informatica external procedures, the External Procedure
        transformation provides the information required to generate Informatica external
        procedure stubs.


External Procedure Transformation Properties
   Create reusable External Procedure transformations in the Transformation Developer, and
   add instances of the transformation to mappings. You cannot create External Procedure
   transformations in the Mapping Designer or Mapplet Designer.
   External Procedure transformations return one or no output rows per input row.
   On the Properties tab of the External Procedure transformation, only enter ASCII characters
   in the Module/Programmatic Identifier and Procedure Name fields. You cannot enter
   multibyte characters in these fields. On the Ports tab of the External Procedure
   transformation, only enter ASCII characters for the port names. You cannot enter multibyte
   characters for External Procedure transformation port names.


Pipeline Partitioning
   If you purchase the Partitioning option with PowerCenter, you can increase the number of
   partitions in a pipeline to improve session performance. Increasing the number of partitions
   allows the Integration Service to create multiple connections to sources and process partitions
   of source data concurrently.
   When you create a session, the Workflow Manager validates each pipeline in the mapping for
   partitioning. You can specify multiple partitions in a pipeline if the Integration Service can
   maintain data consistency when it processes the partitioned data.



                                                                                     Overview    147
Use the Is Partitionable property on the Properties tab to specify whether or not you can
              create multiple partitions in the pipeline. For more information about partitioning External
              Procedure transformations, see “Working with Partition Points” in the Workflow
              Administration Guide.


        COM Versus Informatica External Procedures
              Table 6-1 describes the differences between COM and Informatica external procedures:

              Table 6-1. Differences Between COM and Informatica External Procedures

                                       COM                            Informatica

               Technology              Uses COM technology            Uses Informatica proprietary technology

               Operating System        Runs on Windows only           Runs on all platforms supported for the Integration
                                                                      Service: Windows, AIX, HP, Linux, Solaris

               Language                C, C++, VC++, VB, Perl, VJ++   Only C++



        The BankSoft Example
              The following sections use an example called BankSoft to illustrate how to develop COM and
              Informatica procedures. The BankSoft example uses a financial function, FV, to illustrate how
              to develop and call an external procedure. The FV procedure calculates the future value of an
              investment based on regular payments and a constant interest rate.




148   Chapter 6: External Procedure Transformation
Developing COM Procedures
      You can develop COM external procedures using Microsoft Visual C++ or Visual Basic. The
      following sections describe how to create COM external procedures using Visual C++ and
      how to create COM external procedures using Visual Basic.


    Steps for Creating a COM Procedure
      To create a COM external procedure, complete the following steps:
      1.   Using Microsoft Visual C++ or Visual Basic, create a project.
      2.   Define a class with an IDispatch interface.
      3.   Add a method to the interface. This method is the external procedure that will be
           invoked from inside the Integration Service.
      4.   Compile and link the class into a dynamic link library.
      5.   Register the class in the local Windows registry.
      6.   Import the COM procedure in the Transformation Developer.
      7.   Create a mapping with the COM procedure.
      8.   Create a session using the mapping.


    COM External Procedure Server Type
      The Integration Service only supports in-process COM servers (that is, COM servers with
      Server Type: Dynamic Link Library). This is done to enhance performance. It is more efficient
      when processing large amounts of data to process the data in the same process, instead of
      forwarding it to a separate process on the same machine or a remote machine.


    Using Visual C++ to Develop COM Procedures
      C++ developers can use Visual C++ version 5.0 or later to develop COM procedures. The first
      task is to create a project.

      Step 1. Create an ATL COM AppWizard Project
      1.   Launch Visual C++ and click File > New.
      2.   In the dialog box that appears, select the Projects tab.
      3.   Enter the project name and location.
           In the BankSoft example, you enter COM_VC_Banksoft as the project name, and
           c:COM_VC_Banksoft as the directory.
      4.   Select the ATL COM AppWizard option in the projects list box and click OK.



                                                                      Developing COM Procedures   149
A wizard used to create COM projects in Visual C++ appears.
              5.   Set the Server Type to Dynamic Link Library, select the Support MFC option, and click
                   Finish.
                   The final page of the wizard appears.
              6.   Click OK to return to Visual C++.
              7.   Add a class to the new project.
              8.   On the next page of the wizard, click the OK button. The Developer Studio creates the
                   basic project files.

              Step 2. Add an ATL Object to a Project
              1.   In the Workspace window, select the Class View tab, right-click the tree item
                   COM_VC_BankSoft.BSoftFin classes, and choose New ATL Object from the local menu
                   that appears.
              2.   Highlight the Objects item in the left list box and select Simple Object from the list of
                   object types.
              3.   Click Next.
              4.   In the Short Name field, enter a short name for the class you want to create.
                   In the BankSoft example, use the name BSoftFin, since you are developing a financial
                   function for the fictional company BankSoft. As you type into the Short Name field, the
                   wizard fills in suggested names in the other fields.
              5.   Enter the programmatic identifier for the class.
                   In the BankSoft example, change the ProgID (programmatic identifier) field to
                   COM_VC_BankSoft.BSoftFin.
                   A programmatic identifier, or ProgID, is the human-readable name for a class. Internally,
                   classes are identified by numeric CLSID's. For example:
                      {33B17632-1D9F-11D1-8790-0000C044ACF9}

                   The standard format of a ProgID is Project.Class[.Version]. In the Designer, you refer to
                   COM classes through ProgIDs.
              6.   Select the Attributes tab and set the threading model to Free, the interface to Dual, and
                   the aggregation setting to No.
              7.   Click OK.
              Now that you have a basic class definition, you can add a method to it.

              Step 3. Add the Required Methods to the Class
              1.   Return to the Classes View tab of the Workspace Window.
              2.   Expand the tree view.


150   Chapter 6: External Procedure Transformation
For the BankSoft example, you expand COM_VC_BankSoft.
3.   Right-click the newly-added class.
     In the BankSoft example, you right-click the IBSoftFin tree item.
4.   Click the Add Method menu item and enter the name of the method.
     In the BankSoft example, you enter FV.
5.   In the Parameters field, enter the signature of the method.
     For FV, enter the following:
         [in] double Rate,
         [in] long nPeriods,
         [in] double Payment,
         [in] double PresentValue,
         [in] long PaymentType,
         [out, retval] double* FV

     This signature is expressed in terms of the Microsoft Interface Description Language
     (MIDL). For a complete description of MIDL, see the MIDL language reference. Note
     that:
     ♦   [in] indicates that the parameter is an input parameter.
     ♦   [out] indicates that the parameter is an output parameter.
     ♦   [out, retval] indicates that the parameter is the return value of the method.
     Also, note that all [out] parameters are passed by reference. In the BankSoft example, the
     parameter FV is a double.
6.   Click OK.
     The Developer Studio adds to the project a stub for the method you added.

Step 4. Fill Out the Method Stub with an Implementation
1.   In the BankSoft example, return to the Class View tab of the Workspace window and
     expand the COM_VC_BankSoft classes item.
2.   Expand the CBSoftFin item.
3.   Expand the IBSoftFin item under the above item.
4.   Right-click the FV item and choose Go to Definition.
5.   Position the cursor in the edit window on the line after the TODO comment and add the
     following code:
         double v = pow((1 + Rate), nPeriods);

         *FV = -(

              (PresentValue * v) +

              (Payment * (1 + (Rate * PaymentType))) * ((v - 1) / Rate)

         );



                                                                    Developing COM Procedures   151
Since you refer to the pow function, you have to add the following preprocessor
                   statement after all other include statements at the beginning of the file:
                      #include <math.h>

                   The final step is to build the DLL. When you build it, you register the COM procedure
                   with the Windows registry.

              Step 5. Build the Project
              1.   Pull down the Build menu.
              2.   Select Rebuild All.
                   As Developer Studio builds the project, it generates the following output:
                      ------------Configuration: COM_VC_BankSoft - Win32 Debug--------------
                      Performing MIDL step
                      Microsoft (R) MIDL Compiler Version 3.01.75
                      Copyright (c) Microsoft Corp 1991-1997. All rights reserved.
                      Processing .COM_VC_BankSoft.idl
                      COM_VC_BankSoft.idl
                      Processing C:msdevVCINCLUDEoaidl.idl
                      oaidl.idl
                      Processing C:msdevVCINCLUDEobjidl.idl
                      objidl.idl
                      Processing C:msdevVCINCLUDEunknwn.idl
                      unknwn.idl
                      Processing C:msdevVCINCLUDEwtypes.idl
                      wtypes.idl
                      Processing C:msdevVCINCLUDEocidl.idl
                      ocidl.idl
                      Processing C:msdevVCINCLUDEoleidl.idl
                      oleidl.idl
                      Compiling resources...
                      Compiling...
                      StdAfx.cpp
                      Compiling...
                      COM_VC_BankSoft.cpp
                      BSoftFin.cpp
                      Generating Code...
                      Linking...
                        Creating library Debug/COM_VC_BankSoft.lib and object Debug/
                      COM_VC_BankSoft.exp
                      Registering ActiveX Control...
                      RegSvr32: DllRegisterServer in .DebugCOM_VC_BankSoft.dll succeeded.

                      COM_VC_BankSoft.dll - 0 error(s), 0 warning(s)

              Notice that Visual C++ compiles the files in the project, links them into a dynamic link
              library (DLL) called COM_VC_BankSoft.DLL, and registers the COM (ActiveX) class
              COM_VC_BankSoft.BSoftFin in the local registry.
              Once the component is registered, it is accessible to the Integration Service running on that
              host.




152   Chapter 6: External Procedure Transformation
For more information about how to package COM classes for distribution to other
Integration Services, see “Distributing External Procedures” on page 169.
For more information about how to use COM external procedures to call functions in a
preexisting library of C or C++ functions, see “Wrapper Classes for Pre-Existing C/C++
Libraries or VB Functions” on page 173.
For more information about how to use a class factory to initialize COM objects, see
“Initializing COM and Informatica Modules” on page 175.

Step 6. Register a COM Procedure with the Repository
1.   Open the Transformation Developer.
2.   Click Transformation > Import External Procedure.
     The Import External COM Method dialog box appears.
3.   Click the Browse button.



                                                                   Locate the COM procedure.




4.   Select the COM DLL you created and click OK.
     In the Banksoft example, select COM_VC_Banksoft.DLL.
5.   Under Select Method tree view, expand the class node (in this example, BSoftFin).
6.   Expand Methods.
7.   Select the method you want (in this example, FV) and press OK.
     The Designer creates an External Procedure transformation.
8.   Open the External Procedure transformation, and select the Properties tab.




                                                               Developing COM Procedures       153
The transformation properties display:




                    Enter ASCII characters in the Module/Programmatic Identifier and Procedure Name
                    fields.
              9.    Click the Ports tab.




                    Enter ASCII characters in the Port Name fields. For more information about mapping
                    Visual C++ and Visual Basic datatypes to COM datatypes, see “COM Datatypes” on
                    page 171.
              10.   Click OK, and then click Repository > Save.
                    The repository now contains the reusable transformation, so you can add instances of this
                    transformation to mappings.



154   Chapter 6: External Procedure Transformation
Step 7. Create a Source and a Target for a Mapping
Use the following SQL statements to create a source table and to populate this table with
sample data:
       create table FVInputs(
         Rate float,
         nPeriods int,
         Payment float,
         PresentValue float,
         PaymentType int
       )
       insert into FVInputs values      (.005,10,-200.00,-500.00,1)
       insert into FVInputs values      (.01,12,-1000.00,0.00,0)
       insert into FVInputs values      (.11/12,35,-2000.00,0.00,1)
       insert into FVInputs values      (.005,12,-100.00,-1000.00,1)

Use the following SQL statement to create a target table:
       create table FVOutputs(
          FVin_ext_proc float,
        )

Use the Source Analyzer and the Target Designer to import FVInputs and FVOutputs into
the same folder as the one in which you created the COM_BSFV transformation.

Step 8. Create a Mapping to Test the External Procedure
Transformation
Now create a mapping to test the External Procedure transformation:
1.   In the Mapping Designer, create a new mapping named Test_BSFV.
2.   Drag the source table FVInputs into the mapping.
3.   Drag the target table FVOutputs into the mapping.
4.   Drag the transformation COM_BSFV into the mapping.




5.   Connect the Source Qualifier transformation ports to the External Procedure
     transformation ports as appropriate.




                                                               Developing COM Procedures    155
6.    Connect the FV port in the External Procedure transformation to the FVIn_ext_proc
                    target column.
              7.    Validate and save the mapping.

              Step 9. Start the Integration Service
              Start the Integration Service. Note that the service must be started on the same host as the one
              on which the COM component was registered.

              Step 10. Run a Workflow to Test the Mapping
              When the Integration Service runs the session in a workflow, it performs the following
              functions:
              ♦    Uses the COM runtime facilities to load the DLL and create an instance of the class.
              ♦    Uses the COM IDispatch interface to call the external procedure you defined once for
                   every row that passes through the mapping.
              Note: Multiple classes, each with multiple methods, can be defined within a single project.
              Each of these methods can be invoked as an external procedure.

              To run a workflow to test the mapping:

              1.    In the Workflow Manager, create the session s_Test_BSFV from the Test_BSFV
                    mapping.
              2.    Create a workflow that contains the session s_Test_BSFV.
              3.    Run the workflow. The Integration Service searches the registry for the entry for the
                    COM_VC_BankSoft.BSoftFin class. This entry has information that allows the
                    Integration Service to determine the location of the DLL that contains that class. The
                    Integration Service loads the DLL, creates an instance of the class, and invokes the FV
                    function for every row in the source table.
                    When the workflow finishes, the FVOutputs table should contain the following results:
                    FVIn_ext_proc
                    2581.403374
                    12682.503013
                    82846.246372
                    2301.401830



        Developing COM Procedures with Visual Basic
              Microsoft Visual Basic offers a different development environment for creating COM
              procedures. While the Basic language has different syntax and conventions, the development
              procedure has the same broad outlines as developing COM procedures in Visual C++.




156   Chapter 6: External Procedure Transformation
Step 1. Create a Visual Basic Project with a Single Class
1.   Launch Visual Basic and click File > New Project.
2.   In the dialog box that appears, select ActiveX DLL as the project type and click OK.
     Visual Basic creates a new project named Project1.
     If the Project window does not display, type Ctrl+R, or click View > Project Explorer.
     If the Properties window does not display, press F4, or click View > Properties.
3.   In the Project Explorer window for the new project, right-click the project and choose
     Project1 Properties from the menu that appears.
4.   Enter the name of the new project.
     In the Project window, select Project1 and change the name in the Properties window to
     COM_VB_BankSoft.

Step 2. Change the Names of the Project and Class
1.   Inside the Project Explorer, select the “Project – Project1” item, which should be the root
     item in the tree control. The project properties display in the Properties Window.
2.   Select the Alphabetic tab in the Properties Window and change the Name property to
     COM_VB_BankSoft. This renames the root item in the Project Explorer to
     COM_VB_BankSoft (COM_VB_BankSoft).
3.   Expand the COM_VB_BankSoft (COM_VB_BankSoft) item in the Project Explorer.
4.   Expand the Class Modules item.
5.   Select the Class1 (Class1) item. The properties of the class display in the Properties
     Window.
6.   Select the Alphabetic tab in the Properties Window and change the Name property to
     BSoftFin.
By changing the name of the project and class, you specify that the programmatic identifier
for the class you create is “COM_VB_BankSoft.BSoftFin.” Use this ProgID to refer to this
class inside the Designer.

Step 3. Add a Method to the Class
Place the pointer inside the Code window and enter the following text:
       Public Function FV( _
         Rate As Double, _
         nPeriods As Long, _
         Payment As Double, _
         PresentValue As Double, _
         PaymentType As Long _
       ) As Double

         Dim v As Double
         v = (1 + Rate) ^ nPeriods


                                                                 Developing COM Procedures    157
FV = -( _
                          (PresentValue * v) + _
                          (Payment * (1 + (Rate * PaymentType))) * ((v - 1) / Rate) _
                        )

                        End Function

              This Visual Basic FV function, of course, performs the same operation as the C++ FV
              function in “Developing COM Procedures with Visual Basic” on page 156.

              Step 4. Build the Project

              To build the project:

              1.   From the File menu, select the Make COM_VB_BankSoft.DLL. A dialog box prompts
                   you for the file location.
              2.   Enter the file location and click OK.
              Visual Basic compiles the source code and creates the COM_VB_BankSoft.DLL in the
              location you specified. It also registers the class COM_VB_BankSoft.BSoftFin in the local
              registry.
              Once the component is registered, it is accessible to the Integration Service running on that
              host.
              For more information about how to package Visual Basic COM classes for distribution to
              other machines hosting the Integration Service, see “Distributing External Procedures” on
              page 169.
              For more information about how to use Visual Basic external procedures to call preexisting
              Visual Basic functions, see “Wrapper Classes for Pre-Existing C/C++ Libraries or VB
              Functions” on page 173.
              To create the procedure, follow steps 6 to 9 of “Using Visual C++ to Develop COM
              Procedures” on page 149.




158   Chapter 6: External Procedure Transformation
Developing Informatica External Procedures
      You can create external procedures that run on 32-bit or 64-bit Integration Service machines.
      Complete the following steps to create an Informatica-style external procedure:
      1.   In the Transformation Developer, create an External Procedure transformation.
           The External Procedure transformation defines the signature of the procedure. The
           names of the ports, datatypes and port type (input or output) must match the signature
           of the external procedure.
      2.   Generate the template code for the external procedure.
           When you execute this command, the Designer uses the information from the External
           Procedure transformation to create several C++ source code files and a makefile. One of
           these source code files contains a “stub” for the function whose signature you defined in
           the transformation.
      3.   Modify the code to add the procedure logic. Fill out the stub with an implementation
           and use a C++ compiler to compile and link the source code files into a dynamic link
           library or shared library.
           When the Integration Service encounters an External Procedure transformation bound to
           an Informatica procedure, it loads the DLL or shared library and calls the external
           procedure you defined.
      4.   Build the library and copy it to the Integration Service machine.
      5.   Create a mapping with the External Procedure transformation.
      6.   Run the session in a workflow.
      We use the BankSoft example to illustrate how to implement this feature.


    Step 1. Create the External Procedure Transformation
      1.   Open the Transformation Developer and create an External Procedure transformation.
      2.   Open the transformation and enter a name for it.
           In the BankSoft example, enter EP_extINF_BSFV.
      3.   Create a port for each argument passed to the procedure you plan to define.
           Be sure that you use the correct datatypes.




                                                          Developing Informatica External Procedures   159
To use the FV procedure as an example, you create the following ports. The last port, FV,
                   captures the return value from the procedure:




              4.   Select the Properties tab and configure the procedure as an Informatica procedure.
                   In the BankSoft example, enter the following:




                                                                                       Module/Programmatic
                                                                                       Identifier
                                                                                       Runtime Location




                   Note on Module/Programmatic Identifier:
                   ♦   The module name is the base name of the dynamic link library (on Windows) or the
                       shared object (on UNIX) that contains the external procedures. The following table




160   Chapter 6: External Procedure Transformation
describes how the module name determines the name of the DLL or shared object on
          the various platforms:

         Operating System    Module Identifier        Library File Name

         Windows             INF_BankSoft             INF_BankSoft.DLL

         AIX                 INF_BankSoft             libINF_BankSoftshr.a

         HPUX                INF_BankSoft             libINF_BankSoft.sl

         Linux               INF_BankSoft             libINF_BankSoft.so

         Solaris             INF_BankSoft             libINF_BankSoft.so.1


     Notes on Runtime Location:
     ♦    If you set the Runtime Location to $PMExtProcDir, then the Integration Service looks
          in the directory specified by the process variable $PMExtProcDir to locate the library.
     ♦    If you leave the Runtime Location property blank, the Integration Service uses the
          environment variable defined on the server platform to locate the dynamic link library
          or shared object. The following table describes the environment variables used to
          locate the DLL or shared object on the various platforms:

         Operating System           Environment Variable

          Windows                   PATH

          AIX                       LIBPATH

          HPUX                      SHLIB_PATH

          Linux                     LD_LIBRARY_PATH

          Solaris                   LD_LIBRARY_PATH


     ♦    You can hard code a path as the Runtime Location. This is not recommended since the
          path is specific to a single machine only.
     Note: You must copy all DLLs or shared libraries to the Runtime Location or to the
     environment variable defined on the Integration Service machine. The Integration
     Service fails to load the external procedure when it cannot locate the DLL, shared library,
     or a referenced file.
5.   Click OK.
6.   Click Repository > Save.
After you create the External Procedure transformation that calls the procedure, the next step
is to generate the C++ files.




                                                           Developing Informatica External Procedures   161
Step 2. Generate the C++ Files
              After you create an External Procedure transformation, you generate the code. The Designer
              generates file names in lower case since files created on UNIX-mapped drives are always in
              lower case. The following rules apply to the generated files:
              ♦    File names. A prefix ‘tx’ is used for TX module files.
              ♦    Module class names. The generated code has class declarations for the module that
                   contains the TX procedures. A prefix Tx is used for TX module classes. For example, if an
                   External Procedure transformation has a module name Mymod, then the class name is
                   TxMymod.

              To generate the code for an external procedure:

              1.    Select the transformation and click Transformation > Generate Code.
              2.    Select the check box next to the name of the procedure you just created.
                    In the BankSoft example, select INF_BankSoft.FV.
              3.    Specify the directory where you want to generate the files, and click Generate.
                    The Designer creates a subdirectory, INF_BankSoft, in the directory you specified.
                    Each External Procedure transformation created in the Designer must specify a module
                    and a procedure name. The Designer generates code in a single directory for all
                    transformations sharing a common module name. Building the code in one directory
                    creates a single shared library.
                    The Designer generates the following files:
                    ♦   tx<moduleName>.h. Defines the external procedure module class. This class is derived
                        from a base class TINFExternalModule60. No data members are defined for this class
                        in the generated code. However, you can add new data members and methods here.
                    ♦   tx<moduleName>.cpp. Implements the external procedure module class. You can
                        expand the InitDerived() method to include initialization of any new data members
                        you add. The Integration Service calls the derived class InitDerived() method only
                        when it successfully completes the base class Init() method.
                    This file defines the signatures of all External Procedure transformations in the module.
                    Any modification of these signatures leads to inconsistency with the External Procedure
                    transformations defined in the Designer. Therefore, you should not change the
                    signatures.
                    This file also includes a C function CreateExternalModuleObject, which creates an
                    object of the external procedure module class using the constructor defined in this file.
                    The Integration Service calls CreateExternalModuleObject instead of directly calling the
                    constructor.
                    ♦   <procedureName>.cpp. The Designer generates one of these files for each external
                        procedure in this module. This file contains the code that implements the procedure
                        logic, such as data cleansing and filtering. For data cleansing, create code to read in



162   Chapter 6: External Procedure Transformation
values from the input ports and generate values for output ports. For filtering, create
         code to suppress generation of output rows by returning INF_NO_OUTPUT_ROW.
     ♦   stdafx.h. Stub file used for building on UNIX systems. The various *.cpp files include
         this file. On Windows systems, the Visual Studio generates an stdafx.h file, which
         should be used instead of the Designer generated file.
     ♦   version.cpp. This is a small file that carries the version number of this
         implementation. In earlier releases, external procedure implementation was handled
         differently. This file allows the Integration Service to determine the version of the
         external procedure module.
     ♦   makefile.aix, makefile.aix64, makefile.hp, makefile.hp64, makefile.hpparisc64,
         makefile.linux, makefile.sol. Make files for UNIX platforms. Use makefile.aix,
         makefile.hp, makefile.linux, and makefile.sol for 32-bit platforms. Use makefile.aix64
         for 64-bit AIX platforms and makefile.hp64 for 64-bit HP-UX (Itanium) platforms.

Example 1
In the BankSoft example, the Designer generates the following files:
♦   txinf_banksoft.h. Contains declarations for module class TxINF_BankSoft and external
    procedure FV.
♦   txinf_banksoft.cpp. Contains code for module class TxINF_BankSoft.
♦   fv.cpp. Contains code for procedure FV.
♦   version.cpp. Returns TX version.
♦   stdafx.h. Required for compilation on UNIX. On Windows, stdafx.h is generated by
    Visual Studio.
♦   readme.txt. Contains general help information.

Example 2
If you create two External Procedure transformations with procedure names ‘Myproc1’ and
‘Myproc2,’ both with the module name Mymod, the Designer generates the following files:
♦   txmymod.h. Contains declarations for module class TxMymod and external procedures
    Myproc1 and Myproc2.
♦   txmymod.cpp. Contains code for module class TxMymod.
♦   myproc1.cpp. Contains code for procedure Myproc1.
♦   myproc2.cpp. Contains code for procedure Myproc2.
♦   version.cpp.
♦   stdafx.h.
♦   readme.txt.




                                                     Developing Informatica External Procedures   163
Step 3. Fill Out the Method Stub with Implementation
              The final step is coding the procedure.
              1.   Open the <Procedure_Name>.cpp stub file generated for the procedure.
                   In the BankSoft example, you open fv.cpp to code the TxINF_BankSoft::FV procedure.
              2.   Enter the C++ code for the procedure.
                   The following code implements the FV procedure:
                      INF_RESULT TxINF_BankSoft::FV()

                      {
                          //   Input port values are mapped to the m_pInParamVector array in
                          //   the InitParams method. Use m_pInParamVector[i].IsValid() to check
                          //   if they are valid. Use m_pInParamVector[i].GetLong or GetDouble,
                          //   etc. to get their value. Generate output data into m_pOutParamVector.

                          //       TODO: Fill in implementation of the FV method here.

                               ostrstream ss;

                               char* s;

                               INF_Boolean bVal;

                               double v;

                               TINFParam* Rate = &m_pInParamVector[0];

                               TINFParam* nPeriods = &m_pInParamVector[1];

                               TINFParam* Payment = &m_pInParamVector[2];

                               TINFParam* PresentValue = &m_pInParamVector[3];

                               TINFParam* PaymentType = &m_pInParamVector[4];

                               TINFParam* FV = &m_pOutParamVector[0];

                               bVal =

                                   INF_Boolean(

                                        Rate->IsValid() &&

                                        nPeriods->IsValid() &&

                                        Payment->IsValid() &&

                                        PresentValue->IsValid() &&
                                        PaymentType->IsValid()

                                   );

                               if (bVal == INF_FALSE)
                               {

                                   FV->SetIndicator(INF_SQL_DATA_NULL);

                                   return INF_SUCCESS;

                               }


164   Chapter 6: External Procedure Transformation
v = pow((1 + Rate->GetDouble()), (double)nPeriods->GetLong());

               FV->SetDouble(

                    -(
                         (PresentValue->GetDouble() * v) +

                         (Payment->GetDouble() *

                           (1 + (Rate->GetDouble() * PaymentType->GetLong()))) *
                         ((v - 1) / Rate->GetDouble())

                    )

               );
               ss << "The calculated future value is: " << FV->GetDouble() <<ends;

               s = ss.str();

               (*m_pfnMessageCallback)(E_MSG_TYPE_LOG, 0, s);
               (*m_pfnMessageCallback)(E_MSG_TYPE_ERR, 0, s);

               delete [] s;

               return INF_SUCCESS;
          }

        The Designer generates the function profile, including the arguments and return value.
        You need to enter the actual code within the function, as indicated in the comments.
        Since you referenced the POW function and defined an ostrstream variable, you must
        also include the preprocessor statements:
        On Windows:
          #include <math.h>;
          #include <strstrea.h>;

        On UNIX, the include statements would be the following:
          #include <math.h>;
          #include <strstream.h>;

   3.   Save the modified file.


Step 4. Building the Module
   On Windows, use Visual C++ to compile the DLL.

   To build a DLL on Windows:

   1.   Start Visual C++.
   2.   Click File > New.
   3.   In the New dialog box, click the Projects tab and select the MFC AppWizard (DLL)
        option.


                                                     Developing Informatica External Procedures   165
4.    Enter its location.
                    In the BankSoft example, you enter c:pmclienttxINF_BankSoft, assuming you
                    generated files in c:pmclienttx.
              5.    Enter the name of the project.
                    It must be the same as the module name entered for the External Procedure
                    transformation. In the BankSoft example, it is INF_BankSoft.
              6.    Click OK.
                    Visual C++ now steps you through a wizard that defines all the components of the
                    project.
              7.    In the wizard, click MFC Extension DLL (using shared MFC DLL).
              8.    Click Finish.
                    The wizard generates several files.
              9.    Click Project > Add To Project > Files.
              10.   Navigate up a directory level. This directory contains the external procedure files you
                    created. Select all .cpp files.
                    In the BankSoft example, add the following files:
                    ♦   fv.cpp
                    ♦   txinf_banksoft.cpp
                    ♦   version.cpp
              11.   Click Project > Settings.
              12.   Click the C/C++ tab, and select Preprocessor from the Category field.
              13.   In the Additional Include Directories field, enter ..; <pmserver install
                    dir>extprocinclude.
              14.   Click the Link tab, and select General from the Category field.
              15.   Enter <pmserver install dir>binpmtx.lib in the Object/Library Modules field.
              16.   Click OK.
              17.   Click Build > Build INF_BankSoft.dll or press F7 to build the project.
                    The compiler now creates the DLL and places it in the debug or release directory under
                    the project directory. For information about running a workflow with the debug version,
                    see “Running a Session with the Debug Version of the Module on Windows” on
                    page 168.




166   Chapter 6: External Procedure Transformation
To build shared libraries on UNIX:

  1.   If you cannot access the PowerCenter Client tools directly, copy all the files you need for
       the shared library to the UNIX machine where you plan to perform the build.
       For example, in the BankSoft procedure, use ftp or another mechanism to copy
       everything from the INF_BankSoft directory to the UNIX machine.
  2.   Set the environment variable INFA_HOME to the PowerCenter installation directory.
       Warning: If you specify an incorrect directory path for the INFA_HOME environment
       variable, the Integration Service cannot start.
  3.   Enter the command to make the project.
       The command depends on the version of UNIX, as summarized below:

        UNIX Version         Command

        AIX (32-bit)         make -f makefile.aix

        AIX (64-bit)         make -f makefile.aix64

        HP-UX (32-bit)       make -f makefile.hp

        HP-UX (64-bit)       make -f makefile.hp64

        Linux                make -f makefile.linux

        Solaris              make -f makefile.sol



Step 5. Create a Mapping
  In the Mapping Designer, create a mapping that uses this External Procedure transformation.


Step 6. Run the Session in a Workflow
  When you run the session in a workflow, the Integration Service looks in the directory you
  specify as the Runtime Location to find the library (DLL) you built in Step 4. The default
  value of the Runtime Location property in the session properties is $PMExtProcDir.

  To run a session in a workflow:

  1.   In the Workflow Manager, create a workflow.
  2.   Create a session for this mapping in the workflow.
       Tip: Alternatively, you can create a re-usable session in the Task Developer and use it in
       the workflow.
  3.   Copy the library (DLL) to the Runtime Location directory.
  4.   Run the workflow containing the session.




                                                       Developing Informatica External Procedures   167
Running a Session with the Debug Version of the Module on Windows
              Informatica ships PowerCenter on Windows with the release build (pmtx.dll) and the debug
              build (pmtxdbg.dll) of the External Procedure transformation library. These libraries are
              installed in the server bin directory.
              If you build a release version of the module in Step 4, run the session in a workflow to use the
              release build (pmtx.dll) of the External Procedure transformation library. You do not need to
              complete the following task.
              If you build a debug version of the module in Step 4, follow the procedure below to use the
              debug build (pmtxdbg.dll) of the External Procedure transformation library.

              To run a session using a debug version of the module:

              1.   In the Workflow Manager, create a workflow.
              2.   Create a session for this mapping in the workflow.
                   Or, you can create a re-usable session in the Task Developer and use it in the workflow.
              3.   Copy the library (DLL) to the Runtime Location directory.
              4.   To use the debug build of the External Procedure transformation library:
                   ♦   Preserve pmtx.dll by renaming it or moving it from the server bin directory.
                   ♦   Rename pmtxdbg.dll to pmtx.dll.
              5.   Run the workflow containing the session.
              6.   To revert the release build of the External Procedure transformation library back to the
                   default library:
                   ♦   Rename pmtx.dll back to pmtxdbg.dll.
                   ♦   Return/rename the original pmtx.dll file to the server bin directory.
              Note: If you run a workflow containing this session with the debug version of the module on
              Windows, you must return the original pmtx.dll file to its original name and location before
              you can run a non-debug session.




168   Chapter 6: External Procedure Transformation
Distributing External Procedures
      Suppose you develop a set of external procedures and you want to make them available on
      multiple servers, each of which is running the Integration Service. The methods for doing this
      depend on the type of the external procedure and the operating system on which you built it.
      You can also use these procedures to distribute external procedures to external customers.


    Distributing COM Procedures
      Visual Basic and Visual C++ register COM classes in the local registry when you build the
      project. Once registered, these classes are accessible to the Integration Service running on the
      machine where you compiled the DLL. For example, if you build a project on HOST1, all the
      classes in the project will be registered in the HOST1 registry and will be accessible to the
      Integration Service running on HOST1. Suppose, however, that you also want the classes to
      be accessible to the Integration Service running on HOST2. For this to happen, the classes
      must be registered in the HOST2 registry.
      Visual Basic provides a utility for creating a setup program that can install COM classes on a
      Windows machine and register these classes in the registry on that machine. While no utility
      is available in Visual C++, you can easily register the class yourself.
      Figure 6-1 shows the process for distributing external procedures:

      Figure 6-1. Process for Distributing External Procedures

        Development                     PowerCenter Client       Integration Service
        (Where external                 (Bring the DLL here to   (Bring the DLL here to
        procedure was                   run                      execute
        developed using C++             regsvr32<xyz>.dll)       regsvr32<xyz>.dll)
        or VB)




      To distribute a COM Visual Basic procedure:

      1.   After you build the DLL, exit Visual Basic and launch the Visual Basic Application Setup
           wizard.
      2.   Skip the first panel of the wizard.
      3.   On the second panel, specify the location of the project and select the Create a Setup
           Program option.
      4.   In the third panel, select the method of distribution you plan to use.
      5.   In the next panel, specify the directory to which you want to write the setup files.
           For simple ActiveX components, you can continue to the final panel of the wizard.
           Otherwise, you may need to add more information, depending on the type of file and the
           method of distribution.



                                                                     Distributing External Procedures   169
6.   Click Finish in the final panel.
                   Visual Basic then creates the setup program for the DLL. Run this setup program on any
                   Windows machine where the Integration Service is running.

              To distribute a COM Visual C++/Visual Basic procedure manually:

              1.   Copy the DLL to the directory on the new Windows machine anywhere you want it
                   saved.
              2.   Log in to this Windows machine and open a DOS prompt.
              3.   Navigate to the directory containing the DLL and execute the following command:
                      REGSVR32 project_name.DLL

                   project_name is the name of the DLL you created. In the BankSoft example, the project
                   name is COM_VC_BankSoft.DLL. or COM_VB_BankSoft.DLL.
                   This command line program then registers the DLL and any COM classes contained in
                   it.


        Distributing Informatica Modules
              You can distribute external procedures between repositories.

              To distribute external procedures between repositories:

              1.   Move the DLL or shared object that contains the external procedure to a directory on a
                   machine that the Integration Service can access.
              2.   Copy the External Procedure transformation from the original repository to the target
                   repository using the Designer client tool.
                   -or-
                   Export the External Procedure transformation to an XML file and import it in the target
                   repository.
                   For more information, see “Exporting and Importing Objects” in the Repository Guide.




170   Chapter 6: External Procedure Transformation
Development Notes
      This section includes some additional guidelines and information about developing COM
      and Informatica external procedures.


    COM Datatypes
      When using either Visual C++ or Visual Basic to develop COM procedures, you need to use
      COM datatypes that correspond to the internal datatypes that the Integration Service uses
      when reading and transforming data. These datatype matches are important when the
      Integration Service attempts to map datatypes between ports in an External Procedure
      transformation and arguments (or return values) from the procedure the transformation calls.
      Table 6-2 compares Visual C++ and transformation datatypes:

      Table 6-2. Visual C++ and Transformation Datatypes

       Visual C++ COM Datatype         Transformation Datatype

       VT_I4                           Integer

       VT_UI4                          Integer

       VT_R8                           Double

       VT_BSTR                         String

       VT_DECIMAL                      Decimal

       VT_DATE                         Date/Time


      Table 6-3 compares Visual Basic and transformation datatypes:

      Table 6-3. Visual Basic and Transformation Datatypes

       Visual Basic COM Datatype       Transformation Datatype

       Long                            Integer

       Double                          Double

       String                          String

       Decimal                         Decimal

       Date                            Date/Time


      If you do not correctly match datatypes, the Integration Service may attempt a conversion.
      For example, if you assign the Integer datatype to a port, but the datatype for the
      corresponding argument is BSTR, the Integration Service attempts to convert the Integer
      value to a BSTR.




                                                                            Development Notes   171
Row-Level Procedures
              All External Procedure transformations call procedures using values from a single row passed
              through the transformation. You cannot use values from multiple rows in a single procedure
              call. For example, you could not code the equivalent of the aggregate functions SUM or AVG
              into a procedure call. In this sense, all external procedures must be stateless.


        Return Values from Procedures
              When you call a procedure, the Integration Service captures an additional return value
              beyond whatever return value you code into the procedure. This additional value indicates
              whether the Integration Service successfully called the procedure.
              For COM procedures, this return value uses the type HRESULT.
              Informatica procedures use the type INF_RESULT. If the value returned is S_OK/
              INF_SUCCESS, the Integration Service successfully called the procedure. You must return
              the appropriate value to indicate the success or failure of the external procedure. Informatica
              procedures return four values:
              ♦   INF_SUCCESS. The external procedure processed the row successfully. The Integration
                  Service passes the row to the next transformation in the mapping.
              ♦   INF_NO_OUTPUT_ROW. The Integration Service does not write the current row due
                  to external procedure logic. This is not an error. When you use
                  INF_NO_OUTPUT_ROW to filter rows, the External Procedure transformation behaves
                  similarly to the Filter transformation.
                  Note: When you use INF_NO_OUTPUT_ROW in the external procedure, make sure you
                  connect the External Procedure transformation to another transformation that receives
                  rows from the External Procedure transformation only.
              ♦   INF_ROW_ERROR. Equivalent to a transformation error. The Integration Service
                  discards the current row, but may process the next row unless you configure the session to
                  stop on n errors.
              ♦   INF_FATAL_ERROR. Equivalent to an ABORT() function call. The Integration Service
                  aborts the session and does not process any more rows. For more information, see
                  “Functions” in the Transformation Language Reference.


        Exceptions in Procedure Calls
              The Integration Service captures most exceptions that occur when it calls a COM or
              Informatica procedure through an External Procedure transformation. For example, if the
              procedure call creates a divide by zero error, the Integration Service catches the exception.
              In a few cases, the Integration Service cannot capture errors generated by procedure calls.
              Since the Integration Service supports only in-process COM servers, and since all Informatica
              procedures are stored in shared libraries and DLLs, the code running external procedures
              exists in the same address space in memory as the Integration Service. Therefore, it is possible
              for the external procedure code to overwrite the Integration Service memory, causing the



172   Chapter 6: External Procedure Transformation
Integration Service to stop. If COM or Informatica procedures cause such stops, review the
  source code for memory access problems.


Memory Management for Procedures
  Since all the datatypes used in Informatica procedures are fixed length, there are no memory
  management issues for Informatica external procedures. For COM procedures, you need to
  allocate memory only if an [out] parameter from a procedure uses the BSTR datatype. In this
  case, you need to allocate memory on every call to this procedure. During a session, the
  Integration Service releases the memory after calling the function.


Wrapper Classes for Pre-Existing C/C++ Libraries or VB Functions
  Suppose that BankSoft has a library of C or C++ functions and wants to plug these functions
  in to the Integration Service. In particular, the library contains BankSoft’s own
  implementation of the FV function, called PreExistingFV. The general method for doing this
  is the same for both COM and Informatica external procedures. A similar solution is available
  in Visual Basic. You need only make calls to preexisting Visual Basic functions or to methods
  on objects that are accessible to Visual Basic.


Generating Error and Tracing Messages
  The implementation of the Informatica external procedure TxINF_BankSoft::FV in “Step 4.
  Building the Module” on page 165 contains the following lines of code.
        ostrstream ss;
        char* s;
        ...
        ss << "The calculated future value is: " << FV->GetDouble() << ends;
        s = ss.str();
        (*m_pfnMessageCallback)(E_MSG_TYPE_LOG, 0, s);
        (*m_pfnMessageCallback)(E_MSG_TYPE_ERR, 0, s);
        delete [] s;

  When the Integration Service creates an object of type Tx<MODNAME>, it passes to its
  constructor a pointer to a callback function that can be used to write error or debugging
  messages to the session log. (The code for the Tx<MODNAME> constructor is in the file
  Tx<MODNAME>.cpp.) This pointer is stored in the Tx<MODNAME> member variable
  m_pfnMessageCallback. The type of this pointer is defined in a typedef in the file
  $PMExtProcDir/include/infemmsg.h:
        typedef void (*PFN_MESSAGE_CALLBACK)(
           enum E_MSG_TYPE eMsgType,
           unsigned long Code,
           char* Message
        );

  Also defined in that file is the enumeration E_MSG_TYPE:
        enum E_MSG_TYPE {
          E_MSG_TYPE_LOG = 0,
          E_MSG_TYPE_WARNING,


                                                                        Development Notes     173
E_MSG_TYPE_ERR
                      };

              If you specify the eMsgType of the callback function as E_MSG_TYPE_LOG, the callback
              function will write a log message to the session log. If you specify E_MSG_TYPE_ERR, the
              callback function writes an error message to the session log. If you specify
              E_MSG_TYPE_WARNING, the callback function writes an warning message to the session
              log. Use these messages to provide a simple debugging capability in Informatica external
              procedures.
              To debug COM external procedures, you may use the output facilities available from inside a
              Visual Basic or C++ class. For example, in Visual Basic use a MsgBox to print out the result of
              a calculation for each row. Of course, you want to do this only on small samples of data while
              debugging and make sure to remove the MsgBox before making a production run.
              Note: Before attempting to use any output facilities from inside a Visual Basic or C++ class,
              you must add the following value to the registry:
              1.   Add the following entry to the Windows registry:
                      HKEY_LOCAL_MACHINESystemCurrentControlSetServicesPowerMartParameter
                      sMiscInfoRunInDebugMode=Yes

                   This option starts the Integration Service as a regular application, not a service. You can
                   debug the Integration Service without changing the debug privileges for the Integration
                   Service service while it is running.
              2.   Start the Integration Service from the command line, using the command
                   PMSERVER.EXE.
                   The Integration Service is now running in debug mode.
              When you are finished debugging, make sure you remove this entry from the registry or set
              RunInDebugMode to No. Otherwise, when you attempt to start PowerCenter as a service, it
              will not start.
              1.   Stop the Integration Service and change the registry entry you added earlier to the
                   following setting:
                      HKEY_LOCAL_MACHINESystemCurrentControlSetServicesPowerMartParameter
                      sMiscInfoRunInDebugMode=No

              2.   Restart the Integration Service as a Windows service.

              The TINFParam Class and Indicators
              The <PROCNAME> method accesses input and output parameters using two parameter
              arrays, and that each array element is of the TINFParam datatype. The TINFParam datatype
              is a C++ class that serves as a “variant” data structure that can hold any of the Informatica
              internal datatypes. The actual data in a parameter of type TINFParam* is accessed through
              member functions of the form Get<Type> and Set<Type>, where <Type> is one of the
              Informatica internal datatypes. TINFParam also has methods for getting and setting the
              indicator for each parameter.




174   Chapter 6: External Procedure Transformation
You are responsible for checking these indicators on entry to the external procedure and for
  setting them on exit. On entry, the indicators of all output parameters are explicitly set to
  INF_SQL_DATA_NULL, so if you do not reset these indicators before returning from the
  external procedure, you will just get NULLs for all the output parameters. The TINFParam
  class also supports functions for obtaining the metadata for a particular parameter. For a
  complete description of all the member functions of the TINFParam class, see the infemdef.h
  include file in the tx/include directory.
  Note that one of the main advantages of Informatica external procedures over COM external
  procedures is that Informatica external procedures directly support indicator manipulation.
  That is, you can check an input parameter to see if it is NULL, and you can set an output
  parameter to NULL. COM provides no indicator support. Consequently, if a row entering a
  COM-style external procedure has any NULLs in it, the row cannot be processed. Use the
  default value facility in the Designer to overcome this shortcoming. However, it is not
  possible to pass NULLs out of a COM function.


Unconnected External Procedure Transformations
  When you add an instance of an External Procedure transformation to a mapping, you can
  choose to connect it as part of the pipeline or leave it unconnected. Connected External
  Procedure transformations call the COM or Informatica procedure every time a row passes
  through the transformation.
  To get return values from an unconnected External Procedure transformation, call it in an
  expression using the following syntax:
           :EXT.transformation_name(arguments)

  When a row passes through the transformation containing the expression, the Integration
  Service calls the procedure associated with the External Procedure transformation. The
  expression captures the return value of the procedure through the External Procedure
  transformation return port, which should have the Result (R) option checked. For more
  information about expressions, see “Working with Expressions” on page 10.


Initializing COM and Informatica Modules
  Some external procedures must be configured at initialization time. This initialization takes
  one of two forms, depending on the type of the external procedure:
  1.   Initialization of Informatica-style external procedures. The Tx<MODNAME> class,
       which contains the external procedure, also contains the initialization function,
       Tx<MODNAME>::InitDerived. The signature of this initialization function is well-
       known to the Integration Service and consists of three parameters:
       ♦   nInitProps. This parameter tells the initialization function how many initialization
           properties are being passed to it.
       ♦   Properties. This parameter is an array of nInitProp strings representing the names of
           the initialization properties.




                                                                           Development Notes      175
♦   Values. This parameter is an array of nInitProp strings representing the values of the
                       initialization properties.




                   The Integration Service first calls the Init() function in the base class. When the Init()
                   function successfully completes, the base class calls the Tx<MODNAME>::InitDerived()
                   function.
                   The Integration Service creates the Tx<MODNAME> object and then calls the
                   initialization function. It is the responsibility of the external procedure developer to
                   supply that part of the Tx<MODNAME>::InitDerived() function that interprets the
                   initialization properties and uses them to initialize the external procedure. Once the
                   object is created and initialized, the Integration Service can call the external procedure on
                   the object for each row.
              2.   Initialization of COM-style external procedures. The object that contains the external
                   procedure (or EP object) does not contain an initialization function. Instead, another
                   object (the CF object) serves as a class factory for the EP object. The CF object has a
                   method that can create an EP object.
                   The signature of the CF object method is determined from its type library. The
                   Integration Service creates the CF object, and then calls the method on it to create the EP
                   object, passing this method whatever parameters are required. This requires that the
                   signature of the method consist of a set of input parameters, whose types can be
                   determined from the type library, followed by a single output parameter that is an
                   IUnknown** or an IDispatch** or a VARIANT* pointing to an IUnknown* or
                   IDispatch*.
                   The input parameters hold the values required to initialize the EP object and the output
                   parameter receives the initialized object. The output parameter can have either the [out]
                   or the [out, retval] attributes. That is, the initialized object can be returned either as an




176   Chapter 6: External Procedure Transformation
output parameter or as the return value of the method. The datatypes supported for the
     input parameters are:
     ♦   COM VC type
     ♦   VT_UI1
     ♦   VT_BOOL
     ♦   VT_I2
     ♦   VT_UI2
     ♦   VT_I4
     ♦   VT_UI4
     ♦   VT_R4
     ♦   VT_R8
     ♦   VT_BSTR
     ♦   VT_CY
     ♦   VT_DATE

Setting Initialization Properties in the Designer
Enter external procedure initialization properties on the Initialization Properties tab of the
Edit Transformations dialog box. The tab displays different fields, depending on whether the
external procedure is COM-style or Informatica-style.
COM-style External Procedure transformations contain the following fields on the
Initialization Properties tab:
♦   Programmatic Identifier for Class Factory. Enter the programmatic identifier of the class
    factory.
♦   Constructor. Specify the method of the class factory that creates the EP object.




                                                                        Development Notes   177
Figure 6-2 shows the Initialization Properties tab of a COM-style External Procedure
              transformation:

              Figure 6-2. External Procedure Transformation Initialization Properties


                                                                                        Add a new property.




                                                                                        New Property




              You can enter an unlimited number of initialization properties to pass to the Constructor
              method for both COM-style and Informatica-style External Procedure transformations.
              To add a new initialization property, click the Add button. Enter the name of the parameter
              in the Property column and enter the value of the parameter in the Value column. For
              example, you can enter the following parameters:

               Parameter          Value

               Param1             abc

               Param2             100

               Param3             3.17


              Note: You must create a one-to-one relation between the initialization properties you define in
              the Designer and the input parameters of the class factory constructor method. For example,
              if the constructor has n parameters with the last parameter being the output parameter that
              receives the initialized object, you must define n – 1 initialization properties in the Designer,
              one for each input parameter in the constructor method.
              You can also use process variables in initialization properties. For information about process
              variables support in Initialization properties, see “Service Process Variables in Initialization
              Properties” on page 180.




178   Chapter 6: External Procedure Transformation
Other Files Distributed and Used in TX
   Following are the header files located under the path $PMExtProcDir/include that are needed
   for compiling external procedures:
   ♦   infconfg.h
   ♦   infem60.h
   ♦   infemdef.h
   ♦   infemmsg.h
   ♦   infparam.h
   ♦   infsigtr.h
   Following are the library files located under the path <PMInstallDir> that are needed for
   linking external procedures and running the session:
   ♦   libpmtx.a (AIX)
   ♦   libpmtx.sl (HP-UX)
   ♦   libpmtx.so (Linux)
   ♦   libpmtx.so (Solaris)
   ♦   pmtx.dll and pmtx.lib (Windows)




                                                                          Development Notes    179
Service Process Variables in Initialization Properties
              PowerCenter supports built-in process variables in the External Procedure transformation
              initialization properties list. If the property values contain built-in process variables, the
              Integration Service expands them before passing them to the external procedure library. This
              can be very useful for writing portable External Procedure transformations.
              Figure 6-3 shows an External Procedure transformation with five user-defined properties:

              Figure 6-3. External Procedure Transformation Initialization Properties Tab




              Table 6-4 contains the initialization properties and values for the External Procedure
              transformation in Figure 6-3:

              Table 6-4. External Procedure Initialization Properties

               Property           Value                         Expanded Value Passed to the External Procedure Library

               mytempdir          $PMTempDir                     /tmp

               memorysize         5000000                       5000000

               input_file         $PMSourceFileDir/file.in       /data/input/file.in

               output_file        $PMTargetFileDir/file.out      /data/output/file.out

               extra_var          $some_other_variable          $some_other_variable


              When you run the workflow, the Integration Service expands the property list and passes it to
              the external procedure initialization function. Assuming that the values of the built-in process
              variables $PMTempDir is /tmp, $PMSourceFileDir is /data/input, and $PMTargetFileDir is
              /data/output, the last column in Table 6-4 contains the property and expanded value
              information. Note that the Integration Service does not expand the last property
              “$some_other_variable” because it is not a built-in process variable.


180   Chapter 6: External Procedure Transformation
External Procedure Interfaces
      The Integration Service uses the following major functions with External Procedures:
      ♦   Dispatch
      ♦   External procedure
      ♦   Property access
      ♦   Parameter access
      ♦   Code page access
      ♦   Transformation name access
      ♦   Procedure access
      ♦   Partition related
      ♦   Tracing level


    Dispatch Function
      The Integration Service calls the dispatch function to pass each input row to the external
      procedure module. The dispatch function, in turn, calls the external procedure function you
      specify.
      External procedures access the ports in the transformation directly using the member variable
      m_pInParamVector for input ports and m_pOutParamVector for output ports.

      Signature
      The dispatch function has a fixed signature which includes one index parameter.
             virtual INF_RESULT Dispatch(unsigned long ProcedureIndex) = 0


    External Procedure Function
      The external procedure function is the main entry point into the external procedure module,
      and is an attribute of the External Procedure transformation. The dispatch function calls the
      external procedure function for every input row. For External Procedure transformations, use
      the external procedure function for input and output from the external procedure module.
      The function can access the IN and IN-OUT port values for every input row, and can set the
      OUT and IN-OUT port values. The external procedure function contains all the input and
      output processing logic.

      Signature
      The external procedure function has no parameters. The input parameter array is already
      passed through the InitParams() method and stored in the member variable
      m_pInParamVector. Each entry in the array matches the corresponding IN and IN-OUT
      ports of the External Procedure transformation, in the same order. The Integration Service
      fills this vector before calling the dispatch function.

                                                                    External Procedure Interfaces   181
Use the member variable m_pOutParamVector to pass the output row before returning the
              Dispatch() function.
              For the MyExternal Procedure transformation, the external procedure function is the
              following, where the input parameters are in the member variable m_pInParamVector and the
              output values are in the member variable m_pOutParamVector:
                      INF_RESULT Tx<ModuleName>::MyFunc()


        Property Access Functions
              The property access functions provide information about the initialization properties
              associated with the External Procedure transformation. The initialization property names and
              values appear on the Initialization Properties tab when you edit the External Procedure
              transformation.
              Informatica provides property access functions in both the base class and the
              TINFConfigEntriesList class. Use the GetConfigEntryName() and GetConfigEntryValue()
              functions in the TINFConfigEntriesList class to access the initialization property name and
              value, respectively.

              Signature
              Informatica provides the following functions in the base class:
                      TINFConfigEntriesList*
                      TINFBaseExternalModule60::accessConfigEntriesList();

                      const char* GetConfigEntry(const char* LHS);

              Informatica provides the following functions in the TINFConfigEntriesList class:
                      const char* TINFConfigEntriesList::GetConfigEntryValue(const char* LHS);

                      const char* TINFConfigEntriesList::GetConfigEntryValue(int i);

                      const char* TINFConfigEntriesList::GetConfigEntryName(int i);
                      const char* TINFConfigEntriesList::GetConfigEntry(const char* LHS)

              Note: In the TINFConfigEntriesList class, use the GetConfigEntryName() and
              GetConfigEntryValue() property access functions to access the initialization property names
              and values.
              You can call these functions from a TX program. The TX program then converts this string
              value into a number, for example by using atoi or sscanf. In the following example,
              “addFactor” is an Initialization Property. accessConfigEntriesList() is a member variable of the
              TX base class and does not need to be defined.
                      const char* addFactorStr = accessConfigEntriesList()->
                      GetConfigEntryValue("addFactor");




182   Chapter 6: External Procedure Transformation
Parameter Access Functions
  Parameter access functions are datatype specific. Use the parameter access function
  GetDataType to return the datatype of a parameter. Then use a parameter access function
  corresponding to this datatype to return information about the parameter.
  A parameter passed to an external procedure belongs to the datatype TINFParam*. The
  header file infparam.h defines the related access functions. The Designer generates stub code
  that includes comments indicating the parameter datatype. You can also determine the
  datatype of a parameter in the corresponding External Procedure transformation in the
  Designer.

  Signature
  A parameter passed to an external procedure is a pointer to an object of the TINFParam class.
  This fixed-signature function is a method of that class and returns the parameter datatype as
  an enum value.
  The valid datatypes are:
  INF_DATATYPE_LONG
  INF_DATATYPE_STRING
  INF_DATATYPE_DOUBLE
  INF_DATATYPE_RAW
  INF_DATATYPE_TIME
  Table 6-5 lists a brief description of some parameter access functions:

  Table 6-5. Descriptions of Parameter Access Functions

   Parameter Access Function                 Description

   INF_DATATYPE GetDataType(void);           Gets the datatype of a parameter. Use the parameter datatype to
                                             determine which datatype-specific function to use when accessing
                                             parameter values.

   INF_Boolean IsValid(void);                Verifies that input data is valid. Returns FALSE if the parameter contains
                                             truncated data and is a string.

   INF_Boolean IsNULL(void);                 Verifies that input data is NULL.

   INF_Boolean IsInputMapped (void);         Verifies that input port passing data to this parameter is connected to a
                                             transformation.

   INF_Boolean IsOutput Mapped (void);       Verifies that output port receiving data from this parameter is connected
                                             to a transformation.

   INF_Boolean IsInput(void);                Verifies that parameter corresponds to an input port.

   INF_Boolean IsOutput(void);               Verifies that parameter corresponds to an output port.

   INF_Boolean GetName(void);                Gets the name of the parameter.




                                                                                 External Procedure Interfaces       183
Table 6-5. Descriptions of Parameter Access Functions

                  Parameter Access Function                    Description

                  SQLIndicator GetIndicator(void);             Gets the value of a parameter indicator. The IsValid and ISNULL
                                                               functions are special cases of this function. This function can also return
                                                               INF_SQL_DATA_TRUNCATED.

                  void SetIndicator(SQLIndicator Indicator);   Sets an output parameter indicator, such as invalid or truncated.

                  long GetLong(void);                          Gets the value of a parameter having a Long or Integer datatype. Call
                                                               this function only if you know the parameter datatype is Integer or Long.
                                                               This function does not convert data to Long from another datatype.

                  double GetDouble(void);                      Gets the value of a parameter having a Float or Double datatype. Call
                                                               this function only if you know the parameter datatype is Float or Double.
                                                               This function does not convert data to Double from another datatype.

                  char* GetString(void);                       Gets the value of a parameter as a null-terminated string. Call this
                                                               function only if you know the parameter datatype is String. This function
                                                               does not convert data to String from another datatype.
                                                               The value in the pointer changes when the next row of data is read. If
                                                               you want to store the value from a row for later use, explicitly copy this
                                                               string into its own allocated buffer.

                  char* GetRaw(void);                          Gets the value of a parameter as a non-null terminated byte array. Call
                                                               this function only if you know the parameter datatype is Raw. This
                                                               function does not convert data to Raw from another datatype.

                  unsigned long GetActualDataLen(void);        Gets the current length of the array returned by GetRaw.

                  TINFTime GetTime(void);                      Gets the value of a parameter having a Date/Time datatype. Call this
                                                               function only if you know the parameter datatype is Date/Time. This
                                                               function does not convert data to Date/Time from another datatype.

                  void SetLong(long lVal);                     Sets the value of an output parameter having a Long datatype.

                  void SetDouble(double dblVal);               Sets the value of an output parameter having a Double datatype.

                  void SetString(char* sVal);                  Sets the value of an output parameter having a String datatype.

                  void SetRaw(char* rVal, size_t               Sets a non-null terminated byte array.
                  ActualDataLen);

                  void SetTime(TINFTime timeVal);              Sets the value of an output parameter having a Date/Time datatype.


              Only use the SetInt32 or GetInt32 function when you run the external procedure on a 64-bit
              Integration Service. Do not use any of the following functions:
              ♦     GetLong
              ♦     SetLong
              ♦     GetpLong
              ♦     GetpDouble
              ♦     GetpTime
              Pass the parameters using two parameter lists.



184   Chapter 6: External Procedure Transformation
Table 6-6 lists the member variables of the external procedure base class.

  Table 6-6. Member Variable of the External Procedure Base Class

   Variable                                Description

   m_nInParamCount                         Number of input parameters.

   m_pInParamVector                        Actual input parameter array.

   m_nOutParamCount                        Number of output parameters.

   m_pOutParamVector                       Actual output parameter array.
   Note: Ports defined as input/output show up in both parameter lists.



Code Page Access Functions
  Informatica provides two code page access functions that return the code page of the
  Integration Service and two that return the code page of the data the external procedure
  processes. When the Integration Service runs in Unicode mode, the string data passing to the
  external procedure program can contain multibyte characters. The code page determines how
  the external procedure interprets a multibyte character string. When the Integration Service
  runs in Unicode mode, data processed by the external procedure program must be two-way
  compatible with the Integration Service code page.

  Signature
  Use the following functions to obtain the Integration Service code page through the external
  procedure program. Both functions return equivalent information.
          int GetServerCodePageID() const;
          const char* GetServerCodePageName() const;

  Use the following functions to obtain the code page of the data the external procedure
  processes through the external procedure program. Both functions return equivalent
  information.
          int GetDataCodePageID(); // returns 0 in case of error

          const char* GetDataCodePageName() const; // returns NULL in case of error


Transformation Name Access Functions
  Informatica provides two transformation name access functions that return the name of the
  External Procedure transformation. The GetWidgetName() function returns the name of the
  transformation, and the GetWidgetInstanceName() function returns the name of the
  transformation instance in the mapplet or mapping.

  Signature
  The char* returned by the transformation name access functions is an MBCS string in the
  code page of the Integration Service. It is not in the data code page.


                                                                            External Procedure Interfaces   185
const char* GetWidgetInstanceName() const;

                      const char* GetWidgetName() const;


        Procedure Access Functions
              Informatica provides two procedure access functions that provide information about the
              external procedure associated with the External Procedure transformation. The
              GetProcedureName() function returns the name of the external procedure specified in the
              Procedure Name field of the External Procedure transformation. The GetProcedureIndex()
              function returns the index of the external procedure.

              Signature
              Use the following function to get the name of the external procedure associated with the
              External Procedure transformation:
                      const char* GetProcedureName() const;

              Use the following function to get the index of the external procedure associated with the
              External Procedure transformation:
                      inline unsigned long GetProcedureIndex() const;


        Partition Related Functions
              Use partition related functions for external procedures in sessions with multiple partitions.
              When you partition a session that contains External Procedure transformations, the
              Integration Service creates instances of these transformations for each partition. For example,
              if you define five partitions for a session, the Integration Service creates five instances of each
              external procedure at session runtime.

              Signature
              Use the following function to obtain the number of partitions in a session:
                      unsigned long GetNumberOfPartitions();

              Use the following function to obtain the index of the partition that called this external
              procedure:
                      unsigned long GetPartitionIndex();




186   Chapter 6: External Procedure Transformation
Tracing Level Function
  The tracing level function returns the session trace level, for example:
        typedef enum

        {

        TRACE_UNSET = 0,
        TRACE_TERSE = 1,

        TRACE_NORMAL = 2,

        TRACE_VERBOSE_INIT = 3,
        TRACE_VERBOSE_DATA = 4

        } TracingLevelType;


  Signature
  Use the following function to return the session trace level:
        TracingLevelType GetSessionTraceLevel();




                                                                  External Procedure Interfaces   187
188   Chapter 6: External Procedure Transformation
Chapter 7




Filter Transformation


    This chapter includes the following topics:
    ♦   Overview, 190
    ♦   Filter Condition, 192
    ♦   Creating a Filter Transformation, 193
    ♦   Tips, 195
    ♦   Troubleshooting, 196




                                                              189
Overview
                     Transformation type:
                     Active
                     Connected


              You can filter rows in a mapping with the Filter transformation. You pass all the rows from a
              source transformation through the Filter transformation, and then enter a filter condition for
              the transformation. All ports in a Filter transformation are input/output, and only rows that
              meet the condition pass through the Filter transformation.
              In some cases, you need to filter data based on one or more conditions before writing it to
              targets. For example, if you have a human resources target containing information about
              current employees, you might want to filter out employees who are part-time and hourly.
              The mapping in Figure 7-1 passes the rows from a human resources table that contains
              employee data through a Filter transformation. The filter only allows rows through for
              employees that make salaries of $30,000 or higher.

              Figure 7-1. Sample Mapping with a Filter Transformation




190   Chapter 7: Filter Transformation
Figure 7-2 shows the filter condition used in the mapping in Figure 7-1 on page 190:

Figure 7-2. Specifying a Filter Condition in a Filter Transformation




With the filter of SALARY > 30000, only rows of data where employees that make salaries
greater than $30,000 pass through to the target.
As an active transformation, the Filter transformation may change the number of rows passed
through it. A filter condition returns TRUE or FALSE for each row that passes through the
transformation, depending on whether a row meets the specified condition. Only rows that
return TRUE pass through this transformation. Discarded rows do not appear in the session
log or reject files.
To maximize session performance, include the Filter transformation as close to the sources in
the mapping as possible. Rather than passing rows you plan to discard through the mapping,
you then filter out unwanted data early in the flow of data from sources to targets.
You cannot concatenate ports from more than one transformation into the Filter
transformation. The input ports for the filter must come from a single transformation. The
Filter transformation does not allow setting output default values.




                                                                               Overview   191
Filter Condition
              You use the transformation language to enter the filter condition. The condition is an
              expression that returns TRUE or FALSE. For example, if you want to filter out rows for
              employees whose salary is less than $30,000, you enter the following condition:
                      SALARY > 30000

              You can specify multiple components of the condition, using the AND and OR logical
              operators. If you want to filter out employees who make less than $30,000 and more than
              $100,000, you enter the following condition:
                      SALARY > 30000 AND SALARY < 100000

              You do not need to specify TRUE or FALSE as values in the expression. TRUE and FALSE
              are implicit return values from any condition you set. If the filter condition evaluates to
              NULL, the row is assumed to be FALSE.
              Enter conditions using the Expression Editor, available from the Properties tab of the Filter
              transformation. The filter condition is case sensitive. Any expression that returns a single
              value can be used as a filter. You can also enter a constant for the filter condition. The
              numeric equivalent of FALSE is zero (0). Any non-zero value is the equivalent of TRUE. For
              example, if you have a port called NUMBER_OF_UNITS with a numeric datatype, a filter
              condition of NUMBER_OF_UNITS returns FALSE if the value of NUMBER_OF_UNITS
              equals zero. Otherwise, the condition returns TRUE.
              After entering the expression, you can validate it by clicking the Validate button in the
              Expression Editor. When you enter an expression, validate it before continuing to avoid
              saving an invalid mapping to the repository. If a mapping contains syntax errors in an
              expression, you cannot run any session that uses the mapping until you correct the error.




192   Chapter 7: Filter Transformation
Creating a Filter Transformation
       Creating a Filter transformation requires inserting the new transformation into the mapping,
       adding the appropriate input/output ports, and writing the condition.

       To create a Filter transformation:

       1.   In the Designer, switch to the Mapping Designer and open a mapping.
       2.   Click Transformation > Create.
            Select Filter transformation, and enter the name of the new transformation. The naming
            convention for the Filter transformation is FIL_TransformationName. Click Create, and
            then click Done.
       3.   Select and drag all the ports from a source qualifier or other transformation to add them
            to the Filter transformation.
            After you select and drag ports, copies of these ports appear in the Filter transformation.
            Each column has both an input and an output port.
       4.   Double-click the title bar of the new transformation.
       5.   Click the Properties tab.
            A default condition appears in the list of conditions. The default condition is TRUE (a
            constant with a numeric value of 1).




                                                                                      Open Button




       6.   Click the Value section of the condition, and then click the Open button.
            The Expression Editor appears.




                                                                      Creating a Filter Transformation   193
7.    Enter the filter condition you want to apply.
                    Use values from one of the input ports in the transformation as part of this condition.
                    However, you can also use values from output ports in other transformations.
              8.    Click Validate to check the syntax of the conditions you entered.
                    You may have to fix syntax errors before continuing.
              9.    Click OK.
              10.   Select the Tracing Level, and click OK to return to the Mapping Designer.
              11.   Click Repository > Save to save the mapping.




194   Chapter 7: Filter Transformation
Tips
       Use the Filter transformation early in the mapping.
       To maximize session performance, keep the Filter transformation as close as possible to the
       sources in the mapping. Rather than passing rows that you plan to discard through the
       mapping, you can filter out unwanted data early in the flow of data from sources to targets.

       Use the Source Qualifier transformation to filter.
       The Source Qualifier transformation provides an alternate way to filter rows. Rather than
       filtering rows from within a mapping, the Source Qualifier transformation filters rows when
       read from a source. The main difference is that the source qualifier limits the row set extracted
       from a source, while the Filter transformation limits the row set sent to a target. Since a source
       qualifier reduces the number of rows used throughout the mapping, it provides better
       performance.
       However, the Source Qualifier transformation only lets you filter rows from relational sources,
       while the Filter transformation filters rows from any type of source. Also, note that since it
       runs in the database, you must make sure that the filter condition in the Source Qualifier
       transformation only uses standard SQL. The Filter transformation can define a condition
       using any statement or transformation function that returns either a TRUE or FALSE value.
       For more information about setting a filter for a Source Qualifier transformation, see “Source
       Qualifier Transformation” on page 445.




                                                                                               Tips   195
Troubleshooting
              I imported a flat file into another database (Microsoft Access) and used SQL filter queries to
              determine the number of rows to import into the Designer. But when I import the flat file into
              the Designer and pass data through a Filter transformation using equivalent SQL
              statements, I do not import as many rows. Why is there a difference?
              You might want to check two possible solutions:
              ♦   Case sensitivity. The filter condition is case sensitive, and queries in some databases do not
                  take this into account.
              ♦   Appended spaces. If a field contains additional spaces, the filter condition needs to check
                  for additional spaces for the length of the field. Use the RTRIM function to remove
                  additional spaces.

              How do I filter out rows with null values?
              To filter out rows containing null values or spaces, use the ISNULL and IS_SPACES
              functions to test the value of the port. For example, if you want to filter out rows that contain
              NULLs in the FIRST_NAME port, use the following condition:
                      IIF(ISNULL(FIRST_NAME),FALSE,TRUE)

              This condition states that if the FIRST_NAME port is NULL, the return value is FALSE and
              the row should be discarded. Otherwise, the row passes through to the next transformation.
              For more information about the ISNULL and IS_SPACES functions, see “Functions” in the
              Transformation Language Reference.




196   Chapter 7: Filter Transformation
Chapter 8




HTTP Transformation


   This chapter includes the following topics:
   ♦   Overview, 198
   ♦   Creating an HTTP Transformation, 200
   ♦   Configuring the Properties Tab, 202
   ♦   Configuring the HTTP Tab, 204
   ♦   Examples, 209




                                                             197
Overview
                    Transformation type:
                    Passive
                    Connected


             The HTTP transformation enables you to connect to an HTTP server to use its services and
             applications. When you run a session with an HTTP transformation, the Integration Service
             connects to the HTTP server and issues a request to retrieve data from or update data on the
             HTTP server, depending on how you configure the transformation:
             ♦   Read data from an HTTP server. When the Integration Service reads data from an HTTP
                 server, it retrieves the data from the HTTP server and passes the data to the target or a
                 downstream transformation in the mapping. For example, you can connect to an HTTP
                 server to read current inventory data, perform calculations on the data during the
                 PowerCenter session, and pass the data to the target.
             ♦   Update data on the HTTP server. When the Integration Service writes to an HTTP
                 server, it posts data to the HTTP server and passes HTTP server responses to the target or
                 a downstream transformation in the mapping. For example, you can post data providing
                 scheduling information from upstream transformations to the HTTP server during a
                 session.
             Figure 8-1 shows how the Integration Service processes an HTTP transformation:

             Figure 8-1. HTTP Transformation Processing
                                              HTTP Server


                          HTTP Request                            HTTP Response

                                            Integration Service




                 Source                    HTTP Transformation                    Target




             The Integration Service passes data from upstream transformations or the source to the
             HTTP transformation, reads a URL configured in the HTTP transformation or application
             connection, and sends an HTTP request to the HTTP server to either read or update data.
             Requests contain header information and may contain body information. The header
             contains information such as authentication parameters, commands to activate programs or
             web services residing on the HTTP server, and other information that applies to the entire
             HTTP request. The body contains the data the Integration Service sends to the HTTP server.



198   Chapter 8: HTTP Transformation
When the Integration Service sends a request to read data, the HTTP server sends back an
   HTTP response with the requested data. The Integration Service sends the requested data to
   downstream transformations or the target.
   When the Integration Service sends a request to update data, the HTTP server writes the data
   it receives and sends back an HTTP response that the update succeeded. The HTTP
   transformation considers response codes 200 and 202 as a success. It considers all other
   response codes as failures. The session log displays an error when an HTTP server passes a
   response code that is considered a failure to the HTTP transformation. The Integration
   Service then sends the HTTP response to downstream transformations or the target.
   You can configure the HTTP transformation for the headers of HTTP responses. HTTP
   response body data passes through the HTTPOUT output port.


Authentication
   The HTTP transformation uses the following forms of authentication:
   ♦   Basic. Based on a non-encrypted user name and password.
   ♦   Digest. Based on an encrypted user name and password.
   ♦   NTLM. Based on encrypted user name, password, and domain.


Connecting to the HTTP Server
   When you configure an HTTP transformation, you can configure the URL for the
   connection. You can also create an HTTP connection object in the Workflow Manager.
   Configure an HTTP application connection in the following circumstances:
   ♦   The HTTP server requires authentication.
   ♦   You want to configure the connection timeout.
   ♦   You want to override the base URL in the HTTP transformation.
   For information about configuring the HTTP connection object, see the Workflow
   Administration Guide.




                                                                                 Overview   199
Creating an HTTP Transformation
             You create HTTP transformations in the Transformation Developer or in the Mapping
             Designer. An HTTP transformation has the following tabs:
             ♦   Transformation. Configure the name and description for the transformation.
             ♦   Ports. View input and output ports for the transformation. You cannot add or edit ports
                 on the Ports tab. The Designer creates ports on the Ports tab when you add ports to the
                 header group on the HTTP tab. For more information, see “Configuring Groups and
                 Ports” on page 205.
             ♦   Properties. Configure properties for the HTTP transformation on the Properties tab. For
                 more information, see “Configuring the Properties Tab” on page 202.
             ♦   Initialization Properties. You can define properties that the external procedure uses at run
                 time, such as during initialization. For more information about creating initialization
                 properties, see “Working with Procedure Properties” on page 72.
             ♦   Metadata Extensions. You can specify the property name, datatype, precision, and value.
                 Use metadata extensions for passing information to the procedure. For more information
                 about creating metadata extensions, see “Metadata Extensions” in the Repository Guide.
             ♦   Port Attribute Definitions. You can view port attributes for HTTP transformation ports.
                 You cannot edit port attribute definitions.
             ♦   HTTP. Configure the method, ports, and URL on the HTTP tab. For more information,
                 see “Configuring the HTTP Tab” on page 204.




200   Chapter 8: HTTP Transformation
Figure 8-2 shows an HTTP transformation:

Figure 8-2. HTTP Transformation




To create an HTTP transformation:

1.   In the Transformation Developer or Mapping Designer, click Transformation > Create.
2.   Select HTTP transformation.
3.   Enter a name for the transformation.
4.   Click Create.
     The HTTP transformation displays in the workspace.
5.   Click Done.
6.   Configure the tabs in the transformation.




                                                          Creating an HTTP Transformation   201
Configuring the Properties Tab
             The HTTP transformation is built using the Custom transformation. Some Custom
             transformation properties do not apply to the HTTP transformation or are not configurable.
             Figure 8-3 shows the Properties tab of an HTTP transformation:

             Figure 8-3. HTTP Transformation Properties Tab




             Table 8-1 describes the HTTP transformation properties that you can configure:

             Table 8-1. HTTP Transformation Properties

               Option                  Description

               Runtime Location        Location that contains the DLL or shared library. Default is $PMExtProcDir. Enter a path
                                       relative to the Integration Service machine that runs the session using the HTTP
                                       transformation.
                                       If you make this property blank, the Integration Service uses the environment variable
                                       defined on the Integration Service machine to locate the DLL or shared library.
                                       You must copy all DLLs or shared libraries to the runtime location or to the environment
                                       variable defined on the Integration Service machine. The Integration Service fails to load the
                                       procedure when it cannot locate the DLL, shared library, or a referenced file.

               Tracing Level           Amount of detail displayed in the session log for this transformation. Default is Normal.




202   Chapter 8: HTTP Transformation
Table 8-1. HTTP Transformation Properties

 Option                   Description

 Is Partitionable         Indicates if you can create multiple partitions in a pipeline that uses this transformation:
                          - No. The transformation cannot be partitioned. The transformation and other
                            transformations in the same pipeline are limited to one partition.
                          - Locally. The transformation can be partitioned, but the Integration Service must run all
                            partitions in the pipeline on the same node. Choose Local when different partitions of the
                            Custom transformation must share objects in memory.
                          - Across Grid. The transformation can be partitioned, and the Integration Service can
                            distribute each partition to different nodes.
                          Default is No.
                          For more information about using partitioning, see the Workflow Administration Guide.

 Requires Single Thread   Indicates if the Integration Service processes each partition at the procedure with one
 Per Partition            thread. When you enable this option, the procedure code can use thread-specific operations.
                          Default is enabled.
                          For more information about writing thread-specific operations, see “Working with Thread-
                          Specific Procedure Code” on page 66.




                                                                               Configuring the Properties Tab            203
Configuring the HTTP Tab
             On the HTTP tab, you can configure the transformation to read data from the HTTP server
             or write data to the HTTP server. Configure the following information on the HTTP tab:
             ♦   Select the method. Select GET, POST, or SIMPLE POST method based on whether you
                 want to read data from or write data to an HTTP server. For more information, see
                 “Selecting a Method” on page 204.
             ♦   Configure groups and ports. Manage HTTP request/response body and header details by
                 configuring input and output ports. You can also configure port names with special
                 characters. For more information, see “Configuring Groups and Ports” on page 205.
             ♦   Configure a base URL. Configure the base URL for the HTTP server you want to connect
                 to. For more information, see “Configuring a URL” on page 207.
             Figure 8-4 shows the HTTP tab of an HTTP transformation:

             Figure 8-4. HTTP Transformation HTTP Tab




                                                                                          Method




                                                                                           Groups




                                                                                          HTTP Name


                                                                                           URL




        Selecting a Method
             The groups and ports you define in a transformation depend on the method you select. To
             read data from an HTTP server, select the GET method. To write data to an HTTP server,
             select the POST or SIMPLE POST method.



204   Chapter 8: HTTP Transformation
Table 8-2 explains the different methods:

  Table 8-2. HTTP Transformation Methods

      Method          Description

      GET             Reads data from an HTTP server.

      POST            Writes data from multiple input ports to the HTTP server.

      SIMPLE POST     A simplified version of the POST method. Writes data from one input port as a single block of
                      data to the HTTP server.


  To define the metadata for the HTTP request, you must configure input and output ports
  based on the method you select:
  ♦    GET method. Use the input group to add input ports that the Designer uses to construct
       the final URL for the HTTP server.
  ♦    POST or SIMPLE POST method. Use the input group for the data that defines the body
       of the HTTP request.
  For all methods, use the header group for the HTTP request header information.


Configuring Groups and Ports
  The ports you add to an HTTP transformation depend on the method you choose and the
  group. An HTTP transformation uses the following groups:
  ♦    Output. Contains body data for the HTTP response. Passes responses from the HTTP
       server to downstream transformations or the target. By default, contains one output port,
       HTTPOUT. You cannot add ports to the output group. You can modify the precision for
       the HTTPOUT output port.
  ♦    Input. Contains body data for the HTTP request. Also contains metadata the Designer
       uses to construct the final URL to connect to the HTTP server. To write data to an HTTP
       server, the input group passes body information to the HTTP server. By default, contains
       one input port.
  ♦    Header. Contains header data for the request and response. Passes header information to
       the HTTP server when the Integration Service sends an HTTP request. Ports you add to
       the header group pass data for HTTP headers. When you add ports to the header group
       the Designer adds ports to the input and output groups on the Ports tab. By default,
       contains no ports.
  Note: The data that passes through an HTTP transformation must be of the String datatype.
  String data includes any markup language common in HTTP communication, such as
  HTML and XML.




                                                                                    Configuring the HTTP Tab          205
Table 8-3 describes the groups and ports for the GET method:

             Table 8-3. GET Method Groups and Ports

               Request/
                               Group    Description
               Response

               REQUEST         Input    The Designer uses the names and values of the input ports to construct the final URL.

                               Header   You can configure input and input/output ports for HTTP requests. The Designer adds
                                        ports to the input and output groups based on the ports you add to the header group:
                                        - Input group. Creates input ports based on input and input/output ports from the header
                                          group.
                                        - Output group. Creates output ports based on input/output ports from the header group.

               RESPONSE        Header   You can configure output and input/output ports for HTTP responses. The Designer adds
                                        ports to the input and output groups based on the ports you add to the header group:
                                        - Input group. Creates input ports based on input/output ports from the header group.
                                        - Output group. Creates output ports based on output and input/output ports from the
                                          header group.

                               Output   All body data for an HTTP response passes through the HTTPOUT output port.


             Table 8-4 describes the ports for the POST method:

             Table 8-4. POST Method Groups and Ports

               Request/
                               Group    Description
               Response

               REQUEST         Input    You can add multiple ports to the input group. Body data for an HTTP request can pass
                                        through one or more input ports based on what you add to the header group.

                               Header   You can configure input and input/output ports for HTTP requests. The Designer adds
                                        ports to the input and output groups based on the ports you add to the header group:
                                        - Input group. Creates input ports based on input and input/output ports from the header
                                          group.
                                        - Output group. Creates output ports based on input/output ports from the header group.

               RESPONSE        Header   You can configure output and input/output ports for HTTP responses. The Designer adds
                                        ports to the input and output groups based on the ports you add to the header group:
                                        - Input group. Creates input ports based on input/output ports from the header group.
                                        - Output group. Creates output ports based on output and input/output ports from the
                                          header group.

                               Output   All body data for an HTTP response passes through the HTTPOUT output port.




206   Chapter 8: HTTP Transformation
Table 8-5 describes the ports for the SIMPLE POST method:

  Table 8-5. SIMPLE POST Method Groups and Ports

   Request/
                  Group    Description
   Response

   REQUEST        Input    You can add one input port. Body data for an HTTP request can pass through one input
                           port.

                  Header   You can configure input and input/output ports for HTTP requests. The Designer adds
                           ports to the input and output groups based on the ports you add to the header group:
                           - Input group. Creates input ports based on input and input/output ports from the header
                             group.
                           - Output group. Creates output ports based on input/output ports from the header group.

   RESPONSE       Header   You can configure output and input/output ports for HTTP responses. The Designer adds
                           ports to the input and output groups based on the ports you add to the header group:
                           - Input group. Creates input ports based on input/output ports from the header group.
                           - Output group. Creates output ports based on output and input/output ports from the
                             header group.

                  Output   All body data for an HTTP response passes through the HTTPOUT output port.


  Adding an HTTP Name
  The Designer does not allow special characters, such as a dash (-), in port names. If you need
  to use special characters in a port name, you can configure an HTTP name to override the
  name of a port. For example, if you want an input port named Content-type, you can name
  the port ContentType and enter Content-Type as the HTTP name.


Configuring a URL
  After you select a method and configure input and output ports, you must configure a URL.
  Enter a base URL, and the Designer constructs the final URL. If you select the GET method,
  the final URL contains the base URL and parameters based on the port names in the input
  group. If you select the POST or SIMPLE POST methods, the final URL is the same as the
  base URL.
  You can also specify a URL when you configure an HTTP application connection. The base
  URL specified in the HTTP application connection overrides the base URL specified in the
  HTTP transformation.
  Note: An HTTP server can redirect an HTTP request to another HTTP server. When this
  occurs, the HTTP server sends a URL back to the Integration Service, which then establishes
  a connection to the other HTTP server. The Integration Service can establish a maximum of
  five additional connections.

  Final URL Construction for GET Method
  The Designer constructs the final URL for the GET method based on the base URL and port
  names in the input group. It appends HTTP arguments to the base URL to construct the
  final URL in the form of an HTTP query string. A query string consists of a question mark


                                                                                  Configuring the HTTP Tab            207
(?), followed by name/value pairs. The Designer appends the question mark and the name/
             value pairs that correspond to the names and values of the input ports you add to the input
             group.
             When you select the GET method and add input ports to the input group, the Designer
             appends the following group and port information to the base URL to construct the final
             URL:
                     ?<input group input port 1 name> = $<input group input port 1 value>

             For each input port following the first input group input port, the Designer appends the
             following group and port information:
                     & <input group input port n name> = $<input group input port n value>

             where n represents the input port.
             For example, if you enter www.company.com for the base URL and add the input ports ID,
             EmpName, and Department to the input group, the Designer constructs the following final
             URL:
                     www.company.com?ID=$ID&EmpName=$EmpName&Department=$Department

             You can edit the final URL to modify or add operators, variables, or other arguments. For
             more information about HTTP requests and query string, see http://www.w3c.org.




208   Chapter 8: HTTP Transformation
Examples
     This section contains examples for each type of method:
     ♦   GET
     ♦   POST
     ♦   SIMPLE POST


   GET Example
     The source file used with this example contains the following data:
            78576
            78577
            78578

     Figure 8-5 shows the HTTP tab of the HTTP transformation for the GET example:

     Figure 8-5. HTTP Tab for a GET Example




     The Designer appends a question mark (?), the input group input port name, a dollar sign ($),
     and the input group input port name again to the base URL to construct the final URL:
            http://www.informatica.com?CR=$CR




                                                                                   Examples    209
The Integration Service sends the source file values to the CR input port of the HTTP
             transformation and sends the following HTTP requests to the HTTP server:
                     http://www.informatica.com?CR=78576
                     http://www.informatica.com?CR=78577
                     http://www.informatica.com?CR=78578

             The HTTP server sends an HTTP response back to the Integration Service, which sends the
             data through the output port of the HTTP transformation to the target.


        POST Example
             The source file used with this example contains the following data:
                     33,44,1
                     44,55,2
                     100,66,0

             Figure 8-6 shows that each field in the source file has a corresponding input port:

             Figure 8-6. HTTP Tab for a POST Example




             The Integration Service sends the values of the three fields for each row through the input
             ports of the HTTP transformation and sends the HTTP request to the HTTP server specified
             in the final URL.




210   Chapter 8: HTTP Transformation
SIMPLE POST Example
  The following text shows the XML file used with this example:
         <?xml version="1.0" encoding="UTF-8"?>
         <n4:Envelope xmlns:cli="http://localhost:8080/axis/Clienttest1.jws"
         xmlns:n4="http://schemas.xmlsoap.org/soap/envelope/" xmlns:tns="http://
         schemas.xmlsoap.org/soap/encoding/" xmlns:xsi="http://www.w3.org/2001/
         XMLSchema-instance/" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
         <n4:Header>
         </n4:Header>
         <n4:Body n4:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/
         "><cli:smplsource>
         <Metadatainfo xsi:type="xsd:string">smplsourceRequest.Metadatainfo106</
         Metadatainfo></cli:smplsource>
         </n4:Body>
         <n4:Envelope>,capeconnect:Clienttest1services:Clienttest1#smplsource

  Figure 8-7 shows the HTTP tab of the HTTP transformation for the SIMPLE POST
  example:

  Figure 8-7. HTTP Tab for a SIMPLE POST Example




  The Integration Service sends the body of the source file through the input port and sends the
  HTTP request to the HTTP server specified in the final URL.




                                                                                 Examples    211
212   Chapter 8: HTTP Transformation
Chapter 9




Java Transformation


   This chapter includes the following topics:
   ♦   Overview, 214
   ♦   Using the Java Code Tab, 217
   ♦   Configuring Ports, 219
   ♦   Configuring Java Transformation Properties, 221
   ♦   Developing Java Code, 225
   ♦   Configuring Java Transformation Settings, 229
   ♦   Compiling a Java Transformation, 231
   ♦   Fixing Compilation Errors, 232




                                                              213
Overview
                     Transformation type:
                     Active/Passive
                     Connected


              You can extend PowerCenter functionality with the Java transformation. The Java
              transformation provides a simple native programming interface to define transformation
              functionality with the Java programming language. You can use the Java transformation to
              quickly define simple or moderately complex transformation functionality without advanced
              knowledge of the Java programming language or an external Java development environment.
              For example, you can define transformation logic to loop through input rows and generate
              multiple output rows based on a specific condition. You can also use expressions, user-defined
              functions, unconnected transformations, and mapping variables in the Java code.
              You create Java transformations by writing Java code snippets that define transformation
              logic. You can use Java transformation API methods and standard Java language constructs.
              For example, you can use static code and variables, instance variables, and Java methods. You
              can use third-party Java APIs, built-in Java packages, or custom Java packages. You can also
              define and use Java expressions to call expressions from within a Java transformation.
              For more information about the Java transformation API methods, see “Java Transformation
              API Reference” on page 237. For more information about using Java expressions, see “Java
              Expressions” on page 263.
              The PowerCenter Client uses the Java Development Kit (JDK) to compile the Java code and
              generate byte code for the transformation. The Integration Service uses the Java Runtime
              Environment (JRE) to execute generated byte code at run time.When you run a session with a
              Java transformation, the Integration Service uses the JRE to execute the byte code and process
              input rows and generate output rows.
              You can define transformation behavior for a Java transformation based on the following
              events:
              ♦    The transformation receives an input row
              ♦    The transformation has processed all input rows
              ♦    The transformation receives a transaction notification such as commit or rollback


        Steps to Define a Java Transformation
              Complete the following steps to write and compile Java code and fix compilation errors in a
              Java transformation:
              1.    Create the transformation in the Transformation Developer or Mapping Designer.
              2.    Configure input and output ports and groups for the transformation. Use port names as
                    variables in Java code snippets. For more information, see “Configuring Ports” on
                    page 219.


214   Chapter 9: Java Transformation
3.   Configure the transformation properties. For more information, see “Configuring Java
       Transformation Properties” on page 221.
  4.   Use the code entry tabs in the transformation to write and compile the Java code for the
       transformation. For more information, see “Developing Java Code” on page 225 and
       “Compiling a Java Transformation” on page 231.
  5.   Locate and fix compilation errors in the Java code for the transformation. For more
       information, see “Fixing Compilation Errors” on page 232.


Active and Passive Java Transformations
  You can create active and passive Java transformations. You select the type of Java
  transformation when you create the transformation. After you set the transformation type,
  you cannot change it. Active and passive Java transformations run the Java code in the On
  Input Row tab for the transformation one time for each row of input data.
  Use an active transformation when you want to generate more than one output row for each
  input row in the transformation. You must use Java transformation generateRow API to
  generate an output row. For example, a Java transformation contains two input ports that
  represent a start date and an end date. You can generate an output row for each date between
  the start date and end date.
  Use a passive transformation when you need one output row for each input row in the
  transformation. Passive transformations generate an output row after processing each input
  row.


Datatype Mapping
  The Java transformation maps PowerCenter datatypes to Java primitives, based on the Java
  transformation port type. The Java transformation maps input port datatypes to Java
  primitives when it reads input rows, and it maps Java primitives to output port datatypes
  when it writes output rows.
  For example, if an input port in a Java transformation has an Integer datatype, the Java
  transformation maps it to an integer primitive. The transformation treats the value of the
  input port as Integer in the transformation, and maps the Integer primitive to an integer
  datatype when the transformation generates the output row.
  Table 9-1 shows the mapping between PowerCenter datatypes and Java primitives by a Java
  transformation:

  Table 9-1. Mapping from PowerCenter Datatypes to Java Datatypes

   PowerCenter Datatype      Java Datatype

   CHAR                      String

   BINARY                    byte[]

   LONG (INT32)              int




                                                                                 Overview      215
Table 9-1. Mapping from PowerCenter Datatypes to Java Datatypes

               PowerCenter Datatype            Java Datatype

               DOUBLE                          double

               DECIMAL                         double *
                                               BigDecimal

               Date/Time                       long (number of milliseconds since
                                               January 1, 1970 00:00:00.000 GMT)
               * For more information about configuring the Java datatype for PowerCenter
               Decimal datatypes, see “Enabling High Precision” on page 230.


              String and byte[] are object datatypes in Java. Int, double, and long are primitive datatypes.




216   Chapter 9: Java Transformation
Using the Java Code Tab
      Use the Java Code tab to define, compile, and fix compilation errors in Java code. You can use
      code snippets that Java packages, define static code or a static block, instance variables, and
      user-defined methods, define and call Java expressions, and define transformation logic.
      Create code snippets in the code entry tabs.
      After you develop code snippets, you can compile the Java code and view the results of the
      compilation in the Output window or view the full Java code.
      Figure 9-1 shows the components of the Java Code tab:

      Figure 9-1. Java Code Tab Components




                                                                                     Navigator




                                                                                     Code Window




                                                                                     Code Entry
                                                                                     Tabs



                                                                                     Output Window




      The Java Code tab contains the following components:
      ♦   Navigator. Add input or output ports or APIs to a code snippet. The Navigator lists the
          input and output ports for the transformation, the available Java transformation APIs, and
          a description of the port or API function. For input and output ports, the description
          includes the port name, type, datatype, precision, and scale. For API functions, the
          description includes the syntax and use of the API function.
          The Navigator disables any port or API function that is unavailable for the code entry tab.
          For example, you cannot add ports or call API functions from the Import Packages code
          entry tab.




                                                                          Using the Java Code Tab    217
For more information about using the Navigator when you develop Java code, see
                  “Developing Java Code” on page 225.
              ♦   Code window. Develop Java code for the transformation. The code window uses basic Java
                  syntax highlighting. For more information, see “Developing Java Code” on page 225.
              ♦   Code entry tabs. Define transformation behavior. Each code entry tab has an associated
                  Code window. To enter Java code for a code entry tab, click the tab and write Java code in
                  the Code window. For more information about the code entry tabs, see “Developing Java
                  Code” on page 225.
              ♦   Define Expression link. Launches the Define Expression dialog box that you use to create
                  Java expressions. For more information about creating and using Java expressions, see
                  “Java Expressions” on page 263.
              ♦   Settings link. Launches the Settings dialog box that you use to set the classpath for third-
                  party and custom Java packages and to enable high precision for Decimal datatypes. For
                  more information, see “Configuring Java Transformation Settings” on page 229.
              ♦   Compile link. Compiles the Java code for the transformation. Output from the Java
                  compiler, including error and informational messages, appears in the Output window. For
                  more information about compiling Java transformations, see “Compiling a Java
                  Transformation” on page 231.
              ♦   Full Code link. Opens the Full Code window to display the complete class code for the
                  Java transformation. The complete code for the transformation includes the Java code
                  from the code entry tabs added to the Java transformation class template. For more
                  information about using the Full Code window, see “Fixing Compilation Errors” on
                  page 232.
              ♦   Output window. Displays the compilation results for the Java transformation class. You
                  can right-click an error message in the Output window to locate the error in the snippet
                  code or the full code for the Java transformation class in the Full Code window. You can
                  also double-click an error in the Output window to locate the source of the error. For
                  more information about using the Output window to troubleshoot compilation errors, see
                  “Fixing Compilation Errors” on page 232.




218   Chapter 9: Java Transformation
Configuring Ports
      A Java transformation can have input ports, output ports, and input/output ports. You create
      and edit groups and ports on the Ports tab. You can specify default values for ports. After you
      add ports to a transformation, use the port names as variables in Java code snippets.
      Figure 9-2 shows the Ports tab for a Java transformation with one input group and one output
      group:

      Figure 9-2. Java Transformation Ports Tab



                                                                                       Add and delete
                                                                                       groups, and edit
                                                                                       port relationships.

                                                                                       Input Group




                                                                                       Output Group




                                                                                       Set default value.




    Creating Groups and Ports
      When you create a Java transformation, it includes one input group and one output group. A
      Java transformation always has one input group and one output group. You can change the
      existing group names by typing the group header. If you delete a group, you can add a new
      group by clicking the Create Input Group or Create Output Group icon. The transformation
      is not valid if it has multiple input or output groups.
      To create a port, click the Add button. When you create a port, the Designer adds it below the
      currently selected row or group.
      For guidelines about creating and editing groups and ports, see “Working with Groups and
      Ports” on page 59.




                                                                                Configuring Ports      219
Setting Default Port Values
              You can define default values for ports in a Java transformation. The Java transformation
              initializes port variables with the default port value, depending on the datatype of the port.
              For more information about port datatypes, see “Datatype Mapping” on page 215.

              Input and Output Ports
              The Java transformation initializes the value of unconnected input ports or output ports that
              are not assigned a value in the Java code snippets. The Java transformation initializes the ports
              depending on the port datatype:
              ♦   Simple datatypes. If you define a default value for the port, the transformation initializes
                  the value of the port variable to the default value. Otherwise, it initializes the value of the
                  port variable to 0.
              ♦   Complex datatypes. If you provide a default value for the port, the transformation creates
                  a new String or byte[] object, and initializes the object to the default value. Otherwise, the
                  transformation initializes the port variable to NULL.
                  Input ports with a NULL value generate a NullPointerException if you access the value of
                  the port variable in the Java code.

              Input/Output Ports
              The Java transformation treats input/output ports as pass-through ports. If you do not set a
              value for the port in the Java code for the transformation, the output value is the same as the
              input value. The Java transformation initializes the value of an input/output port in the same
              way as an input port.
              If you set the value of a port variable for an input/output port in the Java code, the Java
              transformation uses this value when it generates an output row. If you do not set the value of
              an input/output port, the Java transformation sets the value of the port variable to 0 for
              simple datatypes and NULL for complex datatypes when it generates an output row.




220   Chapter 9: Java Transformation
Configuring Java Transformation Properties
      The Java transformation includes properties for both the transformation code and the
      transformation. If you create a Java transformation in the Transformation Developer, you can
      override the transformation properties when you use it in a mapping.
      Figure 9-3 shows the Java transformation Properties tab:

      Figure 9-3. Java Transformation Properties




      Table 9-2 describes the Java transformation properties:

      Table 9-2. Java Transformation Properties

                                Required /
       Property                                    Description
                                Optional

       Language                 Required           Language used for the transformation code. You cannot change this
                                                   value.

       Class Name               Required           Name of the Java class for the transformation. You cannot change this
                                                   value.




                                                                      Configuring Java Transformation Properties           221
Table 9-2. Java Transformation Properties

                                        Required /
               Property                                   Description
                                        Optional

               Tracing Level            Required          Amount of detail displayed in the session log for this transformation. Use
                                                          the following tracing levels:
                                                          - Terse
                                                          - Normal
                                                          - Verbose Initialization
                                                          - Verbose Data
                                                          Default is Normal. For more information about tracing levels, see “Session
                                                          and Workflow Logs” in the Workflow Administration Guide.

               Is Partitionable         Required          Multiple partitions in a pipeline can use this transformation. Use the
                                                          following options:
                                                          - No. The transformation cannot be partitioned. The transformation and
                                                            other transformations in the same pipeline are limited to one partition.
                                                            You might choose No if the transformation processes all the input data
                                                            together, such as data cleansing.
                                                          - Locally. The transformation can be partitioned, but the Integration
                                                            Service must run all partitions in the pipeline on the same node. Choose
                                                            Locally when different partitions of the transformation must share objects
                                                            in memory.
                                                          - Across Grid. The transformation can be partitioned, and the Integration
                                                            Service can distribute each partition to different nodes.
                                                          Default is No.
                                                          For more information about using partitioning with Java and Custom
                                                          transformations, see “Working with Partition Points” in the Workflow
                                                          Administration Guide.

               Inputs Must Block        Optional          The procedure associated with the transformation must be able to block
                                                          incoming data. Default is enabled.

               Is Active                Required          The transformation can generate more than one output row for each input
                                                          row.
                                                          You cannot change this property after you create the Java transformation.
                                                          If you need to change this property, create a new Java transformation.

               Update Strategy          Optional          The transformation defines the update strategy for output rows. You can
               Transformation                             enable this property for active Java transformations.
                                                          Default is disabled.
                                                          For more information about setting the update strategy in Java
                                                          transformations, see “Setting the Update Strategy” on page 224.

               Transformation Scope     Required          The method in which the Integration Service applies the transformation
                                                          logic to incoming data. Use the following options:
                                                          - Row
                                                          - Transaction
                                                          - All Input
                                                          This property is always Row for passive transformations. Default is All
                                                          Input for active transformations.
                                                          For more information about working with transaction control, see “Working
                                                          with Transaction Control” on page 223.




222   Chapter 9: Java Transformation
Table 9-2. Java Transformation Properties

                                Required /
      Property                                Description
                                Optional

      Generate Transaction      Optional      The transformation generates transaction rows. You can enable this
                                              property for active Java transformations.
                                              Default is disabled.
                                              For more information about working with transaction control, see “Working
                                              with Transaction Control” on page 223.

      Output Is Ordered         Required      The order of the output data is consistent between session runs.
                                              - Never. The order of the output data is inconsistent between session runs.
                                              - Based On Input Order. The output order is consistent between session
                                                runs when the input data order is consistent between session runs.
                                              - Always. The order of the output data is consistent between session runs
                                                even if the order of the input data is inconsistent between session runs.
                                              Default is Never for active transformations. Default is Based On Input
                                              Order for passive transformations.

      Requires Single           Optional      A single thread processes the data for each partition.
      Thread Per Partition                    You cannot change this value.

      Output Is Deterministic   Optional      The transformation generates consistent output data between session
                                              runs. You must enable this property to perform recovery on sessions that
                                              use this transformation.
                                              For more information about session recovery, see “Recovering Workflows”
                                              in the Workflow Administration Guide.



Working with Transaction Control
  You can define transaction control for a Java transformation using the following properties:
  ♦     Transformation Scope. Determines how the Integration Service applies the transformation
        logic to incoming data.
  ♦     Generate Transaction. Indicates that the Java code for the transformation generates
        transaction rows and outputs them to the output group.

  Transformation Scope
  You can configure how the Integration Service applies the transformation logic to incoming
  data. You can choose one of the following values:
  ♦     Row. Applies the transformation logic to one row of data at a time. Choose Row when the
        results of the transformation depend on a single row of data. You must choose Row for
        passive transformations.
  ♦     Transaction. Applies the transformation logic to all rows in a transaction. Choose
        Transaction when the results of the transformation depend on all rows in the same
        transaction, but not on rows in other transactions. For example, you might choose
        Transaction when the Java code performs aggregate calculations on the data in a single
        transaction.
  ♦     All Input. Applies the transformation logic to all incoming data. When you choose All
        Input, the Integration Service drops transaction boundaries. Choose All Input when the

                                                                  Configuring Java Transformation Properties           223
results of the transformation depend on all rows of data in the source. For example, you
                  might choose All Input when the Java code for the transformation sorts all incoming data.
              For more information about transformation scope, see “Understanding Commit Points” in
              the Workflow Administration Guide.

              Generate Transaction
              You can define Java code in an active Java transformation to generate transaction rows, such as
              commit and rollback rows. If the transformation generates commit and rollback rows,
              configure the Java transformation to generate transactions with the Generate Transaction
              transformation property. For more information about Java transformation API methods to
              generate transaction rows, see “commit” on page 239 and “rollBack” on page 247.
              When you configure the transformation to generate transaction rows, the Integration Service
              treats the Java transformation like a Transaction Control transformation. Most rules that
              apply to a Transaction Control transformation in a mapping also apply to the Java
              transformation. For example, when you configure a Java transformation to generate
              transaction rows, you cannot concatenate pipelines or pipeline branches containing the
              transformation. For more information about working with Transaction Control
              transformations, see “Transaction Control Transformation” on page 555.
              When you edit or create a session using a Java transformation configured to generate
              transaction rows, configure it for user-defined commit.


        Setting the Update Strategy
              Use an active Java transformation to set the update strategy for a mapping. You can set the
              update strategy at the following levels:
              ♦   Within the Java code. You can write the Java code to set the update strategy for output
                  rows. The Java code can flag rows for insert, update, delete, or reject. For more about
                  setting the update strategy, see “setOutRowType” on page 249.
              ♦   Within the mapping. Use the Java transformation in a mapping to flag rows for insert,
                  update, delete, or reject. Select the Update Strategy Transformation property for the Java
                  transformation.
              ♦   Within the session. Configure the session to treat the source rows as data driven.
              If you do not configure the Java transformation to define the update strategy, or you do not
              configure the session as data driven, the Integration Service does not use the Java code to flag
              the output rows. Instead, the Integration Service flags the output rows as insert.




224   Chapter 9: Java Transformation
Developing Java Code
      Use the code entry tabs to enter Java code snippets that define Java transformation
      functionality. You can write Java code using the code entry tabs to import Java packages, write
      helper code, define Java expressions, and write Java code that defines transformation behavior
      for specific transformation events. You can develop snippets in the code entry tabs in any
      order.
      You can enter Java code in the following code entry tabs:
      ♦    Import Packages. Import third-party Java packages, built-in Java packages, or custom Java
           packages. For more information, see “Importing Java Packages” on page 226.
      ♦    Helper Code. Define variables and methods available to all tabs except Import Packages.
           For more information, see “Defining Helper Code” on page 226.
      ♦    On Input Row. Define transformation behavior when it receives an input row. For more
           information, see “On Input Row Tab” on page 227.
      ♦    On End of Data. Define transformation behavior when it has processed all input data. For
           more information, see “On End of Data Tab” on page 228.
      ♦    On Receiving Transaction. Define transformation behavior when it receives a transaction
           notification. Use with active Java transformations. For more information, see “On
           Receiving Transaction Tab” on page 228.
      ♦    Java Expressions. Define Java expressions to call PowerCenter expressions. You can use
           Java expressions in the Helper Code, On Input Row, On End of Data, and On Transaction
           code entry tabs. For more information about Java expressions, see “Java Expressions” on
           page 263.
      You can access input data and set output data on the On Input Row tab. For active
      transformations, you can also set output data on the On End of Data and On Receiving
      Transaction tabs.


    Creating Java Code Snippets
      Use the Code window in the Java Code tab to create Java code snippets to define
      transformation behavior.

      To create a Java code snippet:

      1.    Click the appropriate code entry tab.
      2.    To access input or output column variables in the snippet, double-click the name of the
            port in the Navigator.
      3.    To call a Java transformation API in the snippet, double-click the name of the API in the
            navigator. If necessary, configure the appropriate API input values.
      4.    Write appropriate Java code, depending on the code snippet.
            The Full Code windows displays the full class code for the Java transformation.



                                                                            Developing Java Code   225
Importing Java Packages
              Use the Import Package tab to import third-party Java packages, built-in Java packages, or
              custom Java packages for active or passive Java transformations. After you import Java
              packages, use the imported packages in any code entry tab. You cannot declare or use static
              variables, instance variables, or user methods in this tab.
              For example, to import the Java I/O package, enter the following code in the Import Packages
              tab:
                     import java.io.*;

              When you import non-standard Java packages, you must add the package or class to the
              classpath. For more information about setting the classpath, see “Configuring Java
              Transformation Settings” on page 229.
              When you export or import metadata that contains a Java transformation in the PowerCenter
              Client, the JAR files or classes that contain the third-party or custom packages required by the
              Java transformation are not included. If you import metadata that contains a Java
              transformation, you must also copy the JAR files or classes that contain the required third-
              party or custom packages to the PowerCenter Client machine.


        Defining Helper Code
              Use the Helper Code tab in active or passive Java transformations to declare user-defined
              variables and methods for the Java transformation class. Use variables and methods declared
              in the Helper Code tab in any code entry tab except the Import Packages tab.
              You can declare the following user-defined variables and user-defined methods:
              ♦   Static code and static variables. You can declare static variables and static code within a
                  static block. All instances of a reusable Java transformation in a mapping and all partitions
                  in a session share static code and variables. Static code executes before any other code in a
                  Java transformation.
                  For example, the following code declares a static variable to store the error threshold for all
                  instances of a Java transformation in a mapping:
                     static int errorThreshold;

                  You can then use this variable to store the error threshold for the transformation and access
                  it from all instances of the Java transformation in a mapping and from any partition in a
                  session.
                  Note: You must synchronize static variables in a multiple partition session or in a reusable
                  Java transformation.
              ♦   Instance variables. You can declare partition-level instance variables. Multiple instances of
                  a reusable Java transformation in a mapping or multiple partitions in a session do not share
                  instance variables. Declare instance variables with a prefix to avoid conflicts and initialize
                  non-primitive instance variables.




226   Chapter 9: Java Transformation
For example, the following code uses a boolean variable to decide whether to generate an
      output row:
         // boolean to decide whether to generate an output row
         // based on validity of input
         private boolean generateRow;

  ♦   User-defined methods. Create user-defined static or instance methods to extend the
      functionality of the Java transformation. Java methods declared in the Helper Code tab
      can use or modify output variables or locally declared instance variables. You cannot access
      input variables from Java methods in the Helper Code tab.
      For example, use the following code in the Helper Code tab to declare a function that adds
      two integers:
         private int myTXAdd (int num1,int num2)
         {
              return num1+num2;
         }


On Input Row Tab
  Use the On Input Row tab to define the behavior of the Java transformation when it receives
  an input row. The Java code in this tab executes one time for each input row. You can access
  input row data in the On Input Row tab only.
  You can access and use the following input and output port data, variables, and methods from
  the On Input Row tab:
  ♦   Input port and output port variables. You can access input and output port data as a
      variable by using the name of the port as the name of the variable. For example, if “in_int”
      is an Integer input port, you can access the data for this port by referring as a variable
      “in_int” with the Java primitive datatype int. You do not need to declare input and output
      ports as variables.
      Do not assign a value to an input port variable. If you assign a value to an input variable in
      the On Input Row tab, you cannot get the input data for the corresponding port in the
      current row.
  ♦   Instance variables and user-defined methods. Use any instance or static variable or user-
      defined method you declared in the Helper Code tab.
      For example, an active Java transformation has two input ports, BASE_SALARY and
      BONUSES, with an integer datatype, and a single output port, TOTAL_COMP, with an
      integer datatype. You create a user-defined method in the Helper Code tab, myTXAdd,
      that adds two integers and returns the result. Use the following Java code in the On Input
      Row tab to assign the total values for the input ports to the output port and generate an
      output row:
         TOTAL_COMP = myTXAdd (BASE_SALARY,BONUSES);
         generateRow();

      When the Java transformation receives an input row, it adds the values of the
      BASE_SALARY and BONUSES input ports, assigns the value to the TOTAL_COMP
      output port, and generates an output row.

                                                                          Developing Java Code   227
♦   Java transformation API methods. You can call API methods provided by the Java
                  transformation. For more information about Java transformation API methods, see “Java
                  Transformation API Reference” on page 237.


        On End of Data Tab
              Use the On End of Data tab in active or passive Java transformations to define the behavior of
              the Java transformation when it has processed all input data. If you want to generate output
              rows in the On End of Data tab, you must set the transformation scope for the transformation
              to Transaction or All Input. You cannot access or set the value of input port variables in this
              tab.
              You can access and use the following variables and methods from the On End of Data tab:
              ♦   Output port variables. Use the names of output ports as variables to access or set output
                  data for active Java transformations.
              ♦   Instance variables and user-defined methods. Use any instance variables or user-defined
                  methods you declared in the Helper Code tab.
              ♦   Java transformation API methods. You can call API methods provided by the Java
                  transformation. Use the commit and rollBack API methods to generate a transaction. For
                  more information about API methods, see “Java Transformation API Reference” on
                  page 237.
                  For example, use the following Java code to write information to the session log when the
                  end of data is reached:
                     logInfo("Number of null rows for partition is: " + partCountNullRows);


        On Receiving Transaction Tab
              Use the On Receiving Transaction tab in active Java transformations to define the behavior of
              an active Java transformation when it receives a transaction notification. The code snippet for
              the On Receiving Transaction tab is only executed if the Transaction Scope for the
              transformation is set to Transaction. You cannot access or set the value of input port variables
              in this tab.
              You can access and use the following output data, variables, and methods from the On
              Receiving Transaction tab:
              ♦   Output port variables. Use the names of output ports as variables to access or set output
                  data.
              ♦   Instance variables and user-defined methods. Use any instance variables or user-defined
                  methods you declared in the Helper Code tab.
              ♦   Java transformation API methods. You can call API methods provided by the Java
                  transformation. Use the commit and rollBack API methods to generate a transaction. For
                  more information about API methods, see “Java Transformation API Reference” on
                  page 237. For example, use the following Java code to generate a transaction after the
                  transformation receives a transaction:
                     commit();


228   Chapter 9: Java Transformation
Configuring Java Transformation Settings
      You can configure Java transformation settings to set the classpath for third-party and custom
      Java packages and to enable high precision for Decimal datatypes.
      Figure 9-4 shows the Settings dialog box for a Java transformation where you can set the
      classpath and enable high precision:

      Figure 9-4. Java Transformation Settings Dialog Box




                                                              Set classpath.




                                                              Enable high precision.




    Configuring the Classpath
      When you import non-standard Java packages in the Import package tab, you must set the
      classpath to the location of the JAR files or class files for the Java package. You can set the
      CLASSPATH environment variable on the PowerCenter Client machine or configure the Java
      transformation settings to set the classpath. The PowerCenter Client adds the Java packages
      or class files you add in the Settings dialog box to the system classpath when you compile the
      Java code for the transformation.
      For example, you import the Java package converter in the Import Packages tab and define
      the package in converter.jar. You must add converter.jar to the classpath before you compile
      the Java code for the Java transformation.
      You do not need to set the classpath for built-in Java packages. For example, java.io is a built-
      in Java package. If you import java.io, you do not need to set the classpath for java.io.
      Note: You can also add Java packages to the system classpath for a session, using the Java
      Classpath session property. For more information, see “Session Properties Reference” in the
      Workflow Administration Guide.

      To set the classpath for a Java transformation:

      1.   On the Java Code tab, click the Settings link.
           The Settings dialog box appears.
      2.   Click Browse under Add Classpath to select the JAR file or class file for the imported
           package. Click OK.


                                                             Configuring Java Transformation Settings   229
3.   Click Add.
                   The JAR or class file appears in the list of JAR and class files for the transformation.
              4.   To remove a JAR file or class file, select the JAR or class file and click Remove.


        Enabling High Precision
              By default, the Java transformation maps ports of type Decimal to double datatypes (with a
              precision of 15). If you want to process a Decimal datatype with a precision greater than 15,
              enable high precision to process decimal ports with the Java class BigDecimal.
              When you enable high precision, you can process Decimal ports with precision less than 28 as
              BigDecimal. The Java transformation maps Decimal ports with a precision greater than 28 to
              double datatypes.
              For example, a Java transformation has an input port of type Decimal that receives a value of
              40012030304957666903. If you enable high precision, the value of the port is treated as it
              appears. If you do not enable high precision, the value of the port is 4.00120303049577 x
              10^19.




230   Chapter 9: Java Transformation
Compiling a Java Transformation
      The PowerCenter Client uses the Java compiler to compile the Java code and generate the
      byte code for the transformation. The Java compiler compiles the Java code and displays the
      results of the compilation in the Output window. The Java compiler installs with the
      PowerCenter Client in the java/bin directory.
      To compile the full code for the Java transformation, click Compile in the Java Code tab.
      When you create a Java transformation, it contains a Java class that defines the base
      functionality for a Java transformation. The full code for the Java class contains the template
      class code for the transformation, plus the Java code you define in the code entry tabs.
      When you compile a Java transformation, the PowerCenter Client adds the code from the
      code entry tabs to the template class for the transformation to generate the full class code for
      the transformation. The PowerCenter Client then calls the Java compiler to compile the full
      class code. The Java compiler compiles the transformation and generates the byte code for the
      transformation.
      The results of the compilation display in the Output window. Use the results of the
      compilation to identify and locate Java code errors.
      Note: The Java transformation is also compiled when you click OK in the transformation.




                                                                    Compiling a Java Transformation   231
Fixing Compilation Errors
              You can identify Java code errors and locate the source of Java code errors for a Java
              transformation in the Output window. Java transformation errors may occur as a result of an
              error in a code entry tab or may occur as a result of an error in the full code for the Java
              transformation class.
              To troubleshoot a Java transformation:
              ♦   Locate the source of the error. You can locate the source of the error in the Java snippet
                  code or in the full class code for the transformation.
              ♦   Identify the type of error. Use the results of the compilation in the output window and the
                  location of the error to identify the type of error.
              After you identify the source and type of error, fix the Java code in the code entry tab and
              compile the transformation again.


        Locating the Source of Compilation Errors
              When you compile a Java transformation, the Output window displays the results of the
              compilation. Use the results of the compilation to identify compilation errors. When you use
              the Output window to locate the source of an error, the PowerCenter Client highlights the
              source of the error in a code entry tab or in the Full Code window.
              You can locate errors in the Full Code window, but you cannot edit Java code in the Full Code
              window. To fix errors that you locate in the Full Code window, you need to modify the code
              in the appropriate code entry tab. You might need to use the Full Code window to view errors
              caused by adding user code to the full class code for the transformation.
              Use the results of the compilation in the Output window to identify errors in the following
              locations:
              ♦   Code entry tabs
              ♦   Full Code window

              Locating Errors in the Code Entry Tabs
              To locate the source of an error in the code entry tabs, right-click on the error in the Output
              window and choose View error in snippet or double-click on the error in the Output window.
              The PowerCenter Client highlights the source of the error in the appropriate code entry tab.




232   Chapter 9: Java Transformation
Figure 9-5 shows a highlighted error in a code entry tab:

Figure 9-5. Highlighted Error in Code Entry Tab




Locating Errors in the Full Code Window
To locate the source of errors in the Full Code window, right-click on the error in the Output
window and choose View error in full code or double-click the error in the Output window.
The PowerCenter Client highlights the source of the error in the Full Code window.




                                                                  Fixing Compilation Errors   233
Figure 9-6 shows a highlighted error in the Full Code window:

              Figure 9-6. Highlighted Error in Full Code Window




        Identifying Compilation Errors
              Compilation errors may appear as a result of errors in the user code. Errors in the user code
              may also generate an error in the non-user code for the class. Compilation errors occur in the
              following code for the Java transformation:
              ♦   User code
              ♦   Non-user code

              User Code Errors
              Errors may occur in the user code in the code entry tabs. User code errors may include
              standard Java syntax and language errors. User code errors may also occur when the
              PowerCenter Client adds the user code from the code entry tabs to the full class code.
              For example, a Java transformation has an input port with a name of int1 and an integer
              datatype. The full code for the class declares the input port variable with the following code:
                     int int1;

              However, if you use the same variable name in the On Input Row tab, the Java compiler issues
              an error for a redeclaration of a variable. You must rename the variable in the On Input Row
              code entry tab to fix the error.




234   Chapter 9: Java Transformation
Non-user Code Errors
User code in the code entry tabs may cause errors in non-user code.
For example, a Java transformation has an input port and an output port, int1 and out1, with
integer datatypes. You write the following code in the On Input Row code entry tab to
calculate interest for input port int1 and assign it to the output port out1:
      int interest;

      interest = CallInterest(int1); // calculate interest
      out1 = int1 + interest;

      }

When you compile the transformation, the PowerCenter Client adds the code from the On
Input Row code entry tab to the full class code for the transformation. When the Java
compiler compiles the Java code, the unmatched brace causes a method in the full class code
to terminate prematurely, and the Java compiler issues an error.




                                                                 Fixing Compilation Errors   235
236   Chapter 9: Java Transformation
Chapter 10




Java Transformation API
Reference
   This chapter includes the following topic:
   ♦   Java Transformation API Methods, 238




                                                             237
Java Transformation API Methods
             You can call Java transformation API methods in the Java Code tab of a Java transformation to
             define transformation behavior.
             The Java transformation provides the following API methods:
             ♦   commit. Generates a transaction. For more information, see “commit” on page 239.
             ♦   failSession. Throws an exception with an error message and fails the session. For more
                 information, see “failSession” on page 240.
             ♦   generateRow. Generates an output row for active Java transformations. For more
                 information, see “generateRow” on page 241.
             ♦   getInRowType. Returns the input type of the current row in the transformation. For more
                 information, see “getInRowType” on page 242.
             ♦   incrementErrorCount. Increases the error count for the session. For more information, see
                 “incrementErrorCount” on page 243.
             ♦   isNull. Checks the value of an input column for a null value. For more information, see
                 “isNull” on page 244.
             ♦   logError. Writes an error message to the session log. For more information, see “logError”
                 on page 246.
             ♦   logInfo. Writes an informational message to the session log. For more information, see
                 “logInfo” on page 245.
             ♦   rollback. Generates a rollback transaction. For more information, see “rollBack” on
                 page 247.
             ♦   setNull. Sets the value of an output column in an active or passive Java transformation to
                 NULL. For more information, see “setNull” on page 248.
             ♦   setOutRowType. Sets the update strategy for output rows. For more information, see
                 “setOutRowType” on page 249.
             You can add any API method to a code entry tab by double-clicking the name of the API
             method in the Navigator, dragging the method from the Navigator into the Java code snippet,
             or manually typing the API method in the Java code snippet.
             You can also use the defineJExpression and invokeJExpression API methods to create and
             invoke Java expressions. For more information about using the API methods with Java
             expressions, see “Java Expressions” on page 263.




238   Chapter 10: Java Transformation API Reference
commit
     Generates a transaction.
     Use commit in any tab except the Import Packages or Java Expressions code entry tabs. You
     can only use commit in active transformations configured to generate transactions. If you use
     commit in an active transformation not configured to generate transactions, the Integration
     Service throws an error and fails the session.


   Syntax
     Use the following syntax:
            commit();


   Example
     Use the following Java code to generate a transaction for every 100 rows processed by a Java
     transformation and then set the rowsProcessed counter to 0:
            if (rowsProcessed==100) {
                 commit();
                 rowsProcessed=0;
            }




                                                                                      commit   239
failSession
             Throws an exception with an error message and fails the session. Use failSession to terminate
             the session. Do not use failSession in a try/catch block in a code entry tab.
             Use failSession in any tab except the Import Packages or Java Expressions code entry tabs.


        Syntax
             Use the following syntax:
                     failSession(String errorMessage);


                                                  Input/
               Argument           Datatype                 Description
                                                  Output

               errorMessage       String          Input    Error message string.



        Example
             Use the following Java code to test the input port input1 for a null value and fail the session if
             input1 is NULL:
                     if(isNull(”input1”)) {
                          failSession(“Cannot process a null value for port input1.”);
                     }




240   Chapter 10: Java Transformation API Reference
generateRow
     Generates an output row for active Java transformations. When you call generateRow, the Java
     transformation generates an output row using the current value of the output port variables. If
     you want to generate multiple rows corresponding to an input row, you can call generateRow
     more than once for each input row. If you do not use generateRow in an active Java
     transformation, the transformation does not generate output rows.
     Use generateRow in any code entry tab except the Import Packages or Java Expressions code
     entry tabs. You can use generateRow with active transformations only. If you use generateRow
     in a passive transformation, the session generates an error.


   Syntax
     Use the following syntax:
            generateRow();


   Example
     Use the following Java code to generate one output row, modify the values of the output
     ports, and generate another output row:
            // Generate multiple rows.

            if(!isNull("input1") && !isNull("input2"))
            {
                 output1 = input1 + input2;
                 output2 = input1 - input2;
            }
            generateRow();

            // Generate another row with modified values.
            output1 = output1 * 2;
            output2 = output2 * 2;
            generateRow();




                                                                                  generateRow    241
getInRowType
             Returns the input type of the current row in the transformation. The method returns a value
             of insert, update, delete, or reject.
             You can only use getInRowType in the On Input Row code entry tab. You can only use the
             getInRowType method in active transformations configured to set the update strategy. If you
             use this method in an active transformation not configured to set the update strategy, the
             session generates an error.


        Syntax
             Use the following syntax:
                     rowType getInRowType();


                                                 Input/
               Argument           Datatype                Description
                                                 Output

               rowType            String         Output   Update strategy type. Value can be INSERT, UPDATE, DELETE, or
                                                          REJECT.



        Example
             Use the following Java code to propagate the input type of the current row if the row type is
             UPDATE or INSERT and the value of the input port input1 is less than 100 or set the output
             type as DELETE if the value of input1 is greater than 100:
                     // Set the value of the output port.
                     output1 = input1;
                     // Get and set the row type.
                     String rowType = getInRowType();
                     setOutRowType(rowType);
                     // Set row type to DELETE if the output port value is > 100.
                     if(input1 > 100)
                          setOutRowType(DELETE);




242   Chapter 10: Java Transformation API Reference
incrementErrorCount
      Increases the error count for the session. If the error count reaches the error threshold for the
      session, the session fails. Use incrementErrorCount in any tab except the Import Packages or
      Java Expressions code entry tabs.


    Syntax
      Use the following syntax:
             incrementErrorCount(int nErrors);


                                      Input/
       Argument        Datatype                  Description
                                      Output

       nErrors         Integer        Input      Number of errors to increment the error count for the session.



    Example
      Use the following Java code to increment the error count if an input port for a transformation
      has a null value:
             // Check if input employee id and name is null.
             if (isNull ("EMP_ID_INP") || isNull ("EMP_NAME_INP"))
             {
                  incrementErrorCount(1);
                  // if input employee id and/or name is null, don't generate a output
             row for this input row
                  generateRow = false;

             }




                                                                                      incrementErrorCount         243
isNull
             Checks the value of an input column for a null value. Use isNull to check if data of an input
             column is NULL before using the column as a value. You can use the isNull method in the
             On Input Row code entry tab only.


        Syntax
             Use the following syntax:
                     Boolean isNull(String satrColName);


                                                 Input/
               Argument           Datatype                Description
                                                 Output

               strColName         String         Input    Name of an input column.



        Example
             Use the following Java code to check the value of the SALARY input column before adding it
             to the instance variable totalSalaries:
                     // if value of SALARY is not null

                     if (!isNull("SALARY")) {

                            // add to totalSalaries

                            TOTAL_SALARIES += SALARY;

                     }

             or
                     // if value of SALARY is not null

                     String strColName = "SALARY";

                     if (!isNull(strColName)) {

                            // add to totalSalaries

                            TOTAL_SALARIES += SALARY;

                     }




244   Chapter 10: Java Transformation API Reference
logInfo
          Writes an informational message to the session log.
          Use logInfo in any tab except the Import Packages or Java Expressions tabs.


    Syntax
          Use the following syntax:
                logInfo(String logMessage);


                                           Input/
           Argument          Datatype                   Description
                                           Output

           logMessage        String        Input        Information message string.



    Example
          Use the following Java code to write a message to the message log after the Java
          transformation processes a message threshold of 1,000 rows:
                if (numRowsProcessed == messageThreshold) {

                        logInfo("Processed " + messageThreshold + " rows.");

                }

          The following message appears in the session log:
                [JTX_1012] [INFO] Processed 1000 rows.




                                                                                             logInfo   245
logError
             Write an error message to the session log.
             Use logError in any tab except the Import Packages or Java Expressions code entry tabs.


        Syntax
             Use the following syntax:
                     logError(String errorMessage);


                                                      Input/
               Argument             Datatype                   Description
                                                      Output

               errorMessage         String            Input    Error message string.



        Example
             Use the following Java code to log an error of the input port is null:
                     // check BASE_SALARY

                     if (isNull("BASE_SALARY")) {

                              logError("Cannot process a null salary field.");

                     }

             The following message appears in the message log:
                     [JTX_1013] [ERROR] Cannot process a null salary field.




246   Chapter 10: Java Transformation API Reference
rollBack
       Generates a rollback transaction.
       Use rollBack in any tab except the Import Packages or Java Expressions code entry tabs. You
       can only use rollback in active transformations configured to generate transactions. If you use
       rollback in an active transformation not configured to generate transactions, the Integration
       Service generates an error and fails the session.


    Syntax
       Use the following syntax:
             rollBack();


    Example
       Use the following code to generate a rollback transaction and fail the session if an input row
       has an illegal condition or generate a transaction if the number of rows processed is 100:
             // If row is not legal, rollback and fail session.
             if (!isRowLegal()) {
                  rollback();
                  failSession(“Cannot process illegal row.”);

             } else if (rowsProcessed==100) {
                  commit();
                  rowsProcessed=0;
             }




                                                                                         rollBack   247
setNull
             Sets the value of an output column in an active or passive Java transformation to NULL.
             Once you set an output column to NULL, you cannot modify the value until you generate an
             output row.
             Use setNull in any tab except the Import Packages or Java Expressions code entry tabs.


        Syntax
             Use the following syntax:
                     setNull(String strColName);


                                                      Input/
               Argument             Datatype                   Description
                                                      Output

               strColName           String            Input    Name of an output column.



        Example
             Use the following Java code to check the value of an input column and set the corresponding
             value of an output column to null:
                     // check value of Q3RESULTS input column

                     if(isNull("Q3RESULTS")) {

                            // set the value of output column to null

                            setNull("RESULTS");

                     }

             or
                     // check value of Q3RESULTS input column

                     String strColName = "Q3RESULTS";

                     if(isNull(strColName)) {

                            // set the value of output column to null

                            setNull(strColName);

                     }




248   Chapter 10: Java Transformation API Reference
setOutRowType
     Sets the update strategy for output rows. The setOutRowType method can flag rows for
     insert, update, or delete.
     You can only use setOutRowType in the On Input Row code entry tab. You can only use
     setOutRowType in active transformations configured to set the update strategy. If you use
     setOutRowType in an active transformation not configured to set the update strategy, the
     session generates an error and the session fails.


   Syntax
     Use the following syntax:
            setOutRowType(String rowType);


                                     Input/
      Argument          Datatype                Description
                                     Output

      rowType           String       Input      Update strategy type. Value can be INSERT, UPDATE, or
                                                DELETE.



   Example
     Use the following Java code to propagate the input type of the current row if the row type is
     UPDATE or INSERT and the value of the input port input1 is less than 100 or set the output
     type as DELETE if the value of input1 is greater than 100:
            // Set the value of the output port.
            output1 = input1;
            // Get and set the row type.
            String rowType = getInRowType();
            setOutRowType(rowType);
            // Set row type to DELETE if the output port value is > 100.
            if(input1 > 100)
                 setOutRowType(DELETE);




                                                                                     setOutRowType      249
250   Chapter 10: Java Transformation API Reference
Chapter 11




Java Transformation
Example
   This chapter includes the following topics:
   ♦   Overview, 252
   ♦   Step 1. Import the Mapping, 253
   ♦   Step 2. Create Transformation and Configure Ports, 254
   ♦   Step 3. Enter Java Code, 256
   ♦   Step 4. Compile the Java Code, 261
   ♦   Step 5. Create a Session and Workflow, 262




                                                                 251
Overview
             You can use the Java code in this example to create and compile an active Java transformation.
             You import a sample mapping and create and compile the Java transformation. You can then
             create and run a session and workflow that contains the mapping.
             The Java transformation processes employee data for a fictional company. It reads input rows
             from a flat file source and writes output rows to a flat file target. The source file contains
             employee data, including the employee identification number, name, job title, and the
             manager identification number.
             The transformation finds the manager name for a given employee based on the manager
             identification number and generates output rows that contain employee data. The output
             data includes the employee identification number, name, job title, and the name of the
             employee’s manager. If the employee has no manager in the source data, the transformation
             assumes the employee is at the top of the hierarchy in the company organizational chart.
             Note: The transformation logic assumes the employee job titles are arranged in descending
             order in the source file.
             Complete the following steps to import the sample mapping, create and compile a Java
             transformation, and create a session and workflow that contains the mapping:
             1.    Import the sample mapping. For more information, see “Step 1. Import the Mapping”
                   on page 253.
             2.    Create the Java transformation and configure the Java transformation ports. For more
                   information, see “Step 2. Create Transformation and Configure Ports” on page 254.
             3.    Enter the Java code for the transformation in the appropriate code entry tabs. For more
                   information, see “Step 3. Enter Java Code” on page 256.
             4.    Compile the Java code. For more information, see “Step 4. Compile the Java Code” on
                   page 261.
             5.    Create and run a session and workflow. For more information, see “Step 5. Create a
                   Session and Workflow” on page 262.
             For a sample source and target file for the session, see “Sample Data” on page 262.
             The PowerCenter Client installation contains a mapping, m_jtx_hier_useCase.xml, and flat
             file source, hier_input, that you can use with this example.
             For more information about creating transformations, mappings, sessions, and workflows, see
             Getting Started.




252   Chapter 11: Java Transformation Example
Step 1. Import the Mapping
      Import the metadata for the sample mapping in the Designer. The sample mapping contains
      the following components:
      ♦   Source definition and Source Qualifier transformation. Flat file source definition,
          hier_input, that defines the source data for the transformation.
      ♦   Target definition. Flat file target definition, hier_data, that receives the output data from
          the transformation.
      You can import the metadata for the mapping from the following location:
      <PowerCenter Client installation directory>clientbinm_jtx_hier_useCase.xml

      Figure 11-1 shows the sample mapping:

      Figure 11-1. Java Transformation Example - Sample Mapping




                                                                          Step 1. Import the Mapping   253
Step 2. Create Transformation and Configure Ports
             You create the Java transformation and configure the ports in the Mapping Designer. You can
             use the input and output port names as variables in the Java code. In a Java transformation,
             you create input and output ports in an input or output group. A Java transformation may
             contain only one input group and one output group. For more information about configuring
             ports in a Java transformation, see “Configuring Ports” on page 219.
             In the Mapping Designer, create an active Java transformation and configure the ports. In this
             example, the transformation is named jtx_hier_useCase.
             Note: To use the Java code in this example, you must use the exact names for the input and
             output ports.
             Table 11-1 shows the input and output ports for the transformation:

             Table 11-1. Input and Output Ports

               Port Name                        Port Type   Datatype   Precision   Scale

               EMP_ID_INP                       Input       Integer    10          0

               EMP_NAME_INP                     Input       String     100         0

               EMP_AGE                          Input       Integer    10          0

               EMP_DESC_INP                     Input       String     100         0

               EMP_PARENT_EMPID                 Input       Integer    10          0

               EMP_ID_OUT                       Output      Integer    10          0

               EMP_NAME_OUT                     Output      String     100         0

               EMP_DESC_OUT                     Output      String     100         0

               EMP_PARENT_EMPNAME               Output      String     100         0




254   Chapter 11: Java Transformation Example
Figure 11-2 shows the Ports tab in the Transformation Developer after you create the ports:

Figure 11-2. Java Transformation Example - Ports Tab




                                                       Step 2. Create Transformation and Configure Ports   255
Step 3. Enter Java Code
             Enter Java code for the transformation in the following code entry tabs:
             ♦   Import Packages. Imports the java.util.Map and java.util.HashMap packages. For
                 more information, see “Import Packages Tab” on page 256.
             ♦   Helper Code. Contains a Map object, lock object, and boolean variables used to track the
                 state of data in the Java transformation. For more information, see “Helper Code Tab” on
                 page 257.
             ♦   On Input Row. Contains the Java code that processes each input row in the
                 transformation. For more information, see “On Input Row Tab” on page 258.
             For more information about using the code entry tabs to develop Java code, see “Developing
             Java Code” on page 225.


        Import Packages Tab
             Import third-party Java packages, built-in Java packages, or custom Java packages in the
             Import Packages tab. The example transformation uses the Map and HashMap packages.
             Enter the following code in the Import Packages tab:
                     import java.util.Map;
                     import java.util.HashMap;

             The Designer adds the import statements to the Java code for the transformation.




256   Chapter 11: Java Transformation Example
Figure 11-3 shows the Import Packages code entry tab:

  Figure 11-3. Java Transformation Example - Import Packages Tab




Helper Code Tab
  Declare user-defined variables and methods for the Java transformation on the Helper Code
  tab. The Helper Code tab defines the following variables that are used by the Java code in the
  On Input Row tab:
  ♦   empMap. Map object that stores the identification number and employee name from the
      source.
  ♦   lock. Lock object used to synchronize the access to empMap across partitions.
  ♦   generateRow. Boolean variable used to determine if an output row should be generated for
      the current input row.
  ♦   isRoot. Boolean variable used to determine if an employee is at the top of the company
      organizational chart (root).
  Enter the following code in the Helper Code tab:
         // Static Map object to store the ID and name relationship of an
         // employee. If a session uses multiple partitions, empMap is shared
         // across all partitions.
         private static Map empMap = new HashMap();
         // Static lock object to synchronize the access to empMap across
         // partitions.
         private static Object lock = new Object();


                                                                     Step 3. Enter Java Code   257
// Boolean to track whether to generate an output row based on validity
                     // of the input data.
                     private boolean generateRow;
                     // Boolean to track whether the employee is root.
                     private boolean isRoot;

             Figure 11-4 shows the Helper Code tab:

             Figure 11-4. Java Transformation Example - Helper Code Tab




        On Input Row Tab
             The Java transformation executes the Java code in the On Input Row tab when the
             transformation receives an input row. In this example, the transformation may or may not
             generate an output row, based on the values of the input row.
             Enter the following code in the On Input Row tab:
                     // Initially set generateRow to true for each input row.
                     generateRow = true;
                     // Initially set isRoot to false for each input row.
                     isRoot = false;

                     // Check if input employee id and name is null.
                     if (isNull ("EMP_ID_INP") || isNull ("EMP_NAME_INP"))
                     {
                          incrementErrorCount(1);
                          // If input employee id and/or name is null, don't generate a output




258   Chapter 11: Java Transformation Example
// row for this input row.
     generateRow = false;

} else {
     // Set the output port values.
     EMP_ID_OUT = EMP_ID_INP;
     EMP_NAME_OUT = EMP_NAME_INP;
}

if (isNull ("EMP_DESC_INP"))
     setNull("EMP_DESC_OUT");
} else {
     EMP_DESC_OUT = EMP_DESC_INP;
}
boolean isParentEmpIdNull = isNull("EMP_PARENT_EMPID");

if(isParentEmpIdNull)
{
     // This employee is the root for the hierarchy.
     isRoot = true;
     logInfo("This is the root for this hierarchy.");
     setNull("EMP_PARENT_EMPNAME");
}

synchronized(lock)
{
     // If the employee is not the root for this hierarchy, get the
     // corresponding parent id.
     if(!isParentEmpIdNull)
          EMP_PARENT_EMPNAME = (String) (empMap.get(new Integer
(EMP_PARENT_EMPID)));
     // Add employee to the map for future reference.
     empMap.put (new Integer(EMP_ID_INP), EMP_NAME_INP);
}

// Generate row if generateRow is true.
if(generateRow)
     generateRow();




                                                  Step 3. Enter Java Code   259
Figure 11-5 shows the On Input Row tab:

             Figure 11-5. Java Transformation Example - On Input Row Tab




260   Chapter 11: Java Transformation Example
Step 4. Compile the Java Code
      Click Compile in the Transformation Developer to compile the Java code for the
      transformation. The Output window displays the status of the compilation. If the Java code
      does not compile successfully, correct the errors in the code entry tabs and recompile the Java
      code. After you successfully compile the transformation, save the transformation to the
      repository.
      For more information about compiling Java code, see “Compiling a Java Transformation” on
      page 231. For more information about troubleshooting compilation errors, see “Fixing
      Compilation Errors” on page 232.
      Figure 11-6 shows the results of a successful compilation:

      Figure 11-6. Java Transformation Example - Successful Compilation




                                                                          Step 4. Compile the Java Code   261
Step 5. Create a Session and Workflow
             Create a session and workflow for the mapping in the Workflow Manager, using the
             m_jtx_hier_useCase mapping.
             When you configure the session, you can use the sample source file from the following
             location:
                     <PowerCenter Client installation directory>clientbinhier_data



        Sample Data
             The following data is an excerpt from the sample source file:
                     1,James Davis,50,CEO,
                     4,Elaine Masters,40,Vice President - Sales,1
                     5,Naresh Thiagarajan,40,Vice President - HR,1
                     6,Jeanne Williams,40,Vice President - Software,1
                     9,Geetha Manjunath,34,Senior HR Manager,5
                     10,Dan Thomas,32,Senior Software Manager,6
                     14,Shankar Rahul,34,Senior Software Manager,6
                     20,Juan Cardenas,32,Technical Lead,10
                     21,Pramodh Rahman,36,Lead Engineer,14
                     22,Sandra Patterson,24,Software Engineer,10
                     23,Tom Kelly,32,Lead Engineer,10
                     35,Betty Johnson,27,Lead Engineer,14
                     50,Dave Chu,26,Software Engineer,23
                     70,Srihari Giran,23,Software Engineer,35
                     71,Frank Smalls,24,Software Engineer,35

             The following data is an excerpt from a sample target file:
                     1,James Davis,CEO,
                     4,Elaine Masters,Vice President - Sales,James Davis
                     5,Naresh Thiagarajan,Vice President - HR,James Davis
                     6,Jeanne Williams,Vice President - Software,James Davis
                     9,Geetha Manjunath,Senior HR Manager,Naresh Thiagarajan
                     10,Dan Thomas,Senior Software Manager,Jeanne Williams
                     14,Shankar Rahul,Senior Software Manager,Jeanne Williams
                     20,Juan Cardenas,Technical Lead,Dan Thomas
                     21,Pramodh Rahman,Lead Engineer,Shankar Rahul
                     22,Sandra Patterson,Software Engineer,Dan Thomas
                     23,Tom Kelly,Lead Engineer,Dan Thomas
                     35,Betty Johnson,Lead Engineer,Shankar Rahul
                     50,Dave Chu,Software Engineer,Tom Kelly
                     70,Srihari Giran,Software Engineer,Betty Johnson
                     71,Frank Smalls,Software Engineer,Betty Johnson




262   Chapter 11: Java Transformation Example
Chapter 12




Java Expressions


   This chapter includes the following topics:
   ♦   Overview, 264
   ♦   Using the Define Expression Dialog Box, 266
   ♦   Working with the Simple Interface, 271
   ♦   Working with the Advanced Interface, 273
   ♦   JExpression API Reference, 279




                                                               263
Overview
             You can invoke PowerCenter expressions in a Java transformation with the Java programming
             language. Use expressions to extend the functionality of a Java transformation. For example,
             you can invoke an expression in a Java transformation to look up the values of input or output
             ports or look up the values of Java transformation variables.
             To invoke expressions in a Java transformation, you generate the Java code or use Java
             transformation APIs to invoke the expression. You invoke the expression and use the result of
             the expression in the appropriate code entry tab. You can generate the Java code that invokes
             an expression or use API methods to write the Java code that invokes the expression.
             Use the following methods to create and invoke expressions in a Java transformation:
             ♦   Use the Define Expression dialog box. Create an expression and generate the code for an
                 expression. For more information, see “Using the Define Expression Dialog Box” on
                 page 266.
             ♦   Use the simple interface. Use a single method to invoke an expression and get the result of
                 the expression. For more information, see “Working with the Simple Interface” on
                 page 271.
             ♦   Use the advanced interface. Use the advanced interface to define the expression, invoke
                 the expression, and use the result of the expression. For more information, see “Working
                 with the Advanced Interface” on page 273.
             You can invoke expressions in a Java transformation without advanced knowledge of the Java
             programming language. You can invoke expressions using the simple interface, which only
             requires a single method to invoke an expression. If you are familiar with object oriented
             programming and want more control over invoking the expression, you can use the advanced
             interface.


        Expression Function Types
             You can create expressions for a Java transformation using the Expression Editor, by writing
             the expression in the Define Expression dialog box, or by using the simple or advanced
             interface. You can enter expressions that use input or output port variables or variables in the
             Java code as input parameters. If you use the Define Expression dialog box, you can use the
             Expression Editor to validate the expression before you use it in a Java transformation.
             You can invoke the following types of expression functions in a Java transformation:
             ♦   Transformation language functions. SQL-like functions designed to handle common
                 expressions.
             ♦   User-defined functions. Functions you create in PowerCenter based on transformation
                 language functions.
             ♦   Custom functions. Functions you create with the Custom Function API.
             ♦   Unconnected transformations. You can use unconnected transformations in expressions.
                 For example, you can use an unconnected lookup transformation in an expression.


264   Chapter 12: Java Expressions
You can also use system variables, user-defined mapping and workflow variables, and pre-
defined workflow variables such as $Session.status in expressions.
For more information about the transformation language and custom functions, see the
Transformation Language Reference. For more information about user-defined functions, see
“Working with User-Defined Functions” in the Designer Guide.




                                                                              Overview     265
Using the Define Expression Dialog Box
             When you define a Java expression, you configure the function, create the expression, and
             generate the code that invokes the expression. You can define the function and create the
             expression in the Define Expression dialog box.
             To create an expression function and use the expression in a Java transformation, complete the
             following tasks:
             1.    Configure the function. Configure the function that invokes the expression, including
                   the function name, description, and parameters. You use the function parameters when
                   you create the expression. For more information, see “Step 1. Configure the Function”
                   on page 266.
             2.    Create the expression. Create the expression syntax and validate the expression. For more
                   information, see “Step 2. Create and Validate the Expression” on page 267.
             3.    Generate Java code. Use the Define Expression dialog box to generate the Java code that
                   invokes the expression. The Designer places the code in the Java Expressions code entry
                   tab in the Transformation Developer. For more information, see “Step 3. Generate Java
                   Code for the Expression” on page 267.
             After you generate the Java code, call the generated function in the appropriate code entry tab
             to invoke an expression or get a JExpression object, depending on whether you use the simple
             or advanced interface.
             Note: To validate an expression when you create the expression, you must use the Define
             Expression dialog box.


        Step 1. Configure the Function
             You configure the function name, description, and input parameters for the Java function that
             invokes the expression.
             Use the following rules and guidelines when you configure the function:
             ♦    Use a unique function name that does not conflict with an existing Java function in the
                  transformation or reserved Java keywords.
             ♦    You must configure the parameter name, Java datatype, precision, and scale. The input
                  parameters are the values you pass when you call the function in the Java code for the
                  transformation.
             ♦    To pass a Date datatype to an expression, use a String datatype for the input parameter. If
                  an expression returns a Date datatype, you can use the return value as a String datatype in
                  the simple interface and a String or long datatype in the advanced interface.
             For more information about the mapping between PowerCenter datatypes and Java datatypes,
             see “Datatype Mapping” on page 215.




266   Chapter 12: Java Expressions
Figure 12-1 shows the Define Expression dialog box where you configure the function and
  the expression for a Java transformation:

  Figure 12-1. Define Expression Dialog Box


                                                                     Java function name




                                                                      Java function parameters


                                                                      Define expression.

                                                                      Validate expression.




Step 2. Create and Validate the Expression
  When you create the expression, use the parameters you configured for the function. You can
  also use transformation language functions, custom functions, or other user-defined functions
  in the expression. You can create and validate the expression in the Define Expression dialog
  box or in the Expression Editor.
  When you enter expression syntax, follow the transformation language rules and guidelines.
  For more information about expression syntax, see “The Transformation Language” in the
  Transformation Language Reference.
  For more information about creating expressions, see “Working with Workflows” in the
  Workflow Administration Guide.


Step 3. Generate Java Code for the Expression
  After you configure the function and function parameters and define and validate the
  expression, you can generate the Java code that invokes the expression. The Designer places
  the generated Java code in the Java Expressions code entry tab. Use the generated Java code to
  call the functions that invoke the expression in the code entry tabs in the Transformation
  Developer. You can generate the simple or advanced Java code.
  After you generate the Java code that invokes an expression, you cannot edit the expression
  and revalidate it. To modify an expression after you generate the code, you must recreate the
  expression.



                                                         Using the Define Expression Dialog Box   267
Figure 12-2 shows the Java Expressions code entry tab and generated Java code for an
             expression in the advanced interface:

             Figure 12-2. Java Expressions Code Entry Tab




                                                                                              Generated
                                                                                              Java Code




                                                                                              Define the
                                                                                              expression.




        Steps to Create an Expression and Generate Java Code
             Complete the following procedure to create an expression and generate the Java code to
             invoke the expression.

             To generate Java code that calls an expression:

             1.    In the Transformation Developer, open a Java transformation or create a new Java
                   transformation.
             2.    Click the Java Code tab.
             3.    Click the Define Expression link.
                   The Define Expression dialog box appears.
             4.    Enter a function name.
             5.    Optionally, enter a description for the expression.
                   You can enter up to 2,000 characters.
             6.    Create the parameters for the function.



268   Chapter 12: Java Expressions
When you create the parameters, configure the parameter name, datatype, precision, and
        scale.
  7.    Click Launch Editor to create an expression with the parameters you created in step 6.
  8.    Click Validate to validate the expression.
  9.    Optionally, you can enter the expression in the Expression field and click Validate to
        validate the expression.
  10.   If you want to generate Java code using the advanced interface, select Generate advanced
        code.
  11.   Click Generate.
        The Designer generates the function to invoke the expression in the Java Expressions
        code entry tab.


Java Expression Templates
  You can generate Java code for an expression using the simple or advanced Java code for an
  expression. The Java code for the expression is generated according to a template for the
  expression.

  Simple Java Code
  The following example shows the template for a Java expression generated for simple Java
  code:
  Object function_name (Java datatype x1[,
                                               Java datatype x2 ...] )
                                               throws SDK Exception
  {
  return (Object)invokeJExpression( String expression,
                                           new Object [] { x1[, x2, ... ]}      );
  }

  The following example shows the template for a Java expression generated using the advanced
  interface:
  JExpression function_name ()      throws SDKException
  {
        JExprParamMetadata params[] = new JExprParamMetadata[number of parameters];
        params[0] = new JExprParamMetadata (
                          EDataType.STRING,    // data type
                          20,   // precision
                          0     // scale
                          );
  ...
        params[number of parameters - 1] = new JExprParamMetadata (
                          EDataType.STRING,    // data type



                                                              Using the Define Expression Dialog Box   269
20,   // precision
                                     0     // scale
                                     );
             ...
                   return defineJExpression(String expression,params);
             }




270   Chapter 12: Java Expressions
Working with the Simple Interface
      Use the invokeJExpression Java API method to invoke an expression in the simple interface.


    invokeJExpression
      Invokes an expression and returns the value for the expression. Input parameters for
      invokeJExpression are a string value that represents the expression and an array of objects that
      contain the expression input parameters.
      Use the following rules and guidelines when you use invokeJExpression:
      ♦     Return datatype. The return type of invokeJExpression is an object. You must cast the
            return value of the function with the appropriate datatype. You can return values with
            Integer, Double, String, and byte[] datatypes.
      ♦     Row type. The row type for return values from invokeJExpression is INSERT. If you want
            to use a different row type for the return value, use the advanced interface. For more
            information, see “invoke” on page 279.
      ♦     Null values. If you pass a null value as a parameter or the return value for
            invokeJExpression is NULL, the value is treated as a null indicator. For example, if the
            return value of an expression is NULL and the return datatype is String, a string is
            returned with a value of null.
      ♦     Date datatype. You must convert input parameters with a Date datatype to String. To use
            the string in an expression as a Date datatype, use the to_date() function to convert the
            string to a Date datatype. Also, you must cast the return type of any expression that
            returns a Date datatype as a String.
      Use the following syntax:
                (datatype)invokeJExpression(
                                          String expression,

                                          Object[] paramMetadataArray);


                                             Input/
          Argument             Datatype                Description
                                             Output

          expression           String        Input     String that represents the expression.

          paramMetadataArray   Object[]      Input     Array of objects that contain the input parameters for the
                                                       expression.


      The following example concatenates the two strings “John” and “Smith” and returns “John
      Smith”:
                (String)invokeJExpression("concat(x1,x2)", new Object [] { "John ",
                "Smith" });




                                                                        Working with the Simple Interface           271
Note: The parameters passed to the expression must be numbered consecutively and start with
             the letter x. For example, to pass three parameters to an expression, name the parameters x1,
             x2, and x3.


        Simple Interface Example
             You can define and call expressions that use the invokeJExpression API in the Helper Code or
             On Input Row code entry tabs. The following example shows how to perform a lookup on the
             NAME and ADDRESS input ports in a Java transformation and assign the return value to the
             COMPANY_NAME output port.
             Use the following code in the On Input Row code entry tab:
                     COMPANY_NAME = (String)invokeJExpression(":lkp.my_lookup(X1,X2)",

                                                                     new Object [] {str1 ,str2} );

                     generateRow();




272   Chapter 12: Java Expressions
Working with the Advanced Interface
      You can use the object oriented APIs in the advanced interface to define, invoke, and get the
      result of an expression.
      The advanced interface contains the following classes and Java transformation APIs:
      ♦    EDataType class. Enumerates the datatypes for an expression. For more information, see
           “EDataType Class” on page 274.
      ♦    JExprParamMetadata class. Contains the metadata for each parameter in an expression.
           Parameter metadata includes datatype, precision, and scale. For more information, see
           “JExprParamMetadata Class” on page 274.
      ♦    defineJExpression API. Defines the expression. Includes PowerCenter expression string
           and parameters. For more information, see “defineJExpression” on page 275.
      ♦    JExpression class. Contains the methods to create, invoke, get the metadata and get the
           expression result, and check the return datatype. For more information, see “JExpression
           API Reference” on page 279.


    Steps to Invoke an Expression with the Advanced Interface
      Complete the following process to define, invoke, and get the result of an expression:
      1.    In the Helper Code or On Input Row code entry tab, create an instance of
            JExprParamMetadata for each parameter for the expression and set the value of the
            metadata. Optionally, you can instantiate the JExprParamMetadata object in
            defineJExpression.
      2.    Use defineJExpression to get the JExpression object for the expression.
      3.    In the appropriate code entry tab, invoke the expression with invoke.
      4.    Check the result of the return value with isResultNull.
      5.    You can get the datatype of the return value or the metadata of the return value with
            getResultDataType and getResultMetadata.
      6.    Get the result of the expression using the appropriate API. You can use getInt, getDouble,
            getStringBuffer, and getBytes.


    Rules and Guidelines for Working with the Advanced Interface
      Use the following rules and guidelines when you work with expressions in the advanced
      interface:
      ♦    Null values. If you pass a null value as a parameter or if the result of an expression is null,
           the value is treated as a null indicator. For example, if the result of an expression is null and
           the return datatype is String, a string is returned with a value of null. You can check the
           result of an expression using isResultNull. For more information, see “isResultNull” on
           page 280.


                                                                     Working with the Advanced Interface   273
♦    Date datatype. You must convert input parameters with a Date datatype to a String before
                  you can use them in an expression. To use the string in an expression as a Date datatype,
                  use the to_date() function to convert the string to a Date datatype. You can get the result
                  of an expression that returns a Data datatype as String or long datatype. For more
                  information, see “getStringBuffer” on page 282 and “getLong” on page 281.


        EDataType Class
             Enumerates the Java datatypes used in expressions. You can use the EDataType class to get the
             return datatype of an expression or assign the datatype for a parameter in a
             JExprParamMetadata object. You do not need to instantiate the EDataType class.
             Table 12-1 lists the enumerated values for Java datatypes in expressions:

             Table 12-1. Enumerated Java Datatypes

                 Datatype                     Enumerated Value

                 INT                          1

                 DOUBLE                       2

                 STRING                       3

                 BYTE_ARRAY                   4

                 DATE_AS_LONG                 5


             The following example shows how to use the EDataType class to assign a datatype of String to
             an JExprParamMetadata object:
                       JExprParamMetadata params[] = new JExprParamMetadata[2];
                       params[0] = new JExprParamMetadata (

                                          EDataType.STRING,        // data type

                                          20,       // precision
                                          0       // scale

                                     );

                       ...


        JExprParamMetadata Class
             Instantiates an object that represents the parameters for an expression and sets the metadata
             for the parameters. You use an array of JExprParamMetadata objects as input to the
             defineJExpression to set the metadata for the input parameters. You can create a instance of
             the JExprParamMetadata object in the Java Expressions code entry tab or in
             defineJExpression.




274   Chapter 12: Java Expressions
Use the following syntax:
           JExprParamMetadata paramMetadataArray[] =

                                        new JExprParamMetadata[numberOfParameters];

           paramMetadataArray[0] = new JExprParamMetadata(datatype, precision,
           scale);

           ...

           paramMetadataArray[numberofParameters - 1] =
                                  new JExprParamMetadata(datatype, precision, scale);;


                                       Input/
   Argument            Datatype                 Description
                                       Output

   datatype            EDataType       Input    Datatype of the parameter.

   precision           Integer         Input    Precision of the parameter.

   scale               Integer         Input    Scale of the parameter.


  For example, use the following Java code to instantiate an array of two JExprParamMetadata
  objects with String datatypes, precision of 20, and scale of 0:
           JExprParamMetadata params[] = new JExprParamMetadata[2];

           params[0] = new JExprParamMetadata(EDataType.STRING, 20, 0);
           params[1] = new JExprParamMetadata(EDataType.STRING, 20, 0);

           return defineJExpression(":LKP.LKP_addresslookup(X1,X2)",params);


defineJExpression
  Defines the expression, including the expression string and input parameters. Arguments for
  defineJExpression include a JExprParamMetadata object that contains the input parameters
  and a string value that defines the expression syntax.
  To use defineJExpression, you must instantiate an array of JExprParamMetadata objects that
  represent the input parameters for the expression. You set the metadata values for the
  parameters and pass the array as an argument to defineJExpression.




                                                              Working with the Advanced Interface   275
Use the following syntax:
                        defineJExpression(

                              String expression,

                              Object[] paramMetadataArray
                              );


                                                      Input/
               Argument                Datatype                     Description
                                                      Output

               expression              String         Input         String that represents the expression.

               paramMetadataArray      Object[]       Input         Array of JExprParaMetadata objects that contain the input
                                                                    parameters for the expression.


             For example, use the following Java code to create an expression to perform a lookup on two
             strings:
                        JExprParaMetadata params[] = new JExprParamMetadata[2];

                        params[0] = new JExprParamMetadata(EDataType.STRING, 20, 0);

                        params[1] = new JExprParamMetadata(EDataType.STRING, 20, 0);

                        defineJExpression(":lkp.mylookup(x1,x2)",params);

             Note: The parameters passed to the expression must be numbered consecutively and start with
             the letter x. For example, to pass three parameters to an expression, name the parameters x1,
             x2, and x3.


        JExpression Class
             The JExpression class contains the methods to create and invoke an expression, return the
             value of an expression, and check the return datatype.
             Table 12-2 lists the JExpression API methods:

             Table 12-2. JExpression API Methods

               Method Name                        Description

               invoke                             Invokes an expression.

               getResultDataType                  Returns the datatype of the expression result.

               getResultMetadata                  Returns the metadata of the expression result.

               isResultNull                       Checks the result value of an expression result.

               getInt                             Returns the value of an expression result as an Integer datatype.

               getDouble                          Returns the value of an expression result as a Double datatype.

               getStringBuffer                    Returns the value of an expression result as a String datatype.

               getBytes                           Returns the value of an expression result as a byte[] datatype.



276   Chapter 12: Java Expressions
For more information about the JExpression class, including syntax, usage, and examples, see
  “JExpression API Reference” on page 279.


Advanced Interface Example
  The following example shows how to use the advanced interface to create and invoke a lookup
  expression in a Java transformation. The Java code shows how to create a function that calls
  an expression and how to invoke the expression to get the return value. This example passes
  the values for two input ports with a String datatype, NAME and COMPANY, to the
  function myLookup. The myLookup function uses a lookup expression to look up the value
  for the ADDRESS output port.
  Note: This example assumes you have an unconnected lookup transformation in the mapping
  called LKP_addresslookup.
  Use the following Java code in the Helper Code tab of the Transformation Developer:
        JExprParamMetadata addressLookup()        throws SDKException
        {

              JExprParamMetadata params[] = new JExprParamMetadata[2];

              params[0] = new JExprParamMetadata (
                                 EDataType.STRING,         // data type

                                 50,                       // precision

                                 0                         // scale
                                 );

              params[1] = new JExprParamMetadata (

                                 EDataType.STRING,         // data type
                                 50,                       // precision

                                 0                         // scale

                                 );
              return defineJExpression(":LKP.LKP_addresslookup(X1,X2)",params);

        }

        JExpression lookup = null;

        boolean isJExprObjCreated = false;




                                                          Working with the Advanced Interface   277
Use the following Java code in the On Input Row tab to invoke the expression and return the
             value of the ADDRESS port:
                     ...

                     if(!iisJExprObjCreated)

                     {
                           lookup = addressLookup();

                           isJExprObjCreated = true;

                     }
                     lookup = addressLookup();

                     lookup.invoke(new Object [] {NAME,COMPANY}, ERowType.INSERT);

                     EDataType addressDataType = lookup.getResultDataType();

                     if(addressDataType == EDataType.STRING)

                     {

                           ADDRESS = (lookup.getStringBuffer()).toString();
                     } else {

                           logError("Expression result datatype is incorrect.");

                     }
                     ...




278   Chapter 12: Java Expressions
JExpression API Reference
      The JExpression class contains the following API methods:
      ♦    invoke
      ♦    getResultDataType
      ♦    getResultMetadata
      ♦    isResultNull
      ♦    getInt
      ♦    getDouble
      ♦    getStringBuffer
      ♦    getBytes


    invoke
      Invokes an expression. Arguments for invoke include an object that defines the input
      parameters and the row type. You must instantiate an JExpression object before you use
      invoke.
      You can use ERowType.INSERT, ERowType.DELETE, and ERowType.UPDATE for the row
      type.
      Use the following syntax:
                objectName.invoke(
                       new Object[] { param1[, ... paramN ]},

                       rowType

                       );


                                           Input/
          Argument          Datatype                Description
                                           Output

          objectName        JExpression    Input    JExpression object name.

          parameters        n/a            Input    Object array that contains the input values for the
                                                    expression.


      For example, you create a function in the Java Expressions code entry tab named
      address_lookup() that returns an JExpression object that represents the expression. Use the
      following code to invoke the expression that uses input ports NAME and COMPANY:
                JExpression myObject = address_lookup();

                myObject.invoke(new Object[] { NAME,COMPANY }, ERowType INSERT);




                                                                             JExpression API Reference    279
getResultDataType
             Returns the datatype of an expression result. getResultDataType returns a value of
             EDataType. For more information about the EDataType enumerated class, see “EDataType
             Class” on page 274.
             Use the following syntax:
                     objectName.getResultDataType();

             For example, use the following code to invoke an expression and assign the datatype of the
             result to the variable dataType:
                     myObject.invoke(new Object[] { NAME,COMPANY }, ERowType INSERT);

                     EDataType dataType = myObject.getResultDataType();


        getResultMetadata
             Returns the metadata for an expression result. For example, you can use getResultMetadata to
             get the precision, scale, and datatype of an expression result.
             You can assign the metadata of the return value from an expression to an
             JExprParamMetadata object. Use the getScale, getPrecision, and getDataType object methods
             to retrieve the result metadata.
             Use the following syntax:
                     objectName.getResultMetadata();

             For example, use the following Java code to assign the scale, precision, and datatype of the
             return value of myObject to variables:
                     JExprParamMetadata myMetadata = myObject.getResultMetadata();

                     int scale = myMetadata.getScale();

                     int prec = myMetadata.getPrecision();
                     int datatype = myMetadata.getDataType();

             Note: The getDataType object method returns the integer value of the datatype, as enumerated
             in EDataType. For more information about the EDataType class, see “EDataType Class” on
             page 274.


        isResultNull
             Check the value of an expression result.
             Use the following syntax:
                     objectName.isResultNull();




280   Chapter 12: Java Expressions
For example, use the following Java code to invoke an expression and assign the return value
   of the expression to the variable address if the return value is not null:
         JExpression myObject = address_lookup();

         myObject.invoke(new Object[] { NAME,COMPANY }, ERowType INSERT);

         if(!myObject.isResultNull()) {
                String address = myObject.getStringBuffer();

         }


getInt
   Returns the value of an expression result as an Integer datatype.
   Use the following syntax:
         objectName.getInt();

   For example, use the following Java code to get the result of an expression that returns an
   employee ID number as an integer, where findEmpID is a JExpression object:
         int empID = findEmpID.getInt();


getDouble
   Returns the value of an expression result as a Double datatype.
   Use the following syntax:
         objectName.getDouble();

   For example, use the following Java code to get the result of an expression that returns a salary
   value as a double, where JExprSalary is an JExpression object:
         double salary = JExprSalary.getDouble();


getLong
   Returns the value of an expression result as a Long datatype.
   You can use getLong to get the result of an expression that uses a Date datatype.
   Use the following syntax:
         objectName.getLong();

   For example, use the following Java code to get the result of an expression that returns a Date
   value as a Long datatype, where JExprCurrentDate is an JExpression object:
         long currDate = JExprCurrentDate.getLong();




                                                                       JExpression API Reference   281
getStringBuffer
             Returns the value of an expression result as a String datatype.
             Use the following syntax:
                     objectName.getStringBuffer();

             For example, use the following Java code to get the result of an expression that returns two
             concatenated strings, where JExprConcat is an JExpression object:
                     String result = JExprConcat.getStringBuffer();


        getBytes
             Returns the value of an expression result as an byte[] datatype. For example, you can use
             getByte to get the result of an expression that encypts data with the AES_ENCRYPT
             function.
             Use the following syntax:
                     objectName.getBytes();

             For example, use the following Java code to get the result of an expression that encrypts the
             binary data using the AES_ENCRYPT function, where JExprEncryptData is an JExpression
             object:
                     byte[] newBytes = JExprEncryptData.getBytes();




282   Chapter 12: Java Expressions
Chapter 13




Joiner Transformation


    This chapter includes the following topics:
    ♦   Overview, 284
    ♦   Joiner Transformation Properties, 286
    ♦   Defining a Join Condition, 288
    ♦   Defining the Join Type, 289
    ♦   Using Sorted Input, 292
    ♦   Joining Data from a Single Source, 296
    ♦   Blocking the Source Pipelines, 299
    ♦   Working with Transactions, 300
    ♦   Creating a Joiner Transformation, 303
    ♦   Tips, 306




                                                               283
Overview
                     Transformation type:
                     Active
                     Connected


              Use the Joiner transformation to join source data from two related heterogeneous sources
              residing in different locations or file systems. You can also join data from the same source.
              The Joiner transformation joins sources with at least one matching column. The Joiner
              transformation uses a condition that matches one or more pairs of columns between the two
              sources.
              The two input pipelines include a master pipeline and a detail pipeline or a master and a
              detail branch. The master pipeline ends at the Joiner transformation, while the detail pipeline
              continues to the target.
              Figure 13-1 shows the master and detail pipelines in a mapping with a Joiner transformation:

              Figure 13-1. Mapping with Master and Detail Pipelines

                                                                                Master Pipeline




                                                                                Detail Pipeline




              To join more than two sources in a mapping, join the output from the Joiner transformation
              with another source pipeline. Add Joiner transformations to the mapping until you have
              joined all the source pipelines.
              The Joiner transformation accepts input from most transformations. However, consider the
              following limitations on the pipelines you connect to the Joiner transformation:
              ♦   You cannot use a Joiner transformation when either input pipeline contains an Update
                  Strategy transformation.
              ♦   You cannot use a Joiner transformation if you connect a Sequence Generator
                  transformation directly before the Joiner transformation.


        Working with the Joiner Transformation
              When you work with the Joiner transformation, you must configure the transformation
              properties, join type, and join condition. You can configure the Joiner transformation for
              sorted input to improve Integration Service performance. You can also configure the



284   Chapter 13: Joiner Transformation
transformation scope to control how the Integration Service applies transformation logic. To
work with the Joiner transformation, complete the following tasks:
♦   Configure the Joiner transformation properties. Properties for the Joiner transformation
    identify the location of the cache directory, how the Integration Service processes the
    transformation, and how the Integration Service handles caching. For more information,
    see “Joiner Transformation Properties” on page 286.
♦   Configure the join condition. The join condition contains ports from both input sources
    that must match for the Integration Service to join two rows. Depending on the type of
    join selected, the Integration Service either adds the row to the result set or discards the
    row. For more information, see “Defining a Join Condition” on page 288.
♦   Configure the join type. A join is a relational operator that combines data from multiple
    tables in different databases or flat files into a single result set. You can configure the Joiner
    transformation to use a Normal, Master Outer, Detail Outer, or Full Outer join type. For
    more information, see “Defining the Join Type” on page 289.
♦   Configure the session for sorted or unsorted input. You can improve session performance
    by configuring the Joiner transformation to use sorted input. To configure a mapping to
    use sorted data, you establish and maintain a sort order in the mapping so that the
    Integration Service can use the sorted data when it processes the Joiner transformation. For
    more information about configuring the Joiner transformation for sorted input, see “Using
    Sorted Input” on page 292.
♦   Configure the transaction scope. When the Integration Service processes a Joiner
    transformation, it can apply transformation logic to all data in a transaction, all incoming
    data, or one row of data at a time. For more information about configuring how the
    Integration Service applies transformation logic, see “Working with Transactions” on
    page 300.
If you have the partitioning option in PowerCenter, you can increase the number of partitions
in a pipeline to improve session performance. For information about partitioning with the
Joiner transformation, see “Working with Partition Points” in the Workflow Administration
Guide.




                                                                                       Overview    285
Joiner Transformation Properties
              Properties for the Joiner transformation identify the location of the cache directory, how the
              Integration Service processes the transformation, and how the Integration Service handles
              caching. The properties also determine how the Integration Service joins tables and files.
              Figure 13-2 shows the Joiner transformation properties:

              Figure 13-2. Joiner Transformation Properties Tab




              When you create a mapping, you specify the properties for each Joiner transformation. When
              you create a session, you can override some properties, such as the index and data cache size
              for each transformation.
              Table 13-1 describes the Joiner transformation properties:

              Table 13-1. Joiner Transformation Properties

               Option                             Description

               Case-Sensitive String Comparison   If selected, the Integration Service uses case-sensitive string comparisons when
                                                  performing joins on string columns.

               Cache Directory                    Specifies the directory used to cache master or detail rows and the index to these
                                                  rows. By default, the cache files are created in a directory specified by the
                                                  process variable $PMCacheDir. If you override the directory, make sure the
                                                  directory exists and contains enough disk space for the cache files. The directory
                                                  can be a mapped or mounted drive.

               Join Type                          Specifies the type of join: Normal, Master Outer, Detail Outer, or Full Outer.

               Null Ordering in Master            Not applicable for this transformation type.




286   Chapter 13: Joiner Transformation
Table 13-1. Joiner Transformation Properties

 Option                             Description

 Null Ordering in Detail            Not applicable for this transformation type.

 Tracing Level                      Amount of detail displayed in the session log for this transformation. The options
                                    are Terse, Normal, Verbose Data, and Verbose Initialization.

 Joiner Data Cache Size             Data cache size for the transformation. Default cache size is 2,000,000 bytes. If
                                    the total configured cache size is 2 GB or more, you must run the session on a 64-
                                    bit Integration Service. You can configure a numeric value, or you can configure
                                    the Integration Service to determine the cache size at runtime. If you configure the
                                    Integration Service to determine the cache size, you can also configure a
                                    maximum amount of memory for the Integration Service to allocate to the cache.

 Joiner Index Cache Size            Index cache size for the transformation. Default cache size is 1,000,000 bytes. If
                                    the total configured cache size is 2 GB or more, you must run the session on a 64-
                                    bit Integration Service. You can configure a numeric value, or you can configure
                                    the Integration Service to determine the cache size at runtime. If you configure the
                                    Integration Service to determine the cache size, you can also configure a
                                    maximum amount of memory for the Integration Service to allocate to the cache.

 Sorted Input                       Specifies that data is sorted. Choose Sorted Input to join sorted data. Using
                                    sorted input can improve performance. For more information about working with
                                    sorted input, see “Using Sorted Input” on page 292.

 Transformation Scope               Specifies how the Integration Service applies the transformation logic to incoming
                                    data. You can choose Transaction, All Input, or Row.
                                    For more information, see “Working with Transactions” on page 300.




                                                                            Joiner Transformation Properties         287
Defining a Join Condition
              The join condition contains ports from both input sources that must match for the
              Integration Service to join two rows. Depending on the type of join selected, the Integration
              Service either adds the row to the result set or discards the row. The Joiner transformation
              produces result sets based on the join type, condition, and input data sources.
              Before you define a join condition, verify that the master and detail sources are configured for
              optimal performance. During a session, the Integration Service compares each row of the
              master source against the detail source. To improve performance for an unsorted Joiner
              transformation, use the source with fewer rows as the master source. To improve performance
              for a sorted Joiner transformation, use the source with fewer duplicate key values as the
              master.
              By default, when you add ports to a Joiner transformation, the ports from the first source
              pipeline display as detail sources. Adding the ports from the second source pipeline sets them
              as master sources. To change these settings, click the M column on the Ports tab for the ports
              you want to set as the master source. This sets ports from this source as master ports and ports
              from the other source as detail ports.
              You define one or more conditions based on equality between the specified master and detail
              sources. For example, if two sources with tables called EMPLOYEE_AGE and
              EMPLOYEE_POSITION both contain employee ID numbers, the following condition
              matches rows with employees listed in both sources:
                      EMP_ID1 = EMP_ID2

              Use one or more ports from the input sources of a Joiner transformation in the join
              condition. Additional ports increase the time necessary to join two sources. The order of the
              ports in the condition can impact the performance of the Joiner transformation. If you use
              multiple ports in the join condition, the Integration Service compares the ports in the order
              you specify.
              The Designer validates datatypes in a condition. Both ports in a condition must have the
              same datatype. If you need to use two ports in the condition with non-matching datatypes,
              convert the datatypes so they match.
              If you join Char and Varchar datatypes, the Integration Service counts any spaces that pad
              Char values as part of the string:
                      Char(40) = "abcd"

                      Varchar(40) = "abcd"

              The Char value is “abcd” padded with 36 blank spaces, and the Integration Service does not
              join the two fields because the Char field contains trailing spaces.
              Note: The Joiner transformation does not match null values. For example, if both EMP_ID1
              and EMP_ID2 contain a row with a null value, the Integration Service does not consider
              them a match and does not join the two rows. To join rows with null values, replace null
              input with default values, and then join on the default values. For more information about
              default values, see “Using Default Values for Ports” on page 18.


288   Chapter 13: Joiner Transformation
Defining the Join Type
      In SQL, a join is a relational operator that combines data from multiple tables into a single
      result set. The Joiner transformation is similar to an SQL join except that data can originate
      from different types of sources.
      You define the join type on the Properties tab in the transformation. The Joiner
      transformation supports the following types of joins:
      ♦   Normal
      ♦   Master Outer
      ♦   Detail Outer
      ♦   Full Outer
      Note: A normal or master outer join performs faster than a full outer or detail outer join.

      If a result set includes fields that do not contain data in either of the sources, the Joiner
      transformation populates the empty fields with null values. If you know that a field will
      return a NULL and you do not want to insert NULLs in the target, you can set a default value
      on the Ports tab for the corresponding port.


    Normal Join
      With a normal join, the Integration Service discards all rows of data from the master and
      detail source that do not match, based on the condition.
      For example, you might have two sources of data for auto parts called PARTS_SIZE and
      PARTS_COLOR with the following data:
      PARTS_SIZE (master source)
      PART_ID1         DESCRIPTION              SIZE
      1                Seat Cover               Large
      2                Ash Tray                 Small
      3                Floor Mat                Medium


      PARTS_COLOR (detail source)
      PART_ID2         DESCRIPTION              COLOR
      1                Seat Cover               Blue
      3                Floor Mat                Black
      4                Fuzzy Dice               Yellow


      To join the two tables by matching the PART_IDs in both sources, you set the condition as
      follows:
             PART_ID1 = PART_ID2




                                                                            Defining the Join Type   289
When you join these tables with a normal join, the result set includes the following data:
              PART_ID       DESCRIPTION        SIZE            COLOR
              1             Seat Cover         Large           Blue
              3             Floor Mat          Medium          Black


              The following example shows the equivalent SQL statement:
                      SELECT * FROM PARTS_SIZE, PARTS_COLOR WHERE PARTS_SIZE.PART_ID1 =
                      PARTS_COLOR.PART_ID2


        Master Outer Join
              A master outer join keeps all rows of data from the detail source and the matching rows from
              the master source. It discards the unmatched rows from the master source.
              When you join the sample tables with a master outer join and the same condition, the result
              set includes the following data:
              PART_ID       DESCRIPTION        SIZE               COLOR
              1             Seat Cover         Large              Blue
              3             Floor Mat          Medium             Black
              4             Fuzzy Dice         NULL               Yellow


              Because no size is specified for the Fuzzy Dice, the Integration Service populates the field with
              a NULL.
              The following example shows the equivalent SQL statement:
                      SELECT * FROM PARTS_SIZE RIGHT OUTER JOIN PARTS_COLOR ON
                      (PARTS_SIZE.PART_ID1 = PARTS_COLOR.PART_ID2)


        Detail Outer Join
              A detail outer join keeps all rows of data from the master source and the matching rows from
              the detail source. It discards the unmatched rows from the detail source.
              When you join the sample tables with a detail outer join and the same condition, the result
              set includes the following data:
              PART_ID       DESCRIPTION          SIZE            COLOR
              1             Seat Cover           Large           Blue
              2             Ash Tray             Small           NULL
              3             Floor Mat            Medium          Black


              Because no color is specified for the Ash Tray, the Integration Service populates the field with
              a NULL.




290   Chapter 13: Joiner Transformation
The following example shows the equivalent SQL statement:
         SELECT * FROM PARTS_SIZE LEFT OUTER JOIN PARTS_COLOR ON
         (PARTS_COLOR.PART_ID2 = PARTS_SIZE.PART_ID1)


Full Outer Join
   A full outer join keeps all rows of data from both the master and detail sources.
   When you join the sample tables with a full outer join and the same condition, the result set
   includes:
   PART_ID    DESCRIPTION          SIZE              Color
   1          Seat Cover           Large             Blue
   2          Ash Tray             Small             NULL
   3          Floor Mat            Medium            Black
   4          Fuzzy Dice           NULL              Yellow


   Because no color is specified for the Ash Tray and no size is specified for the Fuzzy Dice, the
   Integration Service populates the fields with NULL.
   The following example shows the equivalent SQL statement:
         SELECT * FROM PARTS_SIZE FULL OUTER JOIN PARTS_COLOR ON
         (PARTS_SIZE.PART_ID1 = PARTS_COLOR.PART_ID2)




                                                                         Defining the Join Type   291
Using Sorted Input
              You can improve session performance by configuring the Joiner transformation to use sorted
              input. When you configure the Joiner transformation to use sorted data, the Integration
              Service improves performance by minimizing disk input and output. You see the greatest
              performance improvement when you work with large data sets.
              To configure a mapping to use sorted data, you establish and maintain a sort order in the
              mapping so the Integration Service can use the sorted data when it processes the Joiner
              transformation. Complete the following tasks to configure the mapping:
              ♦   Configure the sort order. Configure the sort order of the data you want to join. You can
                  join sorted flat files, or you can sort relational data using a Source Qualifier
                  transformation. You can also use a Sorter transformation.
              ♦   Add transformations. Use transformations that maintain the order of the sorted data.
              ♦   Configure the Joiner transformation. Configure the Joiner transformation to use sorted
                  data and configure the join condition to use the sort origin ports. The sort origin
                  represents the source of the sorted data.
              When you configure the sort order in a session, you can select a sort order associated with the
              Integration Service code page. When you run the Integration Service in Unicode mode, it
              uses the selected session sort order to sort character data. When you run the Integration
              Service in ASCII mode, it sorts all character data using a binary sort order. To ensure that data
              is sorted as the Integration Service requires, the database sort order must be the same as the
              user-defined session sort order.
              When you join sorted data from partitioned pipelines, you must configure the partitions to
              maintain the order of sorted data. For more information about joining data from partitioned
              pipelines, see “Working with Partition Points” in the Workflow Administration Guide.


        Configuring the Sort Order
              You must configure the sort order to ensure that the Integration Service passes sorted data to
              the Joiner transformation.
              Configure the sort order using one of the following methods:
              ♦   Use sorted flat files. When the flat files contain sorted data, verify that the order of the
                  sort columns match in each source file.
              ♦   Use sorted relational data. Use sorted ports in the Source Qualifier transformation to sort
                  columns from the source database. Configure the order of the sorted ports the same in
                  each Source Qualifier transformation.
                  For more information about using sorted ports, see “Using Sorted Ports” on page 472.
              ♦   Use Sorter transformations. Use a Sorter transformation to sort relational or flat file data.
                  Place a Sorter transformation in the master and detail pipelines. Configure each Sorter
                  transformation to use the same order of the sort key ports and the sort order direction.




292   Chapter 13: Joiner Transformation
For more information about using the Sorter transformation, see “Creating a Sorter
       Transformation” on page 443.
   If you pass unsorted or incorrectly sorted data to a Joiner transformation configured to use
   sorted data, the session fails and the Integration Service logs the error in the session log file.


Adding Transformations to the Mapping
   When you add transformations between the sort origin and the Joiner transformation, use the
   following guidelines to maintain sorted data:
   ♦   Do not place any of the following transformations between the sort origin and the Joiner
       transformation:
       −   Custom
       −   Unsorted Aggregator
       −   Normalizer
       −   Rank
       −   Union transformation
       −   XML Parser transformation
       −   XML Generator transformation
       −   Mapplet, if it contains one of the above transformations
   ♦   You can place a sorted Aggregator transformation between the sort origin and the Joiner
       transformation if you use the following guidelines:
       −   Configure the Aggregator transformation for sorted input using the guidelines in “Using
           Sorted Input” on page 45.
       −   Use the same ports for the group by columns in the Aggregator transformation as the
           ports at the sort origin.
       −   The group by ports must be in the same order as the ports at the sort origin.
   ♦   When you join the result set of a Joiner transformation with another pipeline, verify that
       the data output from the first Joiner transformation is sorted.
   Tip: You can place the Joiner transformation directly after the sort origin to maintain sorted
   data.


Configuring the Joiner Transformation
   To configure the Joiner transformation, complete the following tasks:
   ♦   Enable Sorted Input on the Properties tab.
   ♦   Define the join condition to receive sorted data in the same order as the sort origin.




                                                                               Using Sorted Input   293
Defining the Join Condition
              Configure the join condition to maintain the sort order established at the sort origin: the
              sorted flat file, the Source Qualifier transformation, or the Sorter transformation. If you use a
              sorted Aggregator transformation between the sort origin and the Joiner transformation, treat
              the sorted Aggregator transformation as the sort origin when you define the join condition.
              Use the following guidelines when you define join conditions:
              ♦    The ports you use in the join condition must match the ports at the sort origin.
              ♦    When you configure multiple join conditions, the ports in the first join condition must
                   match the first ports at the sort origin.
              ♦    When you configure multiple conditions, the order of the conditions must match the
                   order of the ports at the sort origin, and you must not skip any ports.
              ♦    The number of sorted ports in the sort origin can be greater than or equal to the number
                   of ports at the join condition.

              Example of a Join Condition
              For example, you configure Sorter transformations in the master and detail pipelines with the
              following sorted ports:
              1.    ITEM_NO
              2.    ITEM_NAME
              3.    PRICE
              When you configure the join condition, use the following guidelines to maintain sort order:
              ♦    You must use ITEM_NO in the first join condition.
              ♦    If you add a second join condition, you must use ITEM_NAME.
              ♦    If you want to use PRICE in a join condition, you must also use ITEM_NAME in the
                   second join condition.
              If you skip ITEM_NAME and join on ITEM_NO and PRICE, you lose the sort order and
              the Integration Service fails the session.




294   Chapter 13: Joiner Transformation
Figure 13-3 shows a mapping configured to sort and join on the ports ITEM_NO,
ITEM_NAME, and PRICE:

Figure 13-3. Mapping Configured to Join Data from Two Pipelines




The master and
detail Sorter
transformations
sort on the same
ports in the same
order.




When you use the Joiner transformation to join the master and detail pipelines, you can
configure any one of the following join conditions:
       ITEM_NO = ITEM_NO

or
       ITEM_NO = ITEM_NO1

       ITEM_NAME = ITEM_NAME1

or
       ITEM_NO = ITEM_NO1
       ITEM_NAME = ITEM_NAME1

       PRICE = PRICE1




                                                                      Using Sorted Input   295
Joining Data from a Single Source
              You may want to join data from the same source if you want to perform a calculation on part
              of the data and join the transformed data with the original data. When you join the data using
              this method, you can maintain the original data and transform parts of that data within one
              mapping. You can join data from the same source in the following ways:
              ♦   Join two branches of the same pipeline.
              ♦   Join two instances of the same source.


        Joining Two Branches of the Same Pipeline
              When you join data from the same source, you can create two branches of the pipeline. When
              you branch a pipeline, you must add a transformation between the source qualifier and the
              Joiner transformation in at least one branch of the pipeline. You must join sorted data and
              configure the Joiner transformation for sorted input.
              For example, you have a source with the following ports:
              ♦   Employee
              ♦   Department
              ♦   Total Sales
              In the target, you want to view the employees who generated sales that were greater than the
              average sales for their departments. To do this, you create a mapping with the following
              transformations:
              ♦   Sorter transformation. Sorts the data.
              ♦   Sorted Aggregator transformation. Averages the sales data and group by department.
                  When you perform this aggregation, you lose the data for individual employees. To
                  maintain employee data, you must pass a branch of the pipeline to the Aggregator
                  transformation and pass a branch with the same data to the Joiner transformation to
                  maintain the original data. When you join both branches of the pipeline, you join the
                  aggregated data with the original data.
              ♦   Sorted Joiner transformation. Uses a sorted Joiner transformation to join the sorted
                  aggregated data with the original data.
              ♦   Filter transformation. Compares the average sales data against sales data for each employee
                  and filter out employees with less than above average sales.




296   Chapter 13: Joiner Transformation
Figure 13-4 shows joining two branches of the same pipeline:

  Figure 13-4. Mapping that Joins Two Branches of a Pipeline

                        Pipeline Branch 1                             Filter out employees with less than above
                                                                      average sales.




    Source       Pipeline Branch 2     Sorted Joiner Transformation


  Note: You can also join data from output groups of the same transformation, such as the
  Custom transformation or XML Source Qualifier transformation. Place a Sorter
  transformation between each output group and the Joiner transformation and configure the
  Joiner transformation to receive sorted input.
  Joining two branches might impact performance if the Joiner transformation receives data
  from one branch much later than the other branch. The Joiner transformation caches all the
  data from the first branch, and writes the cache to disk if the cache fills. The Joiner
  transformation must then read the data from disk when it receives the data from the second
  branch. This can slow processing.


Joining Two Instances of the Same Source
  You can also join same source data by creating a second instance of the source. After you
  create the second source instance, you can join the pipelines from the two source instances. If
  you want to join unsorted data, you must create two instances of the same source and join the
  pipelines.
  Figure 13-5 shows two instances of the same source joined using a Joiner transformation:

  Figure 13-5. Mapping that Joins Two Instances of the Same Source



   Source
   Instance 1


   Source
   Instance 2




  Note: When you join data using this method, the Integration Service reads the source data for
  each source instance, so performance can be slower than joining two branches of a pipeline.

                                                                      Joining Data from a Single Source           297
Guidelines
              Use the following guidelines when deciding whether to join branches of a pipeline or join two
              instances of a source:
              ♦   Join two branches of a pipeline when you have a large source or if you can read the source
                  data only once. For example, you can only read source data from a message queue once.
              ♦   Join two branches of a pipeline when you use sorted data. If the source data is unsorted
                  and you use a Sorter transformation to sort the data, branch the pipeline after you sort the
                  data.
              ♦   Join two instances of a source when you need to add a blocking transformation to the
                  pipeline between the source and the Joiner transformation.
              ♦   Join two instances of a source if one pipeline may process slower than the other pipeline.
              ♦   Join two instances of a source if you need to join unsorted data.




298   Chapter 13: Joiner Transformation
Blocking the Source Pipelines
      When you run a session with a Joiner transformation, the Integration Service blocks and
      unblocks the source data, based on the mapping configuration and whether you configure the
      Joiner transformation for sorted input.
      For more information about blocking source data, see “Integration Service Architecture” in
      the Administrator Guide.


    Unsorted Joiner Transformation
      When the Integration Service processes an unsorted Joiner transformation, it reads all master
      rows before it reads the detail rows. To ensure it reads all master rows before the detail rows,
      the Integration Service blocks the detail source while it caches rows from the master source.
      Once the Integration Service reads and caches all master rows, it unblocks the detail source
      and reads the detail rows.
      Some mappings with unsorted Joiner transformations violate data flow validation. For more
      information about mappings containing blocking transformations that violate data flow
      validation, see “Mappings” in the Designer Guide.


    Sorted Joiner Transformation
      When the Integration Service processes a sorted Joiner transformation, it blocks data based on
      the mapping configuration. Blocking logic is possible if master and detail input to the Joiner
      transformation originate from different sources.
      The Integration Service uses blocking logic to process the Joiner transformation if it can do so
      without blocking all sources in a target load order group simultaneously. Otherwise, it does
      not use blocking logic. Instead, it stores more rows in the cache.
      When the Integration Service can use blocking logic to process the Joiner transformation, it
      stores fewer rows in the cache, increasing performance.

      Caching Master Rows
      When the Integration Service processes a Joiner transformation, it reads rows from both
      sources concurrently and builds the index and data cache based on the master rows. The
      Integration Service then performs the join based on the detail source data and the cache data.
      The number of rows the Integration Service stores in the cache depends on the partitioning
      scheme, the source data, and whether you configure the Joiner transformation for sorted
      input. To improve performance for an unsorted Joiner transformation, use the source with
      fewer rows as the master source. To improve performance for a sorted Joiner transformation,
      use the source with fewer duplicate key values as the master. For more information about
      Joiner transformation caches, see “Session Caches” in the Workflow Administration Guide.




                                                                      Blocking the Source Pipelines   299
Working with Transactions
              When the Integration Service processes a Joiner transformation, it can apply transformation
              logic to all data in a transaction, all incoming data, or one row of data at a time. The
              Integration Service can drop or preserve transaction boundaries depending on the mapping
              configuration and the transformation scope. You configure how the Integration Service
              applies transformation logic and handles transaction boundaries using the transformation
              scope property.
              You configure transformation scope values based on the mapping configuration and whether
              you want to preserve or drop transaction boundaries.
              You can preserve transaction boundaries when you join the following sources:
              ♦     You join two branches of the same source pipeline. Use the Transaction transformation
                    scope to preserve transaction boundaries. For information about preserving transaction
                    boundaries for a single source, see “Preserving Transaction Boundaries for a Single
                    Pipeline” on page 301.
              ♦     You join two sources, and you want to preserve transaction boundaries for the detail
                    source. Use the Row transformation scope to preserve transaction boundaries in the detail
                    pipeline. For more information about preserving transaction boundaries for the detail
                    source, see “Preserving Transaction Boundaries in the Detail Pipeline” on page 301.
              You can drop transaction boundaries when you join the following sources:
              ♦     You join two sources or two branches and you want to drop transaction boundaries. Use
                    the All Input transformation scope to apply the transformation logic to all incoming data
                    and drop transaction boundaries for both pipelines. For more information about dropping
                    transaction boundaries for two pipelines, see “Dropping Transaction Boundaries for Two
                    Pipelines” on page 302.
              Table 13-2 summarizes how to preserve transaction boundaries using transformation scopes
              with the Joiner transformation:

              Table 13-2. Integration Service Behavior with Transformation Scopes for the
              Joiner Transformation

                  Transformation Scope            Input Type            Integration Service Behavior

                  Row                             Unsorted              Preserves transaction boundaries in the detail pipeline.

                                                  Sorted                Session fails.

                  *Transaction                    Sorted                Preserves transaction boundaries when master and detail
                                                                        originate from the same transaction generator. Session fails
                                                                        when master and detail do not originate from the same
                                                                        transaction generator

                                                  Unsorted              Session fails.

                  *All Input                      Sorted, Unsorted      Drops transaction boundaries.
                  *Sessions fail if you use real-time data with All Input or Transaction transformation scopes.




300   Chapter 13: Joiner Transformation
For more information about transformation scope and transaction boundaries, see
   “Understanding Commit Points” in the Workflow Administration Guide.


Preserving Transaction Boundaries for a Single Pipeline
   When you join data from the same source, use the Transaction transformation scope to
   preserve incoming transaction boundaries for a single pipeline. Use the Transaction
   transformation scope when the Joiner transformation joins data from the same source, either
   two branches of the same pipeline or two output groups of one transaction generator. Use this
   transformation scope with sorted data and any join type.
   When you use the Transaction transformation scope, verify that master and detail pipelines
   originate from the same transaction control point and that you use sorted input. For example,
   in Figure 13-6 the Sorter transformation is the transaction control point. You cannot place
   another transaction control point between the Sorter transformation and the Joiner
   transformation.
   Figure 13-6 shows a mapping configured to join two branches of a pipeline and preserve
   transaction boundaries:

   Figure 13-6. Preserving Transaction Boundaries when You Join Two Pipeline Branches




   Master and detail pipeline branches                    The Integration Service joins the pipeline branches
   originate from the same transaction.                   and preserves transaction boundaries.



Preserving Transaction Boundaries in the Detail Pipeline
   When you want to preserve the transaction boundaries in the detail pipeline, choose the Row
   transformation scope. The Row transformation scope allows the Integration Service to process
   data one row at a time. The Integration Service caches the master data and matches the detail
   data with the cached master data.
   When the source data originates from a real-time source, such as IBM MQ Series, the
   Integration Service matches the cached master data with each message as it is read from the
   detail source.
   Use the Row transformation scope with Normal and Master Outer join types that use
   unsorted data.




                                                                                Working with Transactions       301
Dropping Transaction Boundaries for Two Pipelines
              When you want to join data from two sources or two branches and you do not need to
              preserve transaction boundaries, use the All Input transformation scope. When you use All
              Input, the Integration Service drops incoming transaction boundaries for both pipelines and
              outputs all rows from the transformation as an open transaction. At the Joiner
              transformation, the data from the master pipeline can be cached or joined concurrently,
              depending on how you configure the sort order. Use this transformation scope with sorted
              and unsorted data and any join type.
              For more information about configuring the sort order, see “Joiner Transformation
              Properties” on page 286.




302   Chapter 13: Joiner Transformation
Creating a Joiner Transformation
      To use a Joiner transformation, add a Joiner transformation to the mapping, set up the input
      sources, and configure the transformation with a condition and join type and sort type.

      To create a Joiner Transformation:

      1.   In the Mapping Designer, click Transformation > Create. Select the Joiner
           transformation. Enter a name, and click OK.
           The naming convention for Joiner transformations is JNR_TransformationName. Enter a
           description for the transformation.
           The Designer creates the Joiner transformation.
      2.   Drag all the input/output ports from the first source into the Joiner transformation.
           The Designer creates input/output ports for the source fields in the Joiner transformation
           as detail fields by default. You can edit this property later.
      3.   Select and drag all the input/output ports from the second source into the Joiner
           transformation.
           The Designer configures the second set of source fields and master fields by default.
      4.   Double-click the title bar of the Joiner transformation to open the transformation.
      5.   Click the Ports tab.




      6.   Click any box in the M column to switch the master/detail relationship for the sources.




                                                                   Creating a Joiner Transformation   303
Tip : To improve performance for an unsorted Joiner transformation, use the source with
                   fewer rows as the master source. To improve performance for a sorted Joiner
                   transformation, use the source with fewer duplicate key values as the master.
              7.   Add default values for specific ports.
                   Some ports are likely to contain null values, since the fields in one of the sources may be
                   empty. You can specify a default value if the target database does not handle NULLs.
              8.   Click the Condition tab and set the join condition.




              9.   Click the Add button to add a condition. You can add multiple conditions. The master
                   and detail ports must have matching datatypes. The Joiner transformation only supports
                   equivalent (=) joins.
                   For more information about defining the join condition, see “Defining a Join
                   Condition” on page 288.




304   Chapter 13: Joiner Transformation
10.   Click the Properties tab and configure properties for the transformation.




      Note: You can edit the join condition from the Condition tab. The keyword AND
      separates multiple conditions.
      For more information about defining the properties, see “Joiner Transformation
      Properties” on page 286.
11.   Click OK.
12.   Click the Metadata Extensions tab to configure metadata extensions.
      For information about working with metadata extensions, see “Metadata Extensions” in
      the Repository Guide.
13.   Click Repository > Save to save changes to the mapping.




                                                             Creating a Joiner Transformation   305
Tips
              The following tips can help improve session performance.

              Perform joins in a database when possible.
              Performing a join in a database is faster than performing a join in the session. In some cases,
              this is not possible, such as joining tables from two different databases or flat file systems. If
              you want to perform a join in a database, use the following options:
              ♦   Create a pre-session stored procedure to join the tables in a database.
              ♦   Use the Source Qualifier transformation to perform the join. For more information, see
                  “Joining Source Data” on page 454 for more information.

              Join sorted data when possible.
              You can improve session performance by configuring the Joiner transformation to use sorted
              input. When you configure the Joiner transformation to use sorted data, the Integration
              Service improves performance by minimizing disk input and output. You see the greatest
              performance improvement when you work with large data sets. For more information, see
              “Using Sorted Input” on page 292.

              For an unsorted Joiner transformation, designate the source with fewer rows as the master
              source.
              For optimal performance and disk storage, designate the source with the fewer rows as the
              master source. During a session, the Joiner transformation compares each row of the master
              source against the detail source. The fewer unique rows in the master, the fewer iterations of
              the join comparison occur, which speeds the join process.

              For a sorted Joiner transformation, designate the source with fewer duplicate key values as
              the master source.
              For optimal performance and disk storage, designate the source with fewer duplicate key
              values as the master source. When the Integration Service processes a sorted Joiner
              transformation, it caches rows for one hundred keys at a time. If the master source contains
              many rows with the same key value, the Integration Service must cache more rows, and
              performance can be slowed.




306   Chapter 13: Joiner Transformation
Chapter 14




Lookup Transformation


   This chapter includes the following topics:
   ♦   Overview, 308
   ♦   Connected and Unconnected Lookups, 309
   ♦   Relational and Flat File Lookups, 311
   ♦   Lookup Components, 313
   ♦   Lookup Properties, 316
   ♦   Lookup Query, 324
   ♦   Lookup Condition, 328
   ♦   Lookup Caches, 330
   ♦   Configuring Unconnected Lookup Transformations, 331
   ♦   Creating a Lookup Transformation, 335
   ♦   Tips, 336




                                                              307
Overview
                    Transformation type:
                    Passive
                    Connected/Unconnected


             Use a Lookup transformation in a mapping to look up data in a flat file or a relational table,
             view, or synonym. You can import a lookup definition from any flat file or relational database
             to which both the PowerCenter Client and Integration Service can connect. Use multiple
             Lookup transformations in a mapping.
             The Integration Service queries the lookup source based on the lookup ports in the
             transformation. It compares Lookup transformation port values to lookup source column
             values based on the lookup condition. Pass the result of the lookup to other transformations
             and a target.
             Use the Lookup transformation to perform many tasks, including:
             ♦   Get a related value. For example, the source includes employee ID, but you want to
                 include the employee name in the target table to make the summary data easier to read.
             ♦   Perform a calculation. Many normalized tables include values used in a calculation, such
                 as gross sales per invoice or sales tax, but not the calculated value (such as net sales).
             ♦   Update slowly changing dimension tables. Use a Lookup transformation to determine
                 whether rows already exist in the target.
             You can configure the Lookup transformation to complete the following types of lookups:
             ♦   Connected or unconnected. Connected and unconnected transformations receive input
                 and send output in different ways.
             ♦   Relational or flat file lookup. When you create a Lookup transformation, you can choose
                 to perform a lookup on a flat file or a relational table.
                 When you create a Lookup transformation using a relational table as the lookup source,
                 you can connect to the lookup source using ODBC and import the table definition as the
                 structure for the Lookup transformation.
                 When you create a Lookup transformation using a flat file as a lookup source, the Designer
                 invokes the Flat File Wizard. For more information about using the Flat File Wizard, see
                 “Working with Flat Files” in the Designer Guide.
             ♦   Cached or uncached. Sometimes you can improve session performance by caching the
                 lookup table. If you cache the lookup, you can choose to use a dynamic or static cache. By
                 default, the lookup cache remains static and does not change during the session. With a
                 dynamic cache, the Integration Service inserts or updates rows in the cache during the
                 session. When you cache the target table as the lookup, you can look up values in the
                 target and insert them if they do not exist, or update them if they do.




308   Chapter 14: Lookup Transformation
Connected and Unconnected Lookups
      You can configure a connected Lookup transformation to receive input directly from the
      mapping pipeline, or you can configure an unconnected Lookup transformation to receive
      input from the result of an expression in another transformation.
      Table 14-1 lists the differences between connected and unconnected lookups:

      Table 14-1. Differences Between Connected and Unconnected Lookups

       Connected Lookup                                            Unconnected Lookup

       Receives input values directly from the pipeline.           Receives input values from the result of a :LKP expression
                                                                   in another transformation.

       Use a dynamic or static cache.                              Use a static cache.

       Cache includes all lookup columns used in the mapping       Cache includes all lookup/output ports in the lookup
       (that is, lookup source columns included in the lookup      condition and the lookup/return port.
       condition and lookup source columns linked as output
       ports to other transformations).

       Can return multiple columns from the same row or insert     Designate one return port (R). Returns one column from
       into the dynamic lookup cache.                              each row.

       If there is no match for the lookup condition, the          If there is no match for the lookup condition, the Integration
       Integration Service returns the default value for all       Service returns NULL.
       output ports. If you configure dynamic caching, the
       Integration Service inserts rows into the cache or leaves
       it unchanged.

       If there is a match for the lookup condition, the           If there is a match for the lookup condition, the Integration
       Integration Service returns the result of the lookup        Service returns the result of the lookup condition into the
       condition for all lookup/output ports. If you configure     return port.
       dynamic caching, the Integration Service either updates
       the row in the cache or leaves the row unchanged.

       Pass multiple output values to another transformation.      Pass one output value to another transformation. The
       Link lookup/output ports to another transformation.         lookup/output/return port passes the value to the
                                                                   transformation calling :LKP expression.

       Supports user-defined default values.                       Does not support user-defined default values.



    Connected Lookup Transformation
      The following steps describe how the Integration Service processes a connected Lookup
      transformation:
      1.   A connected Lookup transformation receives input values directly from another
           transformation in the pipeline.
      2.   For each input row, the Integration Service queries the lookup source or cache based on
           the lookup ports and the condition in the transformation.
      3.   If the transformation is uncached or uses a static cache, the Integration Service returns
           values from the lookup query.

                                                                               Connected and Unconnected Lookups               309
If the transformation uses a dynamic cache, the Integration Service inserts the row into
                   the cache when it does not find the row in the cache. When the Integration Service finds
                   the row in the cache, it updates the row in the cache or leaves it unchanged. It flags the
                   row as insert, update, or no change.
             4.    The Integration Service passes return values from the query to the next transformation.
                   If the transformation uses a dynamic cache, you can pass rows to a Filter or Router
                   transformation to filter new rows to the target.
             Note: This chapter discusses connected Lookup transformations unless otherwise specified.


        Unconnected Lookup Transformation
             An unconnected Lookup transformation receives input values from the result of a :LKP
             expression in another transformation. You can call the Lookup transformation more than
             once in a mapping.
             A common use for unconnected Lookup transformations is to update slowly changing
             dimension tables. For more information about slowly changing dimension tables, visit the
             Informatica Knowledge Base at http://my.informatica.com.
             The following steps describe the way the Integration Service processes an unconnected
             Lookup transformation:
             1.    An unconnected Lookup transformation receives input values from the result of a :LKP
                   expression in another transformation, such as an Update Strategy transformation.
             2.    The Integration Service queries the lookup source or cache based on the lookup ports and
                   condition in the transformation.
             3.    The Integration Service returns one value into the return port of the Lookup
                   transformation.
             4.    The Lookup transformation passes the return value into the :LKP expression.
             For more information about unconnected Lookup transformations, see “Configuring
             Unconnected Lookup Transformations” on page 331.




310   Chapter 14: Lookup Transformation
Relational and Flat File Lookups
       When you create a Lookup transformation, you can choose to use a relational table or a flat
       file for the lookup source.


    Relational Lookups
       When you create a Lookup transformation using a relational table as a lookup source, you can
       connect to the lookup source using ODBC and import the table definition as the structure for
       the Lookup transformation.
       You can override the default SQL statement to add a WHERE clause or to query multiple
       tables.


    Flat File Lookups
       When you use a flat file for a lookup source, use any flat file definition in the repository, or
       you can import it. When you import a flat file lookup source, the Designer invokes the Flat
       File Wizard.
       Use the following options with flat file lookups only:
       ♦   Use indirect files as lookup sources by specifying a file list as the lookup file name.
       ♦   Use sorted input for the lookup.
       ♦   You can sort null data high or low. With relational lookups, this is based on the database
           support.
       ♦   Use case-sensitive string comparison with flat file lookups. With relational lookups, the
           case-sensitive comparison is based on the database support.

       Using Sorted Input
       When you configure a flat file Lookup transformation for sorted input, the condition
       columns must be grouped. If the condition columns are not grouped, the Integration Service
       cannot cache the lookup and fails the session. For best caching performance, sort the
       condition columns.
       For example, a Lookup transformation has the following condition:
              OrderID = OrderID1

              CustID = CustID1

       In the following flat file lookup source, the keys are grouped, but not sorted. The Integration
       Service can cache the data, but performance may not be optimal.
       OrderID    CustID    ItemNo.    ItemDesc               Comments
       1001       CA502     F895S      Flashlight             Key data is grouped, but not sorted.
                                                              CustID is out of order within OrderID.

       1001       CA501     C530S      Compass



                                                                       Relational and Flat File Lookups   311
OrderID      CustID     ItemNo.   ItemDesc            Comments
             1001         CA501      T552T     Tent
             1005         OK503      S104E     Safety Knife        Key data is grouped, but not sorted.
                                                                   OrderID is out of order.

             1003         CA500      F304T     First Aid Kit
             1003         TN601      R938M     Regulator System


             The keys are not grouped in the following flat file lookup source. The Integration Service
             cannot cache the data and fails the session.
             OrderID      CustID     ItemNo.   ItemDesc           Comments
             1001         CA501      T552T     Tent
             1001         CA501      C530S     Compass
             1005         OK503      S104E     Safety Knife
             1003         TN601      R938M     Regulator System
             1003         CA500      F304T     First Aid Kit
             1001         CA502      F895S     Flashlight         Key data for CustID is not grouped.


             If you choose sorted input for indirect files, the range of data must not overlap in the files.




312   Chapter 14: Lookup Transformation
Lookup Components
     Define the following components when you configure a Lookup transformation in a
     mapping:
     ♦   Lookup source
     ♦   Ports
     ♦   Properties
     ♦   Condition
     ♦   Metadata extensions


   Lookup Source
     Use a flat file or a relational table for a lookup source. When you create a Lookup
     transformation, you can import the lookup source from the following locations:
     ♦   Any relational source or target definition in the repository
     ♦   Any flat file source or target definition in the repository
     ♦   Any table or file that both the Integration Service and PowerCenter Client machine can
         connect to
     The lookup table can be a single table, or you can join multiple tables in the same database
     using a lookup SQL override. The Integration Service queries the lookup table or an in-
     memory cache of the table for all incoming rows into the Lookup transformation.
     The Integration Service can connect to a lookup table using a native database driver or an
     ODBC driver. However, the native database drivers improve session performance.

     Indexes and a Lookup Table
     If you have privileges to modify the database containing a lookup table, you can improve
     lookup initialization time by adding an index to the lookup table. This is important for very
     large lookup tables. Since the Integration Service needs to query, sort, and compare values in
     these columns, the index needs to include every column used in a lookup condition.
     You can improve performance by indexing the following types of lookup:
     ♦   Cached lookups. You can improve performance by indexing the columns in the lookup
         ORDER BY. The session log contains the ORDER BY clause.
     ♦   Uncached lookups. Because the Integration Service issues a SELECT statement for each
         row passing into the Lookup transformation, you can improve performance by indexing
         the columns in the lookup condition.


   Lookup Ports
     The Ports tab contains options similar to other transformations, such as port name, datatype,
     and scale. In addition to input and output ports, the Lookup transformation includes a


                                                                            Lookup Components     313
lookup port type that represents columns of data in the lookup source. An unconnected
             Lookup transformation also includes a return port type that represents the return value.
             Table 14-2 describes the port types in a Lookup transformation:

             Table 14-2. Lookup Transformation Port Types

                               Type of          Number
                 Ports                                         Description
                               Lookup           Required

                 I             Connected        Minimum of 1   Input port. Create an input port for each lookup port you want to
                               Unconnected                     use in the lookup condition. You must have at least one input or
                                                               input/output port in each Lookup transformation.

                 O             Connected        Minimum of 1   Output port. Create an output port for each lookup port you want
                               Unconnected                     to link to another transformation. You can designate both input
                                                               and lookup ports as output ports. For connected lookups, you
                                                               must have at least one output port. For unconnected lookups,
                                                               use a lookup/output port as a return port (R) to designate a
                                                               return value.

                 L             Connected        Minimum of 1   Lookup port. The Designer designates each column in the
                               Unconnected                     lookup source as a lookup (L) and output port (O).

                 R             Unconnected      1 only         Return port. Use only in unconnected Lookup transformations.
                                                               Designates the column of data you want to return based on the
                                                               lookup condition. You can designate one lookup/output port as
                                                               the return port.


             The Lookup transformation also enables an associated ports property that you configure when
             you use a dynamic cache.
             Use the following guidelines to configure lookup ports:
             ♦       If you delete lookup ports from a flat file session, the session fails.
             ♦       You can delete lookup ports from a relational lookup if you are certain the mapping does
                     not use the lookup port. This reduces the amount of memory the Integration Service uses
                     to run the session.
             ♦       To ensure datatypes match when you add an input port, copy the existing lookup ports.


        Lookup Properties
             On the Properties tab, you can configure properties, such as an SQL override for relational
             lookups, the lookup source name, and tracing level for the transformation. You can also
             configure caching properties on the Properties tab.
             For more information about lookup properties, see “Lookup Properties” on page 316.




314   Chapter 14: Lookup Transformation
Lookup Condition
  On the Condition tab, you can enter the condition or conditions you want the Integration
  Service to use to determine whether input data qualifies values in the lookup source or cache.
  For more information about the lookup condition, see “Lookup Condition” on page 328.


Metadata Extensions
  You can extend the metadata stored in the repository by associating information with
  repository objects, such as Lookup transformations. For example, when you create a Lookup
  transformation, you may want to store your name and the creation date with the Lookup
  transformation. You associate information with repository metadata using metadata
  extensions. For more information, see “Metadata Extensions” in the Repository Guide.




                                                                        Lookup Components    315
Lookup Properties
             Properties for the Lookup transformation identify the database source, how the Integration
             Service processes the transformation, and how it handles caching and multiple matches.
             When you create a mapping, you specify the properties for each Lookup transformation.
             When you create a session, you can override some properties, such as the index and data cache
             size, for each transformation in the session properties.
             Table 14-3 describes the Lookup transformation properties:

             Table 14-3. Lookup Transformation Properties

                                           Lookup
               Option                                   Description
                                           Type

               Lookup SQL Override         Relational   Overrides the default SQL statement to query the lookup table.
                                                        Specifies the SQL statement you want the Integration Service to use for
                                                        querying lookup values. Use only with the lookup cache enabled.
                                                        For more information, see “Lookup Query” on page 324.

               Lookup Table Name           Relational   Specifies the name of the table from which the transformation looks up
                                                        and caches values. You can import a table, view, or synonym from
                                                        another database by selecting the Import button on the dialog box that
                                                        appears when you first create a Lookup transformation.
                                                        If you enter a lookup SQL override, you do not need to add an entry for
                                                        this option.

               Lookup Caching Enabled      Flat File,   Indicates whether the Integration Service caches lookup values during
                                           Relational   the session.
                                                        When you enable lookup caching, the Integration Service queries the
                                                        lookup source once, caches the values, and looks up values in the cache
                                                        during the session. This can improve session performance.
                                                        When you disable caching, each time a row passes into the
                                                        transformation, the Integration Service issues a select statement to the
                                                        lookup source for lookup values.
                                                        Note: The Integration Service always caches flat file lookups.

               Lookup Policy on Multiple   Flat File,   Determines what happens when the Lookup transformation finds multiple
               Match                       Relational   rows that match the lookup condition. You can select the first or last row
                                                        returned from the cache or lookup source, or report an error. Or, you can
                                                        allow the Lookup transformation to use any value. When you configure
                                                        the Lookup transformation to return any matching value, the
                                                        transformation returns the first value that matches the lookup condition.
                                                        It creates an index based on the key ports rather than all Lookup
                                                        transformation ports.
                                                        If you do not enable the Output Old Value On Update option, the Lookup
                                                        Policy On Multiple Match option is set to Report Error for dynamic
                                                        lookups. For more information about lookup caches, see “Lookup
                                                        Caches” on page 337.

               Lookup Condition            Flat File,   Displays the lookup condition you set in the Condition tab.
                                           Relational




316   Chapter 14: Lookup Transformation
Table 14-3. Lookup Transformation Properties

                           Lookup
 Option                                   Description
                           Type

 Connection Information    Relational     Specifies the database containing the lookup table. You can select the
                                          database connection or use the $Source or $Target variable. If you use
                                          one of these variables, the lookup table must reside in the source or
                                          target database you specify when you configure the session.
                                          If you select the database connection, you can also specify what type of
                                          database connection it is. Type Application: before the connection
                                          name if it is an Application connection. Type Relational: before the
                                          connection name if it is a relational connection.
                                          If you do not specify the type of database connection, the Integration
                                          Service fails the session if it cannot determine the type of database
                                          connection.
                                          For more information about using $Source and $Target, see “Configuring
                                          Relational Lookups in a Session” on page 322.

 Source Type               Flat File,     Indicates that the Lookup transformation reads values from a relational
                           Relational     database or a flat file.

 Tracing Level             Flat File,     Sets the amount of detail included in the session log when you run a
                           Relational     session containing this transformation.

 Lookup Cache Directory    Flat File,     Specifies the directory used to build the lookup cache files when you
 Name                      Relational     configure the Lookup transformation to cache the lookup source. Also
                                          used to save the persistent lookup cache files when you select the
                                          Lookup Persistent option.
                                          By default, the Integration Service uses the $PMCacheDir directory
                                          configured for the Integration Service.

 Lookup Cache Persistent   Flat File,     Indicates whether the Integration Service uses a persistent lookup
                           Relational     cache, which consists of at least two cache files. If a Lookup
                                          transformation is configured for a persistent lookup cache and persistent
                                          lookup cache files do not exist, the Integration Service creates the files
                                          during the session. Use only with the lookup cache enabled.

 Lookup Data Cache Size    Flat File,     Indicates the maximum size the Integration Service allocates to the data
                           Relational     cache in memory. You can configure a numeric value, or you can
                                          configure the Integration Service to determine the cache size at runtime.
                                          If you configure the Integration Service to determine the cache size, you
                                          can also configure a maximum amount of memory for the Integration
                                          Service to allocate to the cache.
                                          If the Integration Service cannot allocate the configured amount of
                                          memory when initializing the session, it fails the session. When the
                                          Integration Service cannot store all the data cache data in memory, it
                                          pages to disk.
                                          The Lookup Data Cache Size is 2,000,000 bytes by default. The
                                          minimum size is 1,024 bytes. If the total configured session cache size is
                                          2 GB (2,147,483, 648 bytes) or greater, you must run the session on a
                                          64-bit Integration Service.
                                          Use only with the lookup cache enabled.




                                                                                         Lookup Properties          317
Table 14-3. Lookup Transformation Properties

                                          Lookup
               Option                                  Description
                                          Type

               Lookup Index Cache Size    Flat File,   Indicates the maximum size the Integration Service allocates to the index
                                          Relational   cache in memory. You can configure a numeric value, or you can
                                                       configure the Integration Service to determine the cache size at runtime.
                                                       If you configure the Integration Service to determine the cache size, you
                                                       can also configure a maximum amount of memory for the Integration
                                                       Service to allocate to the cache.
                                                       If the Integration Service cannot allocate the configured amount of
                                                       memory when initializing the session, it fails the session. When the
                                                       Integration Service cannot store all the index cache data in memory, it
                                                       pages to disk.
                                                       The Lookup Index Cache Size is 1,000,000 bytes by default. The
                                                       minimum size is 1,024 bytes. If the total configured session cache size is
                                                       2 GB (2,147,483, 648 bytes) or greater, you must run the session on a
                                                       64-bit Integration Service.
                                                       Use only with the lookup cache enabled.

               Dynamic Lookup Cache       Flat File,   Indicates to use a dynamic lookup cache. Inserts or updates rows in the
                                          Relational   lookup cache as it passes rows to the target table.
                                                       Use only with the lookup cache enabled.

               Output Old Value On        Flat File,   Use only with dynamic caching enabled. When you enable this property,
               Update                     Relational   the Integration Service outputs old values out of the lookup/output ports.
                                                       When the Integration Service updates a row in the cache, it outputs the
                                                       value that existed in the lookup cache before it updated the row based on
                                                       the input data. When the Integration Service inserts a new row in the
                                                       cache, it outputs null values.
                                                       When you disable this property, the Integration Service outputs the same
                                                       values out of the lookup/output and input/output ports.
                                                       This property is enabled by default.

               Cache File Name Prefix     Flat File,   Use only with persistent lookup cache. Specifies the file name prefix to
                                          Relational   use with persistent lookup cache files. The Integration Service uses the
                                                       file name prefix as the file name for the persistent cache files it saves to
                                                       disk. Only enter the prefix. Do not enter .idx or .dat.
                                                       You can enter a parameter or variable for the file name prefix. Use any
                                                       parameter or variable type that you can define in the parameter file. For
                                                       information about using parameter files, see “Parameter Files” in the
                                                       Workflow Administration Guide.
                                                       If the named persistent cache files exist, the Integration Service builds
                                                       the memory cache from the files. If the named persistent cache files do
                                                       not exist, the Integration Service rebuilds the persistent cache files.

               Recache From Lookup        Flat File,   Use only with the lookup cache enabled. When selected, the Integration
               Source                     Relational   Service rebuilds the lookup cache from the lookup source when it first
                                                       calls the Lookup transformation instance.
                                                       If you use a persistent lookup cache, it rebuilds the persistent cache files
                                                       before using the cache. If you do not use a persistent lookup cache, it
                                                       rebuilds the lookup cache in memory before using the cache.




318   Chapter 14: Lookup Transformation
Table 14-3. Lookup Transformation Properties

                           Lookup
 Option                                   Description
                           Type

 Insert Else Update        Flat File,     Use only with dynamic caching enabled. Applies to rows entering the
                           Relational     Lookup transformation with the row type of insert. When you select this
                                          property and the row type entering the Lookup transformation is insert,
                                          the Integration Service inserts the row into the cache if it is new, and
                                          updates the row if it exists. If you do not select this property, the
                                          Integration Service only inserts new rows into the cache when the row
                                          type entering the Lookup transformation is insert. For more information
                                          about defining the row type, see “Using Update Strategy Transformations
                                          with a Dynamic Cache” on page 354.

 Update Else Insert        Flat File,     Use only with dynamic caching enabled. Applies to rows entering the
                           Relational     Lookup transformation with the row type of update. When you select this
                                          property and the row type entering the Lookup transformation is update,
                                          the Integration Service updates the row in the cache if it exists, and
                                          inserts the row if it is new. If you do not select this property, the
                                          Integration Service only updates existing rows in the cache when the row
                                          type entering the Lookup transformation is update.
                                          For more information about defining the row type, see “Using Update
                                          Strategy Transformations with a Dynamic Cache” on page 354.

 Datetime Format           Flat File      If you do not define a datetime format for a particular field in the lookup
                                          definition or on the Ports tab, the Integration Service uses the properties
                                          defined here.
                                          You can enter any datetime format. Default is MM/DD/YYYY
                                          HH24:MI:SS.

 Thousand Separator        Flat File      If you do not define a thousand separator for a particular field in the
                                          lookup definition or on the Ports tab, the Integration Service uses the
                                          properties defined here.
                                          You can choose no separator, a comma, or a period. Default is no
                                          separator.

 Decimal Separator         Flat File      If you do not define a decimal separator for a particular field in the lookup
                                          definition or on the Ports tab, the Integration Service uses the properties
                                          defined here.
                                          You can choose a comma or a period decimal separator. Default is
                                          period.

 Case-Sensitive String     Flat File      If selected, the Integration Service uses case-sensitive string
 Comparison                               comparisons when performing lookups on string columns.
                                          Note: For relational lookups, the case-sensitive comparison is based on
                                          the database support.




                                                                                           Lookup Properties         319
Table 14-3. Lookup Transformation Properties

                                          Lookup
                 Option                                Description
                                          Type

                 Null Ordering            Flat File    Determines how the Integration Service orders null values. You can
                                                       choose to sort null values high or low. By default, the Integration Service
                                                       sorts null values high. This overrides the Integration Service
                                                       configuration to treat nulls in comparison operators as high, low, or null.
                                                       Note: For relational lookups, null ordering is based on the database
                                                       support.

                 Sorted Input             Flat File    Indicates whether or not the lookup file data is sorted. This increases
                                                       lookup performance for file lookups. If you enable sorted input, and the
                                                       condition columns are not grouped, the Integration Service fails the
                                                       session. If the condition columns are grouped, but not sorted, the
                                                       Integration Service processes the lookup as if you did not configure
                                                       sorted input. For more information about sorted input, see “Flat File
                                                       Lookups” on page 311.



        Configuring Lookup Properties in a Session
             When you configure a session, you can configure lookup properties that are unique to
             sessions:
             ♦     Flat file lookups. Configure location information, such as the file directory, file name, and
                   the file type.
             ♦     Relational lookups. You can define $Source and $Target variables in the session
                   properties. You can also override connection information to use the session parameter
                   $DBConnection.




320   Chapter 14: Lookup Transformation
Configuring Flat File Lookups in a Session
Figure 14-1 shows the session properties for a flat file lookup:

Figure 14-1. Session Properties for Flat File Lookups




                                                                            Session
                                                                            Properties for
                                                                            Flat File Lookup




                                                                   Lookup Properties     321
Table 14-4 describes the session properties you configure for flat file lookups:

             Table 14-4. Session Properties for Flat File Lookups

                 Property                            Description

                 Lookup Source File Directory        Enter the directory name. By default, the Integration Service looks in the
                                                     process variable directory, $PMLookupFileDir, for lookup files.
                                                     You can enter the full path and file name. If you specify both the directory
                                                     and file name in the Lookup Source Filename field, clear this field. The
                                                     Integration Service concatenates this field with the Lookup Source
                                                     Filename field when it runs the session.
                                                     You can also use the $InputFileName session parameter to specify the
                                                     file name.
                                                     For more information about session parameters, see “Working with
                                                     Sessions” in the Workflow Administration Guide.

                 Lookup Source Filename              Name of the lookup file. If you use an indirect file, specify the name of the
                                                     indirect file you want the Integration Service to read.
                                                     You can also use the lookup file parameter, $LookupFileName, to change
                                                     the name of the lookup file a session uses.
                                                     If you specify both the directory and file name in the Source File Directory
                                                     field, clear this field. The Integration Service concatenates this field with
                                                     the Lookup Source File Directory field when it runs the session. For
                                                     example, if you have “C:lookup_data” in the Lookup Source File
                                                     Directory field, then enter “filename.txt” in the Lookup Source Filename
                                                     field. When the Integration Service begins the session, it looks for
                                                     “C:lookup_datafilename.txt”.
                                                     For more information, see “Working with Sessions” in the Workflow
                                                     Administration Guide.

                 Lookup Source Filetype              Indicates whether the lookup source file contains the source data or a list
                                                     of files with the same file properties. Choose Direct if the lookup source
                                                     file contains the source data. Choose Indirect if the lookup source file
                                                     contains a list of files.
                                                     When you select Indirect, the Integration Service creates one cache for all
                                                     files. If you use sorted input with indirect files, verify that the range of data
                                                     in the files do not overlap. If the range of data overlaps, the Integration
                                                     Service processes the lookup as if you did not configure for sorted input.



             Configuring Relational Lookups in a Session
             When you configure a session, you specify the connection for the lookup database in the
             Connection node on the Mapping tab (Transformation view). You have the following options
             to specify a connection:
             ♦     Choose any relational connection.
             ♦     Use the connection variable, $DBConnection.
             ♦     Specify a database connection for $Source or $Target information.
             If you use $Source or $Target for the lookup connection, configure the $Source Connection
             Value and $Target Connection Value in the session properties. This ensures that the
             Integration Service uses the correct database connection for the variable when it runs the
             session.


322   Chapter 14: Lookup Transformation
If you use $Source or $Target and you do not specify a Connection Value in the session
properties, the Integration Service determines the database connection to use when it runs the
session. It uses a source or target database connection for the source or target in the pipeline
that contains the Lookup transformation. If it cannot determine which database connection
to use, it fails the session.
The following list describes how the Integration Service determines the value of $Source or
$Target when you do not specify $Source Connection Value or $Target Connection Value in
the session properties:
♦   When you use $Source and the pipeline contains one source, the Integration Service uses
    the database connection you specify for the source.
♦   When you use $Source and the pipeline contains multiple sources joined by a Joiner
    transformation, the Integration Service uses different database connections, depending on
    the location of the Lookup transformation in the pipeline:
    −   When the Lookup transformation is after the Joiner transformation, the Integration
        Service uses the database connection for the detail table.
    −   When the Lookup transformation is before the Joiner transformation, the Integration
        Service uses the database connection for the source connected to the Lookup
        transformation.
♦   When you use $Target and the pipeline contains one target, the Integration Service uses
    the database connection you specify for the target.
♦   When you use $Target and the pipeline contains multiple relational targets, the session
    fails.
♦   When you use $Source or $Target in an unconnected Lookup transformation, the session
    fails.




                                                                          Lookup Properties   323
Lookup Query
             The Integration Service queries the lookup based on the ports and properties you configure in
             the Lookup transformation. The Integration Service runs a default SQL statement when the
             first row enters the Lookup transformation. If you use a relational lookup, you can customize
             the default query with the Lookup SQL Override property.


        Default Lookup Query
             The default lookup query contains the following statements:
             ♦   SELECT. The SELECT statement includes all the lookup ports in the mapping. You can
                 view the SELECT statement by generating SQL using the Lookup SQL Override property.
                 Do not add or delete any columns from the default SQL statement.
             ♦   ORDER BY. The ORDER BY clause orders the columns in the same order they appear in
                 the Lookup transformation. The Integration Service generates the ORDER BY clause. You
                 cannot view this when you generate the default SQL using the Lookup SQL Override
                 property.


        Overriding the Lookup Query
             The lookup SQL override is similar to entering a custom query in a Source Qualifier
             transformation. You can override the lookup query for a relational lookup. You can enter the
             entire override, or you can generate and edit the default SQL statement. When the Designer
             generates the default SQL statement for the lookup SQL override, it includes the lookup/
             output ports in the lookup condition and the lookup/return port.
             Override the lookup query in the following circumstances:
             ♦   Override the ORDER BY clause. Create the ORDER BY clause with fewer columns to
                 increase performance. When you override the ORDER BY clause, you must suppress the
                 generated ORDER BY clause with a comment notation. For more information, see
                 “Overriding the ORDER BY Clause” on page 325.
                 Note: If you use pushdown optimization, you cannot override the ORDER BY clause or
                 suppress the generated ORDER BY clause with a comment notation.
             ♦   A lookup table name or column names contains a reserved word. If the table name or any
                 column name in the lookup query contains a reserved word, you must ensure that all
                 reserved words are enclosed in quotes. For more information, see “Reserved Words” on
                 page 326.
             ♦   Use parameters and variables. Use parameters and variables when you enter a lookup SQL
                 override. Use any parameter or variable type that you can define in the parameter file. You
                 can enter a parameter or variable within the SQL statement, or you can use a parameter or
                 variable as the SQL query. For example, you can use a session parameter,
                 $ParamMyLkpOverride, as the lookup SQL query, and set $ParamMyLkpOverride to the
                 SQL statement in a parameter file.



324   Chapter 14: Lookup Transformation
The Designer cannot expand parameters and variables in the query override and does not
    validate it when you use a parameter or variable. The Integration Service expands the
    parameters and variables when you run the session. For more information about using
    mapping parameters and variables in expressions, see “Mapping Parameters and Variables”
    in the Designer Guide. For more information about parameter files, see “Parameter Files” in
    the Workflow Administration Guide.
♦   A lookup column name contains a slash (/) character. When generating the default
    lookup query, the Designer and Integration Service replace any slash character (/) in the
    lookup column name with an underscore character. To query lookup column names
    containing the slash character, override the default lookup query, replace the underscore
    characters with the slash character, and enclose the column name in double quotes.
♦   Add a WHERE statement. Use a lookup SQL override to add a WHERE statement to the
    default SQL statement. You might want to use this to reduce the number of rows included
    in the cache. When you add a WHERE statement to a Lookup transformation using a
    dynamic cache, use a Filter transformation before the Lookup transformation. This
    ensures the Integration Service only inserts rows into the dynamic cache and target table
    that match the WHERE clause. For more information, see “Using the WHERE Clause
    with a Dynamic Cache” on page 358.
    Note: The session fails if you include large object ports in a WHERE clause.
♦   Other. Use a lookup SQL override if you want to query lookup data from multiple
    lookups or if you want to modify the data queried from the lookup table before the
    Integration Service caches the lookup rows. For example, use TO_CHAR to convert dates
    to strings.

Overriding the ORDER BY Clause
By default, the Integration Service generates an ORDER BY clause for a cached lookup. The
ORDER BY clause contains all lookup ports. To increase performance, you can suppress the
default ORDER BY clause and enter an override ORDER BY with fewer columns.
Note: If you use pushdown optimization, you cannot override the ORDER BY clause or
suppress the generated ORDER BY clause with a comment notation.
The Integration Service always generates an ORDER BY clause, even if you enter one in the
override. Place two dashes ‘--’ after the ORDER BY override to suppress the generated
ORDER BY clause. For example, a Lookup transformation uses the following lookup
condition:
       ITEM_ID = IN_ITEM_ID
       PRICE <= IN_PRICE

The Lookup transformation includes three lookup ports used in the mapping, ITEM_ID,
ITEM_NAME, and PRICE. When you enter the ORDER BY clause, enter the columns in
the same order as the ports in the lookup condition. You must also enclose all database
reserved words in quotes. Enter the following lookup query in the lookup SQL override:
       SELECT ITEMS_DIM.ITEM_NAME, ITEMS_DIM.PRICE, ITEMS_DIM.ITEM_ID FROM
       ITEMS_DIM ORDER BY ITEMS_DIM.ITEM_ID, ITEMS_DIM.PRICE --




                                                                             Lookup Query   325
To override the default ORDER BY clause for a relational lookup, complete the following
             steps:
             1.    Generate the lookup query in the Lookup transformation.
             2.    Enter an ORDER BY clause that contains the condition ports in the same order they
                   appear in the Lookup condition.
             3.    Place two dashes ‘--’ as a comment notation after the ORDER BY clause to suppress the
                   ORDER BY clause that the Integration Service generates.
                   If you override the lookup query with an ORDER BY clause without adding comment
                   notation, the lookup fails.
             Note: Sybase has a 16 column ORDER BY limitation. If the Lookup transformation has more
             than 16 lookup/output ports (including the ports in the lookup condition), you might want
             to override the ORDER BY clause or use multiple Lookup transformations to query the
             lookup table.

             Reserved Words
             If any lookup name or column name contains a database reserved word, such as MONTH or
             YEAR, the session fails with database errors when the Integration Service executes SQL
             against the database. You can create and maintain a reserved words file, reswords.txt, in the
             Integration Service installation directory. When the Integration Service initializes a session, it
             searches for reswords.txt. If the file exists, the Integration Service places quotes around
             matching reserved words when it executes SQL against the database.
             You may need to enable some databases, such as Microsoft SQL Server and Sybase, to use
             SQL-92 standards regarding quoted identifiers. Use connection environment SQL to issue
             the command. For example, with Microsoft SQL Server, use the following command:
                     SET QUOTED_IDENTIFIER ON

             Note: The reserved words file, reswords.txt, is a file that you create and maintain in the
             Integration Service installation directory. The Integration Service searches this file and places
             quotes around reserved words when it executes SQL against source, target, and lookup
             databases. For more information about reswords.txt, see “Working with Targets” in the
             Workflow Administration Guide.

             Guidelines to Overriding the Lookup Query
             Use the following guidelines when you override the lookup SQL query:
             ♦    You can only override the lookup SQL query for relational lookups.
             ♦    Configure the Lookup transformation for caching. If you do not enable caching, the
                  Integration Service does not recognize the override.
             ♦    Generate the default query, and then configure the override. This helps ensure that all the
                  lookup/output ports are included in the query. If you add or subtract ports from the
                  SELECT statement, the session fails.




326   Chapter 14: Lookup Transformation
♦    Use a Filter transformation before a Lookup transformation using a dynamic cache when
     you add a WHERE clause to the lookup SQL override. This ensures the Integration
     Service only inserts rows in the dynamic cache and target table that match the WHERE
     clause. For more information, see “Using the WHERE Clause with a Dynamic Cache” on
     page 358.
♦    If you want to share the cache, use the same lookup SQL override for each Lookup
     transformation.
♦    If you override the ORDER BY clause, the session fails if the ORDER BY clause does not
     contain the condition ports in the same order they appear in the Lookup condition or if
     you do not suppress the generated ORDER BY clause with the comment notation.
♦    If you use pushdown optimization, you cannot override the ORDER BY clause or suppress
     the generated ORDER BY clause with comment notation.
♦    If the table name or any column name in the lookup query contains a reserved word, you
     must enclose all reserved words in quotes.

Steps to Overriding the Lookup Query
Use the following steps to override the default lookup SQL query.

To override the default lookup query:

1.    On the Properties tab, open the SQL Editor from within the Lookup SQL Override field.
2.    Click Generate SQL to generate the default SELECT statement. Enter the lookup SQL
      override.
3.    Connect to a database, and then click Validate to test the lookup SQL override.
4.    Click OK to return to the Properties tab.




                                                                            Lookup Query   327
Lookup Condition
             The Integration Service uses the lookup condition to test incoming values. It is similar to the
             WHERE clause in an SQL query. When you configure a lookup condition for the
             transformation, you compare transformation input values with values in the lookup source or
             cache, represented by lookup ports. When you run a workflow, the Integration Service queries
             the lookup source or cache for all incoming values based on the condition.
             You must enter a lookup condition in all Lookup transformations. Some guidelines for the
             lookup condition apply for all Lookup transformations, and some guidelines vary depending
             on how you configure the transformation.
             Use the following guidelines when you enter a condition for a Lookup transformation:
             ♦   The datatypes in a condition must match.
             ♦   Use one input port for each lookup port used in the condition. Use the same input port in
                 more than one condition in a transformation.
             ♦   When you enter multiple conditions, the Integration Service evaluates each condition as
                 an AND, not an OR. The Integration Service returns only rows that match all the
                 conditions you specify.
             ♦   The Integration Service matches null values. For example, if an input lookup condition
                 column is NULL, the Integration Service evaluates the NULL equal to a NULL in the
                 lookup.
             ♦   If you configure a flat file lookup for sorted input, the Integration Service fails the session
                 if the condition columns are not grouped. If the columns are grouped, but not sorted, the
                 Integration Service processes the lookup as if you did not configure sorted input. For more
                 information about sorted input, see “Flat File Lookups” on page 311.
             The lookup condition guidelines and the way the Integration Service processes matches can
             vary, depending on whether you configure the transformation for a dynamic cache or an
             uncached or static cache. For more information about lookup caches, see “Lookup Caches”
             on page 337.


        Uncached or Static Cache
             Use the following guidelines when you configure a Lookup transformation without a cache or
             to use a static cache:
             ♦   Use the following operators when you create the lookup condition:
                     =, >, <, >=, <=, !=

                 Tip: If you include more than one lookup condition, place the conditions with an equal
                 sign first to optimize lookup performance. For example, create the following lookup
                 condition:
                     ITEM_ID = IN_ITEM_ID
                     PRICE <= IN_PRICE

             ♦   The input value must meet all conditions for the lookup to return a value.


328   Chapter 14: Lookup Transformation
The condition can match equivalent values or supply a threshold condition. For example, you
  might look for customers who do not live in California, or employees whose salary is greater
  than $30,000. Depending on the nature of the source and condition, the lookup might return
  multiple values.


Dynamic Cache
  If you configure a Lookup transformation to use a dynamic cache, you can only use the
  equality operator (=) in the lookup condition.


Handling Multiple Matches
  Lookups find a value based on the conditions you set in the Lookup transformation. If the
  lookup condition is not based on a unique key, or if the lookup source is denormalized, the
  Integration Service might find multiple matches in the lookup source or cache.
  You can configure a Lookup transformation to handle multiple matches in the following
  ways:
  ♦   Return the first matching value, or return the last matching value. You can configure the
      transformation to return the first matching value or the last matching value. The first and
      last values are the first value and last value found in the lookup cache that match the
      lookup condition. When you cache the lookup source, the Integration Service generates an
      ORDER BY clause for each column in the lookup cache to determine the first and last row
      in the cache. The Integration Service then sorts each lookup source column in ascending
      order.
      The Integration Service sorts numeric columns in ascending numeric order (such as 0 to
      10), date/time columns from January to December and from the first of the month to the
      end of the month, and string columns based on the sort order configured for the session.
  ♦   Return any matching value. You can configure the Lookup transformation to return any
      value that matches the lookup condition. When you configure the Lookup transformation
      to return any matching value, the transformation returns the first value that matches the
      lookup condition. It creates an index based on the key ports rather than all Lookup
      transformation ports. When you use any matching value, performance can improve
      because the process of indexing rows is simplified.
  ♦   Return an error. When the Lookup transformation uses a static cache or no cache, the
      Integration Service marks the row as an error, writes the row to the session log by default,
      and increases the error count by one. When the Lookup transformation uses a dynamic
      cache, the Integration Service fails the session when it encounters multiple matches either
      while caching the lookup table or looking up values in the cache that contain duplicate
      keys. Also, if you configure the Lookup transformation to output old values on updates,
      the Lookup transformation returns an error when it encounters multiple matches.




                                                                             Lookup Condition   329
Lookup Caches
             You can configure a Lookup transformation to cache the lookup file or table. The Integration
             Service builds a cache in memory when it processes the first row of data in a cached Lookup
             transformation. It allocates memory for the cache based on the amount you configure in the
             transformation or session properties. The Integration Service stores condition values in the
             index cache and output values in the data cache. The Integration Service queries the cache for
             each row that enters the transformation.
             The Integration Service also creates cache files by default in the $PMCacheDir. If the data
             does not fit in the memory cache, the Integration Service stores the overflow values in the
             cache files. When the session completes, the Integration Service releases cache memory and
             deletes the cache files unless you configure the Lookup transformation to use a persistent
             cache.
             When configuring a lookup cache, you can specify any of the following options:
             ♦   Persistent cache
             ♦   Recache from lookup source
             ♦   Static cache
             ♦   Dynamic cache
             ♦   Shared cache
             Note: You can use a dynamic cache for relational or flat file lookups.

             For more information about working with lookup caches, see “Lookup Caches” on page 337.




330   Chapter 14: Lookup Transformation
Configuring Unconnected Lookup Transformations
       An unconnected Lookup transformation is separate from the pipeline in the mapping. You
       write an expression using the :LKP reference qualifier to call the lookup within another
       transformation. Some common uses for unconnected lookups include:
       ♦    Testing the results of a lookup in an expression
       ♦    Filtering rows based on the lookup results
       ♦    Marking rows for update based on the result of a lookup, such as updating slowly changing
            dimension tables
       ♦    Calling the same lookup multiple times in one mapping
       Complete the following steps when you configure an unconnected Lookup transformation:
       1.       Add input ports.
       2.       Add the lookup condition.
       3.       Designate a return value.
       4.       Call the lookup from another transformation.


    Step 1. Add Input Ports
       Create an input port for each argument in the :LKP expression. For each lookup condition
       you plan to create, you need to add an input port to the Lookup transformation. You can
       create a different port for each condition, or use the same input port in more than one
       condition.
       For example, a retail store increased prices across all departments during the last month. The
       accounting department only wants to load rows into the target for items with increased prices.
       To accomplish this, complete the following tasks:
       ♦    Create a lookup condition that compares the ITEM_ID in the source with the ITEM_ID
            in the target.
       ♦    Compare the PRICE for each item in the source with the price in the target table.
            −   If the item exists in the target table and the item price in the source is less than or equal
                to the price in the target table, you want to delete the row.
            −   If the price in the source is greater than the item price in the target table, you want to
                update the row.




                                                           Configuring Unconnected Lookup Transformations   331
♦   Create an input port (IN_ITEM_ID) with datatype Decimal (37,0) to match the
                 ITEM_ID and an IN_PRICE input port with Decimal (10,2) to match the PRICE lookup
                 port.




        Step 2. Add the Lookup Condition
             After you correctly configure the ports, define a lookup condition to compare transformation
             input values with values in the lookup source or cache. To increase performance, add
             conditions with an equal sign first.
             In this case, add the following lookup condition:
                     ITEM_ID = IN_ITEM_ID
                     PRICE <= IN_PRICE

             If the item exists in the mapping source and lookup source and the mapping source price is
             less than or equal to the lookup price, the condition is true and the lookup returns the values
             designated by the Return port. If the lookup condition is false, the lookup returns NULL.
             Therefore, when you write the update strategy expression, use ISNULL nested in an IIF to
             test for null values.


        Step 3. Designate a Return Value
             With unconnected Lookups, you can pass multiple input values into the transformation, but
             only one column of data out of the transformation. Designate one lookup/output port as a
             return port. The Integration Service can return one value from the lookup query. Use the
             return port to specify the return value. If you call the unconnected lookup from an update
             strategy or filter expression, you are generally checking for null values. In this case, the return
             port can be anything. If, however, you call the lookup from an expression performing a
             calculation, the return value needs to be the value you want to include in the calculation.



332   Chapter 14: Lookup Transformation
To continue the update strategy example, you can define the ITEM_ID port as the return port.
  The update strategy expression checks for null values returned. If the lookup condition is
  true, the Integration Service returns the ITEM_ID. If the condition is false, the Integration
  Service returns NULL.
  Figure 14-2 shows a return port in a Lookup transformation:

  Figure 14-2. Return Port in a Lookup Transformation




                                                                               Return Port




Step 4. Call the Lookup Through an Expression
  You supply input values for an unconnected Lookup transformation from a :LKP expression
  in another transformation. The arguments are local input ports that match the Lookup
  transformation input ports used in the lookup condition. Use the following syntax for a :LKP
  expression:
         :LKP.lookup_transformation_name(argument, argument, ...)

  To continue the example about the retail store, when you write the update strategy expression,
  the order of ports in the expression must match the order in the lookup condition. In this
  case, the ITEM_ID condition is the first lookup condition, and therefore, it is the first
  argument in the update strategy expression.
         IIF(ISNULL(:LKP.lkpITEMS_DIM(ITEM_ID, PRICE)), DD_UPDATE, DD_REJECT)

  Use the following guidelines to write an expression that calls an unconnected Lookup
  transformation:
  ♦   The order in which you list each argument must match the order of the lookup conditions
      in the Lookup transformation.
  ♦   The datatypes for the ports in the expression must match the datatypes for the input ports
      in the Lookup transformation. The Designer does not validate the expression if the
      datatypes do not match.


                                                        Configuring Unconnected Lookup Transformations   333
♦   If one port in the lookup condition is not a lookup/output port, the Designer does not
                 validate the expression.
             ♦   The arguments (ports) in the expression must be in the same order as the input ports in
                 the lookup condition.
             ♦   If you use incorrect :LKP syntax, the Designer marks the mapping invalid.
             ♦   If you call a connected Lookup transformation in a :LKP expression, the Designer marks
                 the mapping invalid.
             Tip: Avoid syntax errors when you enter expressions by using the point-and-click method to
             select functions and ports.




334   Chapter 14: Lookup Transformation
Creating a Lookup Transformation
      The following steps summarize the process of creating a Lookup transformation.

      To create a Lookup transformation:

      1.   In the Mapping Designer, click Transformation > Create. Select the Lookup
           transformation. Enter a name for the transformation. The naming convention for
           Lookup transformations is LKP_TransformationName. Click OK.
      2.   In the Select Lookup Table dialog box, you can choose the following options:
           ♦   Choose an existing table or file definition.
           ♦   Choose to import a definition from a relational table or file.
           ♦   Skip to create a manual definition.

                                                                            Choose an existing definition.

                                                                            Import a definition.



                                                                            Manually create a definition.




      3.   Define input ports for each lookup condition you want to define.
      4.   For an unconnected Lookup transformation, create a return port for the value you want
           to return from the lookup.
      5.   Define output ports for the values you want to pass to another transformation.
      6.   For Lookup transformations that use a dynamic lookup cache, associate an input port or
           sequence ID with each lookup port.
      7.   Add the lookup conditions. If you include more than one condition, place the conditions
           using equal signs first to optimize lookup performance.
           For information about lookup conditions, see “Lookup Condition” on page 328.
      8.   On the Properties tab, set the properties for the Lookup transformation, and click OK.
           For a list of properties, see “Lookup Properties” on page 316.
      9.   For unconnected Lookup transformations, write an expression in another transformation
           using :LKP to call the unconnected Lookup transformation.


                                                                    Creating a Lookup Transformation         335
Tips
             Use the following tips when you configure the Lookup transformation:

             Add an index to the columns used in a lookup condition.
             If you have privileges to modify the database containing a lookup table, you can improve
             performance for both cached and uncached lookups. This is important for very large lookup
             tables. Since the Integration Service needs to query, sort, and compare values in these
             columns, the index needs to include every column used in a lookup condition.

             Place conditions with an equality operator (=) first.
             If a Lookup transformation specifies several conditions, you can improve lookup performance
             by placing all the conditions that use the equality operator first in the list of conditions that
             appear under the Condition tab.

             Cache small lookup tables.
             Improve session performance by caching small lookup tables. The result of the lookup query
             and processing is the same, whether or not you cache the lookup table.

             Join tables in the database.
             If the lookup table is on the same database as the source table in the mapping and caching is
             not feasible, join the tables in the source database rather than using a Lookup transformation.

             Use a persistent lookup cache for static lookups.
             If the lookup source does not change between sessions, configure the Lookup transformation
             to use a persistent lookup cache. The Integration Service then saves and reuses cache files
             from session to session, eliminating the time required to read the lookup source.

             Call unconnected Lookup transformations with the :LKP reference qualifier.
             When you write an expression using the :LKP reference qualifier, you call unconnected
             Lookup transformations only. If you try to call a connected Lookup transformation, the
             Designer displays an error and marks the mapping invalid.




336   Chapter 14: Lookup Transformation
Chapter 15




Lookup Caches


   This chapter includes the following topics:
   ♦   Overview, 338
   ♦   Building Connected Lookup Caches, 340
   ♦   Using a Persistent Lookup Cache, 342
   ♦   Working with an Uncached Lookup or Static Cache, 344
   ♦   Working with a Dynamic Lookup Cache, 345
   ♦   Sharing the Lookup Cache, 363
   ♦   Lookup Cache Tips, 369




                                                              337
Overview
             You can configure a Lookup transformation to cache the lookup table. The Integration
             Service builds a cache in memory when it processes the first row of data in a cached Lookup
             transformation. It allocates memory for the cache based on the amount you configure in the
             transformation or session properties. The Integration Service stores condition values in the
             index cache and output values in the data cache. The Integration Service queries the cache for
             each row that enters the transformation.
             The Integration Service also creates cache files by default in the $PMCacheDir. If the data
             does not fit in the memory cache, the Integration Service stores the overflow values in the
             cache files. When the session completes, the Integration Service releases cache memory and
             deletes the cache files unless you configure the Lookup transformation to use a persistent
             cache.
             If you use a flat file lookup, the Integration Service always caches the lookup source. If you
             configure a flat file lookup for sorted input, the Integration Service cannot cache the lookup if
             the condition columns are not grouped. If the columns are grouped, but not sorted, the
             Integration Service processes the lookup as if you did not configure sorted input. For more
             information, see “Flat File Lookups” on page 311.
             When you configure a lookup cache, you can configure the following cache settings:
             ♦   Building caches. You can configure the session to build caches sequentially or
                 concurrently. When you build sequential caches, the Integration Service creates caches as
                 the source rows enter the Lookup transformation. When you configure the session to build
                 concurrent caches, the Integration Service does not wait for the first row to enter the
                 Lookup transformation before it creates caches. Instead, it builds multiple caches
                 concurrently. For more information, see “Building Connected Lookup Caches” on
                 page 340.
             ♦   Persistent cache. You can save the lookup cache files and reuse them the next time the
                 Integration Service processes a Lookup transformation configured to use the cache. For
                 more information, see “Using a Persistent Lookup Cache” on page 342.
             ♦   Recache from source. If the persistent cache is not synchronized with the lookup table,
                 you can configure the Lookup transformation to rebuild the lookup cache. For more
                 information, see “Building Connected Lookup Caches” on page 340.
             ♦   Static cache. You can configure a static, or read-only, cache for any lookup source. By
                 default, the Integration Service creates a static cache. It caches the lookup file or table and
                 looks up values in the cache for each row that comes into the transformation. When the
                 lookup condition is true, the Integration Service returns a value from the lookup cache.
                 The Integration Service does not update the cache while it processes the Lookup
                 transformation. For more information, see “Working with an Uncached Lookup or Static
                 Cache” on page 344.
             ♦   Dynamic cache. To cache a target table or flat file source and insert new rows or update
                 existing rows in the cache, use a Lookup transformation with a dynamic cache. The
                 Integration Service dynamically inserts or updates data in the lookup cache and passes data



338   Chapter 15: Lookup Caches
to the target. For more information, see “Working with a Dynamic Lookup Cache” on
        page 345.
  ♦     Shared cache. You can share the lookup cache between multiple transformations. You can
        share an unnamed cache between transformations in the same mapping. You can share a
        named cache between transformations in the same or different mappings. For more
        information, see “Sharing the Lookup Cache” on page 363.
  When you do not configure the Lookup transformation for caching, the Integration Service
  queries the lookup table for each input row. The result of the Lookup query and processing is
  the same, whether or not you cache the lookup table. However, using a lookup cache can
  increase session performance. Optimize performance by caching the lookup table when the
  source table is large.
  For more information about caching properties, see “Lookup Properties” on page 316.
  For information about configuring the cache size, see “Session Caches” in the Workflow
  Administration Guide.
  Note: The Integration Service uses the same transformation logic to process a Lookup
  transformation whether you configure it to use a static cache or no cache. However, when you
  configure the transformation to use no cache, the Integration Service queries the lookup table
  instead of the lookup cache.


Cache Comparison
  Table 15-1 compares the differences between an uncached lookup, a static cache, and a
  dynamic cache:

  Table 15-1. Lookup Caching Comparison

      Uncached                             Static Cache                              Dynamic Cache

      You cannot insert or update the      You cannot insert or update the           You can insert or update rows in the cache
      cache.                               cache.                                    as you pass rows to the target.

      You cannot use a flat file lookup.   Use a relational or a flat file lookup.   Use a relational or a flat file lookup.

      When the condition is true, the      When the condition is true, the           When the condition is true, the Integration
      Integration Service returns a        Integration Service returns a value       Service either updates rows in the cache
      value from the lookup table or       from the lookup table or cache.           or leaves the cache unchanged,
      cache.                               When the condition is not true, the       depending on the row type. This indicates
      When the condition is not true,      Integration Service returns the           that the row is in the cache and target
      the Integration Service returns      default value for connected               table. You can pass updated rows to a
      the default value for connected      transformations and NULL for              target.
      transformations and NULL for         unconnected transformations.              When the condition is not true, the
      unconnected transformations.         For more information, see “Working        Integration Service either inserts rows into
      For more information, see            with an Uncached Lookup or Static         the cache or leaves the cache unchanged,
      “Working with an Uncached            Cache” on page 344.                       depending on the row type. This indicates
      Lookup or Static Cache” on                                                     that the row is not in the cache or target.
      page 344.                                                                      You can pass inserted rows to a target
                                                                                     table.
                                                                                     For more information, see “Updating the
                                                                                     Dynamic Lookup Cache” on page 356.




                                                                                                                 Overview       339
Building Connected Lookup Caches
             The Integration Service can build lookup caches for connected Lookup transformations in the
             following ways:
             ♦   Sequential caches. The Integration Service builds lookup caches sequentially. The
                 Integration Service builds the cache in memory when it processes the first row of the data
                 in a cached lookup transformation. For more information, see “Sequential Caches” on
                 page 340.
             ♦   Concurrent caches. The Integration Service builds lookup caches concurrently. It does not
                 need to wait for data to reach the Lookup transformation. For more information, see
                 “Concurrent Caches” on page 341.
             Note: The Integration Service builds caches for unconnected Lookup transformations
             sequentially regardless of how you configure cache building. If you configure the session to
             build concurrent caches for an unconnected Lookup transformation, the Integration Service
             ignores this setting and builds unconnected Lookup transformation caches sequentially.


        Sequential Caches
             By default, the Integration Service builds a cache in memory when it processes the first row of
             data in a cached Lookup transformation. The Integration Service creates each lookup cache in
             the pipeline sequentially. The Integration Service waits for any upstream active
             transformation to complete processing before it starts processing the rows in the Lookup
             transformation. The Integration Service does not build caches for a downstream Lookup
             transformation until an upstream Lookup transformation completes building a cache.
             For example, the following mapping contains an unsorted Aggregator transformation
             followed by two Lookup transformations.
             Figure 15-1 shows a mapping that contains multiple Lookup transformations:

             Figure 15-1. Building Lookup Caches Sequentially




                              1) Aggregator        2) Lookup transformation builds         3) Lookup transformation builds
                              transformation       cache after it reads first input row.   cache after it reads first input row.
                              processes rows.


             The Integration Service processes all the rows for the unsorted Aggregator transformation and
             begins processing the first Lookup transformation after the unsorted Aggregator


340   Chapter 15: Lookup Caches
transformation completes. When it processes the first input row, the Integration Service
  begins building the first lookup cache. After the Integration Service finishes building the first
  lookup cache, it can begin processing the lookup data. The Integration Service begins
  building the next lookup cache when the first row of data reaches the Lookup transformation.
  You might want to process lookup caches sequentially if the Lookup transformation may not
  process row data. The Lookup transformation may not process row data if the transformation
  logic is configured to route data to different pipelines based on a condition. Configuring
  sequential caching may allow you to avoid building lookup caches unnecessarily. For example,
  a Router transformation might route data to one pipeline if a condition resolves to true, and it
  might route data to another pipeline if the condition resolves to false. In this case, a Lookup
  transformation might not receive data at all.


Concurrent Caches
  You can configure the Integration Service to create lookup caches concurrently. You may be
  able to improve session performance using concurrent caches. Performance may especially
  improve when the pipeline contains an active transformations upstream of the Lookup
  transformation. You may want to configure the session to create concurrent caches if you are
  certain that the you will need to build caches for each of the Lookup transformations in the
  session.
  When you configure the Lookup transformation to create concurrent caches, it does not wait
  for upstream transformations to complete before it creates lookup caches, and it does not
  need to finish building a lookup cache before it can begin building other lookup caches.
  For example, you configure the session shown in Figure 15-1 for concurrent cache creation.
  Figure 15-2 shows lookup transformation caches built concurrently:

  Figure 15-2. Building Lookup Caches Concurrently




                                             Caches are built concurrently.


  When you run the session, the Integration Service builds the Lookup caches concurrently. It
  does not wait for upstream transformations to complete, and it does not wait for other
  Lookup transformations to complete cache building.
  Note: You cannot process caches for unconnected Lookup transformations concurrently.

  To configure the session to create concurrent caches, configure a value for the session
  configuration attribute, Additional Concurrent Pipelines for Lookup Cache Creation.



                                                                       Building Connected Lookup Caches   341
Using a Persistent Lookup Cache
             You can configure a Lookup transformation to use a non-persistent or persistent cache. The
             Integration Service saves or deletes lookup cache files after a successful session based on the
             Lookup Cache Persistent property.
             If the lookup table does not change between sessions, you can configure the Lookup
             transformation to use a persistent lookup cache. The Integration Service saves and reuses
             cache files from session to session, eliminating the time required to read the lookup table.


        Using a Non-Persistent Cache
             By default, the Integration Service uses a non-persistent cache when you enable caching in a
             Lookup transformation. The Integration Service deletes the cache files at the end of a session.
             The next time you run the session, the Integration Service builds the memory cache from the
             database.


        Using a Persistent Cache
             If you want to save and reuse the cache files, you can configure the transformation to use a
             persistent cache. Use a persistent cache when you know the lookup table does not change
             between session runs.
             The first time the Integration Service runs a session using a persistent lookup cache, it saves
             the cache files to disk instead of deleting them. The next time the Integration Service runs the
             session, it builds the memory cache from the cache files. If the lookup table changes
             occasionally, you can override session properties to recache the lookup from the database.
             When you use a persistent lookup cache, you can specify a name for the cache files. When you
             specify a named cache, you can share the lookup cache across sessions. For more information
             about the Cache File Name Prefix property, see “Lookup Properties” on page 316. For more
             information about sharing lookup caches, see “Sharing the Lookup Cache” on page 363.


        Rebuilding the Lookup Cache
             You can instruct the Integration Service to rebuild the lookup cache if you think that the
             lookup source changed since the last time the Integration Service built the persistent cache.
             When you rebuild a cache, the Integration Service creates new cache files, overwriting existing
             persistent cache files. The Integration Service writes a message to the session log when it
             rebuilds the cache.
             You can rebuild the cache when the mapping contains one Lookup transformation or when
             the mapping contains Lookup transformations in multiple target load order groups that share
             a cache. You do not need to rebuild the cache when a dynamic lookup shares the cache with a
             static lookup in the same mapping.
             If the Integration Service cannot reuse the cache, it either recaches the lookup from the
             database, or it fails the session, depending on the mapping and session properties.


342   Chapter 15: Lookup Caches
Table 15-2 summarizes how the Integration Service handles persistent caching for named and
unnamed caches:

Table 15-2. Integration Service Handling of Persistent Caches

 Mapping or Session Changes Between Sessions                                                Named Cache              Unnamed Cache

 Integration Service cannot locate cache files.                                             Rebuilds cache.          Rebuilds cache.

 Enable or disable the Enable High Precision option in session properties.                  Fails session.           Rebuilds cache.

 Edit the transformation in the Mapping Designer, Mapplet Designer, or                      Fails session.           Rebuilds cache.
 Reusable Transformation Developer.*

 Edit the mapping (excluding Lookup transformation).                                        Reuses cache.            Rebuilds cache.

 Change database connection or the file location used to access the lookup                  Fails session.           Rebuilds cache.
 table.

 Change the Integration Service data movement mode.                                         Fails session.           Rebuilds cache.

 Change the sort order in Unicode mode.                                                     Fails session.           Rebuilds cache.

 Change the Integration Service code page to a compatible code page.                        Reuses cache.            Reuses cache.

 Change the Integration Service code page to an incompatible code page.                     Fails session.           Rebuilds cache.
 *Editing properties such as transformation description or port description does not affect persistent cache handling.




                                                                                          Using a Persistent Lookup Cache              343
Working with an Uncached Lookup or Static Cache
             By default, the Integration Service creates a static lookup cache when you configure a Lookup
             transformation for caching. The Integration Service builds the cache when it processes the
             first lookup request. It queries the cache based on the lookup condition for each row that
             passes into the transformation. The Integration Service does not update the cache while it
             processes the transformation. The Integration Service processes an uncached lookup the same
             way it processes a cached lookup except that it queries the lookup source instead of building
             and querying the cache.
             When the lookup condition is true, the Integration Service returns the values from the lookup
             source or cache. For connected Lookup transformations, the Integration Service returns the
             values represented by the lookup/output ports. For unconnected Lookup transformations, the
             Integration Service returns the value represented by the return port.
             When the condition is not true, the Integration Service returns either NULL or default
             values. For connected Lookup transformations, the Integration Service returns the default
             value of the output port when the condition is not met. For unconnected Lookup
             transformations, the Integration Service returns NULL when the condition is not met.
             When you create multiple partitions in a pipeline that use a static cache, the Integration
             Service creates one memory cache for each partition and one disk cache for each
             transformation.
             For more information, see “Session Caches” in the Workflow Administration Guide.




344   Chapter 15: Lookup Caches
Working with a Dynamic Lookup Cache
      You can use a dynamic cache with a relational lookup or a flat file lookup. For relational
      lookups, you might configure the transformation to use a dynamic cache when the target
      table is also the lookup table. For flat file lookups, the dynamic cache represents the data to
      update in the target table.
      The Integration Service builds the cache when it processes the first lookup request. It queries
      the cache based on the lookup condition for each row that passes into the transformation.
      When you use a dynamic cache, the Integration Service updates the lookup cache as it passes
      rows to the target.
      When the Integration Service reads a row from the source, it updates the lookup cache by
      performing one of the following actions:
      ♦   Inserts the row into the cache. The row is not in the cache and you specified to insert rows
          into the cache. You can configure the transformation to insert rows into the cache based on
          input ports or generated sequence IDs. The Integration Service flags the row as insert.
      ♦   Updates the row in the cache. The row exists in the cache and you specified to update
          rows in the cache. The Integration Service flags the row as update. The Integration Service
          updates the row in the cache based on the input ports.
      ♦   Makes no change to the cache. The row exists in the cache and you specified to insert new
          rows only. Or, the row is not in the cache and you specified to update existing rows only.
          Or, the row is in the cache, but based on the lookup condition, nothing changes. The
          Integration Service flags the row as unchanged.
      The Integration Service either inserts or updates the cache or makes no change to the cache,
      based on the results of the lookup query, the row type, and the Lookup transformation
      properties you define. For more information, see “Updating the Dynamic Lookup Cache” on
      page 356.
      The following list describes some situations when you use a dynamic lookup cache:
      ♦   Updating a master customer table with new and updated customer information. You
          want to load new and updated customer information into a master customer table. Use a
          Lookup transformation that performs a lookup on the target table to determine if a
          customer exists or not. Use a dynamic lookup cache that inserts and updates rows in the
          cache as it passes rows to the target.
      ♦   Loading data into a slowly changing dimension table and a fact table. You want to load
          data into a slowly changing dimension table and a fact table. Create two pipelines and use
          a Lookup transformation that performs a lookup on the dimension table. Use a dynamic
          lookup cache to load data to the dimension table. Use a static lookup cache to load data to
          the fact table, making sure you specify the name of the dynamic cache from the first
          pipeline. For more information, see “Example Using a Dynamic Lookup Cache” on
          page 360.
      ♦   Reading a flat file that is an export from a relational table. You want to read data from a
          Teradata table, but the ODBC connection is slow. You can export the Teradata table
          contents to a flat file and use the file as a lookup source. You can pass the lookup cache


                                                               Working with a Dynamic Lookup Cache   345
changes back to the Teradata table if you configure the Teradata table as a relational target
                 in the mapping.
             Use a Router or Filter transformation with the dynamic Lookup transformation to route
             inserted or updated rows to the cached target table. You can route unchanged rows to another
             target table or flat file, or you can drop them.
             When you create multiple partitions in a pipeline that use a dynamic lookup cache, the
             Integration Service creates one memory cache and one disk cache for each transformation.
             However, if you add a partition point at the Lookup transformation, the Integration Service
             creates one memory cache for each partition. For more information, see “Session Caches” in
             the Workflow Administration Guide.
             Figure 15-3 shows a mapping with a Lookup transformation that uses a dynamic lookup
             cache:

             Figure 15-3. Mapping with a Dynamic Lookup Cache




             A Lookup transformation using a dynamic cache has the following properties:
             ♦   NewLookupRow. The Designer adds this port to a Lookup transformation configured to
                 use a dynamic cache. Indicates with a numeric value whether the Integration Service
                 inserts or updates the row in the cache, or makes no change to the cache. To keep the
                 lookup cache and the target table synchronized, you pass rows to the target when the
                 NewLookupRow value is equal to 1 or 2. For more information, see “Using the
                 NewLookupRow Port” on page 347.



346   Chapter 15: Lookup Caches
♦   Associated Port. Associate lookup ports with either an input/output port or a sequence
      ID. The Integration Service uses the data in the associated ports to insert or update rows in
      the lookup cache. If you associate a sequence ID, the Integration Service generates a
      primary key for inserted rows in the lookup cache. For more information, see “Using the
      Associated Input Port” on page 348.
  ♦   Ignore Null Inputs for Updates. The Designer activates this port property for lookup/
      output ports when you configure the Lookup transformation to use a dynamic cache.
      Select this property when you do not want the Integration Service to update the column in
      the cache when the data in this column contains a null value. For more information, see
      “Using the Ignore Null Property” on page 353.
  ♦   Ignore in Comparison. The Designer activates this port property for lookup/output ports
      not used in the lookup condition when you configure the Lookup transformation to use a
      dynamic cache. The Integration Service compares the values in all lookup ports with the
      values in their associated input ports by default. Select this property if you want the
      Integration Service to ignore the port when it compares values before updating a row. For
      more information, see “Using the Ignore in Comparison Property” on page 354.
  Figure 15-4 shows the output port properties unique to a dynamic Lookup transformation:

  Figure 15-4. Dynamic Lookup Transformation Ports Tab




                                                                               NewLookupRow

                                                                               Associated Sequence-ID

                                                                               Associated Port

                                                                               Ignore Null

                                                                               Ignore in Comparison




Using the NewLookupRow Port
  When you define a Lookup transformation to use a dynamic cache, the Designer adds the
  NewLookupRow port to the transformation. The Integration Service assigns a value to the
  port, depending on the action it performs to the lookup cache.


                                                           Working with a Dynamic Lookup Cache        347
Table 15-3 lists the possible NewLookupRow values:

             Table 15-3. NewLookupRow Values

               NewLookupRow Value     Description

               0                      Integration Service does not update or insert the row in the cache.

               1                      Integration Service inserts the row into the cache.

               2                      Integration Service updates the row in the cache.


             When the Integration Service reads a row, it changes the lookup cache depending on the
             results of the lookup query and the Lookup transformation properties you define. It assigns
             the value 0, 1, or 2 to the NewLookupRow port to indicate if it inserts or updates the row in
             the cache, or makes no change.
             For information about how the Integration Service determines to update the cache, see
             “Updating the Dynamic Lookup Cache” on page 356.
             The NewLookupRow value indicates how the Integration Service changes the lookup cache. It
             does not change the row type. Therefore, use a Filter or Router transformation and an Update
             Strategy transformation to help keep the target table and lookup cache synchronized.
             Configure the Filter transformation to pass new and updated rows to the Update Strategy
             transformation before passing them to the cached target. Use the Update Strategy
             transformation to change the row type of each row to insert or update, depending on the
             NewLookupRow value.
             You can drop the rows that do not change the cache, or you can pass them to another target.
             For more information, see “Using Update Strategy Transformations with a Dynamic Cache”
             on page 354.
             Define the filter condition in the Filter transformation based on the value of
             NewLookupRow. For example, use the following condition to pass both inserted and updated
             rows to the cached target:
                     NewLookupRow != 0

             For more information about the Filter transformation, see “Filter Transformation” on
             page 189.


        Using the Associated Input Port
             When you use a dynamic lookup cache, you must associate each lookup/output port with an
             input/output port or a sequence ID. The Integration Service uses the data in the associated
             port to insert or update rows in the lookup cache. The Designer associates the input/output
             ports with the lookup/output ports used in the lookup condition.
             For more information about the values of a Lookup transformation when you use a dynamic
             lookup cache, see “Working with Lookup Transformation Values” on page 349.




348   Chapter 15: Lookup Caches
Sometimes you need to create a generated key for a column in a target table. For lookup ports
  with an Integer or Small Integer datatype, you can associate a generated key instead of an
  input port. To do this, select Sequence-ID in the Associated Port column.
  When you select Sequence-ID in the Associated Port column, the Integration Service
  generates a key when it inserts a row into the lookup cache.
  The Integration Service uses the following process to generate sequence IDs:
  1.    When the Integration Service creates the dynamic lookup cache, it tracks the range of
        values in the cache associated with any port using a sequence ID.
  2.    When the Integration Service inserts a new row of data into the cache, it generates a key
        for a port by incrementing the greatest sequence ID existing value by one.
  3.    When the Integration Service reaches the maximum number for a generated sequence ID,
        it starts over at one. It then increments each sequence ID by one until it reaches the
        smallest existing value minus one. If the Integration Service runs out of unique sequence
        ID numbers, the session fails.
        Note: The maximum value for a sequence ID is 2147483647.

  The Integration Service only generates a sequence ID for rows it inserts into the cache.


Working with Lookup Transformation Values
  When you associate an input/output port or a sequence ID with a lookup/output port, the
  following values match by default:
  ♦    Input value. Value the Integration Service passes into the transformation.
  ♦    Lookup value. Value that the Integration Service inserts into the cache.
  ♦    Input/output port output value. Value that the Integration Service passes out of the
       input/output port.
  The lookup/output port output value depends on whether you choose to output old or new
  values when the Integration Service updates a row:
  ♦    Output old values on update. The Integration Service outputs the value that existed in the
       cache before it updated the row.
  ♦    Output new values on update. The Integration Service outputs the updated value that it
       writes in the cache. The lookup/output port value matches the input/output port value.
  Note: You configure to output old or new values using the Output Old Value On Update
  transformation property. For more information about this property, see “Lookup Properties”
  on page 316.




                                                           Working with a Dynamic Lookup Cache   349
For example, you have the following Lookup transformation that uses a dynamic lookup
             cache:




             You define the following lookup condition:
                     IN_CUST_ID = CUST_ID

             By default, the row type of all rows entering the Lookup transformation is insert. To perform
             both inserts and updates in the cache and target table, you select the Insert Else Update
             property in the Lookup transformation.
             The following sections describe the values of the rows in the cache, the input rows, lookup
             rows, and output rows as you run the session.

             Initial Cache Values
             When you run the session, the Integration Service builds the lookup cache from the target
             table with the following data:
             PK_PRIMARYKEY CUST_ID         CUST_NAME      ADDRESS
             100001               80001    Marion James   100 Main St.
             100002               80002    Laura Jones    510 Broadway Ave.
             100003               80003    Shelley Lau    220 Burnside Ave.


             Input Values
             The source contains rows that exist and rows that do not exist in the target table. The
             following rows pass into the Lookup transformation from the Source Qualifier
             transformation:
             SQ_CUST_ID           SQ_CUST_NAME    SQ_ADDRESS
             80001                Marion Atkins   100 Main St.
             80002                Laura Gomez     510 Broadway Ave.
             99001                Jon Freeman     555 6th Ave.


             Note: The input values always match the values the Integration Service outputs out of the
             input/output ports.




350   Chapter 15: Lookup Caches
Lookup Values
The Integration Service looks up values in the cache based on the lookup condition. It
updates rows in the cache for existing customer IDs 80001 and 80002. It inserts a row into
the cache for customer ID 99001. The Integration Service generates a new key
(PK_PRIMARYKEY) for the new row.
PK_PRIMARYKEY CUST_ID      CUST_NAME         ADDRESS
100001          80001      Marion Atkins 100 Main St.
100002          80002      Laura Gomez       510 Broadway Ave.
100004          99001      Jon Freeman       555 6th Ave.


Output Values
The Integration Service flags the rows in the Lookup transformation based on the inserts and
updates it performs on the dynamic cache. These rows pass through an Expression
transformation to a Router transformation that filters and passes on the inserted and updated
rows to an Update Strategy transformation. The Update Strategy transformation flags the
rows based on the value of the NewLookupRow port.
The output values of the lookup/output and input/output ports depend on whether you
choose to output old or new values when the Integration Service updates a row. However, the
output values of the NewLookupRow port and any lookup/output port that uses the
Sequence-ID is the same for new and updated rows.
When you choose to output new values, the lookup/output ports output the following values:
NewLookupRow     PK_PRIMARYKEY     CUST_ID    CUST_NAME        ADDRESS
2                100001            80001      Marion Atkins 100 Main St.
2                100002            80002      Laura Gomez      510 Broadway Ave.
1                100004            99001      Jon Freeman      555 6th Ave.


When you choose to output old values, the lookup/output ports output the following values:
NewLookupRow     PK_PRIMARYKEY     CUST_ID    CUST_NAME        ADDRESS
2                100001            80001      Marion James     100 Main St.
2                100002            80002      Laura Jones      510 Broadway Ave.
1                100004            99001      Jon Freeman      555 6th Ave.


Note that when the Integration Service updates existing rows in the lookup cache and when it
passes rows to the lookup/output ports, it always uses the existing primary key
(PK_PRIMARYKEY) values for rows that exist in the cache and target table.
The Integration Service uses the sequence ID to generate a new primary key for the customer
that it does not find in the cache. The Integration Service inserts the new primary key value
into the lookup cache and outputs it to the lookup/output port.




                                                       Working with a Dynamic Lookup Cache   351
The Integration Service output values from the input/output ports that match the input
             values. For those values, see “Input Values” on page 350.
             Note: If the input value is NULL and you select the Ignore Null property for the associated
             input port, the input value does not equal the lookup value or the value out of the input/
             output port. When you select the Ignore Null property, the lookup cache and the target table
             might become unsynchronized if you pass null values to the target. You must verify that you
             do not pass null values to the target. For more information, see “Using the Ignore Null
             Property” on page 353.




352   Chapter 15: Lookup Caches
Using the Ignore Null Property
   When you update a dynamic lookup cache and target table, the source data might contain
   some null values. The Integration Service can handle the null values in the following ways:
   ♦   Insert null values. The Integration Service uses null values from the source and updates
       the lookup cache and target table using all values from the source.
   ♦   Ignore null values. The Integration Service ignores the null values in the source and
       updates the lookup cache and target table using only the not null values from the source.
   If you know the source data contains null values, and you do not want the Integration Service
   to update the lookup cache or target with null values, select the Ignore Null property for the
   corresponding lookup/output port.
   For example, you want to update the master customer table. The source contains new
   customers and current customers whose last names have changed. The source contains the
   customer IDs and names of customers whose names have changed, but it contains null values
   for the address columns. You want to insert new customers and update the current customer
   names while retaining the current address information in a master customer table.
   For example, the master customer table contains the following data:
   PRIMARYKEY     CUST_ID    CUST_NAME       ADDRESS                 CITY          STATE     ZIP
   100001         80001      Marion James    100 Main St.            Mt. View      CA        94040
   100002         80002      Laura Jones     510 Broadway Ave.       Raleigh       NC        27601
   100003         80003      Shelley Lau     220 Burnside Ave.       Portland      OR        97210


   The source contains the following data:
   CUST_ID    CUST_NAME          ADDRESS        CITY        STATE     ZIP
   80001      Marion Atkins      NULL           NULL        NULL      NULL
   80002      Laura Gomez        NULL           NULL        NULL      NULL
   99001      Jon Freeman        555 6th Ave. San Jose CA             95051


   Select Insert Else Update in the Lookup transformation in the mapping. Select the Ignore
   Null option for all lookup/output ports in the Lookup transformation. When you run a
   session, the Integration Service ignores null values in the source data and updates the lookup
   cache and the target table with not null values:
   PRIMARYKEY    CUST_ID    CUST_NAME        ADDRESS                CITY           STATE    ZIP
   100001        80001      Marion Atkins 100 Main St.              Mt. View       CA       94040
   100002        80002      Laura Gomez      510 Broadway Ave.      Raleigh        NC       27601
   100003        80003      Shelley Lau      220 Burnside Ave.      Portland       OR       97210
   100004        99001      Jon Freeman      555 6th Ave.           San Jose       CA       95051


   Note: When you choose to ignore NULLs, you must verify that you output the same values to
   the target that the Integration Service writes to the lookup cache. When you choose to ignore


                                                           Working with a Dynamic Lookup Cache     353
NULLs, the lookup cache and the target table might become unsynchronized if you pass null
             input values to the target. Configure the mapping based on the value you want the
             Integration Service to output from the lookup/output ports when it updates a row in the
             cache:
             ♦   New values. Connect only lookup/output ports from the Lookup transformation to the
                 target.
             ♦   Old values. Add an Expression transformation after the Lookup transformation and before
                 the Filter or Router transformation. Add output ports in the Expression transformation for
                 each port in the target table and create expressions to ensure you do not output null input
                 values to the target.


        Using the Ignore in Comparison Property
             When you run a session that uses a dynamic lookup cache, the Integration Service compares
             the values in all lookup ports with the values in their associated input ports by default. It
             compares the values to determine whether or not to update the row in the lookup cache.
             When a value in an input port differs from the value in the lookup port, the Integration
             Service updates the row in the cache.
             If you do not want to compare all ports, you can choose the ports you want the Integration
             Service to ignore when it compares ports. The Designer only enables this property for lookup/
             output ports when the port is not used in the lookup condition. You can improve
             performance by ignoring some ports during comparison.
             You might want to do this when the source data includes a column that indicates whether or
             not the row contains data you need to update. Select the Ignore in Comparison property for
             all lookup ports except the port that indicates whether or not to update the row in the cache
             and target table.
             Note: You must configure the Lookup transformation to compare at least one port. The
             Integration Service fails the session when you ignore all ports.


        Using Update Strategy Transformations with a Dynamic Cache
             When you use a dynamic lookup cache, use Update Strategy transformations to define the
             row type for the following rows:
             ♦   Rows entering the Lookup transformation. By default, the row type of all rows entering a
                 Lookup transformation is insert. However, use an Update Strategy transformation before a
                 Lookup transformation to define all rows as update, or some as update and some as insert.
             ♦   Rows leaving the Lookup transformation. The NewLookupRow value indicates how the
                 Integration Service changed the lookup cache, but it does not change the row type. Use a
                 Filter or Router transformation after the Lookup transformation to direct rows leaving the
                 Lookup transformation based on the NewLookupRow value. Use Update Strategy
                 transformations after the Filter or Router transformation to flag rows for insert or update
                 before the target definition in the mapping.




354   Chapter 15: Lookup Caches
Note: If you want to drop the unchanged rows, do not connect rows from the Filter or Router
transformation with the NewLookupRow equal to 0 to the target definition.
When you define the row type as insert for rows entering a Lookup transformation, use the
Insert Else Update property in the Lookup transformation. When you define the row type as
update for rows entering a Lookup transformation, use the Update Else Insert property in the
Lookup transformation. If you define some rows entering a Lookup transformation as update
and some as insert, use either the Update Else Insert or Insert Else Update property, or use
both properties. For more information, see “Updating the Dynamic Lookup Cache” on
page 356.
Figure 15-5 shows a mapping with multiple Update Strategy transformations and a Lookup
transformation using a dynamic cache:
Figure 15-5. Using Update Strategy Transformations with a Lookup Transformation

                                                                                   Update Strategy marks
                                                                                   rows as update.

                                                                                   Update Strategy
                                                                                   inserts new rows into
                                                                                   the target.



                                                                                   Update Strategy
                                                                                   updates existing rows
                                                                                   in the target.




                                                                                   Output rows not
                                                                                   connected to a target
                                                                                   get dropped.



In this case, the Update Strategy transformation before the Lookup transformation flags all
rows as update. Select the Update Else Insert property in the Lookup transformation. The
Router transformation sends the inserted rows to the Insert_New Update Strategy
transformation and sends the updated rows to the Update_Existing Update Strategy
transformation. The two Update Strategy transformations to the right of the Lookup
transformation flag the rows for insert or update for the target.

Configuring Sessions with a Dynamic Lookup Cache
When you configure a session using Update Strategy transformations and a dynamic lookup
cache, you must define certain session properties.
On the General Options settings on the Properties tab in the session properties, define the
Treat Source Rows As option as Data Driven.




                                                               Working with a Dynamic Lookup Cache         355
You must also define the following update strategy target table options:
             ♦   Select Insert
             ♦   Select Update as Update
             ♦   Do not select Delete
             These update strategy target table options ensure that the Integration Service updates rows
             marked for update and inserts rows marked for insert.
             If you do not choose Data Driven, the Integration Service flags all rows for the row type you
             specify in the Treat Source Rows As option and does not use the Update Strategy
             transformations in the mapping to flag the rows. The Integration Service does not insert and
             update the correct rows. If you do not choose Update as Update, the Integration Service does
             not correctly update the rows flagged for update in the target table. As a result, the lookup
             cache and target table might become unsynchronized. For more information, see “Setting the
             Update Strategy for a Session” on page 580.
             For more information about configuring target session properties, see “Working with Targets”
             in the Workflow Administration Guide.


        Updating the Dynamic Lookup Cache
             When you use a dynamic lookup cache, define the row type of the rows entering the Lookup
             transformation as either insert or update. You can define some rows as insert and some as
             update, or all insert, or all update. By default, the row type of all rows entering a Lookup
             transformation is insert. You can add an Update Strategy transformation before the Lookup
             transformation to define the row type as update. For more information, see “Using Update
             Strategy Transformations with a Dynamic Cache” on page 354.
             The Integration Service either inserts or updates rows in the cache, or does not change the
             cache. The row type of the rows entering the Lookup transformation and the lookup query
             result affect how the Integration Service updates the cache. However, you must also configure
             the following Lookup properties to determine how the Integration Service updates the lookup
             cache:
             ♦   Insert Else Update. Applies to rows entering the Lookup transformation with the row type
                 of insert.
             ♦   Update Else Insert. Applies to rows entering the Lookup transformation with the row type
                 of update.
             Note: You can select either the Insert Else Update or Update Else Insert property, or you can
             select both properties or neither property. The Insert Else Update property only affects rows
             entering the Lookup transformation with the row type of insert. The Update Else Insert
             property only affects rows entering the Lookup transformation with the row type of update.

             Insert Else Update
             You can select the Insert Else Update property in the Lookup transformation. This property
             only applies to rows entering the Lookup transformation with the row type of insert. When a



356   Chapter 15: Lookup Caches
row of any other row type, such as update, enters the Lookup transformation, the Insert Else
Update property has no effect on how the Integration Service handles the row.
When you select Insert Else Update and the row type entering the Lookup transformation is
insert, the Integration Service inserts the row into the cache if it is new. If the row exists in the
index cache but the data cache is different than the current row, the Integration Service
updates the row in the data cache.
If you do not select Insert Else Update and the row type entering the Lookup transformation
is insert, the Integration Service inserts the row into the cache if it is new, and makes no
change to the cache if the row exists.
Table 15-4 describes how the Integration Service changes the lookup cache when the row type
of the rows entering the Lookup transformation is insert:

Table 15-4. Dynamic Lookup Cache Behavior for Insert Row Type

 Insert Else Update              Row Found in                 Data Cache is                  Lookup Cache                 NewLookupRow
 Option                          Cache                        Different                      Result                       Value

 Cleared (insert only)           Yes                          n/a                            No change                    0

                                 No                           n/a                            Insert                       1

 Selected                        Yes                          Yes                            Update                       2*

                                 Yes                          No                             No change                    0

                                 No                           n/a                            Insert                       1
 *If you select Ignore Null for all lookup ports not in the lookup condition and if all those ports contain null values, the Integration Service
 does not change the cache and the NewLookupRow value equals 0. For more information, see “Using the Ignore Null Property” on
 page 353.



Update Else Insert
You can select the Update Else Insert property in the Lookup transformation. This property
only applies to rows entering the Lookup transformation with the row type of update. When a
row of any other row type, such as insert, enters the Lookup transformation, this property has
no effect on how the Integration Service handles the row.
When you select this property and the row type entering the Lookup transformation is
update, the Integration Service updates the row in the cache if the row exists in the index
cache and the cache data is different than the existing row. The Integration Service inserts the
row in the cache if it is new.
If you do not select this property and the row type entering the Lookup transformation is
update, the Integration Service updates the row in the cache if it exists, and makes no change
to the cache if the row is new.




                                                                                      Working with a Dynamic Lookup Cache                      357
Table 15-5 describes how the Integration Service changes the lookup cache when the row type
             of the rows entering the Lookup transformation is update:

             Table 15-5. Dynamic Lookup Cache Behavior for Update Row Type

               Update Else Insert              Row Found in                 Data Cache is                 Lookup Cache                  NewLookupRow
               Option                          Cache                        Different                     Result                        Value

               Cleared (update only)           Yes                          Yes                           Update                        2*

                                               Yes                          No                            No change                     0

                                               No                           n/a                           No change                     0

               Selected                        Yes                          Yes                           Update                        2*

                                               Yes                          No                            No change                     0

                                               No                           n/a                           Insert                        1
               *If you select Ignore Null for all lookup ports not in the lookup condition and if all those ports contain null values, the Integration Service
               does not change the cache and the NewLookupRow value equals 0. For more information, see “Using the Ignore Null Property” on
               page 353.



        Using the WHERE Clause with a Dynamic Cache
             When you add a WHERE clause in a lookup SQL override, the Integration Service uses the
             WHERE clause to build the cache from the database and to perform a lookup on the database
             table for an uncached lookup. However, it does not use the WHERE clause to insert rows into
             a dynamic cache when it runs a session.
             When you add a WHERE clause in a Lookup transformation using a dynamic cache, connect
             a Filter transformation before the Lookup transformation to filter rows you do not want to
             insert into the cache or target table. If you do not use a Filter transformation, you might get
             inconsistent data.
             For example, you configure a Lookup transformation to perform a dynamic lookup on the
             employee table, EMP, matching rows by EMP_ID. You define the following lookup SQL
             override:
                       SELECT EMP_ID, EMP_STATUS FROM EMP ORDER BY EMP_ID, EMP_STATUS WHERE
                       EMP_STATUS = 4

             When you first run the session, the Integration Service builds the lookup cache from the
             target table based on the lookup SQL override. Therefore, all rows in the cache match the
             condition in the WHERE clause, EMP_STATUS = 4.
             Suppose the Integration Service reads a source row that meets the lookup condition you
             specify (the value for EMP_ID is found in the cache), but the value of EMP_STATUS is 2.
             The Integration Service does not find the row in the cache, so it inserts the row into the cache
             and passes the row to the target table. When this happens, not all rows in the cache match the
             condition in the WHERE clause. When the Integration Service tries to insert this row in the
             target table, you might get inconsistent data if the row already exists there.




358   Chapter 15: Lookup Caches
To verify that you only insert rows into the cache that match the WHERE clause, add a Filter
  transformation before the Lookup transformation and define the filter condition as the
  condition in the WHERE clause in the lookup SQL override.
  For the example above, enter the following filter condition:
         EMP_STATUS = 4

  For more information about the lookup SQL override, see “Overriding the Lookup Query”
  on page 324.


Synchronizing the Dynamic Lookup Cache
  When you use a dynamic lookup cache, the Integration Service writes to the lookup cache
  before it writes to the target table. The lookup cache and target table can become
  unsynchronized if the Integration Service does not write the data to the target. For example,
  the target database or Informatica writer might reject the data.
  Use the following guidelines to keep the lookup cache synchronized with the lookup table:
  ♦   Use a Router transformation to pass rows to the cached target when the NewLookupRow
      value equals one or two. Use the Router transformation to drop rows when the
      NewLookupRow value equals zero, or you can output those rows to a different target.
  ♦   Use Update Strategy transformations after the Lookup transformation to flag rows for
      insert or update into the target.
  ♦   Set the error threshold to one when you run a session. When you set the error threshold to
      one, the session fails when it encounters the first error. The Integration Service does not
      write the new cache files to disk. Instead, it restores the original cache files, if they exist.
      You must also restore the pre-session target table to the target database. For more
      information about setting the error threshold, see “Working with Sessions” in the
      Workflow Administration Guide.
  ♦   Verify that you output the same values to the target that the Integration Service writes to
      the lookup cache. When you choose to output new values on update, only connect lookup/
      output ports to the target table instead of input/output ports. When you choose to output
      old values on update, add an Expression transformation after the Lookup transformation
      and before the Router transformation. Add output ports in the Expression transformation
      for each port in the target table and create expressions to ensure you do not output null
      input values to the target.
  ♦   Set the Treat Source Rows As property to Data Driven in the session properties.
  ♦   Select Insert and Update as Update when you define the update strategy target table
      options in the session properties. This ensures that the Integration Service updates rows
      marked for update and inserts rows marked for insert. Select these options in the
      Transformations View on the Mapping tab in the session properties. For more
      information, see “Working with Targets” in the Workflow Administration Guide.




                                                             Working with a Dynamic Lookup Cache   359
Null Values in Lookup Condition Columns
             Sometimes when you run a session, the source data may contain null values in columns used
             in the lookup condition. The Integration Service handles rows with null values in lookup
             condition columns differently, depending on whether the row exists in the cache:
             ♦   If the row does not exist in the lookup cache, the Integration Service inserts the row in the
                 cache and passes it to the target table.
             ♦   If the row does exist in the lookup cache, the Integration Service does not update the row
                 in the cache or target table.
             Note: If the source data contains null values in the lookup condition columns, set the error
             threshold to one. This ensures that the lookup cache and table remain synchronized if the
             Integration Service inserts a row in the cache, but the database rejects the row due to a Not
             Null constraint.


        Example Using a Dynamic Lookup Cache
             Use a dynamic lookup cache when you need to insert and update rows in the target. When
             you use a dynamic lookup cache, you can insert and update the cache with the same data you
             pass to the target to insert and update.
             For example, use a dynamic lookup cache to update a table that contains customer data. The
             source data contains rows that you need to insert into the target and rows you need to update
             in the target.
             Figure 15-6 shows a mapping that uses a dynamic cache:

             Figure 15-6. Slowly Changing Dimension Mapping with Dynamic Lookup Cache




             The Lookup transformation uses a dynamic lookup cache. When the session starts, the
             Integration Service builds the lookup cache from the target table. When the Integration
             Service reads a row that is not in the lookup cache, it inserts the row in the cache and then
             passes the row out of the Lookup transformation. The Router transformation directs the row
             to the UPD_Insert_New Update Strategy transformation. The Update Strategy
             transformation marks the row as insert before passing it to the target.
             The target table changes as the session runs, and the Integration Service inserts new rows and
             updates existing rows in the lookup cache. The Integration Service keeps the lookup cache
             and target table synchronized.



360   Chapter 15: Lookup Caches
To generate keys for the target, use Sequence-ID in the associated port. The sequence ID
  generates primary keys for new rows the Integration Service inserts into the target table.
  Without the dynamic lookup cache, you need to use two Lookup transformations in the
  mapping. Use the first Lookup transformation to insert rows in the target. Use the second
  Lookup transformation to recache the target table and update rows in the target table.
  You increase session performance when you use a dynamic lookup cache because you only
  need to build the cache from the database once. You can continue to use the lookup cache
  even though the data in the target table changes.


Rules and Guidelines for Dynamic Caches
  Use the following guidelines when you use a dynamic lookup cache:
  ♦   You can create a dynamic lookup cache from a relational or flat file source.
  ♦   The Lookup transformation must be a connected transformation.
  ♦   Use a persistent or a non-persistent cache.
  ♦   If the dynamic cache is not persistent, the Integration Service always rebuilds the cache
      from the database, even if you do not enable Recache from Lookup Source.
  ♦   You cannot share the cache between a dynamic Lookup transformation and static Lookup
      transformation in the same target load order group.
  ♦   You can only create an equality lookup condition. You cannot look up a range of data.
  ♦   Associate each lookup port (that is not in the lookup condition) with an input port or a
      sequence ID.
  ♦   Use a Router transformation to pass rows to the cached target when the NewLookupRow
      value equals one or two. Use the Router transformation to drop rows when the
      NewLookupRow value equals zero, or you can output those rows to a different target.
  ♦   Verify that you output the same values to the target that the Integration Service writes to
      the lookup cache. When you choose to output new values on update, only connect lookup/
      output ports to the target table instead of input/output ports. When you choose to output
      old values on update, add an Expression transformation after the Lookup transformation
      and before the Router transformation. Add output ports in the Expression transformation
      for each port in the target table and create expressions to ensure you do not output null
      input values to the target.
  ♦   When you use a lookup SQL override, make sure you map the correct columns to the
      appropriate targets for lookup.
  ♦   When you add a WHERE clause to the lookup SQL override, use a Filter transformation
      before the Lookup transformation. This ensures the Integration Service only inserts rows
      in the dynamic cache and target table that match the WHERE clause. For more
      information, see “Using the WHERE Clause with a Dynamic Cache” on page 358.
  ♦   When you configure a reusable Lookup transformation to use a dynamic cache, you
      cannot edit the condition or disable the Dynamic Lookup Cache property in a mapping.




                                                           Working with a Dynamic Lookup Cache    361
♦   Use Update Strategy transformations after the Lookup transformation to flag the rows for
                 insert or update for the target.
             ♦   Use an Update Strategy transformation before the Lookup transformation to define some
                 or all rows as update if you want to use the Update Else Insert property in the Lookup
                 transformation.
             ♦   Set the row type to Data Driven in the session properties.
             ♦   Select Insert and Update as Update for the target table options in the session properties.




362   Chapter 15: Lookup Caches
Sharing the Lookup Cache
      You can configure multiple Lookup transformations in a mapping to share a single lookup
      cache. The Integration Service builds the cache when it processes the first Lookup
      transformation. It uses the same cache to perform lookups for subsequent Lookup
      transformations that share the cache.
      You can share caches that are unnamed and named:
      ♦   Unnamed cache. When Lookup transformations in a mapping have compatible caching
          structures, the Integration Service shares the cache by default. You can only share static
          unnamed caches.
      ♦   Named cache. Use a persistent named cache when you want to share a cache file across
          mappings or share a dynamic and a static cache. The caching structures must match or be
          compatible with a named cache. You can share static and dynamic named caches.
      When the Integration Service shares a lookup cache, it writes a message in the session log.


    Sharing an Unnamed Lookup Cache
      By default, the Integration Service shares the cache for Lookup transformations in a mapping
      that have compatible caching structures. For example, if you have two instances of the same
      reusable Lookup transformation in one mapping and you use the same output ports for both
      instances, the Lookup transformations share the lookup cache by default.
      When two Lookup transformations share an unnamed cache, the Integration Service saves the
      cache for a Lookup transformation and uses it for subsequent Lookup transformations that
      have the same lookup cache structure.
      If the transformation properties or the cache structure do not allow sharing, the Integration
      Service creates a new cache.

      Guidelines for Sharing an Unnamed Lookup Cache
      Use the following guidelines when you configure Lookup transformations to share an
      unnamed cache:
      ♦   You can share static unnamed caches.
      ♦   Shared transformations must use the same ports in the lookup condition. The conditions
          can use different operators, but the ports must be the same.
      ♦   You must configure some of the transformation properties to enable unnamed cache
          sharing. For more information, see Table 15-6 on page 364.
      ♦   The structure of the cache for the shared transformations must be compatible.
          −   If you use hash auto-keys partitioning, the lookup/output ports for each transformation
              must match.
          −   If you do not use hash auto-keys partitioning, the lookup/output ports for the first
              shared transformation must match or be a superset of the lookup/output ports for
              subsequent transformations.

                                                                          Sharing the Lookup Cache   363
♦     If the Lookup transformations with hash auto-keys partitioning are in different target load
                   order groups, you must configure the same number of partitions for each group. If you do
                   not use hash auto-keys partitioning, you can configure a different number of partitions for
                   each target load order group.
             Table 15-6 shows when you can share an unnamed static and dynamic cache:

             Table 15-6. Location for Sharing Unnamed Cache

                 Shared Cache                  Location of Transformations

                 Static with Static            Anywhere in the mapping.

                 Dynamic with Dynamic          Cannot share.

                 Dynamic with Static           Cannot share.


             Table 15-7 describes the guidelines to follow when you configure Lookup transformations to
             share an unnamed cache:

             Table 15-7. Properties for Sharing Unnamed Cache

                 Properties                    Configuration for Unnamed Shared Cache

                 Lookup SQL Override           If you use the Lookup SQL Override property, you must use the same override in all
                                               shared transformations.

                 Lookup Table Name             Must match.

                 Lookup Caching Enabled        Must be enabled.

                 Lookup Policy on Multiple     n/a
                 Match

                 Lookup Condition              Shared transformations must use the same ports in the lookup condition. The
                                               conditions can use different operators, but the ports must be the same.

                 Connection Information        The connection must be the same. When you configure the sessions, the database
                                               connection must match.

                 Source Type                   Must match.

                 Tracing Level                 n/a

                 Lookup Cache Directory Name   Does not need to match.

                 Lookup Cache Persistent       Optional. You can share persistent and non-persistent.

                 Lookup Data Cache Size        Integration Service allocates memory for the first shared transformation in each
                                               pipeline stage. It does not allocate additional memory for subsequent shared
                                               transformations in the same pipeline stage.
                                               For information about pipeline stages, see “Pipeline Partitioning” in the Workflow
                                               Administration Guide.

                 Lookup Index Cache Size       Integration Service allocates memory for the first shared transformation in each
                                               pipeline stage. It does not allocate additional memory for subsequent shared
                                               transformations in the same pipeline stage.
                                               For information about pipeline stages, see “Pipeline Partitioning” in the Workflow
                                               Administration Guide.



364   Chapter 15: Lookup Caches
Table 15-7. Properties for Sharing Unnamed Cache

   Properties                     Configuration for Unnamed Shared Cache

   Dynamic Lookup Cache           You cannot share an unnamed dynamic cache.

   Output Old Value On Update     Does not need to match.

   Cache File Name Prefix         Do not use. You cannot share a named cache with an unnamed cache.

   Recache From Lookup Source     If you configure a Lookup transformation to recache from source, subsequent
                                  Lookup transformations in the target load order group can share the existing cache
                                  whether or not you configure them to recache from source. If you configure
                                  subsequent Lookup transformations to recache from source, the Integration Service
                                  shares the cache instead of rebuilding the cache when it processes the subsequent
                                  Lookup transformation.
                                  If you do not configure the first Lookup transformation in a target load order group to
                                  recache from source, and you do configure the subsequent Lookup transformation to
                                  recache from source, the transformations cannot share the cache. The Integration
                                  Service builds the cache when it processes each Lookup transformation.

   Lookup/Output Ports            The lookup/output ports for the second Lookup transformation must match or be a
                                  subset of the ports in the transformation that the Integration Service uses to build the
                                  cache. The order of the ports do not need to match.

   Insert Else Update             n/a

   Update Else Insert             n/a

   Datetime Format                n/a

   Thousand Separator             n/a

   Decimal Separator              n/a

   Case-Sensitive String          Must match.
   Comparison

   Null Ordering                  Must match.

   Sorted Input                   n/a



Sharing a Named Lookup Cache
  You can also share the cache between multiple Lookup transformations by using a persistent
  lookup cache and naming the cache files. You can share one cache between Lookup
  transformations in the same mapping or across mappings.
  The Integration Service uses the following process to share a named lookup cache:
  1.   When the Integration Service processes the first Lookup transformation, it searches the
       cache directory for cache files with the same file name prefix. For more information
       about the Cache File Name Prefix property, see “Lookup Properties” on page 316.
  2.   If the Integration Service finds the cache files and you do not specify to recache from
       source, the Integration Service uses the saved cache files.
  3.   If the Integration Service does not find the cache files or if you specify to recache from
       source, the Integration Service builds the lookup cache using the database table.


                                                                                     Sharing the Lookup Cache           365
4.    The Integration Service saves the cache files to disk after it processes each target load
                   order group.
             5.    The Integration Service uses the following rules to process the second Lookup
                   transformation with the same cache file name prefix:
                   ♦   The Integration Service uses the memory cache if the transformations are in the same
                       target load order group.
                   ♦   The Integration Service rebuilds the memory cache from the persisted files if the
                       transformations are in different target load order groups.
                   ♦   The Integration Service rebuilds the cache from the database if you configure the
                       transformation to recache from source and the first transformation is in a different
                       target load order group.
                   ♦   The Integration Service fails the session if you configure subsequent Lookup
                       transformations to recache from source, but not the first one in the same target load
                       order group.
                   ♦   If the cache structures do not match, the Integration Service fails the session.
             If you run two sessions simultaneously that share a lookup cache, the Integration Service uses
             the following rules to share the cache files:
             ♦    The Integration Service processes multiple sessions simultaneously when the Lookup
                  transformations only need to read the cache files.
             ♦    The Integration Service fails the session if one session updates a cache file while another
                  session attempts to read or update the cache file. For example, Lookup transformations
                  update the cache file if they are configured to use a dynamic cache or recache from source.

             Guidelines for Sharing a Named Lookup Cache
             Use the following guidelines when you configure Lookup transformations to share a named
             cache:
             ♦    You can share any combination of dynamic and static caches, but you must follow the
                  guidelines for location. For more information, see Table 15-8 on page 367.
             ♦    You must configure some of the transformation properties to enable named cache sharing.
                  For more information, see Table 15-9 on page 367.
             ♦    A dynamic lookup cannot share the cache if the named cache has duplicate rows.
             ♦    A named cache created by a dynamic Lookup transformation with a lookup policy of error
                  on multiple match can be shared by a static or dynamic Lookup transformation with any
                  lookup policy.
             ♦    A named cache created by a dynamic Lookup transformation with a lookup policy of use
                  first or use last can be shared by a Lookup transformation with the same lookup policy.
             ♦    Shared transformations must use the same output ports in the mapping. The criteria and
                  result columns for the cache must match the cache files.
             The Integration Service might use the memory cache, or it might build the memory cache
             from the file, depending on the type and location of the Lookup transformations.



366   Chapter 15: Lookup Caches
Table 15-8 shows when you can share a static and dynamic named cache:

Table 15-8. Location for Sharing Named Cache

 Shared Cache                Location of Transformations            Cache Shared

 Static with Static          - Same target load order group.        - Integration Service uses memory cache.
                             - Separate target load order groups.   - Integration Service uses memory cache.
                             - Separate mappings.                   - Integration Service builds memory cache from file.

 Dynamic with Dynamic        - Separate target load order groups.   - Integration Service uses memory cache.
                             - Separate mappings.                   - Integration Service builds memory cache from file.

 Dynamic with Static         - Separate target load order groups.   - Integration Service builds memory cache from file.
                             - Separate mappings.                   - Integration Service builds memory cache from file.


For more information about target load order groups, see “Mappings” in the Designer Guide.
Table 15-9 describes the guidelines to follow when you configure Lookup transformations to
share a named cache:

Table 15-9. Properties for Sharing Named Cache

 Properties                       Configuration for Named Shared Cache

 Lookup SQL Override              If you use the Lookup SQL Override property, you must use the same override in all
                                  shared transformations.

 Lookup Table Name                Must match.

 Lookup Caching Enabled           Must be enabled.

 Lookup Policy on Multiple        - A named cache created by a dynamic Lookup transformation with a lookup policy of
 Match                              error on multiple match can be shared by a static or dynamic Lookup transformation
                                    with any lookup policy.
                                  - A named cache created by a dynamic Lookup transformation with a lookup policy of
                                    use first or use last can be shared by a Lookup transformation with the same lookup
                                    policy.

 Lookup Condition                 Shared transformations must use the same ports in the lookup condition. The conditions
                                  can use different operators, but the ports must be the same.

 Connection Information           The connection must be the same. When you configure the sessions, the database
                                  connection must match.

 Source Type                      Must match.

 Tracing Level                    n/a

 Lookup Cache Directory           Must match.
 Name

 Lookup Cache Persistent          Must be enabled.

 Lookup Data Cache Size           When transformations within the same mapping share a cache, the Integration Service
                                  allocates memory for the first shared transformation in each pipeline stage. It does not
                                  allocate additional memory for subsequent shared transformations in the same pipeline
                                  stage. For information about pipeline stages, see “Pipeline Partitioning” in the Workflow
                                  Administration Guide.



                                                                                       Sharing the Lookup Cache           367
Table 15-9. Properties for Sharing Named Cache

               Properties                   Configuration for Named Shared Cache

               Lookup Index Cache Size      When transformations within the same mapping share a cache, the Integration Service
                                            allocates memory for the first shared transformation in each pipeline stage. It does not
                                            allocate additional memory for subsequent shared transformations in the same pipeline
                                            stage. For information about pipeline stages, see “Pipeline Partitioning” in the Workflow
                                            Administration Guide.

               Dynamic Lookup Cache         For more information about sharing static and dynamic cache, see Table 15-8 on
                                            page 367.

               Output Old Value on Update   Does not need to match.

               Cache File Name Prefix       Must match. Enter the prefix only. Do not enter .idx or .dat. You cannot share a named
                                            cache with an unnamed cache.

               Recache from Source          If you configure a Lookup transformation to recache from source, subsequent Lookup
                                            transformations in the target load order group can share the existing cache whether or
                                            not you configure them to recache from source. If you configure subsequent Lookup
                                            transformations to recache from source, the Integration Service shares the cache
                                            instead of rebuilding the cache when it processes the subsequent Lookup
                                            transformation.
                                            If you do not configure the first Lookup transformation in a target load order group to
                                            recache from source, and you do configure the subsequent Lookup transformation to
                                            recache from source, the session fails.

               Lookup/Output Ports          Lookup/output ports must be identical, but they do not need to be in the same order.

               Insert Else Update           n/a

               Update Else Insert           n/a

               Thousand Separator           n/a

               Decimal Separator            n/a

               Case-Sensitive String        n/a
               Comparison

               Null Ordering                n/a

               Sorted Input                 Must match.


             Note: You cannot share a lookup cache created on a different operating system. For example,
             only an Integration Service on UNIX can read a lookup cache created on a Integration Service
             on UNIX, and only an Integration Service on Windows can read a lookup cache created on an
             Integration Service on Windows.




368   Chapter 15: Lookup Caches
Lookup Cache Tips
      Use the following tips when you configure the Lookup transformation to cache the lookup
      table:

      Cache small lookup tables.
      Improve session performance by caching small lookup tables. The result of the lookup query
      and processing is the same, whether or not you cache the lookup table.

      Use a persistent lookup cache for static lookup tables.
      If the lookup table does not change between sessions, configure the Lookup transformation to
      use a persistent lookup cache. The Integration Service then saves and reuses cache files from
      session to session, eliminating the time required to read the lookup table.




                                                                             Lookup Cache Tips   369
370   Chapter 15: Lookup Caches
Chapter 16




Normalizer
Transformation
   This chapter includes the following topics:
   ♦   Overview, 372
   ♦   Normalizer Transformation Components, 374
   ♦   Normalizer Transformation Generated Keys, 379
   ♦   VSAM Normalizer Transformation, 380
   ♦   Pipeline Normalizer Transformation, 387
   ♦   Using a Normalizer Transformation in a Mapping, 394
   ♦   Troubleshooting, 399




                                                              371
Overview
                     Transformation type:
                     Active
                     Connected


              The Normalizer transformation receives a row that contains multiple-occurring columns and
              returns a row for each instance of the multiple-occurring data. The transformation processes
              multiple-occurring columns or multiple-occurring groups of columns in each source row.
              The Normalizer transformation parses multiple-occurring columns from COBOL sources,
              relational tables, or other sources. It can process multiple record types from a COBOL source
              that contains a REDEFINES clause.
              For example, a relational table contains quarterly sales totals by store. You need to create a row
              for each sales occurrence. You can configure a Normalizer transformation to return a separate
              row for each quarter.
              The following source rows contain four quarters of sales by store:
                     Store1 100 300 500 700

                     Store2 250 450 650 850

              The Normalizer returns a row for each store and sales combination. It also returns an index
              that identifies the quarter number:
                     Store1 100 1

                     Store1 300 2

                     Store1 500 3

                     Store1 700 4

                     Store2 250 1

                     Store2 450 2

                     Store2 650 3

                     Store2 850 4

              The Normalizer transformation generates a key for each source row. The Integration Service
              increments the generated key sequence number each time it processes a source row. When the
              source row contains a multiple-occurring column or a multiple-occurring group of columns,
              the Normalizer transformation returns a row for each occurrence. Each row contains the same
              generated key value.
              When the Normalizer returns multiple rows from a source row, it returns duplicate data for
              single-occurring source columns. For example, Store1 and Store2 repeat for each instance of
              sales.
              You can create a VSAM Normalizer transformation or a pipeline Normalizer transformation:
              ♦   VSAM Normalizer transformation. A non-reusable transformation that is a Source
                  Qualifier transformation for a COBOL source. The Mapping Designer creates VSAM

372   Chapter 16: Normalizer Transformation
Normalizer columns from a COBOL source in a mapping. The column attributes are read-
    only. The VSAM Normalizer receives a multiple-occurring source column through one
    input port. For more information, see “VSAM Normalizer Transformation” on page 380.
♦   Pipeline Normalizer transformation. A transformation that processes multiple-occurring
    data from relational tables or flat files. You create the columns manually and edit them in
    the Transformation Developer or Mapping Designer. The pipeline Normalizer
    transformation represents multiple-occurring columns with one input port for each source
    column occurrence. For more information about the Pipeline Normalizer transformation,
    see “Pipeline Normalizer Transformation” on page 387.




                                                                                 Overview   373
Normalizer Transformation Components
              A Normalizer transformation contains the following tabs:
              ♦   Transformation. Enter the name and description of the transformation. The naming
                  convention for an Normalizer transformation is NRM_TransformationName. You can also
                  make the pipeline Normalizer transformation reusable.
              ♦   Ports. View the transformation ports and attributes. For more information, see “Ports
                  Tab” on page 374.
              ♦   Properties. Configure the tracing level to determine the amount of transaction detail
                  reported in the session log file. Choose to reset or restart the generated key sequence value
                  in the next session. For more information, see “Properties Tab” on page 376.
              ♦   Normalizer. Define the structure of the source data. The Normalizer tab defines source
                  data as columns and groups of columns. For more information, see “Normalizer Tab” on
                  page 377.
              ♦   Metadata Extensions. Configure the extension name, datatype, precision, and value. You
                  can also create reusable metadata extensions. For more information about creating
                  metadata extensions, see “Metadata Extensions” in the Repository Guide.
              Figure 16-1 shows the ports on the Normalizer transformation:

              Figure 16-1. Normalizer Transformation Ports




        Ports Tab
              When you define a Normalizer transformation, you configure the columns in the Normalizer
              tab. The Designer creates the ports. You can view the Normalizer ports and attributes on the
              Ports tab.




374   Chapter 16: Normalizer Transformation
Pipeline and VSAM Normalizer transformations represent multiple-occurring source columns
differently. A VSAM Normalizer transformation has one input port for a multiple-occurring
column. A pipeline Normalizer transformation has multiple input ports for a multiple-
occurring column.
The Normalizer transformation has one output port for each single-occurring input port.
When a source column is multiple-occurring, the pipeline and VSAM Normalizer
transformations have one output port for the column. The transformation returns a row for
each source column occurrence.
The Normalizer transformation has a generated column ID (GCID) port for each multiple-
occurring column. The generated column ID is an index for the instance of the multiple-
occurring data. For example, if a column occurs four times in a source record, the Normalizer
returns a value of 1, 2, 3, or 4 in the generated column ID based on which instance of the
multiple-occurring data occurs in the row.
The naming convention for the Normalizer generated column ID is
GCID_<occuring_field_name>.
The Normalizer transformation has at least one generated key port. The Integration Service
increments the generated key sequence number each time it processes a source row.
Figure 16-2 shows the Normalizer transformation Ports tab:

Figure 16-2. Normalizer Ports Tab




                                                                        Sales_By_Quarter is multiple-
                                                                        occurring in the source. The
                                                                        Normalizer transformation has
                                                                        one output port for
                                                                        Sales_By_Quarter. It returns
                                                                        four rows for each source row.

                                                                        Generated Key Start
                                                                        Value




You can change the ports on a pipeline Normalizer transformation by editing the columns on
the Normalizer tab. To change a VSAM Normalizer transformation, you need to change the
COBOL source and recreate the transformation.
You can change the generated key start values on the Ports tab. You can enter different values
for each generated key. When you change a start value, the generated key value resets to the
start value the next time you run a session. For more information about generated keys, see
“Normalizer Transformation Generated Keys” on page 379.


                                                       Normalizer Transformation Components        375
For more information about the VSAM Normalizer Ports tab, see “VSAM Normalizer Ports
              Tab” on page 382.
              For more information about the pipeline Normalizer Ports tab, see “Pipeline Normalizer
              Ports Tab” on page 388.


        Properties Tab
              Configure the Normalizer transformation general properties on the Properties tab.
              Figure 16-3 shows the Normalizer transformation Properties tab:

              Figure 16-3. Normalizer Transformation Properties Tab




              Table 16-1 describes the Normalizer transformation properties:

              Table 16-1. Normalizer Transformation Properties

                                    Required/
               Property                                Description
                                    Optional

               Reset                Required           At the end of a session, resets the value sequence for each generated key
                                                       value to the value it was before the session. For more information about
                                                       generated keys, see “Normalizer Transformation Generated Keys” on
                                                       page 379.

               Restart              Required           Starts the generated key sequence at 1. Each time you run a session, the
                                                       key sequence value starts at 1 and overrides the sequence value on the
                                                       Ports tab. For more information about generated keys, see “Normalizer
                                                       Transformation Generated Keys” on page 379.

               Tracing Level        Required           Sets the amount of detail included in the session log when you run a
                                                       session containing this transformation. For more information, see
                                                       “Configuring Tracing Level in Transformations” on page 30.




376   Chapter 16: Normalizer Transformation
Normalizer Tab
  The Normalizer tab defines the structure of the source data. The Normalizer tab defines
  source data as columns and groups of columns. A group of columns might define a record in a
  COBOL source or it might define a group of multiple-occurring fields in the source.
  The column level number identifies groups of columns in the data. Level numbers define a
  data hierarchy. Columns in a group have the same level number and display sequentially
  below a group-level column. A group-level column has a lower level number, and it contains
  no data.
  In Figure 16-4 on page 377, Quarterly_Data is a group-level column. It is Level 1. The
  Quarterly_Data group occurs four times in each row. Sales_by_Quarter and
  Returns_by_Quarter belong to the group. They are Level 2 columns.
  Figure 16-4 shows the Normalizer tab of a pipeline Normalizer transformation:

  Figure 16-4. Normalizer Tab




                                                                   The Quarterly_Data columns
                                                                   occur 4 times.




  Each column has an Occurs attribute. The Occurs attribute identifies columns or groups of
  columns that occur more than once in a source row.
  When you create a pipeline Normalizer transformation, you can edit the columns. When you
  create a VSAM Normalizer transformation, the Normalizer tab is read-only.




                                                       Normalizer Transformation Components     377
Table 16-2 describes the Normalizer tab attributes that are common to the VSAM and
              pipeline Normalizer transformations:

              Table 16-2. Normalizer Tab Columns

               Attribute                Description

               Column Name              Name of the source column.

               Level                    Group columns. Columns in the same group occur beneath a column with a lower level
                                        number. When each column is the same level, the transformation contains no column
                                        groups.

               Occurs                   The number of instances of a column or group of columns in the source row.

               Datatype                 The transformation column datatype can be String, Nstring, or Number.

               Prec                     Precision. Length of the column.

               Scale                    Number of decimal positions for a numeric column.


              The Normalizer tab for a VSAM Normalizer transformation contains the same attributes as
              the pipeline Normalizer transformation, but it includes attributes unique to a COBOL source
              definition. For more information about the Normalizer tab for a VSAM Normalizer
              transformation, see “VSAM Normalizer Tab” on page 383.
              For more information about the Normalizer tab for the pipeline Normalizer transformation,
              see “Pipeline Normalizer Tab” on page 390.




378   Chapter 16: Normalizer Transformation
Normalizer Transformation Generated Keys
      The Normalizer transformation has at least one generated key column in the output row. The
      Integration Service increments the generated key sequence number each time it processes a
      source row. The Integration Service determines the initial key value from the generated key
      value in the Ports tab of the Normalizer transformation. When you create a Normalizer
      transformation, the generated key value is 1 by default.
      The naming convention for the Normalizer generated key is GK_<redefined_field_name>.
      For information about mapping the Normalizer transformation generated keys to targets, see
      “Generating Key Values” on page 396.


    Storing Generated Key Values
      You can view the current generated key values on the Normalizer transformation Ports tab. At
      the end of each session, the Integration Service updates the generated key value in the
      Normalizer transformation to the last value generated for the session plus one. If you have
      multiple instances of the Normalizer transformation in the repository, the Integration Service
      updates the generated key value in all versions when it runs a session.
      Note: Change the generated key sequence number only when you need to change the
      sequence. The Integration Service might pass duplicate keys to the target when you reset a
      generated key that exists in the target.


    Changing the Generated Key Values
      You can change the generated key value in the following ways:
      ♦   Modify the generated key sequence value. You can modify the generated key sequence
          value on the Ports tab of the Normalizer transformation. The Integration Service assigns
          the sequence value to the first generated key it creates for that column.
      ♦   Reset the generated key sequence. Reset the generated key sequence on the Normalizer
          transformation Properties tab. When you reset the generated key sequence, the Integration
          Service resets the generated key start value back to the value it was before the session. Reset
          the generated key sequence when want to create the same generated key values each time
          you run the session.
      ♦   Restart the generated key sequence. Restart the generated key sequence on the Normalizer
          transformation Properties tab. When you restart the generated key sequence, the
          Integration Service starts the generated key sequence at 1 the next time it runs a session.
          When you restart the generated key sequence, the generated key start value does not
          change in the Normalizer transformation until you run a session. When you run the
          session, the Integration Service overrides the sequence number value on the Ports tab.
      When you reset or restart the generated key sequence, the reset or restart affects the generated
      key sequence values the next time you run a session. You do not change the current generated
      key sequence values in the Normalizer transformation. When you reset or restart the
      generated key sequence, the option is enabled for every session until you disable the option.


                                                             Normalizer Transformation Generated Keys   379
VSAM Normalizer Transformation
              The VSAM Normalizer transformation is the source qualifier for a COBOL source definition.
              A COBOL source is a flat file that can contain multiple-occurring data and multiple types of
              records in the same file.
              VSAM (Virtual Storage Access Method) is a file access method for an IBM mainframe
              operating system. VSAM files organize records in indexed or sequential flat files. However,
              you can use the VSAM Normalizer transformation for any flat file source that you define with
              a COBOL source definition.
              A COBOL source definition can have an OCCURS statement that defines a multiple-
              occurring column. The COBOL source definition can also contain a REDEFINES statement
              to define more than one type of record in the file.
              The following COBOL copybook defines a sales record:
              01 SALES_RECORD.
                    03   HDR_DATA.
                                05   HDR_REC_TYPE             PIC X.
                                05   HDR_STORE                PIC X(02).
                    03   STORE_DATA.
                                05   STORE_NAME           PIC X(30).
                                05   STORE_ADDR1          PIC X(30).
                                05   STORE_CITY           PIC X(30).
                    03   DETAIL_DATA REDEFINES STORE_DATA.
                                05   DETAIL_ITEM              PIC 9(9).
                                05   DETAIL_DESC              PIC X(30).
                                05   DETAIL_PRICE             PIC 9(4)V99.
                                05   DETAIL_QTY               PIC 9(5).
                                05   SUPPLIER_INFO OCCURS 4 TIMES .
                                     10   SUPPLIER_CODE       PIC XX.
                                     10   SUPPLIER_NAME       PIC X(8).

              The sales file can contain two types of sales records. Store_Data defines a store and
              Detail_Data defines merchandise sold in the store. The REDEFINES clause indicates that
              Detail_Data fields might occur in a record instead of Store_Data fields.
              The first three characters of each sales record is the header. The header includes a record type
              and a store ID. The value of Hdr_Rec_Type defines whether the rest of the record contains
              store information or merchandise information. For example, when Hdr_Rec_Type is “S,” the
              record contains store data. When Hdr_Rec_Type is “D,” the record contains detail data.
              When the record contains detail data, it includes the Supplier_Info fields. The OCCURS
              clause defines four suppliers in each Detail_Data record.
              For more information about COBOL source definitions, see the Designer Guide.




380   Chapter 16: Normalizer Transformation
Figure 16-5 shows the Sales_File COBOL source definition that you might create from the
COBOL copybook:

Figure 16-5. COBOL Source Definition Example




                                                    Group level columns identify
                                                    groups of columns in a COBOL
                                                    source definition. Group level
                                                    columns do not contain data.




The Sales_Rec, Hdr_Data, Store_Data, Detail_Data, and Supplier_Info columns are group-
level columns that identify groups of lower level data. Group-level columns have a length of
zero because they contain no data. None of these columns are output ports in the source
definition.
The Supplier_Info group contains Supplier_Code and Supplier_Name columns. The
Supplier_Info group occurs four times in each Detail_Data record.
When you create a VSAM Normalizer transformation from the COBOL source definition,
the Mapping Designer creates the input/output ports in the Normalizer transformation based
on the COBOL source definition. The Normalizer transformation contains at least one
generated key output port. When the COBOL source has multiple-occurring columns, the
Normalizer transformation has a generated column ID output port. For more information
about the generated column ID, see “Ports Tab” on page 374.
Figure 16-6 shows the Normalizer transformation ports the Mapping Designer creates from
the source definition:

Figure 16-6. Sales File VSAM Normalizer Transformation




                                                                                     The Normalizer
                                                                                     transformation has a
                                                                                     generated key and a
                                                                                     generated column ID port.




                                                                      VSAM Normalizer Transformation       381
In Figure 16-5 on page 381, the Supplier_Info group of columns occurs four times in each
              COBOL source row.
              The COBOL source row might contain the following data:
                     Item1 ItemDesc 100 25 A Supplier1 B Supplier2 C Supplier3 D Supplier4

              The Normalizer transformation returns a row for each occurrence of the Supplier_Code and
              Supplier_Name columns. Each output row contains the same item, description, price, and
              quantity values.
              The Normalizer returns the following detail data rows from the COBOL source row:
                     Item1 ItemDesc 100 25 A Supplier1 1 1

                     Item1 ItemDesc 100 25 B Supplier2 1 2
                     Item1 ItemDesc 100 25 C Supplier3 1 3

                     Item1 ItemDesc 100 25 D Supplier4 1 4

              Each output row contains a generated key and a column ID. The Integration Service updates
              the generated key value when it processes a new source row. In the detail data rows, the
              generated key value is 1.
              The column ID defines the Supplier_Info column occurrence number. The Integration
              Service updates the column ID for each occurrence of the Supplier_Info. The column ID
              values are 1, 2, 3, 4 in the detail data rows.


        VSAM Normalizer Ports Tab
              The VSAM Normalizer Ports tab shows the transformation input and output ports. It has one
              input/output port for each COBOL source column. It has one input/output port for a
              multiple-occurring column. The transformation does not have input or output ports for
              group level columns.




382   Chapter 16: Normalizer Transformation
Figure 16-7 shows the VSAM Normalizer Ports tab:

  Figure 16-7. VSAM Normalizer Ports Tab




                                                                Supplier_Code and
                                                                Supplier_Name occur four times in
                                                                the COBOL source. The Ports tab
                                                                shows one Supplier_Code port
                                                                and one Supplier_Name port.




                                                                Generated Key Start Values




VSAM Normalizer Tab
  When you create a VSAM Normalizer transformation, the Mapping Designer creates the
  columns from a COBOL source. The Normalizer tab displays the same information as the
  COBOL source definition. You cannot edit the columns on a VSAM Normalizer tab.




                                                         VSAM Normalizer Transformation       383
Figure 16-8 shows a Normalizer tab for a VSAM Normalizer transformation:

              Figure 16-8. Normalizer Tab for a VSAM Normalizer Transformation




              Table 16-3 describes the VSAM Normalizer tab:

              Table 16-3. Normalizer Tab for a VSAM Normalizer Transformation

               Attribute                Description

               POffs                    Physical offset. Location of the field in the file. The first byte in the file is zero.

               Plen                     Physical length. Number of bytes in the field.

               Column Name              Name of the source field.

               Level                    Provides column group hierarchy. The higher the level number, the lower the data is in the
                                        hierarchy. Columns in the same group occur beneath a column with a lower level number.
                                        When each column is the same level, the transformation contains no column groups.

               Occurs                   The number of instances of a column or group of columns in the source row.

               Datatype                 The transformation datatype can be String, Nstring, or Number.

               Prec                     Precision. Length of the column.

               Scale                    Number of decimal positions for a numeric column.

               Picture                  How the data is stored or displayed in the source. Picture 99V99 defines a numeric field with
                                        two implied decimals. Picture X(10) indicates ten characters.

               Usage                    COBOL data storage format such as COMP, BINARY, and COMP-3. When the Usage is
                                        DISPLAY, the Picture clause defines how the source data is formatted when you view it.


384   Chapter 16: Normalizer Transformation
Table 16-3. Normalizer Tab for a VSAM Normalizer Transformation

   Attribute                Description

   Key Type                 Type of key constraint to apply to this field. When you configure a field as a primary key, the
                            Integration Service generates unique numeric IDs for this field when running a session with a
                            COBOL source.

   Signed (S)               Indicates whether numeric values are signed.

   Trailing Sign (T)        Indicates that the sign (+ or -) exists in the last digit of the field. If not enabled, the sign
                            appears as the first character in the field.

   Included Sign (I)        Indicates whether the sign is included in any value appearing in the field.

   Real Decimal Point (R)   Indicates whether the decimal point is a period (.) or the decimal point is represented by the
                            V character in a numeric field.

   Redefines                Indicates that the column REDEFINES another column.

   Business Name            Descriptive name that you give to a column.



Steps to Create a VSAM Normalizer Transformation
  When you create a VSAM Normalizer transformation, you drag a COBOL source into a
  mapping and the Mapping Designer creates the transformation columns from the source. The
  Normalizer transformation is the source qualifier for the COBOL source in the mapping.
  When you add a COBOL source to a mapping, the Mapping Designer creates and configures
  a Normalizer transformation. The Mapping Designer identifies nested records and multiple-
  occurring fields in the COBOL source. It creates the columns and ports in the Normalizer
  transformation from the source columns.

  To create a VSAM Normalizer transformation:

  1.   In the Mapping Designer, create a new mapping or open an existing mapping.
  2.   Drag a COBOL source definition into the mapping.
       The Designer adds a Normalizer transformation and connects it to the COBOL source
       definition. If you have not enabled the option to create a source qualifier by default, the
       Create Normalizer Transformation dialog box appears:




                                                                                  VSAM Normalizer Transformation               385
For more information about the option to create a source qualifier by default, see “Using
                   the Designer” in the Designer Guide.
              3.   If the Create Normalizer Transformation dialog box appears, you can choose from the
                   following options:
                   ♦   VSAM Source. Create a transformation from the COBOL source definition in the
                       mapping.
                   ♦   Pipeline. Create a transformation, but do not define columns from a COBOL source.
                       Define the columns manually on the Normalizer tab. You might choose this option
                       when you want to process multiple-occurring data from another transformation in the
                       mapping.
                   To create the VSAM Normalizer transformation, select the VSAM Normalizer
                   transformation option. The dialog box displays the name of the COBOL source
                   definition in the mapping. Select the COBOL source definition and click OK.
              4.   Open the Normalizer transformation.
              5.   Select the Ports tab to view the ports in the Normalizer transformation. The Designer
                   creates the ports from the COBOL source definition by default.
              6.   Click the Normalizer tab to review the source column organization.
                   The Normalizer tab contains the same information as the Columns tab of the COBOL
                   source. However, you cannot modify column attributes in the Normalizer
                   transformation. To change column attributes, change the COBOL copybook, import the
                   COBOL source, and recreate the Normalizer transformation.
              7.   Select the Properties tab to set the tracing level. You can also configure the
                   transformation to reset the generated key sequence numbers at the start of the next
                   session. For more information about changing generated key values, see “Changing the
                   Generated Key Values” on page 379.




386   Chapter 16: Normalizer Transformation
Pipeline Normalizer Transformation
      When you create a Normalizer transformation in the Transformation Developer, you create a
      pipeline Normalizer transformation by default. When you create a pipeline Normalizer
      transformation, you define the columns based on the data the transformation receives from a
      another type of transformation such as a Source Qualifier transformation. The Designer
      creates the input and output Normalizer transformation ports from the columns you define.
      Figure 16-9 shows the Normalizer transformation columns for a transformation that receives
      four sales columns in each relational source row:

      Figure 16-9. Pipeline Normalizer Columns




                                                                             Each source row has a StoreName
                                                                             column and four instances of
                                                                             Sales_By_Quarter.




      The source rows might contain the following data:
             Dellmark 100 450 650 780
             Tonys        666 333 444 555

      Figure 16-10 shows the ports that the Designer creates from the columns in the Normalizer
      transformation:

      Figure 16-10. Pipeline Normalizer Ports




                                                 A pipeline Normalizer transformation has an input
                                                 port for each instance of a multiple-occurring
                                                 column.
                                                 The transformation returns one instance of the
                                                 multiple-occurring column in each output row.




                                                                    Pipeline Normalizer Transformation    387
The Normalizer transformation returns one row for each instance of the multiple-occurring
              column:
                     Dellmark 100 1 1

                     Dellmark 450 1 2

                     Dellmark 650 1 3
                     Dellmark 780 1 4

                     Tonys        666 2 1

                     Tonys        333 2 2
                     Tonys        444 2 3

                     Tonys        555 2 4

              The Integration Service increments the generated key sequence number each time it processes
              a source row. The generated key links each quarter sales to the same store. In this example, the
              generated key for the Dellmark row is 1. The generated key for the Tonys store is 2.
              The transformation returns a generated column ID (GCID) for each instance of a multiple-
              occurring field. The GCID_Sales_by_Quarter value is always 1, 2, 3, or 4 in this example.
              For more information about the generated key, see “Normalizer Transformation Generated
              Keys” on page 379.


        Pipeline Normalizer Ports Tab
              The pipeline Normalizer Ports tab displays the input and output ports for the transformation.
              It has one input/output port for each single-occurring column you define in the
              transformation. It has one port for each occurrence of a multiple-occurring column. The
              transformation does not have input or output ports for group level columns.




388   Chapter 16: Normalizer Transformation
Figure 16-11 shows the pipeline Normalizer transformation Ports tab:

Figure 16-11. Pipeline Normalizer Ports Tab




                                                                          The Designer creates an
                                                                          input port for each
                                                                          occurrence of a multiple-
                                                                          occurring column.
                                                                          You can change the
                                                                          generated key sequence
                                                                          number.




To change the ports in a pipeline Normalizer transformation, modify the columns in the
Normalizer tab. When you add a column occurrence, the Designer adds an input port. The
Designer creates ports for the lowest level columns. It does not create ports for group level
columns.




                                                           Pipeline Normalizer Transformation         389
Pipeline Normalizer Tab
              When you create a pipeline Normalizer transformation, you define the columns on the
              Normalizer tab. The Designer creates input and output ports based on the columns you enter
              on the Normalizer tab.
              Figure 16-12 shows the Normalizer tab for a pipeline Normalizer transformation:

              Figure 16-12. Normalizer Tab




                                                                                                     Click Level to
                                                                                                     organize columns
                                                                                                     into groups.




              Table 16-4 describes the pipeline Normalizer tab attributes:

              Table 16-4. Pipeline Normalizer Tab

               Attribute                Description

               Column Name              Name of the column.

               Level                    Identifies groups of columns. Columns in the same group have the same level number.
                                        Default is zero. When each column is the same level, the transformation contains no column
                                        groups.

               Occurs                   The number of instances of a column or group of columns in the source row.

               Datatype                 The column datatype can be String, Nstring, or Number.

               Prec                     Precision. Length of the column.

               Scale                    Number of decimal digits in a numeric value.


              Normalizer Tab Column Groups
              When a source row contains groups of repeating columns, you can define column groups on
              the Normalizer tab. The Normalizer transformation returns a row for each column group
              occurrence instead for each column occurrence.



390   Chapter 16: Normalizer Transformation
The level number on the Normalizer tab identifies a hierarchy of columns. Group level
  columns identify groups of columns. The group level column has a lower level number than
  columns in the group. Columns in the same group have the same level number and display
  sequentially below the group level column on the Normalizer tab.
  Figure 16-13 shows a group of multiple-occurring columns in the Normalizer tab:

  Figure 16-13. Grouping Repeated Columns on the Normalizer Tab




                                                                          The NEWRECORD
                                                                          column contains no
                                                                          data. It is a Level 1
                                                                          group column. The
                                                                          group occurs four
                                                                          times in each source
                                                                          row.

                                                                          Store_Number and
                                                                          Store_Name are Level
                                                                          2 columns. They
                                                                          belong to the
                                                                          NEWRECORD group.




  For more information about creating columns and groups, see “Steps to Create a Pipeline
  Normalizer Transformation” on page 391.


Steps to Create a Pipeline Normalizer Transformation
  When you create a pipeline Normalizer transformation, you define the columns on the
  Normalizer tab.
  You can create a Normalizer transformation in the Transformation Developer or in the
  Mapping Designer.

  To create a Normalizer transformation:

  1.   In the Transformation Developer or the Mapping Designer, click Transformation >
       Create. Select Normalizer transformation. Enter a name for the Normalizer
       transformation.
       The naming convention for Normalizer transformations is NRM_TransformationName.
  2.   Click Create and click Done.
  3.   Open the Normalizer transformation and click the Normalizer tab.
  4.   Click Add to add a new column.




                                                                  Pipeline Normalizer Transformation   391
The Designer creates a new column with default attributes. You can change the name,
                   datatype, precision, and scale.
              5.   To create a multiple-occurring column, enter the number of occurrences in the Occurs
                   column.
              6.   To create a group of multiple-occurring columns, enter at least one of the columns on the
                   Normalizer tab. Select the column. Click Level.




                                                                                     Click Level to change column
                                                                                     levels.




                                                                                     All columns are the same level
                                                                                     by default. The Level defines
                                                                                     columns that are grouped
                                                                                     together.




                   The Designer adds a NEWRECORD group level column above the selected column.
                   NEWRECORD becomes Level 1. The selected column becomes Level 2. You can rename
                   the NEWRECORD column.
              7.   You can change the column level for other columns to add them to the same group. Select
                   a column and click Level to change it to the same level as the column above it. Columns
                   in the same group must appear sequentially in the Normalizer tab.




392   Chapter 16: Normalizer Transformation
Figure 16-14 shows the NEWRECORD column that groups the Store_Number and
      Store_Name columns:

      Figure 16-14. Group-Level Column on the Normalizer Tab




                                                                             The NEWRECORD
                                                                             column is a level one
                                                                             group column.


                                                                             Store_Number and
                                                                             Store_Name are level
                                                                             two columns.




8.    Change the occurrence at the group level to make the group of columns multiple-
      occurring.
9.    Click Apply to save the columns and create input and output ports.
      The Designer creates the Normalizer transformation input and output ports. In addition,
      the Designer creates the generated key columns and a column ID for each multiple-
      occurring column or group of columns.
10.   Select the Properties tab to change the tracing level or reset the generated key sequence
      numbers after the next session. For more information about changing generated key
      values, see “Changing the Generated Key Values” on page 379.




                                                               Pipeline Normalizer Transformation    393
Using a Normalizer Transformation in a Mapping
              When a Normalizer transformation receives more than one type of data from a COBOL
              source, you need to connect the Normalizer output ports to different targets based on the type
              of data in each row. The following example describes how to map the Sales_File COBOL
              source definition through a Normalizer transformation to multiple targets.
              The Sales_File source record contains either store information or information about items
              that a store sells. The sales file contains both types of records.
              The following example includes two sales file records:
              Store Record        H01Software Suppliers Incorporated 1111 Battery Street
                                  San Francisco
              Item Record         D01123456789USB Line - 10 Feet                      001495000020 01Supp1
                                  02Supp2   03Supp3    04Supp4


              The COBOL source definition and the Normalizer transformation have columns that
              represent fields in both types of records. You need to filter the store rows from the item rows
              and pass them to different targets.
              Figure 16-15 shows the Sales_File COBOL source:

              Figure 16-15. Sales File COBOL Source



                                                            Hdr_Rec_Type defines the type of data in the source
                                                            record.




                                                            The source record might contain the Store_Data
                                                            information or Detail_Data information with four
                                                            occurrences of Supplier_Info.




              The Hdr_Rec_Type defines whether the record contains store or merchandise data. When the
              Hdr_Rec_Type value is “S,” the record contains Store_Data. When the Hdr_Rec_Type is
              “D,” the record contains Detail_Data. Detail_Data always includes four occurrences of
              Supplier_Info fields.
              To filter data, connect the Normalizer output rows to a Router transformation to route the
              store, item, and supplier data to different targets. You can filter rows in the Router
              transformation based on the value of Hdr_Rec_Type.




394   Chapter 16: Normalizer Transformation
Figure 16-16 shows the mapping that routes Sales_File records to different targets:

Figure 16-16. Multiple Record Types Routed to Different Targets


                              1
                                                                         3

                                    2                                4               6




                                                                                         5

                                    The Router transformation
                                    filters store, detail, and
                                    supplier columns.


The mapping filters multiple record types from the COBOL source to relational targets. The
the multiple-occurring source columns are mapped to a separate relational table. Each row is
indexed by occurrence in the source row.
The mapping contains the following transformations:
♦    Normalizer transformation. The Normalizer transformation returns multiple rows when
     the source contains multiple-occurring Detail_Data. It also processes different record types
     from the same source.
♦    Router transformation. The Router transformation routes data to targets based on the
     value of Hdr_Rec_Type.
♦    Aggregator transformation. The Aggregator transformation removes duplicate
     Detail_Data rows that occur with each Supplier_Info occurrence.
The mapping has the following functionality:
1.    The Normalizer transformation passes the header record type and header store number
      columns to the Sales_Header target. Each Sales_Header record has a generated key that
      links the Sales_Header row to a Store_Data or Detail_Data target row. The Normalizer
      returns Hdr_Data and Store_Data once per row.
2.    The Normalizer transformation passes all columns to the Router transformation. It passes
      Detail_Data data four times per row, once for each occurrence of the Supplier_Info
      columns. The Detail_Data columns contain duplicate data, except for the Supplier_Info
      columns.
3.    The Router transformation passes the store name, address, city, and generated key to
      Store_Data when the Hdr_Rec_Type is “S.” The generated key links Store_Data rows to
      Sales_Header rows.
      The Router transformation contains one user-defined group for the store data and one
      user-defined group for the merchandise items.


                                                          Using a Normalizer Transformation in a Mapping   395
4.    The Router transformation passes the item, item description, price, quantity, and
                    Detail_Data generated keys to an Aggregator transformation when the Hdr_Rec_Type is
                    “D.”
              5.    The Router transformation passes the supplier code, name, and column ID to the
                    Suppliers target when the Hdr_Rec_Type is “D”. It passes the generated key that links
                    the Suppliers row to the Detail_Data row.
              6.    The Aggregator transformation removes the duplicate Detail_Data columns. The
                    Aggregator passes one instance of the item, description, price, quantity, and generated
                    key to Detail_Data. The Detail_Data generated key links the Detail_Data rows to the
                    Suppliers rows. Detail_Data also has a key that links the Detail_Data rows to
                    Sales_Header rows.
                   Figure 16-17 shows the user-defined groups and the filter conditions in the Router
                   transformation:

                   Figure 16-17. Router Transformation User-Defined Groups




                                                                                        The Router transformation
                                                                                        passes store data or item
                                                                                        data based on the record
                                                                                        type.




        Generating Key Values
              The Normalizer transformation creates a generated key when the COBOL source contains a
              group of multiple-occurring columns. You can pass a group of multiple-occurring columns to
              a different target than the other columns in the row. You can create a primary-foreign key
              relationship between the targets with the generated key. For more information about
              generated keys, see “Normalizer Transformation Generated Keys” on page 379.




396   Chapter 16: Normalizer Transformation
For example, Figure 16-18 shows a COBOL source definition that contains a multiple-
occurring group of columns:

Figure 16-18. COBOL Source with A Multiple-Occurring Group of Columns




                                                   The Detail_Suppliers group of columns
                                                   occurs four times in the Detail_Record.




The Normalizer transformation generates a GK_Detail_Sales key for each source row. The
GK_Detail_Sales key represents one Detail_Record source row.
Figure 16-19 shows the primary-foreign key relationships between the targets:

Figure 16-19. Generated Keys in Target Tables

                                                        Multiple-occurring Detail_Supplier
                                                        rows have a foreign key linking them
                                                        to the same Detail_Sales row.


                                                        The Detail_Sales target has a one-to-
                                                        many relationship to the
                                                        Detail_Suppliers target.




                                                     Using a Normalizer Transformation in a Mapping   397
Figure 16-20 shows the GK_Detail_Sales generated key connected to primary and foreign
              keys in the target:

              Figure 16-20. Generated Keys Mapped to Target Keys




                    Pass GK_Detail_Sales to the primary key
                    of Detail_Sales and the foreign key of
                    Detail_Suppliers.


              Map the Normalizer output columns to the following objects:
              ♦   Detail_Sales_Target. Pass the Detail_Item, Detail_Desc, Detail_Price, and Detail_Qty
                  columns to a Detail_Sales target. Pass the GK_Detail_Sales key to the Detail_Sales
                  primary key.
              ♦   Aggregator Transformation. Pass each Detail_Sales row through an Aggregator
                  transformation to remove duplicate rows. The Normalizer returns duplicate Detail_Sales
                  columns for each occurrence of Detail_Suppliers.
              ♦   Detail_Suppliers. Pass each instance of the Detail_Suppliers columns to a the
                  Detail_Suppliers target. Pass the GK_Detail_Sales key to the Detail_Suppliers foreign key.
                  Each instance of the Detail_Suppliers columns has a foreign key that relates the
                  Detail_Suppliers row to the Detail_Sales row.
              For more information about connecting Normalizer transformation ports to relational targets,
              see “Using a Normalizer Transformation in a Mapping” on page 394.




398   Chapter 16: Normalizer Transformation
Troubleshooting
      I cannot edit the ports in my Normalizer transformation when using a relational source.
      When you create ports manually, add them on the Normalizer tab in the transformation, not
      the Ports tab.

      Importing a COBOL file failed with numberrors. What should I do?
      Verify that the COBOL program follows the COBOL standard, including spaces, tabs, and
      end of line characters. The COBOL file headings should be similar to the following text:
            identification division.

                           program-id. mead.

            environment division.
                      select file-one assign to "fname".

            data division.

            file section.
            fd FILE-ONE.

      The Designer does not read hidden characters in the COBOL program. Use a text-only editor
      to make changes to the COBOL file. Do not use Word or Wordpad. Remove extra spaces.

      A session that reads binary data completed, but the information in the target table is
      incorrect.
      Edit the session in the Workflow Manager and verify that the source file format is set
      correctly. The file format might be EBCDIC or ASCII. The number of bytes to skip between
      records must be set to 0.

      I have a COBOL field description that uses a non-IBM COMP type. How should I import the
      source?
      In the source definition, clear the IBM COMP option.

      In my mapping, I use one Expression transformation and one Lookup transformation to
      modify two output ports from the Normalizer transformation. The mapping concatenates
      them into a single transformation. All the ports are under the same level. When I check the
      data loaded in the target, it is incorrect. Why is that?
      You can only concatenate ports from level one. Remove the concatenation.




                                                                                Troubleshooting   399
400   Chapter 16: Normalizer Transformation
Chapter 17




Rank Transformation


   This chapter includes the following topics:
   ♦   Overview, 402
   ♦   Ports in a Rank Transformation, 404
   ♦   Defining Groups, 405
   ♦   Creating a Rank Transformation, 406




                                                              401
Overview
                    Transformation type:
                    Active
                    Connected


             You can select only the top or bottom rank of data with Rank transformation. Use a Rank
             transformation to return the largest or smallest numeric value in a port or group. You can also
             use a Rank transformation to return the strings at the top or the bottom of a session sort
             order. During the session, the Integration Service caches input data until it can perform the
             rank calculations.
             The Rank transformation differs from the transformation functions MAX and MIN, in that it
             lets you select a group of top or bottom values, not just one value. For example, use Rank to
             select the top 10 salespersons in a given territory. Or, to generate a financial report, you might
             also use a Rank transformation to identify the three departments with the lowest expenses in
             salaries and overhead. While the SQL language provides many functions designed to handle
             groups of data, identifying top or bottom strata within a set of rows is not possible using
             standard SQL functions.
             You connect all ports representing the same row set to the transformation. Only the rows that
             fall within that rank, based on some measure you set when you configure the transformation,
             pass through the Rank transformation. You can also write expressions to transform data or
             perform calculations.
             Figure 17-1 shows a mapping that passes employee data from a human resources table
             through a Rank transformation. The Rank only passes the rows for the top 10 highest paid
             employees to the next transformation.

             Figure 17-1. Sample Mapping with a Rank Transformation




             As an active transformation, the Rank transformation might change the number of rows
             passed through it. You might pass 100 rows to the Rank transformation, but select to rank
             only the top 10 rows, which pass from the Rank transformation to another transformation.
             You can connect ports from only one transformation to the Rank transformation. You can
             also create local variables and write non-aggregate expressions.



402   Chapter 17: Rank Transformation
Ranking String Values
  When the Integration Service runs in the ASCII data movement mode, it sorts session data
  using a binary sort order.
  When the Integration Service runs in Unicode data movement mode, the Integration Service
  uses the sort order configured for the session. You select the session sort order in the session
  properties. The session properties lists all available sort orders based on the code page used by
  the Integration Service.
  For example, you have a Rank transformation configured to return the top three values of a
  string port. When you configure the workflow, you select the Integration Service on which
  you want the workflow to run. The session properties display all sort orders associated with
  the code page of the selected Integration Service, such as French, German, and Binary. If you
  configure the session to use a binary sort order, the Integration Service calculates the binary
  value of each string, and returns the three rows with the highest binary values for the string.


Rank Caches
  During a session, the Integration Service compares an input row with rows in the data cache.
  If the input row out-ranks a cached row, the Integration Service replaces the cached row with
  the input row. If you configure the Rank transformation to rank across multiple groups, the
  Integration Service ranks incrementally for each group it finds.
  The Integration Service stores group information in an index cache and row data in a data
  cache. If you create multiple partitions in a pipeline, the Integration Service creates separate
  caches for each partition. For more information about caching, see “Session Caches” in the
  Workflow Administration Guide.


Rank Transformation Properties
  When you create a Rank transformation, you can configure the following properties:
  ♦   Enter a cache directory.
  ♦   Select the top or bottom rank.
  ♦   Select the input/output port that contains values used to determine the rank. You can
      select only one port to define a rank.
  ♦   Select the number of rows falling within a rank.
  ♦   Define groups for ranks, such as the 10 least expensive products for each manufacturer.




                                                                                    Overview    403
Ports in a Rank Transformation
             The Rank transformation includes input or input/output ports connected to another
             transformation in the mapping. It also includes variable ports and a rank port. Use the rank
             port to specify the column you want to rank.
             Table 17-1 lists the ports in a Rank transformation:

             Table 17-1. Rank Transformation Ports

                 Ports      Number Required      Description

                 I          Minimum of one       Input port. Create an input port to receive data from another transformation.

                 O          Minimum of one       Output port. Create an output port for each port you want to link to another
                                                 transformation. You can designate input ports as output ports.

                 V          Not Required         Variable port. Can use to store values or calculations to use in an
                                                 expression. Variable ports cannot be input or output ports. They pass data
                                                 within the transformation only.

                 R          One only             Rank port. Use to designate the column for which you want to rank values.
                                                 You can designate only one Rank port in a Rank transformation. The Rank
                                                 port is an input/output port. You must link the Rank port to another
                                                 transformation.



        Rank Index
             The Designer creates a RANKINDEX port for each Rank transformation. The Integration
             Service uses the Rank Index port to store the ranking position for each row in a group. For
             example, if you create a Rank transformation that ranks the top five salespersons for each
             quarter, the rank index numbers the salespeople from 1 to 5:
             RANKINDEX            SALES_PERSON              SALES
             1                    Sam                       10,000
             2                    Mary                      9,000
             3                    Alice                     8,000
             4                    Ron                       7,000
             5                    Alex                      6,000


             The RANKINDEX is an output port only. You can pass the rank index to another
             transformation in the mapping or directly to a target.




404   Chapter 17: Rank Transformation
Defining Groups
      Like the Aggregator transformation, the Rank transformation lets you group information. For
      example, if you want to select the 10 most expensive items by manufacturer, you would first
      define a group for each manufacturer. When you configure the Rank transformation, you can
      set one of its input/output ports as a group by port. For each unique value in the group port
      (for example, MANUFACTURER_ID or MANUFACTURER_NAME), the transformation
      creates a group of rows falling within the rank definition (top or bottom, and a particular
      number in each rank).
      Therefore, the Rank transformation changes the number of rows in two different ways. By
      filtering all but the rows falling within a top or bottom rank, you reduce the number of rows
      that pass through the transformation. By defining groups, you create one set of ranked rows
      for each group.
      For example, you might create a Rank transformation to identify the 50 highest paid
      employees in the company. In this case, you would identify the SALARY column as the input/
      output port used to measure the ranks, and configure the transformation to filter out all rows
      except the top 50.
      After the Rank transformation identifies all rows that belong to a top or bottom rank, it then
      assigns rank index values. In the case of the top 50 employees, measured by salary, the highest
      paid employee receives a rank index of 1. The next highest-paid employee receives a rank
      index of 2, and so on. When measuring a bottom rank, such as the 10 lowest priced products
      in the inventory, the Rank transformation assigns a rank index from lowest to highest.
      Therefore, the least expensive item would receive a rank index of 1.
      If two rank values match, they receive the same value in the rank index and the
      transformation skips the next value. For example, if you want to see the top five retail stores in
      the country and two stores have the same sales, the return data might look similar to the
      following:
      RANKINDEX         SALES            STORE
      1                 10000            Orange
      1                 10000            Brea
      3                 90000            Los Angeles
      4                 80000            Ventura




                                                                                   Defining Groups   405
Creating a Rank Transformation
             You can add a Rank transformation anywhere in the mapping after the source qualifier.

             To create a Rank transformation:

             1.    In the Mapping Designer, click Transformation > Create. Select the Rank transformation.
                   Enter a name for the Rank. The naming convention for Rank transformations is
                   RNK_TransformationName.
                   Enter a description for the transformation. This description appears in the Repository
                   Manager.
             2.    Click Create, and then click Done.
                   The Designer creates the Rank transformation.
             3.    Link columns from an input transformation to the Rank transformation.
             4.    Click the Ports tab, and then select the Rank (R) option for the port used to measure
                   ranks.




                   If you want to create groups for ranked rows, select Group By for the port that defines
                   the group.




406   Chapter 17: Rank Transformation
5.   Click the Properties tab and select whether you want the top or bottom rank.




6.   For the Number of Ranks option, enter the number of rows you want to select for the
     rank.
7.   Change the other Rank transformation properties, if necessary.
     Table 17-2 describes the Rank transformation properties:

     Table 17-2. Rank Transformation Properties

      Setting                              Description

      Cache Directory                      Local directory where the Integration Service creates the index and data
                                           cache files. By default, the Integration Service uses the directory entered
                                           in the Workflow Manager for the process variable $PMCacheDir. If you
                                           enter a new directory, make sure the directory exists and contains
                                           enough disk space for the cache files.

      Top/Bottom                           Specifies whether you want the top or bottom ranking for a column.

      Number of Ranks                      Number of rows you want to rank.

      Case-Sensitive String Comparison     When running in Unicode mode, the Integration Service ranks strings
                                           based on the sort order selected for the session. If the session sort order
                                           is case-sensitive, select this option to enable case-sensitive string
                                           comparisons, and clear this option to have the Integration Service ignore
                                           case for strings. If the sort order is not case-sensitive, the Integration
                                           Service ignores this setting. By default, this option is selected.

      Tracing Level                        Determines the amount of information the Integration Service writes to
                                           the session log about data passing through this transformation in a
                                           session.




                                                                           Creating a Rank Transformation           407
Table 17-2. Rank Transformation Properties

                    Setting                              Description

                    Rank Data Cache Size                 Data cache size for the transformation. Default is 2,000,000 bytes. If the
                                                         total configured session cache size is 2 GB (2,147,483,648 bytes) or
                                                         more, you must run the session on a 64-bit Integration Service. You can
                                                         configure a numeric value, or you can configure the Integration Service
                                                         to determine the cache size at runtime. If you configure the Integration
                                                         Service to determine the cache size, you can also configure a maximum
                                                         amount of memory for the Integration Service to allocate to the cache.

                    Rank Index Cache Size                Index cache size for the transformation. Default is 1,000,000 bytes. If the
                                                         total configured session cache size is 2 GB (2,147,483,648 bytes) or
                                                         more, you must run the session on a 64-bit Integration Service. You can
                                                         configure a numeric value, or you can configure the Integration Service
                                                         to determine the cache size at runtime. If you configure the Integration
                                                         Service to determine the cache size, you can also configure a maximum
                                                         amount of memory for the Integration Service to allocate to the cache.

                    Transformation Scope                 Specifies how the Integration Service applies the transformation logic to
                                                         incoming data:
                                                         - Transaction. Applies the transformation logic to all rows in a
                                                           transaction. Choose Transaction when a row of data depends on all
                                                           rows in the same transaction, but does not depend on rows in other
                                                           transactions.
                                                         - All Input. Applies the transformation logic on all incoming data. When
                                                           you choose All Input, the PowerCenter drops incoming transaction
                                                           boundaries. Choose All Input when a row of data depends on all rows in
                                                           the source.
                                                         For more information about transformation scope, see “Understanding
                                                         Commit Points” in the Workflow Administration Guide.


             8.    Click OK.
             9.    Click Repository > Save.




408   Chapter 17: Rank Transformation
Chapter 18




Router Transformation


   This chapter includes the following topics:
   ♦   Overview, 410
   ♦   Working with Groups, 412
   ♦   Working with Ports, 416
   ♦   Connecting Router Transformations in a Mapping, 418
   ♦   Creating a Router Transformation, 420




                                                              409
Overview
                     Transformation type:
                     Active
                     Connected


              A Router transformation is similar to a Filter transformation because both transformations
              allow you to use a condition to test data. A Filter transformation tests data for one condition
              and drops the rows of data that do not meet the condition. However, a Router transformation
              tests data for one or more conditions and gives you the option to route rows of data that do
              not meet any of the conditions to a default output group.
              If you need to test the same input data based on multiple conditions, use a Router
              transformation in a mapping instead of creating multiple Filter transformations to perform
              the same task. The Router transformation is more efficient. For example, to test data based on
              three conditions, you only need one Router transformation instead of three filter
              transformations to perform this task. Likewise, when you use a Router transformation in a
              mapping, the Integration Service processes the incoming data only once. When you use
              multiple Filter transformations in a mapping, the Integration Service processes the incoming
              data for each transformation.
              Figure 18-1 shows two mappings that perform the same task. Mapping A uses three Filter
              transformations while Mapping B produces the same result with one Router transformation:

              Figure 18-1. Comparing Router and Filter Transformations
                             Mapping A                                          Mapping B




              A Router transformation consists of input and output groups, input and output ports, group
              filter conditions, and properties that you configure in the Designer.




410   Chapter 18: Router Transformation
Figure 18-2 shows a sample Router transformation and its components:

Figure 18-2. Sample Router Transformation




Input Ports                                         Input Group




User-Defined
Output Groups                                       Output Ports




Default Output Group




                                                                       Overview   411
Working with Groups
              A Router transformation has the following types of groups:
              ♦   Input
              ♦   Output


        Input Group
              The Designer copies property information from the input ports of the input group to create a
              set of output ports for each output group.


        Output Groups
              There are two types of output groups:
              ♦   User-defined groups
              ♦   Default group
              You cannot modify or delete output ports or their properties.

              User-Defined Groups
              You create a user-defined group to test a condition based on incoming data. A user-defined
              group consists of output ports and a group filter condition. You can create and edit user-
              defined groups on the Groups tab with the Designer. Create one user-defined group for each
              condition that you want to specify.
              The Integration Service uses the condition to evaluate each row of incoming data. It tests the
              conditions of each user-defined group before processing the default group. The Integration
              Service determines the order of evaluation for each condition based on the order of the
              connected output groups. The Integration Service processes user-defined groups that are
              connected to a transformation or a target in a mapping. The Integration Service only
              processes user-defined groups that are not connected in a mapping if the default group is
              connected to a transformation or a target.
              If a row meets more than one group filter condition, the Integration Service passes this row
              multiple times.

              The Default Group
              The Designer creates the default group after you create one new user-defined group. The
              Designer does not allow you to edit or delete the default group. This group does not have a
              group filter condition associated with it. If all of the conditions evaluate to FALSE, the
              Integration Service passes the row to the default group. If you want the Integration Service to




412   Chapter 18: Router Transformation
drop all rows in the default group, do not connect it to a transformation or a target in a
   mapping.
   The Designer deletes the default group when you delete the last user-defined group from the
   list.


Using Group Filter Conditions
   You can test data based on one or more group filter conditions. You create group filter
   conditions on the Groups tab using the Expression Editor. You can enter any expression that
   returns a single value. You can also specify a constant for the condition. A group filter
   condition returns TRUE or FALSE for each row that passes through the transformation,
   depending on whether a row satisfies the specified condition. Zero (0) is the equivalent of
   FALSE, and any non-zero value is the equivalent of TRUE. The Integration Service passes the
   rows of data that evaluate to TRUE to each transformation or target that is associated with
   each user-defined group.
   For example, you have customers from nine countries, and you want to perform different
   calculations on the data from only three countries. You might want to use a Router
   transformation in a mapping to filter this data to three different Expression transformations.
   There is no group filter condition associated with the default group. However, you can create
   an Expression transformation to perform a calculation based on the data from the other six
   countries.
   Figure 18-3 shows a mapping with a Router transformation that filters data based on multiple
   conditions:

   Figure 18-3. Using a Router Transformation in a Mapping




                                                                          Working with Groups   413
Since you want to perform multiple calculations based on the data from three different
              countries, create three user-defined groups and specify three group filter conditions on the
              Groups tab.
              Figure 18-4 shows specifying group filter conditions in a Router transformation to filter
              customer data:

              Figure 18-4. Specifying Group Filter Conditions




              In the session, the Integration Service passes the rows of data that evaluate to TRUE to each
              transformation or target that is associated with each user-defined group, such as Japan,
              France, and USA. The Integration Service passes the row to the default group if all of the
              conditions evaluate to FALSE. If this happens, the Integration Service passes the data of the
              other six countries to the transformation or target that is associated with the default group. If
              you want the Integration Service to drop all rows in the default group, do not connect it to a
              transformation or a target in a mapping.


        Adding Groups
              Adding a group is similar to adding a port in other transformations. The Designer copies
              property information from the input ports to the output ports. For more information, see
              “Working with Groups” on page 412.

              To add a group to a Router transformation:

              1.   Click the Groups tab.
              2.   Click the Add button.
              3.   Enter a name for the new group in the Group Name section.
              4.   Click the Group Filter Condition field and open the Expression Editor.


414   Chapter 18: Router Transformation
5.   Enter the group filter condition.
6.   Click Validate to check the syntax of the condition.
7.   Click OK.




                                                            Working with Groups   415
Working with Ports
              A Router transformation has input ports and output ports. Input ports are in the input group,
              and output ports are in the output groups. You can create input ports by copying them from
              another transformation or by manually creating them on the Ports tab.
              Figure 18-5 shows the Ports tab of a Router transformation:

              Figure 18-5. Router Transformation Ports Tab




              The Designer creates output ports by copying the following properties from the input ports:
              ♦   Port name
              ♦   Datatype
              ♦   Precision
              ♦   Scale
              ♦   Default value
              When you make changes to the input ports, the Designer updates the output ports to reflect
              these changes. You cannot edit or delete output ports. The output ports display in the Normal
              view of the Router transformation.
              The Designer creates output port names based on the input port names. For each input port,
              the Designer creates a corresponding output port in each output group.




416   Chapter 18: Router Transformation
Figure 18-6 shows the output port names of a Router transformation in Normal view that
correspond to the input port names:

Figure 18-6. Input Port Name and Corresponding Output Port Names




Input Port Name




Corresponding
Output Port
Names




                                                                    Working with Ports   417
Connecting Router Transformations in a Mapping
              When you connect transformations to a Router transformation in a mapping, consider the
              following rules:
              ♦   You can connect one group to one transformation or target.
                  Output Group 1
                  Port 1
                  Port 2                  Port 1
                  Port 3                  Port 2
                  Output Group 2          Port 3
                  Port 1                  Port 4
                  Port 2
                  Port 3

              ♦   You can connect one output port in a group to multiple transformations or targets.
                  Output Group 1
                  Port 1                  Port 1
                  Port 2                  Port 2
                  Port 3                  Port 3
                  Output Group 2          Port 4
                  Port 1
                  Port 2                  Port 1
                  Port 3                  Port 2
                                          Port 3
                                          Port 4

              ♦   You can connect multiple output ports in one group to multiple transformations or targets.
                  Output Group 1
                  Port 1                  Port 1
                  Port 2                  Port 2
                  Port 3                  Port 3
                  Output Group 2          Port 4
                  Port 1
                  Port 2                  Port 1
                  Port 3                  Port 2
                                          Port 3
                                          Port 4

              ♦   You cannot connect more than one group to one target or a single input group
                  transformation.
                  Output Group 1
                  Port 1                  Port 1
                  Port 2                  Port 2
                  Port 3                  Port 3
                  Output Group 2          Port 4
                  Port 1
                  Port 2
                  Port 3




418   Chapter 18: Router Transformation
♦   You can connect more than one group to a multiple input group transformation, except for
    Joiner transformations, when you connect each output group to a different input group.
    Output Group 1        Input Group 1
    Port 1                Port 1
    Port 2                Port 2
    Port 3                Port 3
    Output Group 2        Input Group 2
    Port 1                Port 1
    Port 2                Port 2
    Port 3                Port 3




                                              Connecting Router Transformations in a Mapping   419
Creating a Router Transformation
              To add a Router transformation to a mapping, complete the following steps.

              To create a Router transformation:

              1.    In the Mapping Designer, open a mapping.
              2.    Click Transformation > Create.
                    Select Router transformation, and enter the name of the new transformation. The
                    naming convention for the Router transformation is RTR_TransformationName. Click
                    Create, and then click Done.
              3.    Select and drag all the ports from a transformation to add them to the Router
                    transformation, or you can manually create input ports on the Ports tab.
              4.    Double-click the title bar of the Router transformation to edit transformation properties.
              5.    Click the Transformation tab and configure transformation properties.
              6.    Click the Properties tab and configure tracing levels.
                    For more information about configuring tracing levels, see “Configuring Tracing Level in
                    Transformations” on page 30.
              7.    Click the Groups tab, and then click the Add button to create a user-defined group.
                    The Designer creates the default group when you create the first user-defined group.
              8.    Click the Group Filter Condition field to open the Expression Editor.
              9.    Enter a group filter condition.
              10.   Click Validate to check the syntax of the conditions you entered.
              11.   Click OK.
              12.   Connect group output ports to transformations or targets.
              13.   Click Repository > Save.




420   Chapter 18: Router Transformation
Chapter 19




Sequence Generator
Transformation
   This chapter includes the following topics:
   ♦   Overview, 422
   ♦   Common Uses, 423
   ♦   Sequence Generator Ports, 424
   ♦   Transformation Properties, 427
   ♦   Creating a Sequence Generator Transformation, 432




                                                              421
Overview
                    Transformation type:
                    Passive
                    Connected


             The Sequence Generator transformation generates numeric values. Use the Sequence
             Generator to create unique primary key values, replace missing primary keys, or cycle through
             a sequential range of numbers.
             The Sequence Generator transformation is a connected transformation. It contains two
             output ports that you can connect to one or more transformations. The Integration Service
             generates a block of sequence numbers each time a block of rows enters a connected
             transformation. If you connect CURRVAL, the Integration Service processes one row in each
             block. When NEXTVAL is connected to the input port of another transformation, the
             Integration Service generates a sequence of numbers. When CURRVAL is connected to the
             input port of another transformation, the Integration Service generates the NEXTVAL value
             plus the Increment By value.
             You can make a Sequence Generator reusable, and use it in multiple mappings. You might
             reuse a Sequence Generator when you perform multiple loads to a single target.
             For example, if you have a large input file that you separate into three sessions running in
             parallel, use a Sequence Generator to generate primary key values. If you use different
             Sequence Generators, the Integration Service might generate duplicate key values. Instead,
             use the reusable Sequence Generator for all three sessions to provide a unique value for each
             target row.




422   Chapter 19: Sequence Generator Transformation
Common Uses
     You can complete the following tasks with a Sequence Generator transformation:
     ♦   Create keys.
     ♦   Replace missing values.
     ♦   Cycle through a sequential range of numbers.


   Creating Keys
     You can create approximately two billion primary or foreign key values with the Sequence
     Generator transformation by connecting the NEXTVAL port to the transformation or target
     and using the widest range of values (1 to 2147483647) with the smallest interval (1).
     When you create primary or foreign keys, only use the Cycle option to prevent the Integration
     Service from creating duplicate primary keys. You might do this by selecting the Truncate
     Target Table option in the session properties (if appropriate) or by creating composite keys.
     To create a composite key, you can configure the Integration Service to cycle through a
     smaller set of values. For example, if you have three stores generating order numbers, you
     might have a Sequence Generator cycling through values from 1 to 3, incrementing by 1.
     When you pass the following set of foreign keys, the generated values then create unique
     composite keys:
     COMPOSITE_KEY ORDER_NO
     1                  12345
     2                  12345
     3                  12345
     1                  12346
     2                  12346
     3                  12346



   Replacing Missing Values
     Use the Sequence Generator transformation to replace missing keys by using NEXTVAL with
     the IIF and ISNULL functions.
     For example, to replace null values in the ORDER_NO column, you create a Sequence
     Generator transformation with the properties and drag the NEXTVAL port to an Expression
     transformation. In the Expression transformation, drag the ORDER_NO port into the
     transformation (along with any other necessary ports). Then create a new output port,
     ALL_ORDERS.
     In ALL_ORDERS, you can then enter the following expression to replace null orders:
            IIF( ISNULL( ORDER_NO ), NEXTVAL, ORDER_NO )




                                                                                Common Uses       423
Sequence Generator Ports
             The Sequence Generator transformation provides two output ports: NEXTVAL and
             CURRVAL. You cannot edit or delete these ports. Likewise, you cannot add ports to the
             transformation.


        NEXTVAL
             Connect NEXTVAL to multiple transformations to generate unique values for each row in
             each transformation. Use the NEXTVAL port to generate sequence numbers by connecting it
             to a transformation or target. You connect the NEXTVAL port to a downstream
             transformation to generate the sequence based on the Current Value and Increment By
             properties. For more information about Sequence Generator properties, see Table 19-1 on
             page 427.
             For example, you might connect NEXTVAL to two target tables in a mapping to generate
             unique primary key values. The Integration Service creates a column of unique primary key
             values for each target table. The column of unique primary key values is sent to one target
             table as a block of sequence numbers. The second targets receives a block of sequence
             numbers from the Sequence Generator transformation only after the first target table receives
             the block of sequence numbers.
             Figure 19-1 shows connecting NEXTVAL to two target tables in a mapping:

             Figure 19-1. Connecting NEXTVAL to Two Target Tables in a Mapping




             For example, you configure the Sequence Generator transformation as follows: Current Value
             = 1, Increment By = 1. When you run the workflow, the Integration Service generates the
             following primary key values for the T_ORDERS_PRIMARY and T_ORDERS_FOREIGN
             target tables:
             T_ORDERS_PRIMARY TABLE: T_ORDERS_FOREIGN TABLE:
             PRIMARY KEY             PRIMARY KEY
             1                                6
             2                                7
             3                                8




424   Chapter 19: Sequence Generator Transformation
T_ORDERS_PRIMARY TABLE: T_ORDERS_FOREIGN TABLE:
PRIMARY KEY             PRIMARY KEY
4                               9
5                               10


If you want the same values to go to more than one target that receives data from a single
transformation, you can connect a Sequence Generator transformation to that preceding
transformation. The Integration Service processes the values into a block of sequence
numbers. This allows the Integration Service to pass unique values to the transformation, and
then route rows from the transformation to targets.
Figure 19-2 shows a mapping with a the Sequence Generator that passes unique values to the
Expression transformation. The Expression transformation then populates both targets with
identical primary key values.

Figure 19-2. Mapping with a Sequence Generator and an Expression Transformation




For example, you configure the Sequence Generator transformation as follows: Current Value
= 1, Increment By = 1. When you run the workflow, the Integration Service generates the
following primary key values for the T_ORDERS_PRIMARY and T_ORDERS_FOREIGN
target tables:
T_ORDERS_PRIMARY TABLE:          T_ORDERS_FOREIGN TABLE:
PRIMARY KEY                      PRIMARY KEY
1                                1
2                                2
3                                3
4                                4
5                                5


Note: When you run a partitioned session on a grid, the Sequence Generator transformation
may skip values depending on the number of rows in each partition.




                                                                          Sequence Generator Ports   425
CURRVAL
             CURRVAL is NEXTVAL plus the Increment By value. You typically only connect the
             CURRVAL port when the NEXTVAL port is already connected to a downstream
             transformation. When a row enters the transformation connected to the CURRVAL port, the
             Integration Service passes the last-created NEXTVAL value plus one.
             For information about the Increment By value, see “Increment By” on page 428.
             Figure 19-3 shows connecting CURRVAL and NEXTVAL ports to a target:

             Figure 19-3. Connecting CURRVAL and NEXTVAL Ports to a Target




             For example, you configure the Sequence Generator transformation as follows: Current Value
             = 1, Increment By = 1. When you run the workflow, the Integration Service generates the
             following values for NEXTVAL and CURRVAL:
             NEXTVAL      CURRVAL
             1            2
             2            3
             3            4
             4            5
             5            6


             If you connect the CURRVAL port without connecting the NEXTVAL port, the Integration
             Service passes a constant value for each row.
             When you connect the CURRVAL port in a Sequence Generator transformation, the
             Integration Service processes one row in each block. You can optimize performance by
             connecting only the NEXTVAL port in a mapping.
             Note: When you run a partitioned session on a grid, the Sequence Generator transformation
             may skip values depending on the number of rows in each partition.




426   Chapter 19: Sequence Generator Transformation
Transformation Properties
      The Sequence Generator transformation is unique among all transformations because you
      cannot add, edit, or delete its default ports (NEXTVAL and CURRVAL).
      Table 19-1 lists the Sequence Generator transformation properties you can configure:

      Table 19-1. Sequence Generator Transformation Properties

       Sequence               Required/
                                                   Description
       Generator Setting      Optional

       Start Value            Required             Start value of the generated sequence that you want the Integration
                                                   Service to use if you use the Cycle option. If you select Cycle, the
                                                   Integration Service cycles back to this value when it reaches the end
                                                   value.
                                                   Default is 0.

       Increment By           Required             Difference between two consecutive values from the NEXTVAL port.
                                                   Default is 1.

       End Value              Optional             Maximum value the Integration Service generates. If the Integration
                                                   Service reaches this value during the session and the sequence is
                                                   not configured to cycle, the session fails.

       Current Value          Optional             Current value of the sequence. Enter the value you want the
                                                   Integration Service to use as the first value in the sequence. If you
                                                   want to cycle through a series of values, the value must be greater
                                                   than or equal to the start value and less than the end value.
                                                   If the Number of Cached Values is set to 0, the Integration Service
                                                   updates the current value to reflect the last-generated value for the
                                                   session plus one, and then uses the updated current value as the
                                                   basis for the next time you run this session. However, if you use the
                                                   Reset option, the Integration Service resets this value to its original
                                                   value after each session.
                                                   Note: If you edit this setting, you reset the sequence to the new
                                                   setting. If you reset Current Value to 10, and the increment is 1, the
                                                   next time you use the session, the Integration Service generates a
                                                   first value of 10.

       Cycle                  Optional             If selected, the Integration Service cycles through the sequence
                                                   range. Otherwise, the Integration Service stops the sequence at the
                                                   configured end value.
                                                   If disabled, the Integration Service fails the session with overflow
                                                   errors if it reaches the end value and still has rows to process.

       Number of Cached       Optional             Number of sequential values the Integration Service caches at a
       Values                                      time. Use this option when multiple sessions use the same reusable
                                                   Sequence Generator at the same time to ensure each session
                                                   receives unique values. The Integration Service updates the
                                                   repository as it caches each value. When set to 0, the Integration
                                                   Service does not cache values.
                                                   Default value for a standard Sequence Generator is 0.
                                                   Default value for a reusable Sequence Generator is 1,000.




                                                                                      Transformation Properties          427
Table 19-1. Sequence Generator Transformation Properties

               Sequence               Required/
                                                          Description
               Generator Setting      Optional

               Reset                  Optional            If selected, the Integration Service generates values based on the
                                                          original current value for each session. Otherwise, the Integration
                                                          Service updates the current value to reflect the last-generated value
                                                          for the session plus one, and then uses the updated current value as
                                                          the basis for the next session run.
                                                          This option is disabled for reusable Sequence Generator
                                                          transformations.

               Tracing Level          Optional            Level of detail about the transformation that the Integration Service
                                                          writes into the session log.



        Start Value and Cycle
             Use Cycle to generate a repeating sequence, such as numbers 1 through 12 to correspond to
             the months in a year.

             To cycle the Integration Service through a sequence:

             1.   Enter the lowest value in the sequence that you want the Integration Service to use for
                  the Start Value.
             2.   Enter the highest value to be used for End Value.
             3.   Select Cycle.
             As it cycles, the Integration Service reaches the configured end value for the sequence, it wraps
             around and starts the cycle again, beginning with the configured Start Value.


        Increment By
             The Integration Service generates a sequence (NEXTVAL) based on the Current Value and
             Increment By properties in the Sequence Generator transformation.
             The Current Value property is the value at which the Integration Service starts creating the
             sequence for each session. Increment By is the integer the Integration Service adds to the
             existing value to create the new value in the sequence. By default, the Current Value is set to
             1, and Increment By is set to 1.
             For example, you might create a Sequence Generator transformation with a current value of
             1,000 and an increment of 10. If you pass three rows through the mapping, the Integration
             Service generates the following set of values:
                       1000

                       1010

                       1020




428   Chapter 19: Sequence Generator Transformation
End Value
  End Value is the maximum value you want the Integration Service to generate. If the
  Integration Service reaches the end value and the Sequence Generator is not configured to
  cycle through the sequence, the session fails with the following error message:
        TT_11009 Sequence Generator Transformation: Overflow error.

  You can set the end value to any integer between 1 and 2,147,483,647.


Current Value
  The Integration Service uses the current value as the basis for generated values for each
  session. To indicate which value you want the Integration Service to use the first time it uses
  the Sequence Generator transformation, you must enter that value as the current value. If you
  want to use the Sequence Generator transformation to cycle through a series of values, the
  current value must be greater than or equal to Start Value and less than the end value.
  At the end of each session, the Integration Service updates the current value to the last value
  generated for the session plus one if the Sequence Generator Number of Cached Values is 0.
  For example, if the Integration Service ends a session with a generated value of 101, it updates
  the Sequence Generator current value to 102 in the repository. The next time the Sequence
  Generator is used, the Integration Service uses 102 as the basis for the next generated value. If
  the Sequence Generator Increment By is 1, when the Integration Service starts another session
  using the Sequence Generator, the first generated value is 102.
  If you have multiple versions of a Sequence Generator transformation, the Integration Service
  updates the current value across all versions when it runs a session. The Integration Service
  updates the current value across versions regardless of whether you have checked out the
  Sequence Generator transformation or the parent mapping. The updated current value
  overrides an edited current value for a Sequence Generator transformation if the two values
  are different.
  For example, User 1 creates Sequence Generator transformation and checks it in, saving a
  current value of 10 to Sequence Generator version 1. Then User 1 checks out the Sequence
  Generator transformation and enters a new current value of 100 to Sequence Generator
  version 2. User 1 keeps the Sequence Generator transformation checked out. Meanwhile, User
  2 runs a session that uses the Sequence Generator transformation version 1. The Integration
  Service uses the checked-in value of 10 as the current value when User 2 runs the session.
  When the session completes, the current value is 150. The Integration Service updates the
  current value to 150 for version 1 and version 2 of the Sequence Generator transformation
  even though User 1 has the Sequence Generator transformation checked out.
  If you open the mapping after you run the session, the current value displays the last value
  generated for the session plus one. Since the Integration Service uses the current value to
  determine the first value for each session, you should edit the current value only when you
  want to reset the sequence.
  If you have multiple versions of the Sequence Generator transformation, and you want to
  reset the sequence, you must check in the mapping or Sequence Generator (reusable)
  transformation after you modify the current value.


                                                                      Transformation Properties   429
Note: If you configure the Sequence Generator to Reset, the Integration Service uses the
             current value as the basis for the first generated value for each session.


        Number of Cached Values
             Number of Cached Values determines the number of values the Integration Service caches at
             one time. When Number of Cached Values is greater than zero, the Integration Service caches
             the configured number of values and updates the current value each time it caches values.
             When multiple sessions use the same reusable Sequence Generator transformation at the same
             time, there might be multiple instances of the Sequence Generator transformation. To avoid
             generating the same values for each session, reserve a range of sequence values for each session
             by configuring Number of Cached Values.
             Tip: To increase performance when running a session on a grid, increase the number of cached
             values for the Sequence Generator transformation. This reduces the communication required
             between the master and worker DTM processes and the repository.

             Non-Reusable Sequence Generators
             For non-reusable Sequence Generator transformations, Number of Cached Values is set to
             zero by default, and the Integration Service does not cache values during the session. When
             the Integration Service does not cache values, it accesses the repository for the current value at
             the start of a session. The Integration Service then generates values for the sequence. At the
             end of the session, the Integration Service updates the current value in the repository.
             When you set Number of Cached Values greater than zero, the Integration Service caches
             values during the session. At the start of the session, the Integration Service accesses the
             repository for the current value, caches the configured number of values, and updates the
             current value accordingly. If the Integration Service exhausts the cache, it accesses the
             repository for the next set of values and updates the current value. At the end of the session,
             the Integration Service discards any remaining values in the cache.
             For non-reusable Sequence Generator transformations, setting Number of Cached Values
             greater than zero can increase the number of times the Integration Service accesses the
             repository during the session. It also causes sections of skipped values since unused cached
             values are discarded at the end of each session.
             For example, you configure a Sequence Generator transformation as follows: Number of
             Cached Values = 50, Current Value = 1, Increment By = 1. When the Integration Service
             starts the session, it caches 50 values for the session and updates the current value to 50 in the
             repository. The Integration Service uses values 1 to 39 for the session and discards the unused
             values, 40 to 49. When the Integration Service runs the session again, it checks the repository
             for the current value, which is 50. It then caches the next 50 values and updates the current
             value to 100. During the session, it uses values 50 to 98. The values generated for the two
             sessions are 1 to 39 and 50 to 98.




430   Chapter 19: Sequence Generator Transformation
Reusable Sequence Generators
  When you have a reusable Sequence Generator transformation in several sessions and the
  sessions run at the same time, use Number of Cached Values to ensure each session receives
  unique values in the sequence. By default, Number of Cached Values is set to 1000 for
  reusable Sequence Generators.
  When multiple sessions use the same Sequence Generator transformation at the same time,
  you risk generating the same values for each session. To avoid this, have the Integration
  Service cache a set number of values for each session by configuring Number of Cached
  Values.
  For example, you configure a reusable Sequence Generator transformation as follows:
  Number of Cached Values = 50, Current Value = 1, Increment By = 1. Two sessions use the
  Sequence Generator, and they are scheduled to run at approximately the same time. When the
  Integration Service starts the first session, it caches 50 values for the session and updates the
  current value to 50 in the repository. The Integration Service begins using values 1 to 50 in
  the session. When the Integration Service starts the second session, it checks the repository for
  the current value, which is 50. It then caches the next 50 values and updates the current value
  to 100. It then uses values 51 to 100 in the second session. When either session uses all its
  cached values, the Integration Service caches a new set of values and updates the current value
  to ensure these values remain unique to the Sequence Generator.
  For reusable Sequence Generator transformations, you can reduce Number of Cached Values
  to minimize discarded values, however it must be greater than one. When you reduce the
  Number of Cached Values, you might increase the number of times the Integration Service
  accesses the repository to cache values during the session.


Reset
  If you select Reset for a non-reusable Sequence Generator transformation, the Integration
  Service generates values based on the original current value each time it starts the session.
  Otherwise, the Integration Service updates the current value to reflect the last-generated value
  plus one, and then uses the updated value the next time it uses the Sequence Generator
  transformation.
  For example, you might configure a Sequence Generator transformation to create values from
  1 to 1,000 with an increment of 1, and a current value of 1 and choose Reset. During the first
  session run, the Integration Service generates numbers 1 through 234. The next time (and
  each subsequent time) the session runs, the Integration Service again generates numbers
  beginning with the current value of 1.
  If you do not select Reset, the Integration Service updates the current value to 235 at the end
  of the first session run. The next time it uses the Sequence Generator transformation, the first
  value generated is 235.
  Note: Reset is disabled for reusable Sequence Generator transformations.




                                                                      Transformation Properties   431
Creating a Sequence Generator Transformation
             To use a Sequence Generator transformation in a mapping, add it to the mapping, configure
             the transformation properties, and then connect NEXTVAL or CURRVAL to one or more
             transformations.

             To create a Sequence Generator transformation:

             1.   In the Mapping Designer, click Transformation > Create. Select the Sequence Generator
                  transformation.
                  The naming convention for Sequence Generator transformations is
                  SEQ_TransformationName.
             2.   Enter a name for the Sequence Generator, and click Create. Click Done.
                  The Designer creates the Sequence Generator transformation.




             3.   Double-click the title bar of the transformation to open the Edit Transformations dialog
                  box.
             4.   Enter a description for the transformation. This description appears in the Repository
                  Manager, making it easier for you or others to understand what the transformation does.
             5.   Select the Properties tab. Enter settings.
                  For a list of transformation properties, see Table 19-1 on page 427.




432   Chapter 19: Sequence Generator Transformation
Note: You cannot override the Sequence Generator transformation properties at the
     session level. This protects the integrity of the sequence values generated.




6.   Click OK.
7.   To generate new sequences during a session, connect the NEXTVAL port to at least one
     transformation in the mapping.
     Use the NEXTVAL or CURRVAL ports in an expression in other transformations.
8.   Click Repository > Save.




                                                  Creating a Sequence Generator Transformation   433
434   Chapter 19: Sequence Generator Transformation
Chapter 20




Sorter Transformation


    This chapter includes the following topics:
    ♦   Overview, 436
    ♦   Sorting Data, 437
    ♦   Sorter Transformation Properties, 439
    ♦   Creating a Sorter Transformation, 443




                                                               435
Overview
                     Transformation type:
                     Active
                     Connected


              You can sort data with the Sorter transformation. You can sort data in ascending or
              descending order according to a specified sort key. You can also configure the Sorter
              transformation for case-sensitive sorting, and specify whether the output rows should be
              distinct. The Sorter transformation is an active transformation. It must be connected to the
              data flow.
              You can sort data from relational or flat file sources. You can also use the Sorter
              transformation to sort data passing through an Aggregator transformation configured to use
              sorted input.
              When you create a Sorter transformation in a mapping, you specify one or more ports as a
              sort key and configure each sort key port to sort in ascending or descending order. You also
              configure sort criteria the Integration Service applies to all sort key ports and the system
              resources it allocates to perform the sort operation.
              Figure 20-1 shows a simple mapping that uses a Sorter transformation. The mapping passes
              rows from a sales table containing order information through a Sorter transformation before
              loading to the target.

              Figure 20-1. Sample Mapping with a Sorter Transformation




436   Chapter 20: Sorter Transformation
Sorting Data
      The Sorter transformation contains only input/output ports. All data passing through the
      Sorter transformation is sorted according to a sort key. The sort key is one or more ports that
      you want to use as the sort criteria.
      You can specify more than one port as part of the sort key. When you specify multiple ports
      for the sort key, the Integration Service sorts each port sequentially. The order the ports
      appear in the Ports tab determines the succession of sort operations. The Sorter
      transformation treats the data passing through each successive sort key port as a secondary
      sort of the previous port.
      At session run time, the Integration Service sorts data according to the sort order specified in
      the session properties. The sort order determines the sorting criteria for special characters and
      symbols.
      Figure 20-2 shows the Ports tab configuration for the Sorter transformation sorting the data
      in ascending order by order ID and item ID:

      Figure 20-2. Sample Sorter Transformation Ports Configuration




      At session run time, the Integration Service passes the following rows into the Sorter
      transformation:
      ORDER_ID              ITEM_ID               QUANTITY            DISCOUNT
      45                    123456                3                   3.04
      45                    456789                2                   12.02
      43                    000246                6                   34.55
      41                    000468                5                   .56




                                                                                      Sorting Data   437
After sorting the data, the Integration Service passes the following rows out of the Sorter
              transformation:
              ORDER_ID               ITEM_ID        QUANTITY            DISCOUNT
              41                     000468         5                   .56
              43                     000246         6                   34.55
              45                     123456         3                   3.04
              45                     456789         2                   12.02




438   Chapter 20: Sorter Transformation
Sorter Transformation Properties
      The Sorter transformation has several properties that specify additional sort criteria. The
      Integration Service applies these criteria to all sort key ports. The Sorter transformation
      properties also determine the system resources the Integration Service allocates when it sorts
      data.
      Figure 20-3 shows the Sorter transformation Properties tab:

      Figure 20-3. Sorter Transformation Properties




    Sorter Cache Size
      The Integration Service uses the Sorter Cache Size property to determine the maximum
      amount of memory it can allocate to perform the sort operation. The Integration Service
      passes all incoming data into the Sorter transformation before it performs the sort operation.
      You can configure a numeric value for the Sorter cache, or you can configure the Integration
      Service to determine the cache size at runtime. If you configure the Integration Service to
      determine the cache size, you can also configure a maximum amount of memory for the
      Integration Service to allocate to the cache.
      If the total configured session cache size is 2 GB (2,147,483,648 bytes) or greater, you must
      run the session on a 64-bit Integration Service.
      Before starting the sort operation, the Integration Service allocates the amount of memory
      configured for the Sorter cache size. If the Integration Service runs a partitioned session, it
      allocates the specified amount of Sorter cache memory for each partition.
      If it cannot allocate enough memory, the Integration Service fails the session. For best
      performance, configure Sorter cache size with a value less than or equal to the amount of


                                                                     Sorter Transformation Properties   439
available physical RAM on the Integration Service machine. Allocate at least 8 MB
              (8,388,608 bytes) of physical memory to sort data using the Sorter transformation. Sorter
              cache size is set to 8,388,608 bytes by default.
              If the amount of incoming data is greater than the amount of Sorter cache size, the
              Integration Service temporarily stores data in the Sorter transformation work directory. The
              Integration Service requires disk space of at least twice the amount of incoming data when
              storing data in the work directory. If the amount of incoming data is significantly greater than
              the Sorter cache size, the Integration Service may require much more than twice the amount
              of disk space available to the work directory.
              Use the following formula to determine the size of incoming data:
                        number_of_input_rows [( Σ column_size) + 16]

              Table 20-1 gives the individual column size values by datatype for Sorter data calculations:

              Table 20-1. Column Sizes for Sorter Data Calculations

               Datatype                                           Column Size

               Binary                                             precision + 8
                                                                  Round to nearest multiple of 8

               Date/Time                                          24

               Decimal, high precision off (all precision)        16

               Decimal, high precision on (precision <=18)        24

               Decimal, high precision on (precision >18, <=28)   32

               Decimal, high precision on (precision >28)         16

               Decimal, high precision on (negative scale)        16

               Double                                             16

               Real                                               16

               Integer                                            16

               Small integer                                      16

               NString, NText, String, Text                       Unicode mode: 2*(precision + 5)
                                                                  ASCII mode: precision + 9


              The column sizes include the bytes required for a null indicator.
              To increase performance for the sort operation, the Integration Service aligns all data for the
              Sorter transformation memory on an 8-byte boundary. Each Sorter column includes rounding
              to the nearest multiple of eight.
              The Integration Service also writes the row size and amount of memory the Sorter
              transformation uses to the session log when you configure the Sorter transformation tracing
              level to Normal. For more information about Sorter transformation tracing levels, see
              “Tracing Level” on page 441.



440   Chapter 20: Sorter Transformation
Case Sensitive
  The Case Sensitive property determines whether the Integration Service considers case when
  sorting data. When you enable the Case Sensitive property, the Integration Service sorts
  uppercase characters higher than lowercase characters.


Work Directory
  You must specify a work directory the Integration Service uses to create temporary files while
  it sorts data. After the Integration Service sorts the data, it deletes the temporary files. You can
  specify any directory on the Integration Service machine to use as a work directory. By
  default, the Integration Service uses the value specified for the $PMTempDir process variable.
  When you partition a session with a Sorter transformation, you can specify a different work
  directory for each partition in the pipeline. To increase session performance, specify work
  directories on physically separate disks on the Integration Service system.


Distinct Output Rows
  You can configure the Sorter transformation to treat output rows as distinct. If you configure
  the Sorter transformation for distinct output rows, the Mapping Designer configures all ports
  as part of the sort key. When the Integration Service runs the session, it discards duplicate
  rows compared during the sort operation.


Tracing Level
  Configure the Sorter transformation tracing level to control the number and type of Sorter
  error and status messages the Integration Service writes to the session log. At Normal tracing
  level, the Integration Service writes the size of the row passed to the Sorter transformation and
  the amount of memory the Sorter transformation allocates for the sort operation. The
  Integration Service also writes the time and date when it passes the first and last input rows to
  the Sorter transformation.
  If you configure the Sorter transformation tracing level to Verbose Data, the Integration
  Service writes the time the Sorter transformation finishes passing all data to the next
  transformation in the pipeline. The Integration Service also writes the time to the session log
  when the Sorter transformation releases memory resources and removes temporary files from
  the work directory.
  For more information about configuring tracing levels for transformations, see “Configuring
  Tracing Level in Transformations” on page 30.




                                                                  Sorter Transformation Properties   441
Null Treated Low
              You can configure the way the Sorter transformation treats null values. Enable this property if
              you want the Integration Service to treat null values as lower than any other value when it
              performs the sort operation. Disable this option if you want the Integration Service to treat
              null values as higher than any other value.


        Transformation Scope
              The transformation scope specifies how the Integration Service applies the transformation
              logic to incoming data:
              ♦   Transaction. Applies the transformation logic to all rows in a transaction. Choose
                  Transaction when a row of data depends on all rows in the same transaction, but does not
                  depend on rows in other transactions.
              ♦   All Input. Applies the transformation logic on all incoming data. When you choose All
                  Input, the PowerCenter drops incoming transaction boundaries. Choose All Input when a
                  row of data depends on all rows in the source.
              For more information about transformation scope, see “Understanding Commit Points” in
              the Workflow Administration Guide.




442   Chapter 20: Sorter Transformation
Creating a Sorter Transformation
      To add a Sorter transformation to a mapping, complete the following steps.

      To create a Sorter transformation:

      1.    In the Mapping Designer, click Transformation > Create. Select the Sorter
            transformation.
            The naming convention for Sorter transformations is SRT_TransformationName. Enter a
            description for the transformation. This description appears in the Repository Manager,
            making it easier to understand what the transformation does.
      2.    Enter a name for the Sorter and click Create.
            The Designer creates the Sorter transformation.
      3.    Click Done.
      4.    Drag the ports you want to sort into the Sorter transformation.
            The Designer creates the input/output ports for each port you include.
      5.    Double-click the title bar of the transformation to open the Edit Transformations dialog
            box.
      6.    Select the Ports tab.
      7.    Select the ports you want to use as the sort key.
      8.    For each port selected as part of the sort key, specify whether you want the Integration
            Service to sort data in ascending or descending order.
      9.    Select the Properties tab. Modify the Sorter transformation properties. For information
            about Sorter transformation properties, see “Sorter Transformation Properties” on
            page 439.
      10.   Select the Metadata Extensions tab. Create or edit metadata extensions for the Sorter
            transformation. For more information about metadata extensions, see “Metadata
            Extensions” in the Repository Guide.
      11.   Click OK.
      12.   Click Repository > Save to save changes to the mapping.




                                                                    Creating a Sorter Transformation   443
444   Chapter 20: Sorter Transformation
Chapter 21




Source Qualifier
Transformation
    This chapter includes the following topics:
    ♦   Overview, 446
    ♦   Source Qualifier Transformation Properties, 449
    ♦   Default Query, 451
    ♦   Joining Source Data, 454
    ♦   Adding an SQL Query, 458
    ♦   Entering a User-Defined Join, 460
    ♦   Outer Join Support, 462
    ♦   Entering a Source Filter, 470
    ♦   Using Sorted Ports, 472
    ♦   Select Distinct, 474
    ♦   Adding Pre- and Post-Session SQL Commands, 475
    ♦   Creating a Source Qualifier Transformation, 476
    ♦   Troubleshooting, 478




                                                                445
Overview
                     Transformation type:
                     Active
                     Connected


              When you add a relational or a flat file source definition to a mapping, you need to connect it
              to a Source Qualifier transformation. The Source Qualifier transformation represents the
              rows that the Integration Service reads when it runs a session.
              Use the Source Qualifier transformation to complete the following tasks:
              ♦   Join data originating from the same source database. You can join two or more tables
                  with primary key-foreign key relationships by linking the sources to one Source Qualifier
                  transformation.
              ♦   Filter rows when the Integration Service reads source data. If you include a filter
                  condition, the Integration Service adds a WHERE clause to the default query.
              ♦   Specify an outer join rather than the default inner join. If you include a user-defined
                  join, the Integration Service replaces the join information specified by the metadata in the
                  SQL query.
              ♦   Specify sorted ports. If you specify a number for sorted ports, the Integration Service adds
                  an ORDER BY clause to the default SQL query.
              ♦   Select only distinct values from the source. If you choose Select Distinct, the Integration
                  Service adds a SELECT DISTINCT statement to the default SQL query.
              ♦   Create a custom query to issue a special SELECT statement for the Integration Service
                  to read source data. For example, you might use a custom query to perform aggregate
                  calculations.


        Transformation Datatypes
              The Source Qualifier transformation displays the transformation datatypes. The
              transformation datatypes determine how the source database binds data when the Integration
              Service reads it. Do not alter the datatypes in the Source Qualifier transformation. If the
              datatypes in the source definition and Source Qualifier transformation do not match, the
              Designer marks the mapping invalid when you save it.


        Target Load Order
              You specify a target load order based on the Source Qualifier transformations in a mapping. If
              you have multiple Source Qualifier transformations connected to multiple targets, you can
              designate the order in which the Integration Service loads data into the targets.




446   Chapter 21: Source Qualifier Transformation
If one Source Qualifier transformation provides data for multiple targets, you can enable
  constraint-based loading in a session to have the Integration Service load data based on target
  table primary and foreign key relationships.
  For more information, see “Mappings” in the Designer Guide.


Parameters and Variables
  You can use parameters and variables in the SQL query, user-defined join, source filter, and
  pre- and post-session SQL commands of a Source Qualifier transformation. Use any
  parameter or variable type that you can define in the parameter file. You can enter a parameter
  or variable within the SQL statement, or you can use a parameter or variable as the SQL
  query. For example, you can use a session parameter, $ParamMyQuery, as the SQL query, and
  set $ParamMyQuery to the SQL statement in a parameter file.
  The Integration Service first generates an SQL query and replaces each mapping parameter or
  variable with its start value. Then it runs the query on the source database.
  When you use a string mapping parameter or variable in the Source Qualifier transformation,
  use a string identifier appropriate to the source system. Most databases use a single quotation
  mark as a string identifier. For example, to use the string parameter $$IPAddress in a source
  filter for a Microsoft SQL Server database table, enclose the parameter in single quotes as
  follows, ‘$$IPAddress’. For more information, see the database documentation.
  When you use a datetime mapping parameter or variable, or when you use the system variable
  $$$SessStartTime, you might need to change the date format to the format used in the
  source. The Integration Service passes datetime parameters and variables to source systems as
  strings in the SQL query. The Integration Service converts a datetime parameter or variable to
  a string, based on the source database.
  Table 21-1 describes the datetime formats the Integration Service uses for each source system:

  Table 21-1. Conversion for Datetime Mapping Parameters and Variables

   Source                    Date Format

   DB2                       YYYY-MM-DD-HH24:MI:SS

   Informix                  YYYY-MM-DD HH24:MI:SS

   Microsoft SQL Server      MM/DD/YYYY HH24:MI:SS

   ODBC                      YYYY-MM-DD HH24:MI:SS

   Oracle                    MM/DD/YYYY HH24:MI:SS

   Sybase                    MM/DD/YYYY HH24:MI:SS

   Teradata                  YYYY-MM-DD HH24:MI:SS


  Some databases require you to identify datetime values with additional punctuation, such as
  single quotation marks or database specific functions. For example, to convert the
  $$$SessStartTime value for an Oracle source, use the following Oracle function in the SQL
  override:


                                                                                   Overview   447
to_date (‘$$$SessStartTime’, ‘mm/dd/yyyy hh24:mi:ss’)

              For Informix, use the following Informix function in the SQL override to convert the
              $$$SessStartTime value:
                      DATETIME ($$$SessStartTime) YEAR TO SECOND

              For more information about SQL override, see “Overriding the Default Query” on page 452.
              For information about database specific functions, see the database documentation.
              Tip: To ensure the format of a datetime parameter or variable matches that used by the source,
              validate the SQL query.
              For information about mapping parameters and variables, see “Mapping Parameters and
              Variables” in the Designer Guide.




448   Chapter 21: Source Qualifier Transformation
Source Qualifier Transformation Properties
      Configure the Source Qualifier transformation properties on the Properties tab of the Edit
      Transformations dialog box.




      Table 21-2 describes the Source Qualifier transformation properties:

      Table 21-2. Source Qualifier Transformation Properties

       Option                        Description

       SQL Query                     Defines a custom query that replaces the default query the Integration Service
                                     uses to read data from sources represented in this Source Qualifier
                                     transformation. For more information, see “Adding an SQL Query” on page 458. A
                                     custom query overrides entries for a custom join or a source filter.

       User-Defined Join             Specifies the condition used to join data from multiple sources represented in the
                                     same Source Qualifier transformation. For more information, see “Entering a
                                     User-Defined Join” on page 460.

       Source Filter                 Specifies the filter condition the Integration Service applies when querying rows.
                                     For more information, see “Entering a Source Filter” on page 470.

       Number of Sorted Ports        Indicates the number of columns used when sorting rows queried from relational
                                     sources. If you select this option, the Integration Service adds an ORDER BY to
                                     the default query when it reads source rows. The ORDER BY includes the number
                                     of ports specified, starting from the top of the transformation.
                                     When selected, the database sort order must match the session sort order.

       Tracing Level                 Sets the amount of detail included in the session log when you run a session
                                     containing this transformation. For more information, see “Configuring Tracing
                                     Level in Transformations” on page 30.

       Select Distinct               Specifies if you want to select only unique rows. The Integration Service includes
                                     a SELECT DISTINCT statement if you choose this option.



                                                                        Source Qualifier Transformation Properties        449
Table 21-2. Source Qualifier Transformation Properties

                Option                        Description

                Pre-SQL                       Pre-session SQL commands to run against the source database before the
                                              Integration Service reads the source. For more information, see “Adding Pre- and
                                              Post-Session SQL Commands” on page 475.

                Post-SQL                      Post-session SQL commands to run against the source database after the
                                              Integration Service writes to the target. For more information, see “Adding Pre-
                                              and Post-Session SQL Commands” on page 475.

                Output is Deterministic       Relational source or transformation output that does not change between session
                                              runs when the input data is consistent between runs. When you configure this
                                              property, the Integration Service does not stage source data for recovery if
                                              transformations in the pipeline always produce repeatable data.

                Output is Repeatable          Relational source or transformation output that is in the same order between
                                              session runs when the order of the input data is consistent. When output is
                                              deterministic and output is repeatable, the Integration Service does not stage
                                              source data for recovery.




450   Chapter 21: Source Qualifier Transformation
Default Query
      For relational sources, the Integration Service generates a query for each Source Qualifier
      transformation when it runs a session. The default query is a SELECT statement for each
      source column used in the mapping. In other words, the Integration Service reads only the
      columns that are connected to another transformation.
      Figure 21-1 shows a single source definition connected to a Source Qualifier transformation:

      Figure 21-1. Source Definition Connected to a Source Qualifier Transformation




      Although there are many columns in the source definition, only three columns are connected
      to another transformation. In this case, the Integration Service generates a default query that
      selects only those three columns:
             SELECT CUSTOMERS.CUSTOMER_ID, CUSTOMERS.COMPANY, CUSTOMERS.FIRST_NAME
             FROM CUSTOMERS

      If any table name or column name contains a database reserved word, you can create and
      maintain a file, reswords.txt, containing reserved words. When the Integration Service
      initializes a session, it searches for reswords.txt in the Integration Service installation
      directory. If the file exists, the Integration Service places quotes around matching reserved
      words when it executes SQL against the database. If you override the SQL, you must enclose
      any reserved word in quotes. For more information about the reserved words file, see
      “Working with Targets” in the Workflow Administration Guide.
      When generating the default query, the Designer delimits table and field names containing
      the following characters with double quotes:
             / + - = ~ ` ! % ^ & * ( ) [ ] { } ' ; ? , < >  | <space>




                                                                                      Default Query   451
Viewing the Default Query
              You can view the default query in the Source Qualifier transformation.

              To view the default query:

              1.   From the Properties tab, select SQL Query.
                   The SQL Editor appears.




                   The SQL Editor displays the default query the Integration Service uses to select source
                   data.
              2.   Click Generate SQL.
              3.   Click Cancel to exit.
              Note: If you do not cancel the SQL query, the Integration Service overrides the default query
              with the custom SQL query.
              Do not connect to the source database. You only connect to the source database when you
              enter an SQL query that overrides the default query.
              Tip: You must connect the columns in the Source Qualifier transformation to another
              transformation or target before you can generate the default query.


        Overriding the Default Query
              You can alter or override the default query in the Source Qualifier transformation by changing
              the default settings of the transformation properties. Do not change the list of selected ports
              or the order in which they appear in the query. This list must match the connected
              transformation output ports.
              When you edit transformation properties, the Source Qualifier transformation includes these
              settings in the default query. However, if you enter an SQL query, the Integration Service uses


452   Chapter 21: Source Qualifier Transformation
only the defined SQL statement. The SQL Query overrides the User-Defined Join, Source
Filter, Number of Sorted Ports, and Select Distinct settings in the Source Qualifier
transformation.
Note: When you override the default SQL query, you must enclose all database reserved words
in quotes.




                                                                          Default Query   453
Joining Source Data
              Use one Source Qualifier transformation to join data from multiple relational tables. These
              tables must be accessible from the same instance or database server.
              When a mapping uses related relational sources, you can join both sources in one Source
              Qualifier transformation. During the session, the source database performs the join before
              passing data to the Integration Service. This can increase performance when source tables are
              indexed.
              Tip: Use the Joiner transformation for heterogeneous sources and to join flat files.


        Default Join
              When you join related tables in one Source Qualifier transformation, the Integration Service
              joins the tables based on the related keys in each table.
              This default join is an inner equijoin, using the following syntax in the WHERE clause:
                      Source1.column_name = Source2.column_name

              The columns in the default join must have:
              ♦   A primary key-foreign key relationship
              ♦   Matching datatypes
              For example, you might see all the orders for the month, including order number, order
              amount, and customer name. The ORDERS table includes the order number and amount of
              each order, but not the customer name. To include the customer name, you need to join the
              ORDERS and CUSTOMERS tables. Both tables include a customer ID, so you can join the
              tables in one Source Qualifier transformation.




454   Chapter 21: Source Qualifier Transformation
Figure 21-2 shows joining two tables with one Source Qualifier transformation:

  Figure 21-2. Joining Two Tables with One Source Qualifier Transformation




  When you include multiple tables, the Integration Service generates a SELECT statement for
  all columns used in the mapping. In this case, the SELECT statement looks similar to the
  following statement:
         SELECT CUSTOMERS.CUSTOMER_ID, CUSTOMERS.COMPANY, CUSTOMERS.FIRST_NAME,
         CUSTOMERS.LAST_NAME, CUSTOMERS.ADDRESS1, CUSTOMERS.ADDRESS2,
         CUSTOMERS.CITY, CUSTOMERS.STATE, CUSTOMERS.POSTAL_CODE, CUSTOMERS.PHONE,
         CUSTOMERS.EMAIL, ORDERS.ORDER_ID, ORDERS.DATE_ENTERED,
         ORDERS.DATE_PROMISED, ORDERS.DATE_SHIPPED, ORDERS.EMPLOYEE_ID,
         ORDERS.CUSTOMER_ID, ORDERS.SALES_TAX_RATE, ORDERS.STORE_ID

         FROM CUSTOMERS, ORDERS

         WHERE CUSTOMERS.CUSTOMER_ID=ORDERS.CUSTOMER_ID

  The WHERE clause is an equijoin that includes the CUSTOMER_ID from the ORDERS
  and CUSTOMER tables.


Custom Joins
  If you need to override the default join, you can enter contents of the WHERE clause that
  specifies the join in the custom query. If the query performs an outer join, the Integration
  Service may insert the join syntax in the WHERE clause or the FROM clause, depending on
  the database syntax.
  You might need to override the default join under the following circumstances:
  ♦   Columns do not have a primary key-foreign key relationship.


                                                                             Joining Source Data   455
♦   The datatypes of columns used for the join do not match.
              ♦   You want to specify a different type of join, such as an outer join.
              For more information about custom joins and queries, see “Entering a User-Defined Join” on
              page 460.


        Heterogeneous Joins
              To perform a heterogeneous join, use the Joiner transformation. Use the Joiner
              transformation when you need to join the following types of sources:
              ♦   Join data from different source databases
              ♦   Join data from different flat file systems
              ♦   Join relational sources and flat files
              For more information, see “Joiner Transformation” on page 283.


        Creating Key Relationships
              You can join tables in the Source Qualifier transformation if the tables have primary key-
              foreign key relationships. However, you can create primary key-foreign key relationships in
              the Source Analyzer by linking matching columns in different tables. These columns do not
              have to be keys, but they should be included in the index for each table.
              Tip: If the source table has more than 1,000 rows, you can increase performance by indexing
              the primary key-foreign keys. If the source table has fewer than 1,000 rows, you might
              decrease performance if you index the primary key-foreign keys.
              For example, the corporate office for a retail chain wants to extract payments received based
              on orders. The ORDERS and PAYMENTS tables do not share primary and foreign keys.
              Both tables, however, include a DATE_SHIPPED column. You can create a primary key-
              foreign key relationship in the metadata in the Source Analyzer.
              Note, the two tables are not linked. Therefore, the Designer does not recognize the
              relationship on the DATE_SHIPPED columns.
              You create a relationship between the ORDERS and PAYMENTS tables by linking the
              DATE_SHIPPED columns. The Designer adds primary and foreign keys to the
              DATE_SHIPPED columns in the ORDERS and PAYMENTS table definitions.




456   Chapter 21: Source Qualifier Transformation
Figure 21-3 shows a relationship between two tables:

Figure 21-3. Creating a Relationship Between Two Tables




If you do not connect the columns, the Designer does not recognize the relationships.
The primary key-foreign key relationships exist in the metadata only. You do not need to
generate SQL or alter the source tables.
Once the key relationships exist, use a Source Qualifier transformation to join the two tables.
The default join is based on DATE_SHIPPED.




                                                                        Joining Source Data   457
Adding an SQL Query
              The Source Qualifier transformation provides the SQL Query option to override the default
              query. You can enter an SQL statement supported by the source database. Before entering the
              query, connect all the input and output ports you want to use in the mapping.
              When you edit the SQL Query, you can generate and edit the default query. When the
              Designer generates the default query, it incorporates all other configured options, such as a
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide
Pc 811 transformation_guide

More Related Content

PDF
Print Preview - C:\DOCUME~1\fogleman\LOCALS~1\Temp\.aptcache ...
webhostingguy
 
PDF
Pspice userguide ingles
unoenero
 
PDF
GI25_TutorialGuide
tutorialsruby
 
PDF
Pc 811 troubleshooting_guide
makhaderms
 
PDF
Manual Civil 3d Ingles
Ivan Martinez Saucedo
 
PDF
Creating a VMware Software-Defined Data Center Reference Architecture
EMC
 
PDF
Swi prolog-6.2.6
Omar Reyna Angeles
 
PDF
Dell Data Migration A Technical White Paper
nomanc
 
Print Preview - C:\DOCUME~1\fogleman\LOCALS~1\Temp\.aptcache ...
webhostingguy
 
Pspice userguide ingles
unoenero
 
GI25_TutorialGuide
tutorialsruby
 
Pc 811 troubleshooting_guide
makhaderms
 
Manual Civil 3d Ingles
Ivan Martinez Saucedo
 
Creating a VMware Software-Defined Data Center Reference Architecture
EMC
 
Swi prolog-6.2.6
Omar Reyna Angeles
 
Dell Data Migration A Technical White Paper
nomanc
 

What's hot (16)

PDF
Ms Patch Man Ch8
gopi1985
 
PDF
Java code conventions
Armando Daniel
 
PDF
Pervasive Video in the Enterprise
Avaya Inc.
 
PDF
Code Conventions
51 lecture
 
PDF
Robust data synchronization with ibm tivoli directory integrator sg246164
Banking at Ho Chi Minh city
 
PDF
programación en prolog
Alex Pin
 
PDF
Vrs User Guide
guest121f02
 
PDF
Corel vs pro_x4_ug modul corel draw
DIANTO IRAWAN
 
PDF
Solving the XP Legacy Problem with (Extreme) Meta-Programming
amcquiggin
 
PDF
Implementing ibm storage data deduplication solutions sg247888
Banking at Ho Chi Minh city
 
PDF
Debian Handbook
Universitas Virtual Terbuka
 
PDF
Java Complete Reference Fifth Edition
umavanth
 
PDF
B035-2447-220K.pdf
degido10
 
PDF
R Data
Ajay Ohri
 
PDF
Hello, android introducing google’s mobile development platform, 2nd editio...
Kwanzoo Dev
 
PDF
Tivoli data warehouse version 1.3 planning and implementation sg246343
Banking at Ho Chi Minh city
 
Ms Patch Man Ch8
gopi1985
 
Java code conventions
Armando Daniel
 
Pervasive Video in the Enterprise
Avaya Inc.
 
Code Conventions
51 lecture
 
Robust data synchronization with ibm tivoli directory integrator sg246164
Banking at Ho Chi Minh city
 
programación en prolog
Alex Pin
 
Vrs User Guide
guest121f02
 
Corel vs pro_x4_ug modul corel draw
DIANTO IRAWAN
 
Solving the XP Legacy Problem with (Extreme) Meta-Programming
amcquiggin
 
Implementing ibm storage data deduplication solutions sg247888
Banking at Ho Chi Minh city
 
Java Complete Reference Fifth Edition
umavanth
 
B035-2447-220K.pdf
degido10
 
R Data
Ajay Ohri
 
Hello, android introducing google’s mobile development platform, 2nd editio...
Kwanzoo Dev
 
Tivoli data warehouse version 1.3 planning and implementation sg246343
Banking at Ho Chi Minh city
 
Ad

Similar to Pc 811 transformation_guide (20)

PDF
Ibm tivoli usage accounting manager v7.1 handbook sg247404
Banking at Ho Chi Minh city
 
PDF
Ibm info sphere datastage data flow and job design
divjeev
 
PDF
SAP CPI-DS.pdf
JagadishBabuParri
 
PDF
hci10_help_sap_en.pdf
JagadishBabuParri
 
PDF
Deployment guide series ibm tivoli usage and accounting manager v7.1 sg247569
Banking at Ho Chi Minh city
 
PDF
Certification guide series ibm tivoli usage and accounting manager v7.1 imple...
Banking at Ho Chi Minh city
 
PDF
Ibm tivoli web access for information management sg246823
Banking at Ho Chi Minh city
 
PDF
sum2_abap_unix_hana.pdf
ssuser9f920a1
 
PDF
CA Service Desk Administrator Guide with Examples
Arshad Havaldar
 
PDF
Hfm user
partha saradhi
 
PDF
Deployment guide series maximo asset mng 7 1
Slađan Šehović
 
PDF
Hfs to zfs migration
satish090909
 
PDF
Erpi admin 11123510[1] by иссам неязын issam hejazin
Issam Hejazin
 
PDF
SAP MM Tutorial ds_42_tutorial_en.pdf
sjha120721
 
PDF
Essbase database administrator's guide
Chanukya Mekala
 
PDF
Dynamics AX/ X++
Reham Maher El-Safarini
 
PDF
Sg247692 Websphere Accounting Chargeback For Tuam Guide
brzaaap
 
PDF
HRpM_UG_731_HDS_M2
Nicholas Pierotti
 
PDF
Hi path 3000 &amp; 5000 v8 manager c administrator documentation issue 6
javier videla
 
PDF
Red book Blueworks Live
Francesco Maria Rivera
 
Ibm tivoli usage accounting manager v7.1 handbook sg247404
Banking at Ho Chi Minh city
 
Ibm info sphere datastage data flow and job design
divjeev
 
SAP CPI-DS.pdf
JagadishBabuParri
 
hci10_help_sap_en.pdf
JagadishBabuParri
 
Deployment guide series ibm tivoli usage and accounting manager v7.1 sg247569
Banking at Ho Chi Minh city
 
Certification guide series ibm tivoli usage and accounting manager v7.1 imple...
Banking at Ho Chi Minh city
 
Ibm tivoli web access for information management sg246823
Banking at Ho Chi Minh city
 
sum2_abap_unix_hana.pdf
ssuser9f920a1
 
CA Service Desk Administrator Guide with Examples
Arshad Havaldar
 
Hfm user
partha saradhi
 
Deployment guide series maximo asset mng 7 1
Slađan Šehović
 
Hfs to zfs migration
satish090909
 
Erpi admin 11123510[1] by иссам неязын issam hejazin
Issam Hejazin
 
SAP MM Tutorial ds_42_tutorial_en.pdf
sjha120721
 
Essbase database administrator's guide
Chanukya Mekala
 
Dynamics AX/ X++
Reham Maher El-Safarini
 
Sg247692 Websphere Accounting Chargeback For Tuam Guide
brzaaap
 
HRpM_UG_731_HDS_M2
Nicholas Pierotti
 
Hi path 3000 &amp; 5000 v8 manager c administrator documentation issue 6
javier videla
 
Red book Blueworks Live
Francesco Maria Rivera
 
Ad

Recently uploaded (20)

PPTX
Basics and rules of probability with real-life uses
ravatkaran694
 
PPTX
Command Palatte in Odoo 18.1 Spreadsheet - Odoo Slides
Celine George
 
PDF
Virat Kohli- the Pride of Indian cricket
kushpar147
 
PPTX
family health care settings home visit - unit 6 - chn 1 - gnm 1st year.pptx
Priyanshu Anand
 
DOCX
pgdei-UNIT -V Neurological Disorders & developmental disabilities
JELLA VISHNU DURGA PRASAD
 
PPTX
Artificial Intelligence in Gastroentrology: Advancements and Future Presprec...
AyanHossain
 
PPTX
Continental Accounting in Odoo 18 - Odoo Slides
Celine George
 
PPTX
Information Texts_Infographic on Forgetting Curve.pptx
Tata Sevilla
 
PPTX
HISTORY COLLECTION FOR PSYCHIATRIC PATIENTS.pptx
PoojaSen20
 
PPTX
Five Point Someone – Chetan Bhagat | Book Summary & Analysis by Bhupesh Kushwaha
Bhupesh Kushwaha
 
PDF
Health-The-Ultimate-Treasure (1).pdf/8th class science curiosity /samyans edu...
Sandeep Swamy
 
PPTX
CDH. pptx
AneetaSharma15
 
PPTX
Introduction to pediatric nursing in 5th Sem..pptx
AneetaSharma15
 
PPTX
Artificial-Intelligence-in-Drug-Discovery by R D Jawarkar.pptx
Rahul Jawarkar
 
PPTX
Applications of matrices In Real Life_20250724_091307_0000.pptx
gehlotkrish03
 
PDF
Review of Related Literature & Studies.pdf
Thelma Villaflores
 
DOCX
Modul Ajar Deep Learning Bahasa Inggris Kelas 11 Terbaru 2025
wahyurestu63
 
PPTX
How to Apply for a Job From Odoo 18 Website
Celine George
 
PPTX
Tips Management in Odoo 18 POS - Odoo Slides
Celine George
 
PDF
What is CFA?? Complete Guide to the Chartered Financial Analyst Program
sp4989653
 
Basics and rules of probability with real-life uses
ravatkaran694
 
Command Palatte in Odoo 18.1 Spreadsheet - Odoo Slides
Celine George
 
Virat Kohli- the Pride of Indian cricket
kushpar147
 
family health care settings home visit - unit 6 - chn 1 - gnm 1st year.pptx
Priyanshu Anand
 
pgdei-UNIT -V Neurological Disorders & developmental disabilities
JELLA VISHNU DURGA PRASAD
 
Artificial Intelligence in Gastroentrology: Advancements and Future Presprec...
AyanHossain
 
Continental Accounting in Odoo 18 - Odoo Slides
Celine George
 
Information Texts_Infographic on Forgetting Curve.pptx
Tata Sevilla
 
HISTORY COLLECTION FOR PSYCHIATRIC PATIENTS.pptx
PoojaSen20
 
Five Point Someone – Chetan Bhagat | Book Summary & Analysis by Bhupesh Kushwaha
Bhupesh Kushwaha
 
Health-The-Ultimate-Treasure (1).pdf/8th class science curiosity /samyans edu...
Sandeep Swamy
 
CDH. pptx
AneetaSharma15
 
Introduction to pediatric nursing in 5th Sem..pptx
AneetaSharma15
 
Artificial-Intelligence-in-Drug-Discovery by R D Jawarkar.pptx
Rahul Jawarkar
 
Applications of matrices In Real Life_20250724_091307_0000.pptx
gehlotkrish03
 
Review of Related Literature & Studies.pdf
Thelma Villaflores
 
Modul Ajar Deep Learning Bahasa Inggris Kelas 11 Terbaru 2025
wahyurestu63
 
How to Apply for a Job From Odoo 18 Website
Celine George
 
Tips Management in Odoo 18 POS - Odoo Slides
Celine George
 
What is CFA?? Complete Guide to the Chartered Financial Analyst Program
sp4989653
 

Pc 811 transformation_guide

  • 2. Informatica PowerCenter Transformation Guide Version 8.1.1 April 2007 Copyright (c) 1998–2007 Informatica Corporation. All rights reserved. Printed in the USA. This software and documentation contain proprietary information of Informatica Corporation and are provided under a license agreement containing restrictions on use and disclosure and are also protected by copyright law. Reverse engineering of the software is prohibited. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or otherwise) without prior consent of Informatica Corporation. Use, duplication, or disclosure of the Software by the U.S. Government is subject to the restrictions set forth in the applicable software license agreement and as provided in DFARS 227.7202-1(a) and 227.7702-3(a) (1995), DFARS 252.227-7013(c)(1)(ii) (OCT 1988), FAR 12.212(a) (1995), FAR 52.227-19, or FAR 52.227-14 (ALT III), as applicable. The information in this document is subject to change without notice. If you find any problems in the documentation, please report them to us in writing. Informatica Corporation does not warrant that this documentation is error free. Informatica, PowerCenter, PowerCenterRT, PowerCenter Connect, PowerCenter Data Analyzer, PowerExchange, PowerMart, SuperGlue, Metadata Manager, Informatica Data Quality and Informatica Data Explorer are trademarks or registered trademarks of Informatica Corporation in the United States and in jurisdictions throughout the world. All other company and product names may be trade names or trademarks of their respective owners. Portions of this software and/or documentation are subject to copyright held by third parties, including without limitation: Copyright DataDirect Technologies, 1999-2002. All rights reserved. Copyright © Sun Microsystems. All Rights Reserved. Copyright © RSA Security Inc. All Rights Reserved. Copyright © Ordinal Technology Corp. All Rights Reserved. Informatica PowerCenter products contain ACE (TM) software copyrighted by Douglas C. Schmidt and his research group at Washington University and University of California, Irvine, Copyright (c) 1993-2002, all rights reserved. Portions of this software contain copyrighted material from The JBoss Group, LLC. Your right to use such materials is set forth in the GNU Lesser General Public License Agreement, which may be found at http://www.opensource.org/licenses/lgpl-license.php. The JBoss materials are provided free of charge by Informatica, “as-is”, without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and fitness for a particular purpose. Portions of this software contain copyrighted material from Meta Integration Technology, Inc. Meta Integration® is a registered trademark of Meta Integration Technology, Inc. This product includes software developed by the Apache Software Foundation (http://www.apache.org/). The Apache Software is Copyright (c) 1999-2005 The Apache Software Foundation. All rights reserved. This product includes software developed by the OpenSSL Project for use in the OpenSSL Toolkit and redistribution of this software is subject to terms available at http://www.openssl.org. Copyright 1998-2003 The OpenSSL Project. All Rights Reserved. The zlib library included with this software is Copyright (c) 1995-2003 Jean-loup Gailly and Mark Adler. The Curl license provided with this Software is Copyright 1996-2004, Daniel Stenberg, <[email protected]>. All Rights Reserved. The PCRE library included with this software is Copyright (c) 1997-2001 University of Cambridge Regular expression support is provided by the PCRE library package, which is open source software, written by Philip Hazel. The source for this library may be found at ftp://ftp.csx.cam.ac.uk/pub/software/programming/ pcre. InstallAnywhere is Copyright 2005 Zero G Software, Inc. All Rights Reserved. Portions of the Software are Copyright (c) 1998-2005 The OpenLDAP Foundation. All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted only as authorized by the OpenLDAP Public License, available at http://www.openldap.org/software/release/license.html. This Software is protected by U.S. Patent Numbers 6,208,990; 6,044,374; 6,014,670; 6,032,158; 5,794,246; 6,339,775 and other U.S. Patents Pending. DISCLAIMER: Informatica Corporation provides this documentation “as is” without warranty of any kind, either express or implied, including, but not limited to, the implied warranties of non-infringement, merchantability, or use for a particular purpose. The information provided in this documentation may include technical inaccuracies or typographical errors. Informatica could make improvements and/or changes in the products described in this documentation at any time without notice.
  • 3. Table of Contents List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxi List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxv Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxix About This Book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxx Document Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxx Other Informatica Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxxi Visiting Informatica Customer Portal . . . . . . . . . . . . . . . . . . . . . . . . . xxxi Visiting the Informatica Web Site . . . . . . . . . . . . . . . . . . . . . . . . . . . xxxi Visiting the Informatica Knowledge Base . . . . . . . . . . . . . . . . . . . . . . xxxi Obtaining Technical Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxxi Chapter 1: Working with Transformations . . . . . . . . . . . . . . . . . . . . . . 1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Creating a Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Configuring Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Working with Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Creating Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Configuring Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Linking Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Multi-Group Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Working with Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 Using the Expression Editor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Using Local Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 Temporarily Store Data and Simplify Complex Expressions . . . . . . . . . . 14 Store Values Across Rows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 Capture Values from Stored Procedures . . . . . . . . . . . . . . . . . . . . . . . . 15 Guidelines for Configuring Variable Ports . . . . . . . . . . . . . . . . . . . . . . 16 Using Default Values for Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Entering User-Defined Default Values . . . . . . . . . . . . . . . . . . . . . . . . . 20 Entering User-Defined Default Input Values . . . . . . . . . . . . . . . . . . . . . 22 Entering User-Defined Default Output Values . . . . . . . . . . . . . . . . . . . 25 General Rules for Default Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 iii
  • 4. Entering and Validating Default Values . . . . . . . . . . . . . . . . . . . . . . . . . 28 Configuring Tracing Level in Transformations . . . . . . . . . . . . . . . . . . . . . . . 30 Reusable Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 Instances and Inherited Changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 Mapping Variables in Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 Creating Reusable Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 Promoting Non-Reusable Transformations . . . . . . . . . . . . . . . . . . . . . . 32 Creating Non-Reusable Instances of Reusable Transformations . . . . . . . . 33 Adding Reusable Transformations to Mappings . . . . . . . . . . . . . . . . . . . 33 Modifying a Reusable Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . 34 Chapter 2: Aggregator Transformation . . . . . . . . . . . . . . . . . . . . . . . 37 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 Ports in the Aggregator Transformation . . . . . . . . . . . . . . . . . . . . . . . . . 38 Components of the Aggregator Transformation . . . . . . . . . . . . . . . . . . . 38 Aggregate Caches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 Aggregate Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 Aggregate Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 Nested Aggregate Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 Conditional Clauses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 Non-Aggregate Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 Null Values in Aggregate Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 Group By Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 Non-Aggregate Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 Default Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 Using Sorted Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 Sorted Input Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 Pre-Sorting Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 Creating an Aggregator Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 Tips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 Troubleshooting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 Chapter 3: Custom Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . 53 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 Working with Transformations Built On the Custom Transformation . . . 54 Code Page Compatibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 Distributing Custom Transformation Procedures . . . . . . . . . . . . . . . . . . 56 iv Table of Contents
  • 5. Creating Custom Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 Rules and Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 Custom Transformation Components . . . . . . . . . . . . . . . . . . . . . . . . . . 58 Working with Groups and Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 Creating Groups and Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 Editing Groups and Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 Defining Port Relationships . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 Working with Port Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 Custom Transformation Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 Setting the Update Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 Working with Thread-Specific Procedure Code . . . . . . . . . . . . . . . . . . . 66 Working with Transaction Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 Transformation Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 Generate Transaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 Working with Transaction Boundaries . . . . . . . . . . . . . . . . . . . . . . . . . 69 Blocking Input Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 Writing the Procedure Code to Block Data . . . . . . . . . . . . . . . . . . . . . . 70 Configuring Custom Transformations as Blocking Transformations . . . . 70 Validating Mappings with Custom Transformations . . . . . . . . . . . . . . . 71 Working with Procedure Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 Creating Custom Transformation Procedures . . . . . . . . . . . . . . . . . . . . . . . 73 Step 1. Create the Custom Transformation . . . . . . . . . . . . . . . . . . . . . . 73 Step 2. Generate the C Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 Step 3. Fill Out the Code with the Transformation Logic . . . . . . . . . . . 76 Step 4. Build the Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 Step 5. Create a Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 Step 6. Run the Session in a Workflow . . . . . . . . . . . . . . . . . . . . . . . . . 87 Chapter 4: Custom Transformation Functions . . . . . . . . . . . . . . . . . 89 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 Working with Handles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 Function Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 Working with Rows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 Rules and Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 Generated Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 Initialization Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 Notification Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 Table of Contents v
  • 6. Deinitialization Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 API Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 Set Data Access Mode Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 Navigation Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 Property Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 Rebind Datatype Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 Data Handling Functions (Row-Based Mode) . . . . . . . . . . . . . . . . . . . 117 Set Pass-Through Port Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 Output Notification Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 Data Boundary Output Notification Function . . . . . . . . . . . . . . . . . . . 121 Error Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 Session Log Message Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 Increment Error Count Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 Is Terminated Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 Blocking Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 Pointer Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 Change String Mode Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 Set Data Code Page Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 Row Strategy Functions (Row-Based Mode) . . . . . . . . . . . . . . . . . . . . 128 Change Default Row Strategy Function . . . . . . . . . . . . . . . . . . . . . . . 129 Array-Based API Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 Maximum Number of Rows Functions . . . . . . . . . . . . . . . . . . . . . . . . 130 Number of Rows Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 Is Row Valid Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 Data Handling Functions (Array-Based Mode) . . . . . . . . . . . . . . . . . . 132 Row Strategy Functions (Array-Based Mode) . . . . . . . . . . . . . . . . . . . . 135 Set Input Error Row Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 Java API Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 C++ API Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 Chapter 5: Expression Transformation . . . . . . . . . . . . . . . . . . . . . . 141 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 Calculating Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 Adding Multiple Calculations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 Creating an Expression Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 vi Table of Contents
  • 7. Chapter 6: External Procedure Transformation . . . . . . . . . . . . . . . . 145 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146 Code Page Compatibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146 External Procedures and External Procedure Transformations . . . . . . . . 147 External Procedure Transformation Properties . . . . . . . . . . . . . . . . . . . 147 Pipeline Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 COM Versus Informatica External Procedures . . . . . . . . . . . . . . . . . . . 148 The BankSoft Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148 Developing COM Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 Steps for Creating a COM Procedure . . . . . . . . . . . . . . . . . . . . . . . . . 149 COM External Procedure Server Type . . . . . . . . . . . . . . . . . . . . . . . . 149 Using Visual C++ to Develop COM Procedures . . . . . . . . . . . . . . . . . 149 Developing COM Procedures with Visual Basic . . . . . . . . . . . . . . . . . 156 Developing Informatica External Procedures . . . . . . . . . . . . . . . . . . . . . . . 159 Step 1. Create the External Procedure Transformation . . . . . . . . . . . . . 159 Step 2. Generate the C++ Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 Step 3. Fill Out the Method Stub with Implementation . . . . . . . . . . . . 164 Step 4. Building the Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 Step 5. Create a Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 Step 6. Run the Session in a Workflow . . . . . . . . . . . . . . . . . . . . . . . . 167 Distributing External Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 Distributing COM Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 Distributing Informatica Modules . . . . . . . . . . . . . . . . . . . . . . . . . . . 170 Development Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 COM Datatypes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 Row-Level Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 Return Values from Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 Exceptions in Procedure Calls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 Memory Management for Procedures . . . . . . . . . . . . . . . . . . . . . . . . . 173 Wrapper Classes for Pre-Existing C/C++ Libraries or VB Functions . . . 173 Generating Error and Tracing Messages . . . . . . . . . . . . . . . . . . . . . . . 173 Unconnected External Procedure Transformations . . . . . . . . . . . . . . . . 175 Initializing COM and Informatica Modules . . . . . . . . . . . . . . . . . . . . 175 Other Files Distributed and Used in TX . . . . . . . . . . . . . . . . . . . . . . . 179 Service Process Variables in Initialization Properties . . . . . . . . . . . . . . . . . 180 External Procedure Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181 Dispatch Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181 Table of Contents vii
  • 8. External Procedure Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181 Property Access Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182 Parameter Access Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183 Code Page Access Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185 Transformation Name Access Functions . . . . . . . . . . . . . . . . . . . . . . . 185 Procedure Access Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186 Partition Related Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186 Tracing Level Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 Chapter 7: Filter Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190 Filter Condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192 Creating a Filter Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 Tips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195 Troubleshooting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196 Chapter 8: HTTP Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . 197 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198 Authentication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199 Connecting to the HTTP Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199 Creating an HTTP Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200 Configuring the Properties Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202 Configuring the HTTP Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204 Selecting a Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204 Configuring Groups and Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205 Configuring a URL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209 GET Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209 POST Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210 SIMPLE POST Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211 Chapter 9: Java Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214 Steps to Define a Java Transformation . . . . . . . . . . . . . . . . . . . . . . . . . 214 Active and Passive Java Transformations . . . . . . . . . . . . . . . . . . . . . . . 215 Datatype Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215 Using the Java Code Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217 viii Table of Contents
  • 9. Configuring Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219 Creating Groups and Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219 Setting Default Port Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220 Configuring Java Transformation Properties . . . . . . . . . . . . . . . . . . . . . . . 221 Working with Transaction Control . . . . . . . . . . . . . . . . . . . . . . . . . . . 223 Setting the Update Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224 Developing Java Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225 Creating Java Code Snippets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225 Importing Java Packages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226 Defining Helper Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226 On Input Row Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227 On End of Data Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228 On Receiving Transaction Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228 Configuring Java Transformation Settings . . . . . . . . . . . . . . . . . . . . . . . . . 229 Configuring the Classpath . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229 Enabling High Precision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230 Compiling a Java Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231 Fixing Compilation Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232 Locating the Source of Compilation Errors . . . . . . . . . . . . . . . . . . . . . 232 Identifying Compilation Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234 Chapter 10: Java Transformation API Reference . . . . . . . . . . . . . . . 237 Java Transformation API Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238 commit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239 failSession . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240 generateRow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241 getInRowType . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242 incrementErrorCount . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243 Table of Contents ix
  • 10. Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243 isNull . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244 logInfo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245 logError . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246 rollBack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247 setNull . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248 setOutRowType . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249 Chapter 11: Java Transformation Example . . . . . . . . . . . . . . . . . . . 251 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252 Step 1. Import the Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253 Step 2. Create Transformation and Configure Ports . . . . . . . . . . . . . . . . . . 254 Step 3. Enter Java Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256 Import Packages Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256 Helper Code Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257 On Input Row Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258 Step 4. Compile the Java Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261 Step 5. Create a Session and Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . 262 Sample Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262 Chapter 12: Java Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264 Expression Function Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264 Using the Define Expression Dialog Box . . . . . . . . . . . . . . . . . . . . . . . . . . 266 Step 1. Configure the Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266 x Table of Contents
  • 11. Step 2. Create and Validate the Expression . . . . . . . . . . . . . . . . . . . . . 267 Step 3. Generate Java Code for the Expression . . . . . . . . . . . . . . . . . . 267 Steps to Create an Expression and Generate Java Code . . . . . . . . . . . . 268 Java Expression Templates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269 Working with the Simple Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271 invokeJExpression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271 Simple Interface Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272 Working with the Advanced Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273 Steps to Invoke an Expression with the Advanced Interface . . . . . . . . . 273 Rules and Guidelines for Working with the Advanced Interface . . . . . . 273 EDataType Class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274 JExprParamMetadata Class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274 defineJExpression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275 JExpression Class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276 Advanced Interface Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277 JExpression API Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279 invoke . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279 getResultDataType . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280 getResultMetadata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280 isResultNull . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280 getInt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281 getDouble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281 getLong . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281 getStringBuffer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282 getBytes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282 Chapter 13: Joiner Transformation . . . . . . . . . . . . . . . . . . . . . . . . . 283 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284 Working with the Joiner Transformation . . . . . . . . . . . . . . . . . . . . . . 284 Joiner Transformation Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286 Defining a Join Condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288 Defining the Join Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289 Normal Join . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289 Master Outer Join . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290 Detail Outer Join . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290 Full Outer Join . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291 Using Sorted Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292 Table of Contents xi
  • 12. Configuring the Sort Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292 Adding Transformations to the Mapping . . . . . . . . . . . . . . . . . . . . . . . 293 Configuring the Joiner Transformation . . . . . . . . . . . . . . . . . . . . . . . . 293 Defining the Join Condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294 Joining Data from a Single Source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296 Joining Two Branches of the Same Pipeline . . . . . . . . . . . . . . . . . . . . . 296 Joining Two Instances of the Same Source . . . . . . . . . . . . . . . . . . . . . . 297 Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298 Blocking the Source Pipelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299 Unsorted Joiner Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299 Sorted Joiner Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299 Working with Transactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300 Preserving Transaction Boundaries for a Single Pipeline . . . . . . . . . . . . 301 Preserving Transaction Boundaries in the Detail Pipeline . . . . . . . . . . . 301 Dropping Transaction Boundaries for Two Pipelines . . . . . . . . . . . . . . 302 Creating a Joiner Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303 Tips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306 Chapter 14: Lookup Transformation . . . . . . . . . . . . . . . . . . . . . . . . 307 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308 Connected and Unconnected Lookups . . . . . . . . . . . . . . . . . . . . . . . . . . . 309 Connected Lookup Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . 309 Unconnected Lookup Transformation . . . . . . . . . . . . . . . . . . . . . . . . . 310 Relational and Flat File Lookups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311 Relational Lookups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311 Flat File Lookups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311 Lookup Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313 Lookup Source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313 Lookup Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313 Lookup Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314 Lookup Condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315 Metadata Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315 Lookup Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316 Configuring Lookup Properties in a Session . . . . . . . . . . . . . . . . . . . . 320 Lookup Query . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324 Default Lookup Query . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324 Overriding the Lookup Query . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324 xii Table of Contents
  • 13. Lookup Condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328 Uncached or Static Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328 Dynamic Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329 Handling Multiple Matches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329 Lookup Caches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 330 Configuring Unconnected Lookup Transformations . . . . . . . . . . . . . . . . . 331 Step 1. Add Input Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331 Step 2. Add the Lookup Condition . . . . . . . . . . . . . . . . . . . . . . . . . . 332 Step 3. Designate a Return Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332 Step 4. Call the Lookup Through an Expression . . . . . . . . . . . . . . . . . 333 Creating a Lookup Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335 Tips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 336 Chapter 15: Lookup Caches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338 Cache Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339 Building Connected Lookup Caches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340 Sequential Caches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340 Concurrent Caches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341 Using a Persistent Lookup Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342 Using a Non-Persistent Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342 Using a Persistent Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342 Rebuilding the Lookup Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342 Working with an Uncached Lookup or Static Cache . . . . . . . . . . . . . . . . . 344 Working with a Dynamic Lookup Cache . . . . . . . . . . . . . . . . . . . . . . . . . 345 Using the NewLookupRow Port . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347 Using the Associated Input Port . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348 Working with Lookup Transformation Values . . . . . . . . . . . . . . . . . . . 349 Using the Ignore Null Property . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353 Using the Ignore in Comparison Property . . . . . . . . . . . . . . . . . . . . . . 354 Using Update Strategy Transformations with a Dynamic Cache . . . . . . 354 Updating the Dynamic Lookup Cache . . . . . . . . . . . . . . . . . . . . . . . . 356 Using the WHERE Clause with a Dynamic Cache . . . . . . . . . . . . . . . 358 Synchronizing the Dynamic Lookup Cache . . . . . . . . . . . . . . . . . . . . . 359 Example Using a Dynamic Lookup Cache . . . . . . . . . . . . . . . . . . . . . . 360 Rules and Guidelines for Dynamic Caches . . . . . . . . . . . . . . . . . . . . . 361 Sharing the Lookup Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363 Table of Contents xiii
  • 14. Sharing an Unnamed Lookup Cache . . . . . . . . . . . . . . . . . . . . . . . . . . 363 Sharing a Named Lookup Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365 Lookup Cache Tips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369 Chapter 16: Normalizer Transformation . . . . . . . . . . . . . . . . . . . . . . 371 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 372 Normalizer Transformation Components . . . . . . . . . . . . . . . . . . . . . . . . . . 374 Ports Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374 Properties Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 376 Normalizer Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 377 Normalizer Transformation Generated Keys . . . . . . . . . . . . . . . . . . . . . . . 379 Storing Generated Key Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 379 Changing the Generated Key Values . . . . . . . . . . . . . . . . . . . . . . . . . . 379 VSAM Normalizer Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 380 VSAM Normalizer Ports Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 382 VSAM Normalizer Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383 Steps to Create a VSAM Normalizer Transformation . . . . . . . . . . . . . . 385 Pipeline Normalizer Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387 Pipeline Normalizer Ports Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 388 Pipeline Normalizer Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 390 Steps to Create a Pipeline Normalizer Transformation . . . . . . . . . . . . . 391 Using a Normalizer Transformation in a Mapping . . . . . . . . . . . . . . . . . . . 394 Generating Key Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 396 Troubleshooting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 399 Chapter 17: Rank Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . 401 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 402 Ranking String Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403 Rank Caches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403 Rank Transformation Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403 Ports in a Rank Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 404 Rank Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 404 Defining Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 405 Creating a Rank Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 406 Chapter 18: Router Transformation . . . . . . . . . . . . . . . . . . . . . . . . . 409 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 410 xiv Table of Contents
  • 15. Working with Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 412 Input Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 412 Output Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 412 Using Group Filter Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413 Adding Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414 Working with Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 416 Connecting Router Transformations in a Mapping . . . . . . . . . . . . . . . . . . 418 Creating a Router Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 420 Chapter 19: Sequence Generator Transformation . . . . . . . . . . . . . . 421 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 422 Common Uses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423 Creating Keys . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423 Replacing Missing Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423 Sequence Generator Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 424 NEXTVAL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 424 CURRVAL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 426 Transformation Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 427 Start Value and Cycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 428 Increment By . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 428 End Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 429 Current Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 429 Number of Cached Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 430 Reset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 431 Creating a Sequence Generator Transformation . . . . . . . . . . . . . . . . . . . . . 432 Chapter 20: Sorter Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . 435 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 436 Sorting Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437 Sorter Transformation Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 439 Sorter Cache Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 439 Case Sensitive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 441 Work Directory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 441 Distinct Output Rows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 441 Tracing Level . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 441 Null Treated Low . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 442 Transformation Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 442 Table of Contents xv
  • 16. Creating a Sorter Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 443 Chapter 21: Source Qualifier Transformation . . . . . . . . . . . . . . . . . 445 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 446 Transformation Datatypes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 446 Target Load Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 446 Parameters and Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 447 Source Qualifier Transformation Properties . . . . . . . . . . . . . . . . . . . . . . . . 449 Default Query . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 451 Viewing the Default Query . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 452 Overriding the Default Query . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 452 Joining Source Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 454 Default Join . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 454 Custom Joins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455 Heterogeneous Joins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 456 Creating Key Relationships . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 456 Adding an SQL Query . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 458 Entering a User-Defined Join . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 460 Outer Join Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 462 Informatica Join Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 462 Creating an Outer Join . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 467 Common Database Syntax Restrictions . . . . . . . . . . . . . . . . . . . . . . . . 469 Entering a Source Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 470 Using Sorted Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 472 Select Distinct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 474 Overriding Select Distinct in the Session . . . . . . . . . . . . . . . . . . . . . . 474 Adding Pre- and Post-Session SQL Commands . . . . . . . . . . . . . . . . . . . . . 475 Creating a Source Qualifier Transformation . . . . . . . . . . . . . . . . . . . . . . . . 476 Creating a Source Qualifier Transformation By Default . . . . . . . . . . . . 476 Creating a Source Qualifier Transformation Manually . . . . . . . . . . . . . 476 Configuring Source Qualifier Transformation Options . . . . . . . . . . . . . 476 Troubleshooting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 478 Chapter 22: SQL Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . 479 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 480 Script Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 481 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 482 xvi Table of Contents
  • 17. Script Mode Rules and Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . 482 Query Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 484 Using Static SQL Queries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 485 Using Dynamic SQL Queries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 486 Query Mode Rules and Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . 489 Connecting to Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 490 Using a Static Database Connection . . . . . . . . . . . . . . . . . . . . . . . . . . 490 Passing a Logical Database Connection . . . . . . . . . . . . . . . . . . . . . . . . 490 Passing Full Connection Information . . . . . . . . . . . . . . . . . . . . . . . . . 490 Database Connections Rules and Guidelines . . . . . . . . . . . . . . . . . . . . 493 Session Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 494 Input Row to Output Row Cardinality . . . . . . . . . . . . . . . . . . . . . . . . 494 Transaction Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 498 High Availability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 498 Creating an SQL Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 500 SQL Transformation Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 502 Properties Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 502 SQL Settings Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 505 SQL Ports Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 506 SQL Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 509 Chapter 23: Using the SQL Transformation in a Mapping . . . . . . . . 511 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 512 Dynamic Update Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 513 Defining the Source File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 514 Creating a Target Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 515 Creating the Database Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 515 Configuring the Expression Transformation . . . . . . . . . . . . . . . . . . . . 516 Defining the SQL Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . 516 Configuring Session Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 518 Target Data Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 518 Dynamic Connection Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 519 Defining the Source File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 520 Creating a Target Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 520 Creating the Database Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 520 Creating the Database Connections . . . . . . . . . . . . . . . . . . . . . . . . . . 521 Configuring the Expression Transformation . . . . . . . . . . . . . . . . . . . . 521 Table of Contents xvii
  • 18. Defining the SQL Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . 522 Configuring Session Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 524 Target Data Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 524 Chapter 24: Stored Procedure Transformation . . . . . . . . . . . . . . . . 525 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 526 Connected and Unconnected Transformations . . . . . . . . . . . . . . . . . . . . . . 527 Input and Output Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 528 Input/Output Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 528 Return Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 528 Status Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 528 Running a Stored Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 529 Stored Procedure Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 529 Executing Stored Procedures with a Database Connection . . . . . . . . . . 529 Using a Stored Procedure in a Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . 531 Writing a Stored Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 532 Sample Stored Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 532 Creating a Stored Procedure Transformation . . . . . . . . . . . . . . . . . . . . . . . 535 Importing Stored Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 535 Manually Creating Stored Procedure Transformations . . . . . . . . . . . . . 537 Setting Options for the Stored Procedure . . . . . . . . . . . . . . . . . . . . . . 538 Using $Source and $Target Variables . . . . . . . . . . . . . . . . . . . . . . . . . 539 Changing the Stored Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 540 Configuring a Connected Transformation . . . . . . . . . . . . . . . . . . . . . . . . . 541 Configuring an Unconnected Transformation . . . . . . . . . . . . . . . . . . . . . . 542 Calling a Stored Procedure From an Expression . . . . . . . . . . . . . . . . . . 542 Calling a Pre- or Post-Session Stored Procedure . . . . . . . . . . . . . . . . . . 545 Error Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 548 Pre-Session Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 548 Post-Session Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 549 Session Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 549 Supported Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 550 SQL Declaration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 550 Parameter Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 550 Input/Output Port in Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 550 Type of Return Value Supported . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 551 Expression Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 552 xviii Table of Contents
  • 19. Tips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 553 Troubleshooting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 554 Chapter 25: Transaction Control Transformation . . . . . . . . . . . . . . 555 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 556 Transaction Control Transformation Properties . . . . . . . . . . . . . . . . . . . . . 557 Properties Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 557 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 558 Using Transaction Control Transformations in Mappings . . . . . . . . . . . . . . 560 Sample Transaction Control Mappings with Multiple Targets . . . . . . . 561 Mapping Guidelines and Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 564 Creating a Transaction Control Transformation . . . . . . . . . . . . . . . . . . . . . 565 Chapter 26: Union Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . 567 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 568 Union Transformation Rules and Guidelines . . . . . . . . . . . . . . . . . . . . 568 Union Transformation Components . . . . . . . . . . . . . . . . . . . . . . . . . . 568 Working with Groups and Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 570 Creating a Union Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 572 Using a Union Transformation in Mappings . . . . . . . . . . . . . . . . . . . . . . . 574 Chapter 27: Update Strategy Transformation . . . . . . . . . . . . . . . . . 575 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 576 Setting the Update Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 576 Flagging Rows Within a Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 577 Forwarding Rejected Rows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 577 Update Strategy Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 577 Aggregator and Update Strategy Transformations . . . . . . . . . . . . . . . . 578 Lookup and Update Strategy Transformations . . . . . . . . . . . . . . . . . . . 579 Setting the Update Strategy for a Session . . . . . . . . . . . . . . . . . . . . . . . . . 580 Specifying an Operation for All Rows . . . . . . . . . . . . . . . . . . . . . . . . . 580 Specifying Operations for Individual Target Tables . . . . . . . . . . . . . . . 581 Update Strategy Checklist . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 583 Chapter 28: XML Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . 585 XML Source Qualifier Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . 586 XML Parser Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 587 Table of Contents xix
  • 20. XML Generator Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 588 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 589 xx Table of Contents
  • 21. List of Figures Figure 1-1. Sample Ports Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Figure 1-2. Example of Input, Output, and Input/Output Ports . . . . . . . . . . . . . . . . . . . . . . . 8 Figure 1-3. Sample Input and Output Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 Figure 1-4. Expression Editor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Figure 1-5. Variable Ports Store Values Across Rows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 Figure 1-6. Default Value for Input and Input/Output Ports . . . . . . . . . . . . . . . . . . . . . . . . . 19 Figure 1-7. Default Value for Output Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 Figure 1-8. Using a Constant as a Default Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 Figure 1-9. Using the ERROR Function to Skip Null Input Values . . . . . . . . . . . . . . . . . . . . 24 Figure 1-10. Entering and Validating Default Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 Figure 1-11. Reverting to Original Reusable Transformation Properties . . . . . . . . . . . . . . . . . 35 Figure 2-1. Sample Mapping with Aggregator and Sorter Transformations . . . . . . . . . . . . . . . 46 Figure 3-1. Custom Transformation Ports Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 Figure 3-2. Editing Port Dependencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 Figure 3-3. Port Attribute Definitions Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 Figure 3-4. Edit Port Attribute Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 Figure 3-5. Custom Transformation Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 Figure 3-6. Custom Transformation Ports Tab - Union Example . . . . . . . . . . . . . . . . . . . . . . 74 Figure 3-7. Custom Transformation Properties Tab - Union Example . . . . . . . . . . . . . . . . . . 75 Figure 3-8. Mapping with a Custom Transformation - Union Example . . . . . . . . . . . . . . . . . 87 Figure 4-1. Custom Transformation Handles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 Figure 6-1. Process for Distributing External Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 Figure 6-2. External Procedure Transformation Initialization Properties . . . . . . . . . . . . . . . . 178 Figure 6-3. External Procedure Transformation Initialization Properties Tab . . . . . . . . . . . . 180 Figure 7-1. Sample Mapping with a Filter Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . 190 Figure 7-2. Specifying a Filter Condition in a Filter Transformation . . . . . . . . . . . . . . . . . . 191 Figure 8-1. HTTP Transformation Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198 Figure 8-2. HTTP Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201 Figure 8-3. HTTP Transformation Properties Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202 Figure 8-4. HTTP Transformation HTTP Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204 Figure 8-5. HTTP Tab for a GET Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209 Figure 8-6. HTTP Tab for a POST Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210 Figure 8-7. HTTP Tab for a SIMPLE POST Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211 Figure 9-1. Java Code Tab Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217 Figure 9-2. Java Transformation Ports Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219 Figure 9-3. Java Transformation Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221 Figure 9-4. Java Transformation Settings Dialog Box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229 Figure 9-5. Highlighted Error in Code Entry Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233 Figure 9-6. Highlighted Error in Full Code Window . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234 Figure 11-1. Java Transformation Example - Sample Mapping . . . . . . . . . . . . . . . . . . . . . . . 253 List of Figures xxi
  • 22. Figure 11-2. Java Transformation Example - Ports Tab . . . . . . . . . . . . . . . . . . . . . . . . . .. . .255 Figure 11-3. Java Transformation Example - Import Packages Tab . . . . . . . . . . . . . . . . . .. . .257 Figure 11-4. Java Transformation Example - Helper Code Tab . . . . . . . . . . . . . . . . . . . .. . .258 Figure 11-5. Java Transformation Example - On Input Row Tab . . . . . . . . . . . . . . . . . . .. . .260 Figure 11-6. Java Transformation Example - Successful Compilation . . . . . . . . . . . . . . . .. . .261 Figure 12-1. Define Expression Dialog Box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .267 Figure 12-2. Java Expressions Code Entry Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .268 Figure 13-1. Mapping with Master and Detail Pipelines . . . . . . . . . . . . . . . . . . . . . . . . .. . .284 Figure 13-2. Joiner Transformation Properties Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .286 Figure 13-3. Mapping Configured to Join Data from Two Pipelines . . . . . . . . . . . . . . . .. . .295 Figure 13-4. Mapping that Joins Two Branches of a Pipeline . . . . . . . . . . . . . . . . . . . . . .. . .297 Figure 13-5. Mapping that Joins Two Instances of the Same Source . . . . . . . . . . . . . . . . .. . .297 Figure 13-6. Preserving Transaction Boundaries when You Join Two Pipeline Branches . .. . .301 Figure 14-1. Session Properties for Flat File Lookups . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .321 Figure 14-2. Return Port in a Lookup Transformation . . . . . . . . . . . . . . . . . . . . . . . . . .. . .333 Figure 15-1. Building Lookup Caches Sequentially . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .340 Figure 15-2. Building Lookup Caches Concurrently . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .341 Figure 15-3. Mapping with a Dynamic Lookup Cache . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .346 Figure 15-4. Dynamic Lookup Transformation Ports Tab . . . . . . . . . . . . . . . . . . . . . . . .. . .347 Figure 15-5. Using Update Strategy Transformations with a Lookup Transformation . . . .. . .355 Figure 15-6. Slowly Changing Dimension Mapping with Dynamic Lookup Cache . . . . . .. . .360 Figure 16-1. Normalizer Transformation Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .374 Figure 16-2. Normalizer Ports Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .375 Figure 16-3. Normalizer Transformation Properties Tab . . . . . . . . . . . . . . . . . . . . . . . . .. . .376 Figure 16-4. Normalizer Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .377 Figure 16-5. COBOL Source Definition Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .381 Figure 16-6. Sales File VSAM Normalizer Transformation . . . . . . . . . . . . . . . . . . . . . . .. . .381 Figure 16-7. VSAM Normalizer Ports Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .383 Figure 16-8. Normalizer Tab for a VSAM Normalizer Transformation . . . . . . . . . . . . . . .. . .384 Figure 16-9. Pipeline Normalizer Columns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .387 Figure 16-10. Pipeline Normalizer Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .387 Figure 16-11. Pipeline Normalizer Ports Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .389 Figure 16-12. Normalizer Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .390 Figure 16-13. Grouping Repeated Columns on the Normalizer Tab . . . . . . . . . . . . . . . . .. . .391 Figure 16-14. Group-Level Column on the Normalizer Tab . . . . . . . . . . . . . . . . . . . . . .. . .393 Figure 16-15. Sales File COBOL Source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .394 Figure 16-16. Multiple Record Types Routed to Different Targets . . . . . . . . . . . . . . . . . .. . .395 Figure 16-17. Router Transformation User-Defined Groups . . . . . . . . . . . . . . . . . . . . . .. . .396 Figure 16-18. COBOL Source with A Multiple-Occurring Group of Columns . . . . . . . . .. . .397 Figure 16-19. Generated Keys in Target Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .397 Figure 16-20. Generated Keys Mapped to Target Keys . . . . . . . . . . . . . . . . . . . . . . . . . .. . .398 Figure 17-1. Sample Mapping with a Rank Transformation . . . . . . . . . . . . . . . . . . . . . . .. . .402 Figure 18-1. Comparing Router and Filter Transformations . . . . . . . . . . . . . . . . . . . . . .. . .410 xxii List of Figures
  • 23. Figure 18-2. Sample Router Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . 411 Figure 18-3. Using a Router Transformation in a Mapping . . . . . . . . . . . . . . . . . . . . . . .. . 413 Figure 18-4. Specifying Group Filter Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . 414 Figure 18-5. Router Transformation Ports Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . 416 Figure 18-6. Input Port Name and Corresponding Output Port Names . . . . . . . . . . . . . .. . 417 Figure 19-1. Connecting NEXTVAL to Two Target Tables in a Mapping . . . . . . . . . . . . .. . 424 Figure 19-2. Mapping with a Sequence Generator and an Expression Transformation . . . .. . 425 Figure 19-3. Connecting CURRVAL and NEXTVAL Ports to a Target . . . . . . . . . . . . . .. . 426 Figure 20-1. Sample Mapping with a Sorter Transformation . . . . . . . . . . . . . . . . . . . . . .. . 436 Figure 20-2. Sample Sorter Transformation Ports Configuration . . . . . . . . . . . . . . . . . . .. . 437 Figure 20-3. Sorter Transformation Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . 439 Figure 21-1. Source Definition Connected to a Source Qualifier Transformation . . . . . . .. . 451 Figure 21-2. Joining Two Tables with One Source Qualifier Transformation . . . . . . . . . .. . 455 Figure 21-3. Creating a Relationship Between Two Tables . . . . . . . . . . . . . . . . . . . . . . . .. . 457 Figure 22-1. SQL Transformation Script Mode Ports . . . . . . . . . . . . . . . . . . . . . . . . . . .. . 481 Figure 22-2. SQL Editor for an SQL Transformation Query . . . . . . . . . . . . . . . . . . . . . .. . 484 Figure 22-3. SQL Transformation Static Query Mode Ports . . . . . . . . . . . . . . . . . . . . . .. . 486 Figure 22-4. SQL Transformation Ports to Pass a Full Dynamic Query . . . . . . . . . . . . . .. . 487 Figure 22-5. SQL Transformation Properties Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . 503 Figure 22-6. SQL Settings Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . 505 Figure 22-7. SQL Transformation SQL Ports Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . 507 Figure 23-1. Dynamic Query Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . 513 Figure 23-2. Dynamic Query Expression Transformation Ports . . . . . . . . . . . . . . . . . . . .. . 516 Figure 23-3. Dynamic Query SQL Transformation Ports tab: . . . . . . . . . . . . . . . . . . . . .. . 517 Figure 23-4. Dynamic Connection Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . 519 Figure 23-5. Dynamic Query Example Expression Transformation Ports . . . . . . . . . . . . .. . 521 Figure 23-6. Dynamic Connection Example SQL Transformation Ports . . . . . . . . . . . . . .. . 523 Figure 24-1. Sample Mapping with a Stored Procedure Transformation . . . . . . . . . . . . . .. . 541 Figure 24-2. Expression Transformation Referencing a Stored Procedure Transformation .. . 542 Figure 24-3. Stored Procedure Error Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . 548 Figure 25-1. Transaction Control Transformation Properties . . . . . . . . . . . . . . . . . . . . . .. . 557 Figure 25-2. Sample Transaction Control Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . 559 Figure 25-3. Effective and Ineffective Transaction Control Transformations . . . . . . . . . . .. . 561 Figure 25-4. Transaction Control Transformation Effective for a Transformation . . . . . . .. . 561 Figure 25-5. Valid Mapping with Transaction Control Transformations . . . . . . . . . . . . . .. . 562 Figure 25-6. Invalid Mapping with Transaction Control Transformations . . . . . . . . . . . .. . 563 Figure 26-1. Union Transformation Groups Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . 570 Figure 26-2. Union Transformation Group Ports Tab . . . . . . . . . . . . . . . . . . . . . . . . . . .. . 571 Figure 26-3. Union Transformation Ports Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . 571 Figure 26-4. Mapping with a Union Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . 574 Figure 27-1. Specifying Operations for Individual Target Tables . . . . . . . . . . . . . . . . . . .. . 582 List of Figures xxiii
  • 24. xxiv List of Figures
  • 25. List of Tables Table 1-1. Transformation Descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . .. 2 Table 1-2. Multi-Group Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . .. 9 Table 1-3. Transformations Containing Expressions . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . . 11 Table 1-4. Variable Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . . 14 Table 1-5. System Default Values and Integration Service Behavior . . . . . . . . . . . . . . .. . .. . . 18 Table 1-6. Transformations Supporting User-Defined Default Values . . . . . . . . . . . . .. . .. . . 20 Table 1-7. Default Values for Input and Input/Output Ports . . . . . . . . . . . . . . . . . . .. . .. . . 22 Table 1-8. Supported Default Values for Output Ports . . . . . . . . . . . . . . . . . . . . . . . .. . .. . . 26 Table 1-9. Session Log Tracing Levels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . . 30 Table 3-1. Custom Transformation Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . . 64 Table 3-2. Transaction Boundary Handling with Custom Transformations . . . . . . . . .. . .. . . 69 Table 3-3. Module File Names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . . 85 Table 3-4. UNIX Commands to Build the Shared Library. . . . . . . . . . . . . . . . . . . . . .. . .. . . 86 Table 4-1. Custom Transformation Handles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . . 91 Table 4-2. Custom Transformation Generated Functions . . . . . . . . . . . . . . . . . . . . . .. . .. . . 92 Table 4-3. Custom Transformation API Functions . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . . 92 Table 4-4. Custom Transformation Array-Based API Functions . . . . . . . . . . . . . . . . .. . .. . . 94 Table 4-5. INFA_CT_MODULE Property IDs . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 109 Table 4-6. INFA_CT_PROC_HANDLE Property IDs . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 110 Table 4-7. INFA_CT_TRANS_HANDLE Property IDs . . . . . . . . . . . . . . . . . . . . . . .. . .. . 111 Table 4-8. INFA_CT_INPUT_GROUP and INFA_CT_OUTPUT_GROUP Handle Property IDs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 Table 4-9. INFA_CT_INPUTPORT and INFA_CT_OUTPUTPORT_HANDLE Handle Property IDs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 113 Table 4-10. Property Functions (MBCS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 115 Table 4-11. Property Functions (Unicode) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 115 Table 4-12. Compatible Datatypes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 116 Table 4-13. Get Data Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 118 Table 4-14. Get Data Functions (Array-Based Mode) . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 133 Table 6-1. Differences Between COM and Informatica External Procedures . . . . . . . .. . .. . 148 Table 6-2. Visual C++ and Transformation Datatypes . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 171 Table 6-3. Visual Basic and Transformation Datatypes . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 171 Table 6-4. External Procedure Initialization Properties . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 180 Table 6-5. Descriptions of Parameter Access Functions. . . . . . . . . . . . . . . . . . . . . . . .. . .. . 183 Table 6-6. Member Variable of the External Procedure Base Class. . . . . . . . . . . . . . . .. . .. . 185 Table 8-1. HTTP Transformation Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 202 Table 8-2. HTTP Transformation Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 205 Table 8-3. GET Method Groups and Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 206 Table 8-4. POST Method Groups and Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 206 Table 8-5. SIMPLE POST Method Groups and Ports . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 207 Table 9-1. Mapping from PowerCenter Datatypes to Java Datatypes . . . . . . . . . . . . . .. . .. . 215 List of Tables xxv
  • 26. Table 9-2. Java Transformation Properties . . . . . . . . . . . . . . . . . . . . . . . ... . . .. . . .. . .. . .221 Table 11-1. Input and Output Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . ... . . .. . . .. . .. . .254 Table 12-1. Enumerated Java Datatypes . . . . . . . . . . . . . . . . . . . . . . . . ... . . .. . . .. . .. . .274 Table 12-2. JExpression API Methods . . . . . . . . . . . . . . . . . . . . . . . . . . ... . . .. . . .. . .. . .276 Table 13-1. Joiner Transformation Properties . . . . . . . . . . . . . . . . . . . . ... . . .. . . .. . .. . .286 Table 13-2. Integration Service Behavior with Transformation Scopes for the Joiner Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ... . . .. . . .. . .. . .300 Table 14-1. Differences Between Connected and Unconnected Lookups . ... . . .. . . .. . .. . .309 Table 14-2. Lookup Transformation Port Types . . . . . . . . . . . . . . . . . . ... . . .. . . .. . .. . .314 Table 14-3. Lookup Transformation Properties . . . . . . . . . . . . . . . . . . . ... . . .. . . .. . .. . .316 Table 14-4. Session Properties for Flat File Lookups . . . . . . . . . . . . . . . ... . . .. . . .. . .. . .322 Table 15-1. Lookup Caching Comparison . . . . . . . . . . . . . . . . . . . . . . . ... . . .. . . .. . .. . .339 Table 15-2. Integration Service Handling of Persistent Caches . . . . . . . . ... . . .. . . .. . .. . .343 Table 15-3. NewLookupRow Values . . . . . . . . . . . . . . . . . . . . . . . . . . . ... . . .. . . .. . .. . .348 Table 15-4. Dynamic Lookup Cache Behavior for Insert Row Type . . . . ... . . .. . . .. . .. . .357 Table 15-5. Dynamic Lookup Cache Behavior for Update Row Type . . . ... . . .. . . .. . .. . .358 Table 15-6. Location for Sharing Unnamed Cache . . . . . . . . . . . . . . . . . ... . . .. . . .. . .. . .364 Table 15-7. Properties for Sharing Unnamed Cache . . . . . . . . . . . . . . . . ... . . .. . . .. . .. . .364 Table 15-8. Location for Sharing Named Cache . . . . . . . . . . . . . . . . . . . ... . . .. . . .. . .. . .367 Table 15-9. Properties for Sharing Named Cache . . . . . . . . . . . . . . . . . . ... . . .. . . .. . .. . .367 Table 16-1. Normalizer Transformation Properties . . . . . . . . . . . . . . . . ... . . .. . . .. . .. . .376 Table 16-2. Normalizer Tab Columns . . . . . . . . . . . . . . . . . . . . . . . . . . ... . . .. . . .. . .. . .378 Table 16-3. Normalizer Tab for a VSAM Normalizer Transformation . . . ... . . .. . . .. . .. . .384 Table 16-4. Pipeline Normalizer Tab . . . . . . . . . . . . . . . . . . . . . . . . . . ... . . .. . . .. . .. . .390 Table 17-1. Rank Transformation Ports . . . . . . . . . . . . . . . . . . . . . . . . ... . . .. . . .. . .. . .404 Table 17-2. Rank Transformation Properties . . . . . . . . . . . . . . . . . . . . . ... . . .. . . .. . .. . .407 Table 19-1. Sequence Generator Transformation Properties . . . . . . . . . . ... . . .. . . .. . .. . .427 Table 20-1. Column Sizes for Sorter Data Calculations . . . . . . . . . . . . . ... . . .. . . .. . .. . .440 Table 21-1. Conversion for Datetime Mapping Parameters and Variables ... . . .. . . .. . .. . .447 Table 21-2. Source Qualifier Transformation Properties . . . . . . . . . . . . . ... . . .. . . .. . .. . .449 Table 21-3. Locations for Entering Outer Join Syntax . . . . . . . . . . . . . . ... . . .. . . .. . .. . .463 Table 21-4. Syntax for Normal Joins in a Join Override . . . . . . . . . . . . . ... . . .. . . .. . .. . .463 Table 21-5. Syntax for Left Outer Joins in a Join Override . . . . . . . . . . . ... . . .. . . .. . .. . .465 Table 21-6. Syntax for Right Outer Joins in a Join Override . . . . . . . . . ... . . .. . . .. . .. . .467 Table 22-1. Full Database Connection Ports . . . . . . . . . . . . . . . . . . . . . ... . . .. . . .. . .. . .491 Table 22-2. Native Connect String Syntax . . . . . . . . . . . . . . . . . . . . . ... . . .. . . .. . .. . .491 Table 22-3. Output Rows By Query Statement - Query Mode . . . . . . . . ... . . .. . . .. . .. . .495 Table 22-4. NumRowsAffected Rows by Query Statement - Query Mode ... . . .. . . .. . .. . .495 Table 22-5. Output Rows by Query Statement - Query Mode . . . . . . . . ... . . .. . . .. . .. . .497 Table 22-6. SQL Transformation Connection Options . . . . . . . . . . . . . . ... . . .. . . .. . .. . .501 Table 22-7. SQL Transformation Properties . . . . . . . . . . . . . . . . . . . . . ... . . .. . . .. . .. . .503 Table 22-8. SQL Settings Tab Attributes . . . . . . . . . . . . . . . . . . . . . . . . ... . . .. . . .. . .. . .505 Table 22-9. SQL Transformation Ports . . . . . . . . . . . . . . . . . . . . . . . . . ... . . .. . . .. . .. . .507 Table 22-10. Standard SQL Statements . . . . . . . . . . . . . . . . . . . . . . . . . ... . . .. . . .. . .. . .509 xxvi List of Tables
  • 27. Table 24-1. Connected and Unconnected Stored Procedure Transformation Tasks . . . .. . .. . 527 Table 24-2. Setting Options for the Stored Procedure Transformation . . . . . . . . . . . .. . .. . 538 Table 27-1. Constants for Each Database Operation . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 577 Table 27-2. Specifying an Operation for All Rows . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 580 Table 27-3. Update Strategy Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . 581 List of Tables xxvii
  • 28. xxviii List of Tables
  • 29. Preface Welcome to PowerCenter, the Informatica software product that delivers an open, scalable data integration solution addressing the complete life cycle for all data integration projects including data warehouses, data migration, data synchronization, and information hubs. PowerCenter combines the latest technology enhancements for reliably managing data repositories and delivering information resources in a timely, usable, and efficient manner. The PowerCenter repository coordinates and drives a variety of core functions, including extracting, transforming, loading, and managing data. The Integration Service can extract large volumes of data from multiple platforms, handle complex transformations on the data, and support high-speed loads. PowerCenter can simplify and accelerate the process of building a comprehensive data warehouse from disparate data sources. xxix
  • 30. About This Book The Transformation Guide is written for the developers and software engineers responsible for implementing your data warehouse. The Transformation Guide assumes that you have a solid understanding of your operating systems, relational database concepts, and the database engines, flat files, or mainframe system in your environment. This guide also assumes that you are familiar with the interface requirements for your supporting applications. Document Conventions This guide uses the following formatting conventions: If you see… It means… italicized text The word or set of words are especially emphasized. boldfaced text Emphasized subjects. italicized monospaced text This is the variable name for a value you enter as part of an operating system command. This is generic text that should be replaced with user-supplied values. Note: The following paragraph provides additional facts. Tip: The following paragraph provides suggested uses. Warning: The following paragraph notes situations where you can overwrite or corrupt data, unless you follow the specified procedure. monospaced text This is a code example. bold monospaced text This is an operating system command you enter from a prompt to run a task. xxx Preface
  • 31. Other Informatica Resources In addition to the product manuals, Informatica provides these other resources: ♦ Informatica Customer Portal ♦ Informatica web site ♦ Informatica Knowledge Base ♦ Informatica Technical Support Visiting Informatica Customer Portal As an Informatica customer, you can access the Informatica Customer Portal site at http://my.informatica.com. The site contains product information, user group information, newsletters, access to the Informatica customer support case management system (ATLAS), the Informatica Knowledge Base, Informatica Documentation Center, and access to the Informatica user community. Visiting the Informatica Web Site You can access the Informatica corporate web site at http://www.informatica.com. The site contains information about Informatica, its background, upcoming events, and sales offices. You will also find product and partner information. The services area of the site includes important information about technical support, training and education, and implementation services. Visiting the Informatica Knowledge Base As an Informatica customer, you can access the Informatica Knowledge Base at http://my.informatica.com. Use the Knowledge Base to search for documented solutions to known technical issues about Informatica products. You can also find answers to frequently asked questions, technical white papers, and technical tips. Obtaining Technical Support There are many ways to access Informatica Technical Support. You can contact a Technical Support Center by using the telephone numbers listed in the following table, you can send email, or you can use the WebSupport Service. Use the following email addresses to contact Informatica Technical Support: ♦ [email protected] for technical inquiries ♦ [email protected] for general customer service requests Preface xxxi
  • 32. WebSupport requires a user name and password. You can request a user name and password at http://my.informatica.com. North America / South America Europe / Middle East / Africa Asia / Australia Informatica Corporation Informatica Software Ltd. Informatica Business Solutions Headquarters 6 Waltham Park Pvt. Ltd. 100 Cardinal Way Waltham Road, White Waltham Diamond District Redwood City, California Maidenhead, Berkshire Tower B, 3rd Floor 94063 SL6 3TN 150 Airport Road United States United Kingdom Bangalore 560 008 India Toll Free Toll Free Toll Free 877 463 2435 00 800 4632 4357 Australia: 1 800 151 830 Singapore: 001 800 4632 4357 Standard Rate Standard Rate Standard Rate United States: 650 385 5800 Belgium: +32 15 281 702 India: +91 80 4112 5738 France: +33 1 41 38 92 26 Germany: +49 1805 702 702 Netherlands: +31 306 022 797 United Kingdom: +44 1628 511 445 xxxii Preface
  • 33. Chapter 1 Working with Transformations This chapter includes the following topics: ♦ Overview, 2 ♦ Creating a Transformation, 5 ♦ Configuring Transformations, 6 ♦ Working with Ports, 7 ♦ Multi-Group Transformations, 9 ♦ Working with Expressions, 10 ♦ Using Local Variables, 14 ♦ Using Default Values for Ports, 18 ♦ Configuring Tracing Level in Transformations, 30 ♦ Reusable Transformations, 31 1
  • 34. Overview A transformation is a repository object that generates, modifies, or passes data. The Designer provides a set of transformations that perform specific functions. For example, an Aggregator transformation performs calculations on groups of data. Transformations in a mapping represent the operations the Integration Service performs on the data. Data passes through transformation ports that you link in a mapping or mapplet. Transformations can be active or passive. An active transformation can change the number of rows that pass through it, such as a Filter transformation that removes rows that do not meet the filter condition. A passive transformation does not change the number of rows that pass through it, such as an Expression transformation that performs a calculation on data and passes all rows through the transformation. Transformations can be connected to the data flow, or they can be unconnected. An unconnected transformation is not connected to other transformations in the mapping. An unconnected transformation is called within another transformation, and returns a value to that transformation. Table 1-1 provides a brief description of each transformation: Table 1-1. Transformation Descriptions Transformation Type Description Aggregator Active/ Performs aggregate calculations. Connected Application Source Qualifier Active/ Represents the rows that the Integration Service reads from an Connected application, such as an ERP source, when it runs a session. Custom Active or Calls a procedure in a shared library or DLL. Passive/ Connected Expression Passive/ Calculates a value. Connected External Procedure Passive/ Calls a procedure in a shared library or in the COM layer of Connected or Windows. Unconnected Filter Active/ Filters data. Connected HTTP Passive/ Connects to an HTTP server to read or update data. Connected Input Passive/ Defines mapplet input rows. Available in the Mapplet Designer. Connected Java Active or Executes user logic coded in Java. The byte code for the user logic Passive/ is stored in the repository. Connected 2 Chapter 1: Working with Transformations
  • 35. Table 1-1. Transformation Descriptions Transformation Type Description Joiner Active/ Joins data from different databases or flat file systems. Connected Lookup Passive/ Looks up values. Connected or Unconnected Normalizer Active/ Source qualifier for COBOL sources. Can also use in the pipeline to Connected normalize data from relational or flat file sources. Output Passive/ Defines mapplet output rows. Available in the Mapplet Designer. Connected Rank Active/ Limits records to a top or bottom range. Connected Router Active/ Routes data into multiple transformations based on group Connected conditions. Sequence Generator Passive/ Generates primary keys. Connected Sorter Active/ Sorts data based on a sort key. Connected Source Qualifier Active/ Represents the rows that the Integration Service reads from a Connected relational or flat file source when it runs a session. SQL Active or Executes SQL queries against a database. Passive/ Connected Stored Procedure Passive/ Calls a stored procedure. Connected or Unconnected Transaction Control Active/ Defines commit and rollback transactions. Connected Union Active/ Merges data from different databases or flat file systems. Connected Update Strategy Active/ Determines whether to insert, delete, update, or reject rows. Connected XML Generator Active/ Reads data from one or more input ports and outputs XML through a Connected single output port. XML Parser Active/ Reads XML from one input port and outputs data to one or more Connected output ports. XML Source Qualifier Active/ Represents the rows that the Integration Service reads from an XML Connected source when it runs a session. Overview 3
  • 36. When you build a mapping, you add transformations and configure them to handle data according to a business purpose. Complete the following tasks to incorporate a transformation into a mapping: 1. Create the transformation. Create it in the Mapping Designer as part of a mapping, in the Mapplet Designer as part of a mapplet, or in the Transformation Developer as a reusable transformation. 2. Configure the transformation. Each type of transformation has a unique set of options that you can configure. 3. Link the transformation to other transformations and target definitions. Drag one port to another to link them in the mapping or mapplet. 4 Chapter 1: Working with Transformations
  • 37. Creating a Transformation You can create transformations using the following Designer tools: ♦ Mapping Designer. Create transformations that connect sources to targets. Transformations in a mapping cannot be used in other mappings unless you configure them to be reusable. ♦ Transformation Developer. Create individual transformations, called reusable transformations, that use in multiple mappings. For more information, see “Reusable Transformations” on page 31. ♦ Mapplet Designer. Create and configure a set of transformations, called mapplets, that you use in multiple mappings. For more information, see “Mapplets” in the Designer Guide. Use the same process to create a transformation in the Mapping Designer, Transformation Developer, and Mapplet Designer. To create a transformation: 1. Open the appropriate Designer tool. 2. In the Mapping Designer, open or create a Mapping. In the Mapplet Designer, open or create a Mapplet. 3. On the Transformations toolbar, click the button corresponding to the transformation you want to create. -or- Click Transformation > Create and select the type of transformation you want to create. 4. Drag across the portion of the mapping where you want to place the transformation. The new transformation appears in the workspace. Next, you need to configure the transformation by adding any new ports to it and setting other properties. Creating a Transformation 5
  • 38. Configuring Transformations After you create a transformation, you can configure it. Every transformation contains the following common tabs: ♦ Transformation. Name the transformation or add a description. ♦ Port. Add and configure ports. ♦ Properties. Configure properties that are unique to the transformation. ♦ Metadata Extensions. Extend the metadata in the repository by associating information with individual objects in the repository. Some transformations might include other tabs, such as the Condition tab, where you enter conditions in a Joiner or Normalizer transformation. When you configure transformations, you might complete the following tasks: ♦ Add ports. Define the columns of data that move into and out of the transformation. ♦ Add groups. In some transformations, define input or output groups that define a row of data entering or leaving the transformation. ♦ Enter expressions. Enter SQL-like expressions in some transformations that transform the data. ♦ Define local variables. Define local variables in some transformations that temporarily store data. ♦ Override default values. Configure default values for ports to handle input nulls and output transformation errors. ♦ Enter tracing levels. Choose the amount of detail the Integration Service writes in the session log about a transformation. 6 Chapter 1: Working with Transformations
  • 39. Working with Ports After you create a transformation, you need to add and configure ports using the Ports tab. Figure 1-1 shows a sample Ports tab: Figure 1-1. Sample Ports Tab Creating Ports You can create a new port in the following ways: ♦ Drag a port from another transformation. When you drag a port from another transformation the Designer creates a port with the same properties, and it links the two ports. Click Layout > Copy Columns to enable copying ports. ♦ Click the Add button on the Ports tab. The Designer creates an empty port you can configure. Configuring Ports On the Ports tab, you can configure the following properties: ♦ Port name. The name of the port. ♦ Datatype, precision, and scale. If you plan to enter an expression or condition, make sure the datatype matches the return value of the expression. ♦ Port type. Transformations may contain a combination of input, output, input/output, and variable port types. Working with Ports 7
  • 40. Default value. The Designer assigns default values to handle null values and output transformation errors. You can override the default value in some ports. ♦ Description. A description of the port. ♦ Other properties. Some transformations have properties specific to that transformation, such as expressions or group by properties. For more information about configuration options, see the appropriate sections in this chapter or in the specific transformation chapters. Note: The Designer creates some transformations with configured ports. For example, the Designer creates a Lookup transformation with an output port for each column in the table or view used for the lookup. You need to create a port representing a value used to perform a lookup. Linking Ports Once you add and configure a transformation in a mapping, you link it to targets and other transformations. You link mapping objects through the ports. Data passes into and out of a mapping through the following ports: ♦ Input ports. Receive data. ♦ Output ports. Pass data. ♦ Input/output ports. Receive data and pass it unchanged. Figure 1-2 shows an example of a transformation with input, output, and input/output ports: Figure 1-2. Example of Input, Output, and Input/Output Ports Input Port Input/Output Port Output Ports To link ports, drag between ports in different mapping objects. The Designer validates the link and creates the link only when the link meets validation requirements. For more information about connecting mapping objects or about how to link ports, see “Mappings” in the Designer Guide. 8 Chapter 1: Working with Transformations
  • 41. Multi-Group Transformations Transformations have input and output groups. A group is a set of ports that define a row of incoming or outgoing data. A group is analogous to a table in a relational source or target definition. Most transformations have one input and one output group. However, some have multiple input groups, multiple output groups, or both. A group is the representation of a row of data entering or leaving a transformation. Table 1-2 lists the transformations with multiple groups: Table 1-2. Multi-Group Transformations Transformation Description Custom Contains any number of input and output groups. HTTP Contains an input, output, and a header group. Joiner Contains two input groups, the master source and detail source, and one output group. Router Contains one input group and multiple output groups. Union Contains multiple input groups and one output group. XML Source Qualifier Contains multiple input and output groups. XML Target Definition Contains multiple input groups. XML Parser Contains one input group and multiple output groups. XML Generator Contains multiple input groups and one output group. When you connect transformations in a mapping, you must consider input and output groups. For more information about connecting transformations in a mapping, see “Mappings” in the Designer Guide. Some multiple input group transformations require the Integration Service to block data at an input group while the Integration Service waits for a row from a different input group. A blocking transformation is a multiple input group transformation that blocks incoming data. The following transformations are blocking transformations: ♦ Custom transformation with the Inputs May Block property enabled ♦ Joiner transformation configured for unsorted input The Designer performs data flow validation when you save or validate a mapping. Some mappings that contain blocking transformations might not be valid. For more information about data flow validation, see “Mappings” in the Designer Guide. For more information about blocking source data, see “Integration Service Architecture” in the Administrator Guide. Multi-Group Transformations 9
  • 42. Working with Expressions You can enter expressions using the Expression Editor in some transformations. Create expressions with the following functions: ♦ Transformation language functions. SQL-like functions designed to handle common expressions. ♦ User-defined functions. Functions you create in PowerCenter based on transformation language functions. ♦ Custom functions. Functions you create with the Custom Function API. For more information about the transformation language and custom functions, see the Transformation Language Reference. For more information about user-defined functions, see “Working with User-Defined Functions” in the Designer Guide. Enter an expression in an output port that uses the value of data from an input or input/ output port. For example, you have a transformation with an input port IN_SALARY that contains the salaries of all the employees. You might want to use the individual values from the IN_SALARY column later in the mapping, and the total and average salaries you calculate through this transformation. For this reason, the Designer requires you to create a separate output port for each calculated value. Figure 1-3 shows an Aggregator transformation that uses input ports to calculate sums and averages: Figure 1-3. Sample Input and Output Ports 10 Chapter 1: Working with Transformations
  • 43. Table 1-3 lists the transformations in which you can enter expressions: Table 1-3. Transformations Containing Expressions Transformation Expression Return Value Aggregator Performs an aggregate calculation based on all data Result of an aggregate calculation for a passed through the transformation. Alternatively, you port. can specify a filter for records in the aggregate calculation to exclude certain kinds of records. For example, you can find the total number and average salary of all employees in a branch office using this transformation. Expression Performs a calculation based on values within a Result of a row-level calculation for a single row. For example, based on the price and port. quantity of a particular item, you can calculate the total purchase price for that line item in an order. Filter Specifies a condition used to filter rows passed TRUE or FALSE, depending on whether through this transformation. For example, if you want a row meets the specified condition. Only to write customer data to the BAD_DEBT table for rows that return TRUE are passed customers with outstanding balances, you could use through this transformation. The the Filter transformation to filter customer data. transformation applies this value to each row passed through it. Rank Sets the conditions for rows included in a rank. For Result of a condition or calculation for a example, you can rank the top 10 salespeople who port. are employed with the company. Router Routes data into multiple transformations based on a TRUE or FALSE, depending on whether group expression. For example, use this a row meets the specified group transformation to compare the salaries of employees expression. Only rows that return TRUE at three different pay levels. You can do this by pass through each user-defined group in creating three groups in the Router transformation. this transformation. Rows that return For example, create one group expression for each FALSE pass through the default group. salary range. Update Strategy Flags a row for update, insert, delete, or reject. You Numeric code for update, insert, delete, use this transformation when you want to control or reject. The transformation applies this updates to a target, based on some condition you value to each row passed through it. apply. For example, you might use the Update Strategy transformation to flag all customer rows for update when the mailing address has changed, or flag all employee rows for reject for people who no longer work for the company. Transaction Specifies a condition used to determine the action the One of the following built-in variables, Control Integration Service performs, either commit, roll back, depending on whether or not a row or no transaction change. You use this transformation meets the specified condition: when you want to control commit and rollback - TC_CONTINUE_TRANSACTION transactions based on a row or set of rows that pass - TC_COMMIT_BEFORE through the transformation. For example, use this - TC_COMMIT_AFTER transformation to commit a set of rows based on an - TC_ROLLBACK_BEFORE order entry date. - TC_ROLLBACK_AFTER The Integration Service performs actions based on the return value. Working with Expressions 11
  • 44. Using the Expression Editor Use the Expression Editor to build SQL-like statements. Although you can enter an expression manually, you should use the point-and-click method. Select functions, ports, variables, and operators from the point-and-click interface to minimize errors when you build expressions. Figure 1-4 shows an example of the Expression Editor: Figure 1-4. Expression Editor Entering Port Names into an Expression For connected transformations, if you use port names in an expression, the Designer updates that expression when you change port names in the transformation. For example, you write a valid expression that determines the difference between two dates, Date_Promised and Date_Delivered. Later, if you change the Date_Promised port name to Due_Date, the Designer changes the Date_Promised port name to Due_Date in the expression. Note: You can propagate the name Due_Date to other non-reusable transformations that depend on this port in the mapping. For more information, see “Mappings” in the Designer Guide. Adding Comments You can add comments to an expression to give descriptive information about the expression or to specify a valid URL to access business documentation about the expression. You can add comments in one of the following ways: ♦ To add comments within the expression, use -- or // comment indicators. ♦ To add comments in the dialog box, click the Comments button. For examples on adding comments to expressions, see “The Transformation Language” in the Transformation Language Reference. 12 Chapter 1: Working with Transformations
  • 45. For more information about linking to business documentation, see “Using the Designer” in the Designer Guide. Validating Expressions Use the Validate button to validate an expression. If you do not validate an expression, the Designer validates it when you close the Expression Editor. If the expression is invalid, the Designer displays a warning. You can save the invalid expression or modify it. You cannot run a session against a mapping with invalid expressions. Expression Editor Display The Expression Editor can display syntax expressions in different colors for better readability. If you have the latest Rich Edit control, riched20.dll, installed on the system, the Expression Editor displays expression functions in blue, comments in grey, and quoted strings in green. You can resize the Expression Editor. Expand the dialog box by dragging from the borders. The Designer saves the new size for the dialog box as a client setting. Adding Expressions to an Output Port Complete the following steps to add an expression to an output port. To add expressions: 1. In the transformation, select the port and open the Expression Editor. 2. Enter the expression. Use the Functions and Ports tabs and the operator keys. 3. Add comments to the expression. Use comment indicators -- or //. 4. Validate the expression. Use the Validate button to validate the expression. Working with Expressions 13
  • 46. Using Local Variables Use local variables in Aggregator, Expression, and Rank transformations. You can reference variables in an expression or use them to temporarily store data. Variables are an easy way to improve performance. You might use variables to complete the following tasks: ♦ Temporarily store data. ♦ Simplify complex expressions. ♦ Store values from prior rows. ♦ Capture multiple return values from a stored procedure. ♦ Compare values. ♦ Store the results of an unconnected Lookup transformation. Temporarily Store Data and Simplify Complex Expressions Variables improve performance when you enter several related expressions in the same transformation. Rather than parsing and validating the same expression components each time, you can define these components as variables. For example, if an Aggregator transformation uses the same filter condition before calculating sums and averages, you can define this condition once as a variable, and then reuse the condition in both aggregate calculations. You can simplify complex expressions. If an Aggregator includes the same calculation in multiple expressions, you can improve session performance by creating a variable to store the results of the calculation. For example, you might create the following expressions to find both the average salary and the total salary using the same data: AVG( SALARY, ( ( JOB_STATUS = 'Full-time' ) AND (OFFICE_ID = 1000 ) ) ) SUM( SALARY, ( ( JOB_STATUS = 'Full-time' ) AND (OFFICE_ID = 1000 ) ) ) Rather than entering the same arguments for both calculations, you might create a variable port for each condition in this calculation, then modify the expression to use the variables. Table 1-4 shows how to use variables to simplify complex expressions and temporarily store data: Table 1-4. Variable Usage Port Value V_CONDITION1 JOB_STATUS = ‘Full-time’ V_CONDITION2 OFFICE_ID = 1000 14 Chapter 1: Working with Transformations
  • 47. Table 1-4. Variable Usage Port Value AVG_SALARY AVG(SALARY, (V_CONDITION1 AND V_CONDITION2) ) SUM_SALARY SUM(SALARY, (V_CONDITION1 AND V_CONDITION2) ) Store Values Across Rows Use variables to store data from prior rows. This can help you perform procedural calculations. Figure 1-5 shows how to use variables to find out how many customers are in each state: Figure 1-5. Variable Ports Store Values Across Rows Since the Integration Service groups the input data by state, the company uses variables to hold the value of the previous state read and a state counter. The following expression compares the previous state to the state just read: IIF(PREVIOUS_STATE = STATE, STATE_COUNTER +1, 1) The STATE_COUNTER is incremented if the row is a member of the previous state. For each new state, the Integration Service sets the counter back to 1. Then an output port passes the value of the state counter to the next transformation. Capture Values from Stored Procedures Variables also provide a way to capture multiple columns of return values from stored procedures. For more information, see “Stored Procedure Transformation” on page 525. Using Local Variables 15
  • 48. Guidelines for Configuring Variable Ports Consider the following factors when you configure variable ports in a transformation: ♦ Port order. The Integration Service evaluates ports by dependency. The order of the ports in a transformation must match the order of evaluation: input ports, variable ports, output ports. ♦ Datatype. The datatype you choose reflects the return value of the expression you enter. ♦ Variable initialization. The Integration Service sets initial values in variable ports, where you can create counters. Port Order The Integration Service evaluates ports in the following order: 1. Input ports. The Integration Service evaluates all input ports first since they do not depend on any other ports. Therefore, you can create input ports in any order. Since they do not reference other ports, the Integration Service does not order input ports. 2. Variable ports. Variable ports can reference input ports and variable ports, but not output ports. Because variable ports can reference input ports, the Integration Service evaluates variable ports after input ports. Likewise, since variables can reference other variables, the display order for variable ports is the same as the order in which the Integration Service evaluates each variable. For example, if you calculate the original value of a building and then adjust for depreciation, you might create the original value calculation as a variable port. This variable port needs to appear before the port that adjusts for depreciation. 3. Output ports. Because output ports can reference input ports and variable ports, the Integration Service evaluates output ports last. The display order for output ports does not matter since output ports cannot reference other output ports. Be sure output ports display at the bottom of the list of ports. Datatype When you configure a port as a variable, you can enter any expression or condition in it. The datatype you choose for this port reflects the return value of the expression you enter. If you specify a condition through the variable port, any numeric datatype returns the values for TRUE (non-zero) and FALSE (zero). Variable Initialization The Integration Service does not set the initial value for variables to NULL. Instead, the Integration Service uses the following guidelines to set initial values for variables: ♦ Zero for numeric ports ♦ Empty strings for string ports ♦ 01/01/1753 for Date/Time ports with PMServer 4.0 date handling compatibility disabled ♦ 01/01/0001 for Date/Time ports with PMServer 4.0 date handling compatibility enabled 16 Chapter 1: Working with Transformations
  • 49. Therefore, use variables as counters, which need an initial value. For example, you can create a numeric variable with the following expression: VAR1 + 1 This expression counts the number of rows in the VAR1 port. If the initial value of the variable were set to NULL, the expression would always evaluate to NULL. This is why the initial value is set to zero. Using Local Variables 17
  • 50. Using Default Values for Ports All transformations use default values that determine how the Integration Service handles input null values and output transformation errors. Input, output, and input/output ports are created with a system default value that you can sometimes override with a user-defined default value. Default values have different functions in different types of ports: ♦ Input port. The system default value for null input ports is NULL. It displays as a blank in the transformation. If an input value is NULL, the Integration Service leaves it as NULL. ♦ Output port. The system default value for output transformation errors is ERROR. The default value appears in the transformation as ERROR(‘transformation error’). If a transformation error occurs, the Integration Service skips the row. The Integration Service notes all input rows skipped by the ERROR function in the session log file. The following errors are considered transformation errors: − Data conversion errors, such as passing a number to a date function. − Expression evaluation errors, such as dividing by zero. − Calls to an ERROR function. ♦ Input/output port. The system default value for null input is the same as input ports, NULL. The system default value appears as a blank in the transformation. The default value for output transformation errors is the same as output ports. The default value for output transformation errors does not display in the transformation. Table 1-5 shows the system default values for ports in connected transformations: Table 1-5. System Default Values and Integration Service Behavior Default User-Defined Default Port Type Integration Service Behavior Value Value Supported Input, NULL Integration Service passes all input null values as Input Input/Output NULL. Input/Output Output, ERROR Integration Service calls the ERROR function for Output Input/Output output port transformation errors. The Integration Service skips rows with errors and writes the input data and error message in the session log file. Note: Variable ports do not support default values. The Integration Service initializes variable ports according to the datatype. For more information, see “Using Local Variables” on page 14. 18 Chapter 1: Working with Transformations
  • 51. Figure 1-6 shows that the system default value for input and input/output ports appears as a blank in the transformation: Figure 1-6. Default Value for Input and Input/Output Ports Selected Port Blank default value for input port means NULL. Figure 1-7 shows that the system default value for output ports appears ERROR(‘transformation error’): Figure 1-7. Default Value for Output Ports Selected Port ERROR Default Value Using Default Values for Ports 19
  • 52. You can override some of the default values to change the Integration Service behavior when it encounters null input values and output transformation errors. Entering User-Defined Default Values You can override the system default values with user-defined default values for supported input, input/output, and output ports within a connected transformation: ♦ Input ports. You can enter user-defined default values for input ports if you do not want the Integration Service to treat null values as NULL. ♦ Output ports. You can enter user-defined default values for output ports if you do not want the Integration Service to skip the row or if you want the Integration Service to write a specific message with the skipped row to the session log. ♦ Input/output ports. You can enter user-defined default values to handle null input values for input/output ports in the same way you can enter user-defined default values for null input values for input ports. You cannot enter user-defined default values for output transformation errors in an input/output port. Note: The Integration Service ignores user-defined default values for unconnected transformations. For example, if you call a Lookup or Stored Procedure transformation through an expression, the Integration Service ignores any user-defined default value and uses the system default value only. Table 1-6 shows the ports for each transformation that support user-defined default values: Table 1-6. Transformations Supporting User-Defined Default Values Input Values for Output Values for Output Values for Transformation Input Port Output Port Input/Output Port Input/Output Port Aggregator Supported Not Supported Not Supported Custom Supported Supported Not Supported Expression Supported Supported Not Supported External Procedure Supported Supported Not Supported Filter Supported Not Supported Not Supported HTTP Supported Not Supported Not Supported Java Supported Supported Supported Lookup Supported Supported Not Supported Normalizer Supported Supported Not Supported Rank Not Supported Supported Not Supported Router Supported Not Supported Not Supported SQL Supported Not Supported Supported Stored Procedure Supported Supported Not Supported 20 Chapter 1: Working with Transformations
  • 53. Table 1-6. Transformations Supporting User-Defined Default Values Input Values for Output Values for Output Values for Transformation Input Port Output Port Input/Output Port Input/Output Port Sequence Generator n/a Not Supported Not Supported Sorter Supported Not Supported Not Supported Source Qualifier Not Supported n/a Not Supported Transaction Control Not Supported n/a Not Supported Union Supported Supported n/a Update Strategy Supported n/a Not Supported XML Generator n/a Supported Not Supported XML Parser Supported n/a Not Supported XML Source Qualifier Not Supported n/a Not Supported Use the following options to enter user-defined default values: ♦ Constant value. Use any constant (numeric or text), including NULL. ♦ Constant expression. You can include a transformation function with constant parameters. ♦ ERROR. Generate a transformation error. Write the row and a message in the session log or row error log. The Integration Service writes the row to session log or row error log based on session configuration. ♦ ABORT. Abort the session. Entering Constant Values You can enter any constant value as a default value. The constant value must match the port datatype. For example, a default value for a numeric port must be a numeric constant. Some constant values include: 0 9999 NULL 'Unknown Value' 'Null input data' Entering Constant Expressions A constant expression is any expression that uses transformation functions (except aggregate functions) to write constant expressions. You cannot use values from input, input/output, or variable ports in a constant expression. Using Default Values for Ports 21
  • 54. Some valid constant expressions include: 500 * 1.75 TO_DATE('January 1, 1998, 12:05 AM') ERROR ('Null not allowed') ABORT('Null not allowed') SYSDATE You cannot use values from ports within the expression because the Integration Service assigns default values for the entire mapping when it initializes the session. Some invalid default values include the following examples, which incorporate values read from ports: AVG(IN_SALARY) IN_PRICE * IN_QUANTITY :LKP(LKP_DATES, DATE_SHIPPED) Note: You cannot call a stored procedure or lookup table from a default value expression. Entering ERROR and ABORT Functions Use the ERROR and ABORT functions for input and output port default values, and input values for input/output ports. The Integration Service skips the row when it encounters the ERROR function. It aborts the session when it encounters the ABORT function. Entering User-Defined Default Input Values You can enter a user-defined default input value if you do not want the Integration Service to treat null values as NULL. You can complete the following functions to override null values: ♦ Replace the null value with a constant value or constant expression. ♦ Skip the null value with an ERROR function. ♦ Abort the session with the ABORT function. Table 1-7 summarizes how the Integration Service handles null input for input and input/ output ports: Table 1-7. Default Values for Input and Input/Output Ports Default Value Default Value Description Type NULL (displays blank) System Integration Service passes NULL. Constant or User-Defined Integration Service replaces the null value with the value of the constant Constant expression or constant expression. 22 Chapter 1: Working with Transformations
  • 55. Table 1-7. Default Values for Input and Input/Output Ports Default Value Default Value Description Type ERROR User-Defined Integration Service treats this as a transformation error: - Increases the transformation error count by 1. - Skips the row, and writes the error message to the session log file or row error log. The Integration Service does not write rows to the reject file. ABORT User-Defined Session aborts when the Integration Service encounters a null input value. The Integration Service does not increase the error count or write rows to the reject file. Replacing Null Values Use a constant value or expression to substitute a specified value for a NULL. For example, if an input string port is called DEPT_NAME and you want to replace null values with the string ‘UNKNOWN DEPT’, you could set the default value to ‘UNKNOWN DEPT’. Depending on the transformation, the Integration Service passes ‘UNKNOWN DEPT’ to an expression or variable within the transformation or to the next transformation in the data flow. Figure 1-8 shows a string constant as a user-defined default value for input or input/output ports: Figure 1-8. Using a Constant as a Default Value Selected Port User-Defined Constant Default Value Using Default Values for Ports 23
  • 56. The Integration Service replaces all null values in the EMAIL port with the string ‘UNKNOWN DEPT.’ DEPT_NAME REPLACED VALUE Housewares Housewares NULL UNKNOWN DEPT Produce Produce Skipping Null Records Use the ERROR function as the default value when you do not want null values to pass into a transformation. For example, you might want to skip a row when the input value of DEPT_NAME is NULL. You could use the following expression as the default value: ERROR('Error. DEPT is NULL') Figure 1-9 shows a default value that instructs the Integration Service to skip null values: Figure 1-9. Using the ERROR Function to Skip Null Input Values When you use the ERROR function as a default value, the Integration Service skips the row with the null value. The Integration Service writes all rows skipped by the ERROR function into the session log file. It does not write these rows to the session reject file. DEPT_NAME RETURN VALUE Housewares Housewares NULL 'Error. DEPT is NULL' (Row is skipped) Produce Produce 24 Chapter 1: Working with Transformations
  • 57. The following session log shows where the Integration Service skips the row with the null value: TE_11019 Port [DEPT_NAME]: Default value is: ERROR(<<Transformation Error>> [error]: Error. DEPT is NULL ... error('Error. DEPT is NULL') ). CMN_1053 EXPTRANS: : ERROR: NULL input column DEPT_NAME: Current Input data: CMN_1053 Input row from SRCTRANS: Rowdata: ( RowType=4 Src Rowid=2 Targ Rowid=2 DEPT_ID (DEPT_ID:Int:): "2" DEPT_NAME (DEPT_NAME:Char.25:): "NULL" MANAGER_ID (MANAGER_ID:Int:): "1" ) For more information about the ERROR function, see “Functions” in the Transformation Language Reference. Aborting the Session Use the ABORT function to abort a session when the Integration Service encounters any null input values. For more information about the ABORT function, see “Functions” in the Transformation Language Reference. Entering User-Defined Default Output Values You can enter user-defined default values for output ports if you do not want the Integration Service to skip rows with errors or if you want the Integration Service to write a specific message with the skipped row to the session log. You can enter default values to complete the following functions when the Integration Service encounters output transformation errors: ♦ Replace the error with a constant value or constant expression. The Integration Service does not skip the row. ♦ Abort the session with the ABORT function. ♦ Write specific messages in the session log for transformation errors. You cannot enter user-defined default output values for input/output ports. Using Default Values for Ports 25
  • 58. Table 1-8 summarizes how the Integration Service handles output port transformation errors and default values in transformations: Table 1-8. Supported Default Values for Output Ports Default Value Default Value Description Type Transformation Error System When a transformation error occurs and you did not override the default value, the Integration Service performs the following tasks: - Increases the transformation error count by 1. - Skips the row, and writes the error and input row to the session log file or row error log, depending on session configuration. The Integration Service does not write the row to the reject file. Constant or User-Defined Integration Service replaces the error with the default value. Constant Expression The Integration Service does not increase the error count or write a message to the session log. ABORT User-Defined Session aborts and the Integration Service writes a message to the session log. The Integration Service does not increase the error count or write rows to the reject file. Replacing Errors If you do not want the Integration Service to skip a row when a transformation error occurs, use a constant or constant expression as the default value for an output port. For example, if you have a numeric output port called NET_SALARY and you want to use the constant value ‘9999’ when a transformation error occurs, assign the default value 9999 to the NET_SALARY port. If there is any transformation error (such as dividing by zero) while computing the value of NET_SALARY, the Integration Service uses the default value 9999. Aborting the Session Use the ABORT function as the default value in an output port if you do not want to allow any transformation errors. Writing Messages in the Session Log or Row Error Logs You can enter a user-defined default value in the output port if you want the Integration Service to write a specific message in the session log with the skipped row. The system default is ERROR (‘transformation error’), and the Integration Service writes the message ‘transformation error’ in the session log along with the skipped row. You can replace ‘transformation error’ if you want to write a different message. When you enable row error logging, the Integration Service writes error messages to the error log instead of the session log and the Integration Service does not log Transaction Control transformation rollback or commit errors. If you want to write rows to the session log in addition to the row error log, you can enable verbose data tracing. 26 Chapter 1: Working with Transformations
  • 59. Working with ERROR Functions in Output Port Expressions If you enter an expression that uses the ERROR function, the user-defined default value for the output port might override the ERROR function in the expression. For example, you enter the following expression that instructs the Integration Service to use the value ‘Negative Sale’ when it encounters an error: IIF( TOTAL_SALES>0, TOTAL_SALES, ERROR ('Negative Sale')) The following examples show how user-defined default values may override the ERROR function in the expression: ♦ Constant value or expression. The constant value or expression overrides the ERROR function in the output port expression. For example, if you enter ‘0’ as the default value, the Integration Service overrides the ERROR function in the output port expression. It passes the value 0 when it encounters an error. It does not skip the row or write ‘Negative Sale’ in the session log. ♦ ABORT. The ABORT function overrides the ERROR function in the output port expression. If you use the ABORT function as the default value, the Integration Service aborts the session when a transformation error occurs. The ABORT function overrides the ERROR function in the output port expression. ♦ ERROR. If you use the ERROR function as the default value, the Integration Service includes the following information in the session log: − Error message from the default value − Error message indicated in the ERROR function in the output port expression − Skipped row For example, you can override the default value with the following ERROR function: ERROR('No default value') The Integration Service skips the row, and includes both error messages in the log. TE_7007 Transformation Evaluation Error; current row skipped... TE_7007 [<<Transformation Error>> [error]: Negative Sale ... error('Negative Sale') ] Sun Sep 20 13:57:28 1998 TE_11019 Port [OUT_SALES]: Default value is: ERROR(<<Transformation Error>> [error]: No default value ... error('No default value') Using Default Values for Ports 27
  • 60. General Rules for Default Values Use the following rules and guidelines when you create default values: ♦ The default value must be either a NULL, a constant value, a constant expression, an ERROR function, or an ABORT function. ♦ For input/output ports, the Integration Service uses default values to handle null input values. The output default value of input/output ports is always ERROR(‘Transformation Error’). ♦ Variable ports do not use default values. ♦ You can assign default values to group by ports in the Aggregator and Rank transformations. ♦ Not all port types in all transformations allow user-defined default values. If a port does not allow user-defined default values, the default value field is disabled. ♦ Not all transformations allow user-defined default value. For more information, see Table 1-6 on page 20. ♦ If a transformation is not connected to the mapping data flow (an unconnected transformation), the Integration Service ignores user-defined default values. ♦ If any input port is unconnected, its value is assumed to be NULL and the Integration Service uses the default value for that input port. ♦ If an input port default value contains the ABORT function and the input value is NULL, the Integration Service immediately stops the session. Use the ABORT function as a default value to restrict null input values. The first null value in an input port stops the session. ♦ If an output port default value contains the ABORT function and any transformation error occurs for that port, the session immediately stops. Use the ABORT function as a default value to enforce strict rules for transformation errors. The first transformation error for this port stops the session. ♦ The ABORT function, constant values, and constant expressions override ERROR functions configured in output port expressions. Entering and Validating Default Values You can validate default values as you enter them. The Designer includes a Validate button so you can ensure valid default values. A message appears indicating if the default is valid. 28 Chapter 1: Working with Transformations
  • 61. Figure 1-10 shows the user-defined value for a port and the Validate button: Figure 1-10. Entering and Validating Default Values Selected Port User-Defined Validate Button Default Value The Designer also validates default values when you save a mapping. If you enter an invalid default value, the Designer marks the mapping invalid. Using Default Values for Ports 29
  • 62. Configuring Tracing Level in Transformations When you configure a transformation, you can set the amount of detail the Integration Service writes in the session log. Table 1-9 describes the session log tracing levels: Table 1-9. Session Log Tracing Levels Tracing Level Description Normal Integration Service logs initialization and status information, errors encountered, and skipped rows due to transformation row errors. Summarizes session results, but not at the level of individual rows. Terse Integration Service logs initialization information and error messages and notification of rejected data. Verbose In addition to normal tracing, Integration Service logs additional initialization details, names of Initialization index and data files used, and detailed transformation statistics. Verbose Data In addition to verbose initialization tracing, Integration Service logs each row that passes into the mapping. Also notes where the Integration Service truncates string data to fit the precision of a column and provides detailed transformation statistics. Allows the Integration Service to write errors to both the session log and error log when you enable row error logging. When you configure the tracing level to verbose data, the Integration Service writes row data for all rows in a block when it processes a transformation. By default, the tracing level for every transformation is Normal. Change the tracing level to a Verbose setting only when you need to debug a transformation that is not behaving as expected. To add a slight performance boost, you can also set the tracing level to Terse, writing the minimum of detail to the session log when running a workflow containing the transformation. When you configure a session, you can override the tracing levels for individual transformations with a single tracing level for all transformations in the session. 30 Chapter 1: Working with Transformations
  • 63. Reusable Transformations Mappings can contain reusable and non-reusable transformations. Non-reusable transformations exist within a single mapping. Reusable transformations can be used in multiple mappings. For example, you might create an Expression transformation that calculates value-added tax for sales in Canada, which is useful when you analyze the cost of doing business in that country. Rather than perform the same work every time, you can create a reusable transformation. When you need to incorporate this transformation into a mapping, you add an instance of it to the mapping. Later, if you change the definition of the transformation, all instances of it inherit the changes. The Designer stores each reusable transformation as metadata separate from any mapping that uses the transformation. If you review the contents of a folder in the Navigator, you see the list of all reusable transformations in that folder. Each reusable transformation falls within a category of transformations available in the Designer. For example, you can create a reusable Aggregator transformation to perform the same aggregate calculations in multiple mappings, or a reusable Stored Procedure transformation to call the same stored procedure in multiple mappings. You can create most transformations as a non-reusable or reusable. However, you can only create the External Procedure transformation as a reusable transformation. When you add instances of a reusable transformation to mappings, you must be careful that changes you make to the transformation do not invalidate the mapping or generate unexpected data. Instances and Inherited Changes When you add a reusable transformation to a mapping, you add an instance of the transformation. The definition of the transformation still exists outside the mapping, while a copy (or instance) appears within the mapping. Since the instance of a reusable transformation is a pointer to that transformation, when you change the transformation in the Transformation Developer, its instances reflect these changes. Instead of updating the same transformation in every mapping that uses it, you can update the reusable transformation once, and all instances of the transformation inherit the change. Note that instances do not inherit changes to property settings, only modifications to ports, expressions, and the name of the transformation. Mapping Variables in Expressions Use mapping parameters and variables in reusable transformation expressions. When the Designer validates the parameter or variable, it treats it as an Integer datatype. When you use the transformation in a mapplet or mapping, the Designer validates the expression again. If the mapping parameter or variable does not exist in the mapplet or mapping, the Designer Reusable Transformations 31
  • 64. logs an error. For more information, see “Mapping Parameters and Variables” in the Designer Guide. Creating Reusable Transformations You can create a reusable transformation using the following methods: ♦ Design it in the Transformation Developer. In the Transformation Developer, you can build new reusable transformations. ♦ Promote a non-reusable transformation from the Mapping Designer. After you add a transformation to a mapping, you can promote it to the status of reusable transformation. The transformation designed in the mapping then becomes an instance of a reusable transformation maintained elsewhere in the repository. If you promote a transformation to reusable status, you cannot demote it. However, you can create a non-reusable instance of it. Note: Sequence Generator transformations must be reusable in mapplets. You cannot demote reusable Sequence Generator transformations to non-reusable in a mapplet. To create a reusable transformation: 1. In the Designer, switch to the Transformation Developer. 2. Click the button on the Transformation toolbar corresponding to the type of transformation you want to create. 3. Drag within the workbook to create the transformation. 4. Double-click the transformation title bar to open the dialog displaying its properties. 5. Click the Rename button and enter a descriptive name for the transformation, and click OK. 6. Click the Ports tab, then add any input and output ports you need for this transformation. 7. Set the other properties of the transformation, and click OK. These properties vary according to the transformation you create. For example, if you create an Expression transformation, you need to enter an expression for one or more of the transformation output ports. If you create a Stored Procedure transformation, you need to identify the stored procedure to call. 8. Click Repository > Save. Promoting Non-Reusable Transformations The other technique for creating a reusable transformation is to promote an existing transformation within a mapping. By checking the Make Reusable option in the Edit Transformations dialog box, you instruct the Designer to promote the transformation and create an instance of it in the mapping. 32 Chapter 1: Working with Transformations
  • 65. To promote a non-reusable transformation: 1. In the Designer, open a mapping and double-click the title bar of the transformation you want to promote. 2. Select the Make Reusable option. 3. When prompted whether you are sure you want to promote the transformation, click Yes. 4. Click OK to return to the mapping. 5. Click Repository > Save. Now, when you look at the list of reusable transformations in the folder you are working in, the newly promoted transformation appears in this list. Creating Non-Reusable Instances of Reusable Transformations You can create a non-reusable instance of a reusable transformation within a mapping. Reusable transformations must be made non-reusable within the same folder. If you want to have a non-reusable instance of a reusable transformation in a different folder, you need to first make a non-reusable instance of the transformation in the source folder, and then copy it into the target folder. To create a non-reusable instance of a reusable transformation: 1. In the Designer, open a mapping. 2. In the Navigator, select an existing transformation and drag the transformation into the mapping workspace. Hold down the Ctrl key before you release the transformation. The status bar displays the following message: Make a non-reusable copy of this transformation and add it to this mapping. 3. Release the transformation. The Designer creates a non-reusable instance of the existing reusable transformation. 4. Click Repository > Save. Adding Reusable Transformations to Mappings After you create a reusable transformation, you can add it to mappings. To add a reusable transformation: 1. In the Designer, switch to the Mapping Designer. 2. Open or create a mapping. 3. In the list of repository objects, drill down until you find the reusable transformation you want in the Transformations section of a folder. 4. Drag the transformation from the Navigator into the mapping. Reusable Transformations 33
  • 66. A copy (or instance) of the reusable transformation appears. 5. Link the new transformation to other transformations or target definitions. 6. Click Repository > Save. Modifying a Reusable Transformation Changes to a reusable transformation that you enter through the Transformation Developer are immediately reflected in all instances of that transformation. While this feature is a powerful way to save work and enforce standards (for example, by publishing the official version of a depreciation calculation through a reusable transformation), you risk invalidating mappings when you modify a reusable transformation. To see what mappings, mapplets, or shortcuts may be affected by changes you make to a transformation, select the transformation in the workspace or Navigator, right-click, and select View Dependencies. If you make any of the following changes to the reusable transformation, mappings that use instances of it may be invalidated: ♦ When you delete a port or multiple ports in a transformation, you disconnect the instance from part or all of the data flow through the mapping. ♦ When you change a port datatype, you make it impossible to map data from that port to another port using an incompatible datatype. ♦ When you change a port name, expressions that refer to the port are no longer valid. ♦ When you enter an invalid expression in the reusable transformation, mappings that use the transformation are no longer valid. The Integration Service cannot run sessions based on invalid mappings. Reverting to Original Reusable Transformation If you change the properties of a reusable transformation in a mapping, you can revert to the original reusable transformation properties by clicking the Revert button. 34 Chapter 1: Working with Transformations
  • 67. Figure 1-11 shows how you can revert to the original properties of the reusable transformation: Figure 1-11. Reverting to Original Reusable Transformation Properties Revert to original properties defined in Transformation Developer. Reusable Transformations 35
  • 68. 36 Chapter 1: Working with Transformations
  • 69. Chapter 2 Aggregator Transformation This chapter includes the following topics: ♦ Overview, 38 ♦ Aggregate Expressions, 40 ♦ Group By Ports, 42 ♦ Using Sorted Input, 45 ♦ Creating an Aggregator Transformation, 47 ♦ Tips, 50 ♦ Troubleshooting, 51 37
  • 70. Overview Transformation type: Active Connected The Aggregator transformation lets you perform aggregate calculations, such as averages and sums. The Aggregator transformation is unlike the Expression transformation, in that you use the Aggregator transformation to perform calculations on groups. The Expression transformation permits you to perform calculations on a row-by-row basis only. When using the transformation language to create aggregate expressions, use conditional clauses to filter rows, providing more flexibility than SQL language. The Integration Service performs aggregate calculations as it reads, and stores necessary data group and row data in an aggregate cache. After you create a session that includes an Aggregator transformation, you can enable the session option, Incremental Aggregation. When the Integration Service performs incremental aggregation, it passes new source data through the mapping and uses historical cache data to perform new aggregation calculations incrementally. For information about incremental aggregation, see “Using Incremental Aggregation” in the Workflow Administration Guide. Ports in the Aggregator Transformation To configure ports in the Aggregator transformation, complete the following tasks: ♦ Enter an expression in any output port, using conditional clauses or non-aggregate functions in the port. ♦ Create multiple aggregate output ports. ♦ Configure any input, input/output, output, or variable port as a group by port. ♦ Improve performance by connecting only the necessary input/output ports to subsequent transformations, reducing the size of the data cache. ♦ Use variable ports for local variables. ♦ Create connections to other transformations as you enter an expression. Components of the Aggregator Transformation The Aggregator is an active transformation, changing the number of rows in the pipeline. The Aggregator transformation has the following components and options: ♦ Aggregate expression. Entered in an output port. Can include non-aggregate expressions and conditional clauses. ♦ Group by port. Indicates how to create groups. The port can be any input, input/output, output, or variable port. When grouping data, the Aggregator transformation outputs the last row of each group unless otherwise specified. 38 Chapter 2: Aggregator Transformation
  • 71. Sorted input. Use to improve session performance. To use sorted input, you must pass data to the Aggregator transformation sorted by group by port, in ascending or descending order. ♦ Aggregate cache. The Integration Service stores data in the aggregate cache until it completes aggregate calculations. It stores group values in an index cache and row data in the data cache. Aggregate Caches When you run a session that uses an Aggregator transformation, the Integration Service creates index and data caches in memory to process the transformation. If the Integration Service requires more space, it stores overflow values in cache files. You can configure the index and data caches in the Aggregator transformation or in the session properties. Or, you can configure the Integration Service to determine the cache size at runtime. For more information about configuring index and data caches, see “Creating an Aggregator Transformation” on page 47. For information about configuring the Integration Service to determine the cache size at runtime, see “Working with Sessions” in the Workflow Administration Guide. Note: The Integration Service uses memory to process an Aggregator transformation with sorted ports. It does not use cache memory. You do not need to configure cache memory for Aggregator transformations that use sorted ports. Overview 39
  • 72. Aggregate Expressions The Designer allows aggregate expressions only in the Aggregator transformation. An aggregate expression can include conditional clauses and non-aggregate functions. It can also include one aggregate function nested within another aggregate function, such as: MAX( COUNT( ITEM )) The result of an aggregate expression varies depending on the group by ports used in the transformation. For example, when the Integration Service calculates the following aggregate expression with no group by ports defined, it finds the total quantity of items sold: SUM( QUANTITY ) However, if you use the same expression, and you group by the ITEM port, the Integration Service returns the total quantity of items sold, by item. You can create an aggregate expression in any output port and use multiple aggregate ports in a transformation. Aggregate Functions Use the following aggregate functions within an Aggregator transformation. You can nest one aggregate function within another aggregate function. The transformation language includes the following aggregate functions: ♦ AVG ♦ COUNT ♦ FIRST ♦ LAST ♦ MAX ♦ MEDIAN ♦ MIN ♦ PERCENTILE ♦ STDDEV ♦ SUM ♦ VARIANCE When you use any of these functions, you must use them in an expression within an Aggregator transformation. For a description of these functions, see “Functions” in the Transformation Language Reference. Nested Aggregate Functions You can include multiple single-level or multiple nested functions in different output ports in an Aggregator transformation. However, you cannot include both single-level and nested 40 Chapter 2: Aggregator Transformation
  • 73. functions in an Aggregator transformation. Therefore, if an Aggregator transformation contains a single-level function in any output port, you cannot use a nested function in any other port in that transformation. When you include single-level and nested functions in the same Aggregator transformation, the Designer marks the mapping or mapplet invalid. If you need to create both single-level and nested functions, create separate Aggregator transformations. Conditional Clauses Use conditional clauses in the aggregate expression to reduce the number of rows used in the aggregation. The conditional clause can be any clause that evaluates to TRUE or FALSE. For example, use the following expression to calculate the total commissions of employees who exceeded their quarterly quota: SUM( COMMISSION, COMMISSION > QUOTA ) Non-Aggregate Functions You can also use non-aggregate functions in the aggregate expression. The following expression returns the highest number of items sold for each item (grouped by item). If no items were sold, the expression returns 0. IIF( MAX( QUANTITY ) > 0, MAX( QUANTITY ), 0)) Null Values in Aggregate Functions When you configure the Integration Service, you can choose how you want the Integration Service to handle null values in aggregate functions. You can choose to treat null values in aggregate functions as NULL or zero. By default, the Integration Service treats null values as NULL in aggregate functions. For information about changing this default behavior, see “Creating and Configuring the Integration Service” in the Administrator Guide. Aggregate Expressions 41
  • 74. Group By Ports The Aggregator transformation lets you define groups for aggregations, rather than performing the aggregation across all input data. For example, rather than finding the total company sales, you can find the total sales grouped by region. To define a group for the aggregate expression, select the appropriate input, input/output, output, and variable ports in the Aggregator transformation. You can select multiple group by ports, creating a new group for each unique combination of groups. The Integration Service then performs the defined aggregation for each group. When you group values, the Integration Service produces one row for each group. If you do not group values, the Integration Service returns one row for all input rows. The Integration Service typically returns the last row of each group (or the last row received) with the result of the aggregation. However, if you specify a particular row to be returned (for example, by using the FIRST function), the Integration Service then returns the specified row. When selecting multiple group by ports in the Aggregator transformation, the Integration Service uses port order to determine the order by which it groups. Since group order can affect the results, order group by ports to ensure the appropriate grouping. For example, the results of grouping by ITEM_ID then QUANTITY can vary from grouping by QUANTITY then ITEM_ID, because the numeric values for quantity are not necessarily unique. The following Aggregator transformation groups first by STORE_ID and then by ITEM: If you send the following data through this Aggregator transformation: STORE_ID ITEM QTY PRICE 101 'battery' 3 2.99 101 'battery' 1 3.19 101 'battery' 2 2.59 101 'AAA' 2 2.45 201 'battery' 1 1.99 201 'battery' 4 1.59 301 'battery' 1 2.45 42 Chapter 2: Aggregator Transformation
  • 75. The Integration Service performs the aggregate calculation on the following unique groups: STORE_ID ITEM 101 'battery' 101 'AAA' 201 'battery' 301 'battery' The Integration Service then passes the last row received, along with the results of the aggregation, as follows: STORE_ID ITEM QTY PRICE SALES_PER_STORE 101 'battery' 2 2.59 17.34 101 'AAA' 2 2.45 4.90 201 'battery' 4 1.59 8.35 301 'battery' 1 2.45 2.45 Non-Aggregate Expressions Use non-aggregate expressions in group by ports to modify or replace groups. For example, if you want to replace ‘AAA battery’ before grouping, you can create a new group by output port, named CORRECTED_ITEM, using the following expression: IIF( ITEM = 'AAA battery', battery, ITEM ) Default Values Use default values in the group by port to replace null input values. This allows the Integration Service to include null item groups in the aggregation. For more information about default values, see “Using Default Values for Ports” on page 18. Group By Ports 43
  • 76. For example, if you define a default value of ‘Misc’ in the ITEM column as shown below, the Integration Service replaces null groups with ‘Misc’: 44 Chapter 2: Aggregator Transformation
  • 77. Using Sorted Input You can improve Aggregator transformation performance by using the sorted input option. When you use sorted input, the Integration Service assumes all data is sorted by group. As the Integration Service reads rows for a group, it performs aggregate calculations. When necessary, it stores group information in memory. To use the Sorted Input option, you must pass sorted data to the Aggregator transformation. You can gain performance with sorted ports when you configure the session with multiple partitions. When you do not use sorted input, the Integration Service performs aggregate calculations as it reads. However, since data is not sorted, the Integration Service stores data for each group until it reads the entire source to ensure all aggregate calculations are accurate. For example, one Aggregator transformation has the STORE_ID and ITEM group by ports, with the sorted input option selected. When you pass the following data through the Aggregator, the Integration Service performs an aggregation for the three rows in the 101/battery group as soon as it finds the new group, 201/battery: STORE_ID ITEM QTY PRICE 101 'battery' 3 2.99 101 'battery' 1 3.19 101 'battery' 2 2.59 201 'battery' 4 1.59 201 'battery' 1 1.99 If you use sorted input and do not presort data correctly, you receive unexpected results. Sorted Input Conditions Do not use sorted input if either of the following conditions are true: ♦ The aggregate expression uses nested aggregate functions. ♦ The session uses incremental aggregation. If you use sorted input and do not sort data correctly, the session fails. Pre-Sorting Data To use sorted input, you pass sorted data through the Aggregator. Data must be sorted as follows: ♦ By the Aggregator group by ports, in the order they appear in the Aggregator transformation. ♦ Using the same sort order configured for the session. If data is not in strict ascending or descending order based on the session sort order, the Integration Service fails the session. For example, if you configure a session to use a French sort order, data passing into the Aggregator transformation must be sorted using the French sort order. Using Sorted Input 45
  • 78. For relational and file sources, use the Sorter transformation to sort data in the mapping before passing it to the Aggregator transformation. You can place the Sorter transformation anywhere in the mapping prior to the Aggregator if no transformation changes the order of the sorted data. Group by columns in the Aggregator transformation must be in the same order as they appear in the Sorter transformation. For information about sorting data using the Sorter transformation, see “Sorter Transformation” on page 435. If the session uses relational sources, you can also use the Number of Sorted Ports option in the Source Qualifier transformation to sort group by columns in the source database. Group by columns must be in the same order in both the Aggregator and Source Qualifier transformations. For information about sorting data in the Source Qualifier, see “Using Sorted Ports” on page 472. Figure 2-1 shows the mapping with a Sorter transformation configured to sort the source data in descending order by ITEM_NAME: Figure 2-1. Sample Mapping with Aggregator and Sorter Transformations The Sorter transformation sorts the data as follows: ITEM_NAME QTY PRICE Soup 4 2.95 Soup 1 2.95 Soup 2 3.25 Cereal 1 4.49 Cereal 2 5.25 With sorted input, the Aggregator transformation returns the following results: ITEM_NAME QTY PRICE INCOME_PER_ITEM Cereal 2 5.25 14.99 Soup 2 3.25 21.25 46 Chapter 2: Aggregator Transformation
  • 79. Creating an Aggregator Transformation To use an Aggregator transformation in a mapping, add the Aggregator transformation to the mapping. Then configure the transformation with an aggregate expression and group by ports. To create an Aggregator transformation: 1. In the Mapping Designer, click Transformation > Create. Select the Aggregator transformation. 2. Enter a name for the Aggregator, click Create. Then click Done. The Designer creates the Aggregator transformation. 3. Drag the ports to the Aggregator transformation. The Designer creates input/output ports for each port you include. 4. Double-click the title bar of the transformation to open the Edit Transformations dialog box. 5. Select the Ports tab. 6. Click the group by option for each column you want the Aggregator to use in creating groups. Optionally, enter a default value to replace null groups. If you want to use a non-aggregate expression to modify groups, click the Add button and enter a name and data type for the port. Make the port an output port by clearing Input (I). Click in the right corner of the Expression field, enter the non-aggregate expression using one of the input ports, and click OK. Select Group By. 7. Click Add and enter a name and data type for the aggregate expression port. Make the port an output port by clearing Input (I). Click in the right corner of the Expression field to open the Expression Editor. Enter the aggregate expression, click Validate, and click OK. Make sure the expression validates before closing the Expression Editor. 8. Add default values for specific ports. If certain ports are likely to contain null values, you might specify a default value if the target database does not handle null values. Creating an Aggregator Transformation 47
  • 80. 9. Select the Properties tab. Select and modify these options: Aggregator Setting Description Cache Directory Local directory where the Integration Service creates the index and data cache files. By default, the Integration Service uses the directory entered in the Workflow Manager for the process variable $PMCacheDir. If you enter a new directory, make sure the directory exists and contains enough disk space for the aggregate caches. If you have enabled incremental aggregation, the Integration Service creates a backup of the files each time you run the session. The cache directory must contain enough disk space for two sets of the files. For information about incremental aggregation, see “Using Incremental Aggregation” in the Workflow Administration Guide. Tracing Level Amount of detail displayed in the session log for this transformation. Sorted Input Indicates input data is presorted by groups. Select this option only if the mapping passes sorted data to the Aggregator transformation. Aggregator Data Data cache size for the transformation. Default cache size is 2,000,000 bytes. If the Cache Size total configured session cache size is 2 GB (2,147,483,648 bytes) or greater, you must run the session on a 64-bit Integration Service. You can configure the Integration Service to determine the cache size at runtime, or you can configure a numeric value. If you configure the Integration Service to determine the cache size, you can also configure a maximum amount of memory for the Integration Service to allocate to the cache. 48 Chapter 2: Aggregator Transformation
  • 81. Aggregator Setting Description Aggregator Index Index cache size for the transformation. Default cache size is 1,000,000 bytes. If the Cache Size total configured session cache size is 2 GB (2,147,483,648 bytes) or greater, you must run the session on a 64-bit Integration Service. You can configure the Integration Service to determine the cache size at runtime, or you can configure a numeric value. If you configure the Integration Service to determine the cache size, you can also configure a maximum amount of memory for the Integration Service to allocate to the cache. Transformation Scope Specifies how the Integration Service applies the transformation logic to incoming data: - Transaction. Applies the transformation logic to all rows in a transaction. Choose Transaction when a row of data depends on all rows in the same transaction, but does not depend on rows in other transactions. - All Input. Applies the transformation logic on all incoming data. When you choose All Input, the PowerCenter drops incoming transaction boundaries. Choose All Input when a row of data depends on all rows in the source. For more information about transformation scope, see “Understanding Commit Points” in the Workflow Administration Guide. 10. Click OK. 11. Click Repository > Save to save changes to the mapping. Creating an Aggregator Transformation 49
  • 82. Tips Use the following guidelines to optimize the performance of an Aggregator transformation. Use sorted input to decrease the use of aggregate caches. Sorted input reduces the amount of data cached during the session and improves session performance. Use this option with the Sorter transformation to pass sorted data to the Aggregator transformation. Limit connected input/output or output ports. Limit the number of connected input/output or output ports to reduce the amount of data the Aggregator transformation stores in the data cache. Filter before aggregating. If you use a Filter transformation in the mapping, place the transformation before the Aggregator transformation to reduce unnecessary aggregation. 50 Chapter 2: Aggregator Transformation
  • 83. Troubleshooting I selected sorted input but the workflow takes the same amount of time as before. You cannot use sorted input if any of the following conditions are true: ♦ The aggregate expression contains nested aggregate functions. ♦ The session uses incremental aggregation. ♦ Source data is data driven. When any of these conditions are true, the Integration Service processes the transformation as if you do not use sorted input. A session using an Aggregator transformation causes slow performance. The Integration Service may be paging to disk during the workflow. You can increase session performance by increasing the index and data cache sizes in the transformation properties. For more information about caching, see “Session Caches” in the Workflow Administration Guide. I entered an override cache directory in the Aggregator transformation, but the Integration Service saves the session incremental aggregation files somewhere else. You can override the transformation cache directory on a session level. The Integration Service notes the cache directory in the session log. You can also check the session properties for an override cache directory. Troubleshooting 51
  • 84. 52 Chapter 2: Aggregator Transformation
  • 85. Chapter 3 Custom Transformation This chapter includes the following topics: ♦ Overview, 54 ♦ Creating Custom Transformations, 57 ♦ Working with Groups and Ports, 59 ♦ Working with Port Attributes, 62 ♦ Custom Transformation Properties, 64 ♦ Working with Transaction Control, 68 ♦ Blocking Input Data, 70 ♦ Working with Procedure Properties, 72 ♦ Creating Custom Transformation Procedures, 73 53
  • 86. Overview Transformation type: Active/Passive Connected Custom transformations operate in conjunction with procedures you create outside of the Designer interface to extend PowerCenter functionality. You can create a Custom transformation and bind it to a procedure that you develop using the functions described in “Custom Transformation Functions” on page 89. Use the Custom transformation to create transformation applications, such as sorting and aggregation, which require all input rows to be processed before outputting any output rows. To support this process, the input and output functions occur separately in Custom transformations compared to External Procedure transformations. The Integration Service passes the input data to the procedure using an input function. The output function is a separate function that you must enter in the procedure code to pass output data to the Integration Service. In contrast, in the External Procedure transformation, an external procedure function does both input and output, and its parameters consist of all the ports of the transformation. You can also use the Custom transformation to create a transformation that requires multiple input groups, multiple output groups, or both. A group is the representation of a row of data entering or leaving a transformation. For example, you might create a Custom transformation with one input group and multiple output groups that parses XML data. Or, you can create a Custom transformation with two input groups and one output group that merges two streams of input data into one stream of output data. Working with Transformations Built On the Custom Transformation You can build transformations using the Custom transformation. Some of the PowerCenter transformations are built using the Custom transformation. Rules that apply to Custom transformations, such as blocking rules, also apply to transformations built using Custom transformations. For example, when you connect a Custom transformation in a mapping, you must verify that the data can flow from all sources in a target load order group to the targets without the Integration Service blocking all sources. Similarly, you must also verify this for transformations built using a Custom transformation. For more information about data flow validation, see “Mappings” in the Designer Guide. The following transformations that ship with Informatica products are built using the Custom transformation: ♦ HTTP transformation with PowerCenter ♦ Java transformation with PowerCenter ♦ SQL transformation with PowerCenter ♦ Union transformation with PowerCenter 54 Chapter 3: Custom Transformation
  • 87. XML Parser transformation with PowerCenter ♦ XML Generator transformation with PowerCenter ♦ SAP/ALE_IDoc_Interpreter transformation with PowerCenter Connect for SAP NetWeaver mySAP Option ♦ SAP/ALE_IDoc_Prepare transformation with PowerCenter Connect for SAP NetWeaver mySAP Option ♦ Web Service Consumer transformation with PowerCenter Connect for Web Services ♦ Address transformation with Data Cleansing Option ♦ Parse transformation with Data Cleansing Option Code Page Compatibility The Custom transformation procedure code page is the code page of the data the Custom transformation procedure processes. The following factors determine the Custom transformation procedure code page: ♦ Integration Service data movement mode ♦ The INFA_CTChangeStringMode() function ♦ The INFA_CTSetDataCodePageID() function The Custom transformation procedure code page must be two-way compatible with the Integration Service code page. The Integration Service passes data to the procedure in the Custom transformation procedure code page. Also, the data the procedure passes to the Integration Service must be valid characters in the Custom transformation procedure code page. By default, when the Integration Service runs in ASCII mode, the Custom transformation procedure code page is ASCII. Also, when the Integration Service runs in Unicode mode, the Custom transformation procedure code page is UCS-2, but the Integration Service only passes characters that are valid in the Integration Service code page. However, use the INFA_CTChangeStringMode() functions in the procedure code to request the data in a different format. In addition, when the Integration Service runs in Unicode mode, you can request the data in a different code page using the INFA_CTSetDataCodePageID() function. Changing the format or requesting the data in a different code page changes the Custom transformation procedure code page to the code page the procedure requests: ♦ ASCII mode. You can write the external procedure code to request the data in UCS-2 format using the INFA_CTChangeStringMode() function. When you use this function, the procedure must pass only ASCII characters in UCS-2 format to the Integration Service. Do not use the INFA_CTSetDataCodePageID() function when the Integration Service runs in ASCII mode. ♦ Unicode mode. You can write the external procedure code to request the data in MBCS using the INFA_CTChangeStringMode() function. When the external procedure requests the data in MBCS, the Integration Service passes the data in the Integration Service code Overview 55
  • 88. page. When you use the INFA_CTChangeStringMode() function, you can write the external procedure code to request the data in a different code page from the Integration Service code page using the INFA_CTSetDataCodePageID() function. The code page you specify in the INFA_CTSetDataCodePageID() function must be two-way compatible with the Integration Service code page. Note: You can also use the INFA_CTRebindInputDataType() function to change the format for a specific port in the Custom transformation. Distributing Custom Transformation Procedures You can copy a Custom transformation from one repository to another. When you copy a Custom transformation between repositories, you must verify that the Integration Service machine the target repository uses contains the Custom transformation procedure. 56 Chapter 3: Custom Transformation
  • 89. Creating Custom Transformations You can create reusable Custom transformations in the Transformation Developer, and add instances of the transformation to mappings. You can create non-reusable Custom transformations in the Mapping Designer or Mapplet Designer. Each Custom transformation specifies a module and a procedure name. You can create a Custom transformation based on an existing shared library or DLL containing the procedure, or you can create a Custom transformation as the basis for creating the procedure. When you create a Custom transformation to use with an existing shared library or DLL, make sure you define the correct module and procedure name. When you create a Custom transformation as the basis for creating the procedure, select the transformation and generate the code. The Designer uses the transformation properties when it generates the procedure code. It generates code in a single directory for all transformations sharing a common module name. The Designer generates the following files: ♦ m_<module_name>.c. Defines the module. This file includes an initialization function, m_<module_name>_moduleInit() that lets you write code you want the Integration Service to run when it loads the module. Similarly, this file includes a deinitialization function, m_<module_name>_moduleDeinit(), that lets you write code you want the Integration Service to run before it unloads the module. ♦ p_<procedure_name>.c. Defines the procedure in the module. This file contains the code that implements the procedure logic, such as data cleansing or merging data. ♦ makefile.aix, makefile.aix64, makefile.hp, makefile.hp64, makefile.hpparisc64, makefile.linux, makefile.sol, and makefile.sol64. Make files for the UNIX platforms. Use makefile.aix64 for 64-bit AIX platforms, makefile.sol64 for 64-bit Solaris platforms, and makefile.hp64 for 64-bit HP-UX (Itanium) platforms. Rules and Guidelines Use the following rules and guidelines when you create a Custom transformation: ♦ Custom transformations are connected transformations. You cannot reference a Custom transformation in an expression. ♦ You can include multiple procedures in one module. For example, you can include an XML writer procedure and an XML parser procedure in the same module. ♦ You can bind one shared library or DLL to multiple Custom transformation instances if you write the procedure code to handle multiple Custom transformation instances. ♦ When you write the procedure code, you must make sure it does not violate basic mapping rules. For more information about mappings and mapping validation, see “Mappings” in the Designer Guide. ♦ The Custom transformation sends and receives high precision decimals as high precision decimals. Creating Custom Transformations 57
  • 90. Use multi-threaded code in Custom transformation procedures. Custom Transformation Components When you configure a Custom transformation, you define the following components: ♦ Transformation tab. You can rename the transformation and add a description on the Transformation tab. ♦ Ports tab. You can add and edit ports and groups to a Custom transformation. For more information about creating ports and groups, see “Working with Groups and Ports” on page 59. You can also define the input ports an output port depends on. For more information about defining port dependencies, see “Defining Port Relationships” on page 60. ♦ Port Attribute Definitions tab. You can create user-defined port attributes for Custom transformation ports. For more information about creating and editing port attributes, see “Working with Port Attributes” on page 62. ♦ Properties tab. You can define transformation properties such as module and function identifiers, transaction properties, and the runtime location. For more information about defining transformation properties, see “Custom Transformation Properties” on page 64. ♦ Initialization Properties tab. You can define properties that the external procedure uses at runtime, such as during initialization. For more information about creating initialization properties, see “Working with Procedure Properties” on page 72. ♦ Metadata Extensions tab. You can create metadata extensions to define properties that the procedure uses at runtime, such as during initialization. For more information about using metadata extensions for procedure properties, see “Working with Procedure Properties” on page 72. 58 Chapter 3: Custom Transformation
  • 91. Working with Groups and Ports A Custom transformation has both input and output groups. It also can have input ports, output ports, and input/output ports. You create and edit groups and ports on the Ports tab of the Custom transformation. You can also define the relationship between input and output ports on the Ports tab. Figure 3-1 shows the Custom transformation Ports tab: Figure 3-1. Custom Transformation Ports Tab Add and delete groups, and edit port attributes. First Input Group Header Output Group Header Second Input Group Header Coupled Group Headers Creating Groups and Ports You can create multiple input groups and multiple output groups in a Custom transformation. You must create at least one input group and one output group. To create an input group, click the Create Input Group icon. To create an output group, click the Create Output Group icon. When you create a group, the Designer adds it as the last group. When you create a passive Custom transformation, you can only create one input group and one output group. To create a port, click the Add button. When you create a port, the Designer adds it below the currently selected row or group. Each port contains attributes defined on the Port Attribute Definitions tab. You can edit the attributes for each port. For more information about creating and editing user-defined port attributes, see “Working with Port Attributes” on page 62. Working with Groups and Ports 59
  • 92. Editing Groups and Ports Use the following rules and guidelines when you edit ports and groups in a Custom transformation: ♦ You can change group names by typing in the group header. ♦ You can only enter ASCII characters for port and group names. ♦ Once you create a group, you cannot change the group type. If you need to change the group type, delete the group and add a new group. ♦ When you delete a group, the Designer deletes all ports of the same type in that group. However, all input/output ports remain in the transformation, belong to the group above them, and change to input ports or output ports, depending on the type of group you delete. For example, an output group contains output ports and input/output ports. You delete the output group. The Designer deletes the output ports. It changes the input/ output ports to input ports. Those input ports belong to the input group with the header directly above them. ♦ To move a group up or down, select the group header and click the Move Port Up or Move Port Down button. The ports above and below the group header remain the same, but the groups to which they belong might change. Defining Port Relationships By default, an output port in a Custom transformation depends on all input ports. However, you can define the relationship between input and output ports in a Custom transformation. When you do this, you can view link paths in a mapping containing a Custom transformation and you can see which input ports an output port depends on. You can also view source column dependencies for target ports in a mapping containing a Custom transformation. To define the relationship between ports in a Custom transformation, create a port dependency. A port dependency is the relationship between an output or input/output port and one or more input or input/output ports. When you create a port dependency, base it on the procedure logic in the code. To create a port dependency, click Custom Transformation on the Ports tab and choose Port Dependencies. 60 Chapter 3: Custom Transformation
  • 93. Figure 3-2 shows where you create and edit port dependencies: Figure 3-2. Editing Port Dependencies Choose an output or input/output port. Add a port dependency. Remove a port dependency. Choose an input or input/output port on which the output or input/output port depends. For example, create a external procedure that parses XML data. You create a Custom transformation with one input group containing one input port and multiple output groups containing multiple output ports. According to the external procedure logic, all output ports depend on the input port. You can define this relationship in the Custom transformation by creating a port dependency for each output port. Define each port dependency so that the output port depends on the one input port. To create a port dependency: 1. On the Ports tab, click Custom Transformation and choose Port Dependencies. 2. In the Output Port Dependencies dialog box, select an output or input/output port in the Output Port field. 3. In the Input Ports pane, select an input or input/output port on which the output port or input/output port depends. 4. Click Add. 5. Repeat steps 3 to 4 to include more input or input/output ports in the port dependency. 6. To create another port dependency, repeat steps 2 to 5. 7. Click OK. Working with Groups and Ports 61
  • 94. Working with Port Attributes Ports have certain attributes, such as datatype and precision. When you create a Custom transformation, you can create user-defined port attributes. User-defined port attributes apply to all ports in a Custom transformation. For example, you create a external procedure to parse XML data. You can create a port attribute called “XML path” where you can define the position of an element in the XML hierarchy. Create port attributes and assign default values on the Port Attribute Definitions tab of the Custom transformation. You can define a specific port attribute value for each port on the Ports tab. Figure 3-3 shows the Port Attribute Definitions tab where you create port attributes: Figure 3-3. Port Attribute Definitions Tab Port Attribute Default Value When you create a port attribute, define the following properties: ♦ Name. The name of the port attribute. ♦ Datatype. The datatype of the port attribute value. You can choose Boolean, Numeric, or String. ♦ Value. The default value of the port attribute. This property is optional. When you enter a value here, the value applies to all ports in the Custom transformation. You can override the port attribute value for each port on the Ports tab. You define port attributes for each Custom transformation. You cannot copy a port attribute from one Custom transformation to another. 62 Chapter 3: Custom Transformation
  • 95. Editing Port Attribute Values After you create port attributes, you can edit the port attribute values for each port in the transformation. To edit the port attribute values, click Custom Transformation on the Ports tab and choose Edit Port Attribute. Figure 3-4 shows where you edit port attribute values: Figure 3-4. Edit Port Attribute Values Filter ports by group. Edit port attribute value. Revert to default port attribute value. You can change the port attribute value for a particular port by clicking the Open button. This opens the Edit Port Attribute Default Value dialog box. Or, you can enter a new value by typing directly in the Value column. You can filter the ports listed in the Edit Port Level Attributes dialog box by choosing a group from the Select Group field. Working with Port Attributes 63
  • 96. Custom Transformation Properties Properties for the Custom transformation apply to both the procedure and the transformation. Configure the Custom transformation properties on the Properties tab of the Custom transformation. Figure 3-5 shows the Custom transformation Properties tab: Figure 3-5. Custom Transformation Properties Table 3-1 describes the Custom transformation properties: Table 3-1. Custom Transformation Properties Option Description Language Language used for the procedure code. You define the language when you create the Custom transformation. If you need to change the language, create a new Custom transformation. Module Identifier Module name. Applies to Custom transformation procedures developed using C or C++. Enter only ASCII characters in this field. You cannot enter multibyte characters. This property is the base name of the DLL or the shared library that contains the procedure. The Designer uses this name to create the C file when you generate the external procedure code. Function Identifier Name of the procedure in the module. Applies to Custom transformation procedures developed using C. Enter only ASCII characters in this field. You cannot enter multibyte characters. The Designer uses this name to create the C file where you enter the procedure code. 64 Chapter 3: Custom Transformation
  • 97. Table 3-1. Custom Transformation Properties Option Description Class Name Class name of the Custom transformation procedure. Applies to Custom transformation procedures developed using C++ or Java. Enter only ASCII characters in this field. You cannot enter multibyte characters. Runtime Location Location that contains the DLL or shared library. Default is $PMExtProcDir. Enter a path relative to the Integration Service machine that runs the session using the Custom transformation. If you make this property blank, the Integration Service uses the environment variable defined on the Integration Service machine to locate the DLL or shared library. You must copy all DLLs or shared libraries to the runtime location or to the environment variable defined on the Integration Service machine. The Integration Service fails to load the procedure when it cannot locate the DLL, shared library, or a referenced file. Tracing Level Amount of detail displayed in the session log for this transformation. Default is Normal. Is Partitionable Indicates if you can create multiple partitions in a pipeline that uses this transformation: - No. The transformation cannot be partitioned. The transformation and other transformations in the same pipeline are limited to one partition. - Locally. The transformation can be partitioned, but the Integration Service must run all partitions in the pipeline on the same node. Choose Local when different partitions of the Custom transformation must share objects in memory. - Across Grid. The transformation can be partitioned, and the Integration Service can distribute each partition to different nodes. Default is No. For more information about using partitioning with Custom transformations, see “Working with Partition Points” in the Workflow Administration Guide. Inputs Must Block Indicates if the procedure associated with the transformation must be able to block incoming data. Default is enabled. For more information about blocking data, see “Blocking Input Data” on page 70. Is Active Indicates if this transformation is an active or passive transformation. You cannot change this property after you create the Custom transformation. If you need to change this property, create a new Custom transformation and select the correct property value. Update Strategy Indicates if this transformation defines the update strategy for output rows. Default is Transformation disabled. You can enable this for active Custom transformations. For more information about this property, see “Setting the Update Strategy” on page 66. Transformation Scope Indicates how the Integration Service applies the transformation logic to incoming data: - Row - Transaction - All Input When the transformation is passive, this property is always Row. When the transformation is active, this property is All Input by default. For more information about working with transaction control, see “Working with Transaction Control” on page 68. Generate Transaction Indicates if this transformation can generate transactions. When a Custom transformation generates transactions, it generates transactions for all output groups. Default is disabled. You can only enable this for active Custom transformations. For more information about working with transaction control, see “Working with Transaction Control” on page 68. Custom Transformation Properties 65
  • 98. Table 3-1. Custom Transformation Properties Option Description Output is Ordered Indicates if the order of the output data is consistent between session runs. - Never. The order of the output data is inconsistent between session runs. This is the default for active transformations. - Based On Input Order. The output order is consistent between session runs when the input data order is consistent between session runs. This is the default for passive transformations. - Always. The order of the output data is consistent between session runs even if the order of the input data is inconsistent between session runs. Requires Single Thread Indicates if the Integration Service processes each partition at the procedure with one Per Partition thread. When you enable this option, the procedure code can use thread-specific operations. Default is enabled. For more information about writing thread-specific operations, see “Working with Thread- Specific Procedure Code” on page 66. Output is Deterministic Indicates whether the transformation generates consistent output data between session runs. You must enable this property to perform recovery on sessions that use this transformation. For more information about session recovery, see “Recovering Workflows” in the Workflow Administration Guide. Setting the Update Strategy Use an active Custom transformation to set the update strategy for a mapping at the following levels: ♦ Within the procedure. You can write the external procedure code to set the update strategy for output rows. The external procedure can flag rows for insert, update, delete, or reject. For more information about the functions used to set the update strategy, see “Row Strategy Functions (Row-Based Mode)” on page 128. ♦ Within the mapping. Use the Custom transformation in a mapping to flag rows for insert, update, delete, or reject. Select the Update Strategy Transformation property for the Custom transformation. ♦ Within the session. Configure the session to treat the source rows as data driven. If you do not configure the Custom transformation to define the update strategy, or you do not configure the session as data driven, the Integration Service does not use the external procedure code to flag the output rows. Instead, when the Custom transformation is active, the Integration Service flags the output rows as insert. When the Custom transformation is passive, the Integration Service retains the row type. For example, when a row flagged for update enters a passive Custom transformation, the Integration Service maintains the row type and outputs the row as update. Working with Thread-Specific Procedure Code Custom transformation procedures can include thread-specific operations. A thread-specific operation is code that performs an action based on the thread that is processing the procedure. 66 Chapter 3: Custom Transformation
  • 99. You can configure the Custom transformation so the Integration Service uses one thread to process the Custom transformation for each partition using the Requires Single Thread Per Partition property. When you configure a Custom transformation to process each partition with one thread, the Integration Service calls the following functions with the same thread for each partition: ♦ p_<proc_name>_partitionInit() ♦ p_<proc_name>_partitionDeinit() ♦ p_<proc_name>_inputRowNotification() ♦ p_<proc_name>_dataBdryRowNotification() ♦ p_<proc_name>_eofNotification() You can include thread-specific operations in these functions because the Integration Service uses the same thread to process these functions for each partition. For example, you might attach and detach threads to a Java Virtual Machine. Note: When you configure a Custom transformation to process each partition with one thread, the Workflow Manager adds partition points depending on the mapping configuration. For more information, see “Working with Partition Points” in the Workflow Administration Guide. Custom Transformation Properties 67
  • 100. Working with Transaction Control You can define transaction control for Custom transformations using the following transformation properties: ♦ Transformation Scope. Determines how the Integration Service applies the transformation logic to incoming data. ♦ Generate Transaction. Indicates that the procedure generates transaction rows and outputs them to the output groups. Transformation Scope You can configure how the Integration Service applies the transformation logic to incoming data. You can choose one of the following values: ♦ Row. Applies the transformation logic to one row of data at a time. Choose Row when the results of the procedure depend on a single row of data. For example, you might choose Row when a procedure parses a row containing an XML file. ♦ Transaction. Applies the transformation logic to all rows in a transaction. Choose Transaction when the results of the procedure depend on all rows in the same transaction, but not on rows in other transactions. When you choose Transaction, you must connect all input groups to the same transaction control point. For example, you might choose Transaction when the external procedure performs aggregate calculations on the data in a single transaction. ♦ All Input. Applies the transformation logic to all incoming data. When you choose All Input, the Integration Service drops transaction boundaries. Choose All Input when the results of the procedure depend on all rows of data in the source. For example, you might choose All Input when the external procedure performs aggregate calculations on all incoming data, or when it sorts all incoming data. For more information about transformation scope, see “Understanding Commit Points” in the Workflow Administration Guide. Generate Transaction You can write the external procedure code to output transactions, such as commit and rollback rows. When the external procedure outputs commit and rollback rows, configure the Custom transformation to generate transactions. Select the Generate Transaction transformation property. You can enable this property for active Custom transformations. For information about the functions you use to generate transactions, see “Data Boundary Output Notification Function” on page 121. When the external procedure outputs a commit or rollback row, it outputs or rolls back the row for all output groups. When you configure the transformation to generate transactions, the Integration Service treats the Custom transformation like a Transaction Control transformation. Most rules that apply to a Transaction Control transformation in a mapping also apply to the Custom 68 Chapter 3: Custom Transformation
  • 101. transformation. For example, when you configure a Custom transformation to generate transactions, you cannot concatenate pipelines or pipeline branches containing the transformation. For more information about working with Transaction Control transformations, see “Transaction Control Transformation” on page 555. When you edit or create a session using a Custom transformation configured to generate transactions, configure it for user-defined commit. Working with Transaction Boundaries The Integration Service handles transaction boundaries entering and leaving Custom transformations based on the mapping configuration and the Custom transformation properties. Table 3-2 describes how the Integration Service handles transaction boundaries at Custom transformations: Table 3-2. Transaction Boundary Handling with Custom Transformations Transformation Generate Transactions Enabled Generate Transactions Disabled Scope Row Integration Service drops incoming transaction When the incoming data for all input groups boundaries and does not call the data comes from the same transaction control point, boundary notification function. the Integration Service preserves incoming It outputs transaction rows according to the transaction boundaries and outputs them procedure logic across all output groups. across all output groups. However, it does not call the data boundary notification function. When the incoming data for the input groups comes from different transaction control points, the Integration Service drops incoming transaction boundaries. It does not call the data boundary notification function. The Integration Service outputs all rows in one open transaction. Transaction Integration Service preserves incoming Integration Service preserves incoming transaction boundaries and calls the data transaction boundaries and calls the data boundary notification function. boundary notification function. However, it outputs transaction rows according It outputs the transaction rows across all output to the procedure logic across all output groups. groups. All Input Integration Service drops incoming transaction Integration Service drops incoming transaction boundaries and does not call the data boundaries and does not call the data boundary notification function. The Integration boundary notification function. It outputs all Service outputs transaction rows according to rows in one open transaction. the procedure logic across all output groups. Working with Transaction Control 69
  • 102. Blocking Input Data By default, the Integration Service concurrently reads sources in a target load order group. However, you can write the external procedure code to block input data on some input groups. Blocking is the suspension of the data flow into an input group of a multiple input group transformation. For more information about blocking source data, see “Integration Service Architecture” in the Administrator Guide. To use a Custom transformation to block input data, you must write the procedure code to block and unblock data. You must also enable blocking on the Properties tab for the Custom transformation. Writing the Procedure Code to Block Data You can write the procedure to block and unblock incoming data. To block incoming data, use the INFA_CTBlockInputFlow() function. To unblock incoming data, use the INFA_CTUnblockInputFlow() function. For more information about the blocking functions, see “Blocking Functions” on page 125. You might want to block input data if the external procedure needs to alternate reading from input groups. Without the blocking functionality, you would need to write the procedure code to buffer incoming data. You can block input data instead of buffering it which usually increases session performance. For example, you need to create an external procedure with two input groups. The external procedure reads a row from the first input group and then reads a row from the second input group. If you use blocking, you can write the external procedure code to block the flow of data from one input group while it processes the data from the other input group. When you write the external procedure code to block data, you increase performance because the procedure does not need to copy the source data to a buffer. However, you could write the external procedure to allocate a buffer and copy the data from one input group to the buffer until it is ready to process the data. Copying source data to a buffer decreases performance. Configuring Custom Transformations as Blocking Transformations When you create a Custom transformation, the Designer enables the Inputs Must Block transformation property by default. This property affects data flow validation when you save or validate a mapping. When you enable this property, the Custom transformation is a blocking transformation. When you clear this property, the Custom transformation is not a blocking transformation. For more information about blocking transformations, see “Multi- Group Transformations” on page 9. Configure the Custom transformation as a blocking transformation when the external procedure code must be able to block input data. You can configure the Custom transformation as a non-blocking transformation when one of the following conditions is true: ♦ The procedure code does not include the blocking functions. 70 Chapter 3: Custom Transformation
  • 103. The procedure code includes two algorithms, one that uses blocking and the other that copies the source data to a buffer allocated by the procedure instead of blocking data. The code checks whether or not the Integration Service allows the Custom transformation to block data. The procedure uses the algorithm with the blocking functions when it can block, and uses the other algorithm when it cannot block. You might want to do this to create a Custom transformation that you use in multiple mapping configurations. For more information about verifying whether the Integration Service allows a Custom transformation to block data, see “Validating Mappings with Custom Transformations” on page 71. Note: When the procedure blocks data and you configure the Custom transformation as a non-blocking transformation, the Integration Service fails the session. Validating Mappings with Custom Transformations When you include a Custom transformation in a mapping, both the Designer and Integration Service validate the mapping. The Designer validates the mapping you save or validate and the Integration Service validates the mapping when you run the session. Validating at Design Time When you save or validate a mapping, the Designer performs data flow validation. When the Designer does this, it verifies that the data can flow from all sources in a target load order group to the targets without blocking transformations blocking all sources. Some mappings with blocking transformations are invalid. For more information about data flow validation, see “Mappings” in the Designer Guide. Validating at Runtime When you run a session, the Integration Service validates the mapping against the procedure code at runtime. When the Integration Service does this, it tracks whether or not it allows the Custom transformations to block data: ♦ Configure the Custom transformation as a blocking transformation. The Integration Service always allows the Custom transformation to block data. ♦ Configure the Custom transformation as a non-blocking transformation. The Integration Service allows the Custom transformation to block data depending on the mapping configuration. If the Integration Service can block data at the Custom transformation without blocking all sources in the target load order group simultaneously, it allows the Custom transformation to block data. You can write the procedure code to check whether or not the Integration Service allows a Custom transformation to block data. Use the INFA_CT_getInternalProperty() function to access the INFA_CT_TRANS_MAY_BLOCK_DATA property ID. The Integration Service returns TRUE when the Custom transformation can block data, and it returns FALSE when the Custom transformation cannot block data. For more information about the INFA_CT_getInternalProperty() function, see “Property Functions” on page 108. Blocking Input Data 71
  • 104. Working with Procedure Properties You can define property name and value pairs in the Custom transformation that the procedure can use when the Integration Service runs the procedure, such as during initialization time. You can create user-defined properties on the following tabs of the Custom transformation: ♦ Metadata Extensions. You can specify the property name, datatype, precision, and value. Use metadata extensions for passing information to the procedure. For more information about creating metadata extensions, see “Metadata Extensions” in the Repository Guide. ♦ Initialization Properties. You can specify the property name and value. While you can define properties on both tabs in the Custom transformation, the Metadata Extensions tab lets you provide more detail for the property. Use metadata extensions to pass properties to the procedure. For example, you create a Custom transformation external procedure that sorts data after transforming it. You could create a boolean metadata extension named Sort_Ascending. When you use the Custom transformation in a mapping, you can choose True or False for the metadata extension, depending on how you want the procedure to sort the data. When you define a property in the Custom transformation, use the get all property names functions, such as INFA_CTGetAllPropertyNamesM(), to access the names of all properties defined on the Initialization Properties and Metadata Extensions tab. Use the get external property functions, such as INFA_CT_getExternalPropertyM(), to access the property name and value of a property ID you specify. Note: When you define a metadata extension and an initialization property with the same name, the property functions only return information for the metadata extension. 72 Chapter 3: Custom Transformation
  • 105. Creating Custom Transformation Procedures You can create Custom transformation procedures that run on 32-bit or 64-bit Integration Service machines. Use the following steps as a guideline when you create a Custom transformation procedure: 1. In the Transformation Developer, create a reusable Custom transformation. Or, in the Mapplet Designer or Mapping Designer, create a non-reusable Custom transformation. 2. Generate the template code for the procedure. When you generate the procedure code, the Designer uses the information from the Custom transformation to create C source code files and makefiles. 3. Modify the C files to add the procedure logic. 4. Use a C/C++ compiler to compile and link the source code files into a DLL or shared library and copy it to the Integration Service machine. 5. Create a mapping with the Custom transformation. 6. Run the session in a workflow. This section includes an example to demonstrate this process. The steps in this section create a Custom transformation that contains two input groups and one output group. The Custom transformation procedure verifies that the Custom transformation uses two input groups and one output group. It also verifies that the number of ports in all groups are equal and that the port datatypes are the same for all groups. The procedure takes rows of data from each input group and outputs all rows to the output group. Step 1. Create the Custom Transformation The first step is to create a Custom transformation. To create a Custom transformation: 1. In the Transformation Developer, click Transformation > Create. 2. In the Create Transformation dialog box, choose Custom transformation, enter a transformation name, and click Create. In the Union example, enter CT_Inf_Union as the transformation name. 3. In the Active or Passive dialog box, create the transformation as a passive or active transformation, and click OK. In the Union example, choose Active. 4. Click Done to close the Create Transformation dialog box. 5. Open the transformation and click the Ports tab. Create groups and ports. You can edit the groups and ports later, if necessary. For more information about creating groups and ports, see “Working with Groups and Ports” on page 59. Creating Custom Transformation Procedures 73
  • 106. In the Union example, create the groups and ports shown in Figure 3-6: Figure 3-6. Custom Transformation Ports Tab - Union Example First Input Group Second Input Group Output Group 6. Select the Properties tab and enter a module and function identifier and the runtime location. Edit other transformation properties. For more information about Custom transformation properties, see “Custom Transformation Properties” on page 64. 74 Chapter 3: Custom Transformation
  • 107. In the Union example, enter the properties shown in Figure 3-7: Figure 3-7. Custom Transformation Properties Tab - Union Example 7. Click the Metadata Extensions tab to enter metadata extensions, such as properties the external procedure might need for initialization. For more information about using metadata extensions for procedure properties, see “Working with Procedure Properties” on page 72. In the Union example, do not create metadata extensions. 8. Click the Port Attribute Definitions tab to create port attributes, if necessary. For more information about creating port attributes, see “Working with Port Attributes” on page 62. In the Union example, do not create port attributes. 9. Click OK. 10. Click Repository > Save. After you create the Custom transformation that calls the procedure, the next step is to generate the C files. Step 2. Generate the C Files After you create a Custom transformation, you generate the source code files. The Designer generates file names in lower case. Creating Custom Transformation Procedures 75
  • 108. To generate the code for a Custom transformation procedure: 1. In the Transformation Developer, select the transformation and click Transformation > Generate Code. 2. Select the procedure you just created. The Designer lists the procedures as <module_name>.<procedure_name>. In the Union example, select UnionDemo.Union. 3. Specify the directory where you want to generate the files, and click Generate. In the Union example, select <client_installation_directory>/TX. The Designer creates a subdirectory, <module_name>, in the directory you specified. In the Union example, the Designer creates <client_installation_directory>/TX/ UnionDemo. It also creates the following files: ♦ m_UnionDemo.c ♦ m_UnionDemo.h ♦ p_Union.c ♦ p_Union.h ♦ makefile.aix (32-bit), makefile.aix64 (64-bit), makefile.hp (32-bit), makefile.hp64 (64-bit), makefile.hpparisc64, makefile.linux (32-bit), and makefile.sol (32-bit). Step 3. Fill Out the Code with the Transformation Logic You must code the procedure C file. Optionally, you can also code the module C file. In the Union example, you fill out the procedure C file only. You do not need to fill out the module C file. To code the procedure C file: 1. Open p_<procedure_name>.c for the procedure. In the Union example, open p_Union.c. 2. Enter the C code for the procedure. 3. Save the modified file. In the Union example, use the following code: /************************************************************************** * * Copyright (c) 2005 Informatica Corporation. This file contains * material proprietary to Informatica Corporation and may not be copied * or distributed in any form without the written permission of Informatica * Corporation * **************************************************************************/ 76 Chapter 3: Custom Transformation
  • 109. /************************************************************************** * Custom Transformation p_union Procedure File * * This file contains code that functions that will be called by the main * server executable. * * for more information on these files, * see $(INFA_HOME)/ExtProc/include/Readme.txt **************************************************************************/ /* * INFORMATICA 'UNION DEMO' developed using the API for custom * transformations. * File Name: p_Union.c * * An example of a custom transformation ('Union') using PowerCenter8.0 * * The purpose of the 'Union' transformation is to combine pipelines with the * same row definition into one pipeline (i.e. union of multiple pipelines). * [ Note that it does not correspond to the mathematical definition of union * since it does not eliminate duplicate rows.] * * This example union transformation allows N input pipelines ( each * corresponding to an input group) to be combined into one pipeline. * * To use this transformation in a mapping, the following attributes must be * true: * a. The transformation must have >= 2 input groups and only one output group. * b. In the Properties tab set the following properties: * i. Module Identifier: UnionDemo * ii. Function Identifier: Union * iii. Inputs May Block: Unchecked * iv. Is Active: Checked * v. Update Strategy Transformation: Unchecked * * vi. Transformation Scope: All * vii. Generate Transaction: Unchecked * * * * This version of the union transformation does not provide code for * changing the update strategy or for generating transactions. * c. The input groups and the output group must have the same number of ports * and the same datatypes. This is verified in the initialization of the * module and the session is failed if this is not true. * d. The transformation can be used in multiple number of times in a Target Creating Custom Transformation Procedures 77
  • 110. * Load Order Group and can also be contained within multiple partitions. * */ /************************************************************************** Includes **************************************************************************/ include <stdlib.h> #include "p_union.h" /************************************************************************** Forward Declarations **************************************************************************/ INFA_STATUS validateProperties(const INFA_CT_PARTITION_HANDLE* partition); /************************************************************************** Functions **************************************************************************/ /************************************************************************** Function: p_union_procInit Description: Initialization for the procedure. Returns INFA_SUCCESS if procedure initialization succeeds, else return INFA_FAILURE. Input: procedure - the handle for the procedure Output: None Remarks: This function will get called once for the session at initialization time. It will be called after the moduleInit function. **************************************************************************/ INFA_STATUS p_union_procInit( INFA_CT_PROCEDURE_HANDLE procedure) { const INFA_CT_TRANSFORMATION_HANDLE* transformation = NULL; const INFA_CT_PARTITION_HANDLE* partition = NULL; size_t nTransformations = 0, nPartitions = 0, i = 0; /* Log a message indicating beginning of the procedure initialization */ INFA_CTLogMessageM( eESL_LOG, "union_demo: Procedure initialization started ..." ); INFA_CTChangeStringMode( procedure, eASM_MBCS ); 78 Chapter 3: Custom Transformation
  • 111. /* Get the transformation handles */ transformation = INFA_CTGetChildrenHandles( procedure, &nTransformations, TRANSFORMATIONTYPE); /* For each transformation verify that the 0th partition has the correct * properties. This does not need to be done for all partitions since rest * of the partitions have the same information */ for (i = 0; i < nTransformations; i++) { /* Get the partition handle */ partition = INFA_CTGetChildrenHandles(transformation[i], &nPartitions, PARTITIONTYPE ); if (validateProperties(partition) != INFA_SUCCESS) { INFA_CTLogMessageM( eESL_ERROR, "union_demo: Failed to validate attributes of " "the transformation"); return INFA_FAILURE; } } INFA_CTLogMessageM( eESL_LOG, "union_demo: Procedure initialization completed." ); return INFA_SUCCESS; } /************************************************************************** Function: p_union_procDeinit Description: Deinitialization for the procedure. Returns INFA_SUCCESS if procedure deinitialization succeeds, else return INFA_FAILURE. Input: procedure - the handle for the procedure Output: None Remarks: This function will get called once for the session at deinitialization time. It will be called before the moduleDeinit function. **************************************************************************/ INFA_STATUS p_union_procDeinit( INFA_CT_PROCEDURE_HANDLE procedure, INFA_STATUS sessionStatus ) Creating Custom Transformation Procedures 79
  • 112. { /* Do nothing ... */ return INFA_SUCCESS; } /************************************************************************** Function: p_union_partitionInit Description: Initialization for the partition. Returns INFA_SUCCESS if partition deinitialization succeeds, else return INFA_FAILURE. Input: partition - the handle for the partition Output: None Remarks: This function will get called once for each partition for each transformation in the session. **************************************************************************/ INFA_STATUS p_union_partitionInit( INFA_CT_PARTITION_HANDLE partition ) { /* Do nothing ... */ return INFA_SUCCESS; } /************************************************************************** Function: p_union_partitionDeinit Description: Deinitialization for the partition. Returns INFA_SUCCESS if partition deinitialization succeeds, else return INFA_FAILURE. Input: partition - the handle for the partition Output: None Remarks: This function will get called once for each partition for each transformation in the session. **************************************************************************/ INFA_STATUS p_union_partitionDeinit( INFA_CT_PARTITION_HANDLE partition ) { /* Do nothing ... */ return INFA_SUCCESS; } /************************************************************************** Function: p_union_inputRowNotification 80 Chapter 3: Custom Transformation
  • 113. Description: Notification that a row needs to be processed for an input group in a transformation for the given partition. Returns INFA_ROWSUCCESS if the input row was processed successfully, INFA_ROWFAILURE if the input row was not processed successfully and INFA_FATALERROR if the input row causes the session to fail. Input: partition - the handle for the partition for the given row group - the handle for the input group for the given row Output: None Remarks: This function is probably where the meat of your code will go, as it is called for every row that gets sent into your transformation. **************************************************************************/ INFA_ROWSTATUS p_union_inputRowNotification( INFA_CT_PARTITION_HANDLE partition, INFA_CT_INPUTGROUP_HANDLE inputGroup ) { const INFA_CT_OUTPUTGROUP_HANDLE* outputGroups = NULL; const INFA_CT_INPUTPORT_HANDLE* inputGroupPorts = NULL; const INFA_CT_OUTPUTPORT_HANDLE* outputGroupPorts = NULL; size_t nNumInputPorts = 0, nNumOutputGroups = 0, nNumPortsInOutputGroup = 0, i = 0; /* Get the output group port handles */ outputGroups = INFA_CTGetChildrenHandles(partition, &nNumOutputGroups, OUTPUTGROUPTYPE); outputGroupPorts = INFA_CTGetChildrenHandles(outputGroups[0], &nNumPortsInOutputGroup, OUTPUTPORTTYPE); /* Get the input groups port handles */ inputGroupPorts = INFA_CTGetChildrenHandles(inputGroup, &nNumInputPorts, INPUTPORTTYPE); /* For the union transformation, on receiving a row of input, we need to * output that row on the output group. */ for (i = 0; i < nNumInputPorts; i++) { INFA_CTSetData(outputGroupPorts[i], INFA_CTGetDataVoid(inputGroupPorts[i])); Creating Custom Transformation Procedures 81
  • 114. INFA_CTSetIndicator(outputGroupPorts[i], INFA_CTGetIndicator(inputGroupPorts[i]) ); INFA_CTSetLength(outputGroupPorts[i], INFA_CTGetLength(inputGroupPorts[i]) ); } /* We know there is only one output group for each partition */ return INFA_CTOutputNotification(outputGroups[0]); } /************************************************************************** Function: p_union_eofNotification Description: Notification that the last row for an input group has already been seen. Return INFA_FAILURE if the session should fail as a result of seeing this notification, INFA_SUCCESS otherwise. Input: partition - the handle for the partition for the notification group - the handle for the input group for the notification Output: None **************************************************************************/ INFA_STATUS p_union_eofNotification( INFA_CT_PARTITION_HANDLE partition, INFA_CT_INPUTGROUP_HANDLE group) { INFA_CTLogMessageM( eESL_LOG, "union_demo: An input group received an EOF notification"); return INFA_SUCCESS; } /************************************************************************** Function: p_union_dataBdryNotification Description: Notification that a transaction has ended. The data boundary type can either be commit or rollback. Return INFA_FAILURE if the session should fail as a result of seeing this notification, INFA_SUCCESS otherwise. Input: partition - the handle for the partition for the notification transactionType - commit or rollback Output: None **************************************************************************/ 82 Chapter 3: Custom Transformation
  • 115. INFA_STATUS p_union_dataBdryNotification ( INFA_CT_PARTITION_HANDLE partition, INFA_CT_DATABDRY_TYPE transactionType) { /* Do nothing */ return INFA_SUCCESS; } /* Helper functions */ /************************************************************************** Function: validateProperties Description: Validate that the transformation has all properties expected by a union transformation, such as at least one input group, and only one output group. Return INFA_FAILURE if the session should fail since the transformation was invalid, INFA_SUCCESS otherwise. Input: partition - the handle for the partition Output: None **************************************************************************/ INFA_STATUS validateProperties(const INFA_CT_PARTITION_HANDLE* partition) { const INFA_CT_INPUTGROUP_HANDLE* inputGroups = NULL; const INFA_CT_OUTPUTGROUP_HANDLE* outputGroups = NULL; size_t nNumInputGroups = 0, nNumOutputGroups = 0; const INFA_CT_INPUTPORT_HANDLE** allInputGroupsPorts = NULL; const INFA_CT_OUTPUTPORT_HANDLE* outputGroupPorts = NULL; size_t nNumPortsInOutputGroup = 0; size_t i = 0, nTempNumInputPorts = 0; /* Get the input and output group handles */ inputGroups = INFA_CTGetChildrenHandles(partition[0], &nNumInputGroups, INPUTGROUPTYPE); outputGroups = INFA_CTGetChildrenHandles(partition[0], &nNumOutputGroups, OUTPUTGROUPTYPE); /* 1. Number of input groups must be >= 2 and number of output groups must * be equal to one. */ if (nNumInputGroups < 1 || nNumOutputGroups != 1) Creating Custom Transformation Procedures 83
  • 116. { INFA_CTLogMessageM( eESL_ERROR, "UnionDemo: There must be at least two input groups " "and only one output group"); return INFA_FAILURE; } /* 2. Verify that the same number of ports are in each group (including * output group). */ outputGroupPorts = INFA_CTGetChildrenHandles(outputGroups[0], &nNumPortsInOutputGroup, OUTPUTPORTTYPE); /* Allocate an array for all input groups ports */ allInputGroupsPorts = malloc(sizeof(INFA_CT_INPUTPORT_HANDLE*) * nNumInputGroups); for (i = 0; i < nNumInputGroups; i++) { allInputGroupsPorts[i] = INFA_CTGetChildrenHandles(inputGroups[i], &nTempNumInputPorts, INPUTPORTTYPE); if ( nNumPortsInOutputGroup != nTempNumInputPorts) { INFA_CTLogMessageM( eESL_ERROR, "UnionDemo: The number of ports in all input and " "the output group must be the same."); return INFA_FAILURE; } } free(allInputGroupsPorts); /* 3. Datatypes of ports in input group 1 must match data types of all other * groups. TODO:*/ return INFA_SUCCESS; } 84 Chapter 3: Custom Transformation
  • 117. Step 4. Build the Module You can build the module on a Windows or UNIX platform. Table 3-3 lists the library file names for each platform when you build the module: Table 3-3. Module File Names Platform Module File Name Windows <module_identifier>.dll AIX lib<module_identifier>.a HP-UX lib<module_identifier>.sl Linux lib<module_identifier>.so Solaris lib<module_identifier>.so Building the Module on Windows On Windows, use Microsoft Visual C++ to build the module. To build the module on Windows: 1. Start Visual C++. 2. Click File > New. 3. In the New dialog box, click the Projects tab and select the Win32 Dynamic-Link Library option. 4. Enter its location. In the Union example, enter <client_installation_directory>/TX/UnionDemo. 5. Enter the name of the project. You must use the module name specified for the Custom transformation as the project name. In the Union example, enter UnionDemo. 6. Click OK. Visual C++ creates a wizard to help you define the project components. 7. In the wizard, select An empty DLL project and click Finish. Click OK in the New Project Information dialog box. Visual C++ creates the project files in the directory you specified. 8. Click Project > Add To Project > Files. Creating Custom Transformation Procedures 85
  • 118. 9. Navigate up a directory level. This directory contains the procedure files you created. Select all .c files and click OK. In the Union example, add the following files: ♦ m_UnionDemo.c ♦ p_Union.c 10. Click Project > Settings. 11. Click the C/C++ tab, and select Preprocessor from the Category field. 12. In the Additional Include Directories field, enter the following path and click OK: ..; <PowerCenter_install_dir>extprocincludect 13. Click Build > Build <module_name>.dll or press F7 to build the project. Visual C++ creates the DLL and places it in the debug or release directory under the project directory. Building the Module on UNIX On UNIX, use any C compiler to build the module. To build the module on UNIX: 1. Copy all C files and makefiles generated by the Designer to the UNIX machine. Note: If you build the shared library on a machine other than the Integration Service machine, you must also copy the files in the following directory to the build machine: <PowerCenter_install_dir>ExtProcincludect In the Union example, copy all files in <client_installation_directory>/TX/UnionDemo. 2. Set the environment variable INFA_HOME to the Integration Service installation directory. Note: If you specify an incorrect directory path for the INFA_HOME environment variable, the Integration Service cannot start. 3. Enter a command from Table 3-4 to make the project. Table 3-4. UNIX Commands to Build the Shared Library UNIX Version Command AIX (32-bit) make -f makefile.aix AIX (64-bit) make -f makefile.aix64 HP-UX (32-bit) make -f makefile.hp HP-UX (64-bit) make -f makefile.hp64 HP-UX PA-RISC make -f makefile.hpparisc64 86 Chapter 3: Custom Transformation
  • 119. Table 3-4. UNIX Commands to Build the Shared Library UNIX Version Command Linux make -f makefile.linux Solaris make -f makefile.sol Step 5. Create a Mapping In the Mapping Designer, create a mapping that uses the Custom transformation. In the Union example, create a mapping similar to the one in Figure 3-8: Figure 3-8. Mapping with a Custom Transformation - Union Example In this mapping, two sources with the same ports and datatypes connect to the two input groups in the Custom transformation. The Custom transformation takes the rows from both sources and outputs them all through its one output group. The output group has the same ports and datatypes as the input groups. Step 6. Run the Session in a Workflow When you run the session, the Integration Service looks for the shared library or DLL in the runtime location you specify in the Custom transformation. To run a session in a workflow: 1. In the Workflow Manager, create a workflow. 2. Create a session for this mapping in the workflow. 3. Copy the shared library or DLL to the runtime location directory. 4. Run the workflow containing the session. When the Integration Service loads a Custom transformation bound to a procedure, it loads the DLL or shared library and calls the procedure you define. Creating Custom Transformation Procedures 87
  • 120. 88 Chapter 3: Custom Transformation
  • 121. Chapter 4 Custom Transformation Functions This chapter includes the following topics: ♦ Overview, 90 ♦ Function Reference, 92 ♦ Working with Rows, 96 ♦ Generated Functions, 98 ♦ API Functions, 104 ♦ Array-Based API Functions, 130 ♦ Java API Functions, 138 ♦ C++ API Functions, 139 89
  • 122. Overview Custom transformations operate in conjunction with procedures you create outside of the Designer to extend PowerCenter functionality. The Custom transformation functions allow you to develop the transformation logic in a procedure you associate with a Custom transformation. PowerCenter provides two sets of functions called generated and API functions. The Integration Service uses generated functions to interface with the procedure. When you create a Custom transformation and generate the source code files, the Designer includes the generated functions in the files. Use the API functions in the procedure code to develop the transformation logic. When you write the procedure code, you can configure it to receive a block of rows from the Integration Service or a single row at a time. You can increase the procedure performance when it receives and processes a block of rows. For more information about receiving rows from the Integration Service, see “Working with Rows” on page 96. Working with Handles Most functions are associated with a handle, such as INFA_CT_PARTITION_HANDLE. The first parameter for these functions is the handle the function affects. Custom transformation handles have a hierarchical relationship to each other. A parent handle has a 1:n relationship to its child handle. 90 Chapter 4: Custom Transformation Functions
  • 123. Figure 4-1 shows the Custom transformation handles: Figure 4-1. Custom Transformation Handles INFA_CT_MODULE_HANDLE Parent handle to INFA_CT_PROC_HANDLE contains n contains 1 INFA_CT_PROC_HANDLE Child handle to INFA_CT_MODULE_HANDLE contains n contains 1 INFA_CT_TRANS_HANDLE contains n contains 1 INFA_CT_PARTITION_HANDLE contains n contains 1 contains n contains 1 INFA_CT_INPUTGROUP_HANDLE INFA_CT_OUTPUTGROUP_HANDLE contains n contains 1 contains n contains 1 INFA_CT_INPUTPORT_HANDLE INFA_CT_OUTPUTPORT_HANDLE Table 4-1 describes the Custom transformation handles: Table 4-1. Custom Transformation Handles Handle Name Description INFA_CT_MODULE_HANDLE Represents the shared library or DLL. The external procedure can only access the module handle in its own shared library or DLL. It cannot access the module handle in any other shared library or DLL. INFA_CT_PROC_HANDLE Represents a specific procedure within the shared library or DLL. You might use this handle when you need to write a function to affect a procedure referenced by multiple Custom transformations. INFA_CT_TRANS_HANDLE Represents a specific Custom transformation instance in the session. INFA_CT_PARTITION_HANDLE Represents a specific partition in a specific Custom transformation instance. INFA_CT_INPUTGROUP_HANDLE Represents an input group in a partition. INFA_CT_INPUTPORT_HANDLE Represents an input port in an input group in a partition. INFA_CT_OUTPUTGROUP_HANDLE Represents an output group in a partition. INFA_CT_OUTPUTPORT_HANDLE Represents an output port in an output group in a partition. Overview 91
  • 124. Function Reference The Custom transformation functions include generated and API functions. Table 4-2 lists the Custom transformation generated functions: Table 4-2. Custom Transformation Generated Functions Function Description m_<module_name>_moduleInit() Module initialization function. For more information, see “Module Initialization Function” on page 98. p_<proc_name>_procInit() Procedure initialization function. For more information, see “Procedure Initialization Function” on page 99. p_<proc_name>_partitionInit() Partition initialization function. For more information, see “Partition Initialization Function” on page 99. p_<proc_name>_inputRowNotification() Input row notification function. For more information, see “Input Row Notification Function” on page 100. p_<proc_name>_dataBdryNotification() Data boundary notification function. For more information, see “Data Boundary Notification Function” on page 101. p_<proc_name>_eofNotification() End of file notification function. For more information, see “End Of File Notification Function” on page 101. p_<proc_name>_partitionDeinit() Partition deinitialization function. For more information, see “Partition Deinitialization Function” on page 102. p_<proc_name>_procedureDeinit() Procedure deinitialization function. For more information, see “Procedure Deinitialization Function” on page 102. m_<module_name>_moduleDeinit() Module deinitialization function. For more information, see “Module Deinitialization Function” on page 103. Table 4-3 lists the Custom transformation API functions: Table 4-3. Custom Transformation API Functions Function Description INFA_CTSetDataAccessMode() Set data access mode function. For more information, see “Set Data Access Mode Function” on page 104. INFA_CTGetAncestorHandle() Get ancestor handle function. For more information, see “Get Ancestor Handle Function” on page 105. INFA_CTGetChildrenHandles() Get children handles function. For more information, see “Get Children Handles Function” on page 106. INFA_CTGetInputPortHandle() Get input port handle function. For more information, see “Get Port Handle Functions” on page 107. INFA_CTGetOutputPortHandle() Get output port handle function. For more information, see “Get Port Handle Functions” on page 107. 92 Chapter 4: Custom Transformation Functions
  • 125. Table 4-3. Custom Transformation API Functions Function Description INFA_CTGetInternalProperty<datatype>() Get internal property function. For more information, see “Get Internal Property Function” on page 108. INFA_CTGetAllPropertyNamesM() Get all property names in MBCS mode function. For more information, see “Get All External Property Names (MBCS or Unicode)” on page 114. INFA_CTGetAllPropertyNamesU() Get all property names in Unicode mode function. For more information, see “Get All External Property Names (MBCS or Unicode)” on page 114. INFA_CTGetExternalProperty<datatype>M() Get external property in MBCS function. For more information, see “Get External Properties (MBCS or Unicode)” on page 114. INFA_CTGetExternalProperty<datatype>U() Get external property in Unicode function. For more information, see “Get External Properties (MBCS or Unicode)” on page 114. INFA_CTRebindInputDataType() Rebind input port datatype function. For more information, see “Rebind Datatype Functions” on page 115. INFA_CTRebindOutputDataType() Rebind output port datatype function. For more information, see “Rebind Datatype Functions” on page 115. INFA_CTGetData<datatype>() Get data functions. For more information, see “Get Data Functions (Row-Based Mode)” on page 118. INFA_CTSetData() Set data functions. For more information, see “Set Data Function (Row- Based Mode)” on page 118. INFA_CTGetIndicator() Get indicator function. For more information, see “Indicator Functions (Row-Based Mode)” on page 119. INFA_CTSetIndicator() Set indicator function. For more information, see “Indicator Functions (Row-Based Mode)” on page 119. INFA_CTGetLength() Get length function. For more information, see “Length Functions” on page 120. INFA_CTSetLength() Set length function. For more information, see “Length Functions” on page 120. INFA_CTSetPassThruPort() Set pass-through port function. For more information, see “Set Pass- Through Port Function” on page 120. INFA_CTOutputNotification() Output notification function. For more information, see “Output Notification Function” on page 121. INFA_CTDataBdryOutputNotification() Data boundary output notification function. For more information, see “Data Boundary Output Notification Function” on page 121. INFA_CTGetErrorMsgU() Get error message in Unicode function. For more information, see “Error Functions” on page 122. INFA_CTGetErrorMsgM() Get error message in MBCS function. For more information, see “Error Functions” on page 122. INFA_CTLogMessageU() Log message in the session log in Unicode function. For more information, see “Session Log Message Functions” on page 123. Function Reference 93
  • 126. Table 4-3. Custom Transformation API Functions Function Description INFA_CTLogMessageM() Log message in the session log in MBCS function. For more information, see “Session Log Message Functions” on page 123. INFA_CTIncrementErrorCount() Increment error count function. For more information, see “Increment Error Count Function” on page 124. INFA_CTIsTerminateRequested() Is terminate requested function. For more information, see “Is Terminated Function” on page 124. INFA_CTBlockInputFlow() Block input groups function. For more information, see “Blocking Functions” on page 125. INFA_CTUnblockInputFlow() Unblock input groups function. For more information, see “Blocking Functions” on page 125. INFA_CTSetUserDefinedPtr() Set user-defined pointer function. For more information, see “Pointer Functions” on page 126. INFA_CTGetUserDefinedPtr() Get user-defined pointer function. For more information, see “Pointer Functions” on page 126. INFA_CTChangeStringMode() Change the string mode function. For more information, see “Change String Mode Function” on page 126. INFA_CTSetDataCodePageID() Set the data code page ID function. For more information, see “Set Data Code Page Function” on page 127. INFA_CTGetRowStrategy() Get row strategy function. For more information, see “Row Strategy Functions (Row-Based Mode)” on page 128. INFA_CTSetRowStrategy() Set the row strategy function. For more information, see “Row Strategy Functions (Row-Based Mode)” on page 128. INFA_CTChangeDefaultRowStrategy() Change the default row strategy of a transformation. For more information, see “Change Default Row Strategy Function” on page 129. Table 4-4 lists the Custom transformation array-based functions: Table 4-4. Custom Transformation Array-Based API Functions Function Description INFA_CTAGetInputRowMax() Get maximum number of input rows function. For more information, see “Maximum Number of Rows Functions” on page 130. INFA_CTAGetOutputRowMax() Get maximum number of output rows function. For more information, see “Maximum Number of Rows Functions” on page 130. INFA_CTASetOutputRowMax() Set maximum number of output rows function. For more information, see “Maximum Number of Rows Functions” on page 130. INFA_CTAGetNumRows() Get number of rows function. For more information, see “Number of Rows Functions” on page 131. INFA_CTASetNumRows() Set number of rows function. For more information, see “Number of Rows Functions” on page 131. 94 Chapter 4: Custom Transformation Functions
  • 127. Table 4-4. Custom Transformation Array-Based API Functions Function Description INFA_CTAIsRowValid() Is row valid function. For more information, see “Is Row Valid Function” on page 132. INFA_CTAGetData<datatype>() Get data functions. For more information, see “Get Data Functions (Array-Based Mode)” on page 133. INFA_CTAGetIndicator() Get indicator function. For more information, see “Get Indicator Function (Array-Based Mode)” on page 134. INFA_CTASetData() Set data function. For more information, see “Set Data Function (Array- Based Mode)” on page 134. INFA_CTAGetRowStrategy() Get row strategy function. For more information, see “Row Strategy Functions (Array-Based Mode)” on page 135. INFA_CTASetRowStrategy() Set row strategy function. For more information, see “Row Strategy Functions (Array-Based Mode)” on page 135. INFA_CTASetInputErrorRowM() Set input error row function for MBCS. For more information, see “Set Input Error Row Functions” on page 136. INFA_CTASetInputErrorRowU() Set input error row function for Unicode. For more information, see “Set Input Error Row Functions” on page 136. Function Reference 95
  • 128. Working with Rows The Integration Service can pass a single row to a Custom transformation procedure or a block of rows in an array. You can write the procedure code to specify whether the procedure receives one row or a block of rows. You can increase performance when the procedure receives a block of rows: ♦ You can decrease the number of function calls the Integration Service and procedure make. The Integration Service calls the input row notification function fewer times, and the procedure calls the output notification function fewer times. ♦ You can increase the locality of memory access space for the data. ♦ You can write the procedure code to perform an algorithm on a block of data instead of each row of data. By default, the procedure receives a row of data at a time. To receive a block of rows, you must include the INFA_CTSetDataAccessMode() function to change the data access mode to array-based. When the data access mode is array-based, you must use the array-based data handling and row strategy functions to access and output the data. When the data access mode is row-based, you must use the row-based data handling and row strategy functions to access and output the data. All array-based functions use the prefix INFA_CTA. All other functions use the prefix INFA_CT. For more information about the array-based functions, see “Array-Based API Functions” on page 130. Use the following steps to write the procedure code to access a block of rows: 1. Call INFA_CTSetDataAccessMode() during the procedure initialization, to change the data access mode to array-based. 2. When you create a passive Custom transformation, you can also call INFA_CTSetPassThruPort() during procedure initialization to pass through the data for input/output ports. When a block of data reaches the Custom transformation procedure, the Integration Service calls p_<proc_name>_inputRowNotification() for each block of data. Perform the rest of the steps inside this function. 3. Call INFA_CTAGetNumRows() using the input group handle in the input row notification function to find the number of rows in the current block. 4. Call one of the INFA_CTAGetData<datatype>() functions using the input port handle to get the data for a particular row in the block. 5. Call INFA_CTASetData to output rows in a block. 6. Before calling INFA_CTOutputNotification(), call INFA_CTASetNumRows() to notify the Integration Service of the number of rows the procedure is outputting in the block. 7. Call INFA_CTOutputNotification(). 96 Chapter 4: Custom Transformation Functions
  • 129. Rules and Guidelines Use the following rules and guidelines when you write the procedure code to use either row- based or array-based data access mode: ♦ In row-based mode, you can return INFA_ROWERROR in the input row notification function to indicate the function encountered an error for the row of data on input. The Integration Service increments the internal error count. ♦ In array-based mode, do not return INFA_ROWERROR in the input row notification function. The Integration Service treats that as a fatal error. If you need to indicate a row in a block has an error, call the INFA_CTASetInputErrorRowM() or INFA_CTASetInputErrorRowU() function. ♦ In row-based mode, the Integration Service only passes valid rows to the procedure. ♦ In array-based mode, an input block may contain invalid rows, such as dropped, filtered, or error rows. Call INFA_CTAIsRowValid() to determine if a row in a block is valid. ♦ In array-based mode, do not call INFA_CTASetNumRows() for a passive Custom transformation. You can call this function for active Custom transformations. ♦ In array-based mode, call INFA_CTOutputNotification() once. ♦ In array-based mode, you can call INFA_CTSetPassThruPort() only for passive Custom transformations. ♦ In array-based mode for passive Custom transformations, you must output all rows in an output block, including any error row. Working with Rows 97
  • 130. Generated Functions When you use the Designer to generate the procedure code, the Designer includes a set of functions called generated functions in the m_<module_name>.c and p_<procedure_name>.c files. The Integration Service uses the generated functions to interface with the procedure. When you run a session, the Integration Service calls these generated functions in the following order for each target load order group in the mapping: 1. Initialization functions 2. Notification functions 3. Deinitialization functions Initialization Functions The Integration Service first calls the initialization functions. Use the initialization functions to write processes you want the Integration Service to run before it passes data to the Custom transformation. Writing code in the initialization functions reduces processing overhead because the Integration Service runs these processes only once for a module, procedure, or partition. The Designer generates the following initialization functions: ♦ m_<module_name>_moduleInit(). For more information, see “Module Initialization Function” on page 98. ♦ p_<proc_name>_procInit(). For more information, see “Procedure Initialization Function” on page 99. ♦ p_<proc_name>_partitionInit(). For more information, see “Partition Initialization Function” on page 99. Module Initialization Function The Integration Service calls the m_<module_name>_moduleInit() function during session initialization, before it runs the pre-session tasks. It calls this function, once for a module, before all other functions. If you want the Integration Service to run a specific process when it loads the module, you must include it in this function. For example, you might write code to create global structures that procedures within this module access. Use the following syntax: INFA_STATUS m_<module_name>_moduleInit(INFA_CT_MODULE_HANDLE module); Input/ Argument Datatype Description Output module INFA_CT_MODULE_HANDLE Input Module handle. 98 Chapter 4: Custom Transformation Functions
  • 131. The return value datatype is INFA_STATUS. Use INFA_SUCCESS and INFA_FAILURE for the return value. When the function returns INFA_FAILURE, the Integration Service fails the session. Procedure Initialization Function The Integration Service calls p_<proc_name>_procInit() function during session initialization, before it runs the pre-session tasks and after it runs the module initialization function. The Integration Service calls this function once for each procedure in the module. Write code in this function when you want the Integration Service to run a process for a particular procedure. You can also enter some API functions in the procedure initialization function, such as navigation and property functions. Use the following syntax: INFA_STATUS p_<proc_name>_procInit(INFA_CT_PROCEDURE_HANDLE procedure); Input/ Argument Datatype Description Output procedure INFA_CT_PROCEDURE_HANDLE Input Procedure handle. The return value datatype is INFA_STATUS. Use INFA_SUCCESS and INFA_FAILURE for the return value. When the function returns INFA_FAILURE, the Integration Service fails the session. Partition Initialization Function The Integration Service calls p_<proc_name>_partitionInit() function before it passes data to the Custom transformation. The Integration Service calls this function once for each partition at a Custom transformation instance. If you want the Integration Service to run a specific process before it passes data through a partition of the Custom transformation, you must include it in this function. Use the following syntax: INFA_STATUS p_<proc_name>_partitionInit(INFA_CT_PARTITION_HANDLE transformation); Input/ Argument Datatype Description Output transformation INFA_CT_PARTITION_HANDLE Input Partition handle. The return value datatype is INFA_STATUS. Use INFA_SUCCESS and INFA_FAILURE for the return value. When the function returns INFA_FAILURE, the Integration Service fails the session. Note: When the Custom transformation requires one thread for each partition, you can include thread-specific operations in the partition initialization function. For more Generated Functions 99
  • 132. information about working with thread-specific procedure code, see “Working with Thread- Specific Procedure Code” on page 66. Notification Functions The Integration Service calls the notification functions when it passes a row of data to the Custom transformation. The Designer generates the following notification functions: ♦ p_<proc_name>_inputRowNotification(). For more information, see “Input Row Notification Function” on page 100. ♦ p_<proc_name>_dataBdryRowNotification(). For more information, see “Data Boundary Notification Function” on page 101. ♦ p_<proc_name>_eofNotification(). For more information, see “End Of File Notification Function” on page 101. Note: When the Custom transformation requires one thread for each partition, you can include thread-specific operations in the notification functions. For more information about working with thread-specific procedure code, see “Working with Thread-Specific Procedure Code” on page 66. Input Row Notification Function The Integration Service calls the p_<proc_name>_inputRowNotification() function when it passes a row or a block of rows to the Custom transformation. It notes which input group and partition receives data through the input group handle and partition handle. Use the following syntax: INFA_ROWSTATUS p_<proc_name>_inputRowNotification(INFA_CT_PARTITION_HANDLE Partition, INFA_CT_INPUTGROUP_HANDLE group); Input/ Argument Datatype Description Output partition INFA_CT_PARTITION_HANDLE Input Partition handle. group INFA_CT_INPUTGROUP_HANDLE Input Input group handle. The datatype of the return value is INFA_ROWSTATUS. Use the following values for the return value: ♦ INFA_ROWSUCCESS. Indicates the function successfully processed the row of data. ♦ INFA_ROWERROR. Indicates the function encountered an error for the row of data. The Integration Service increments the internal error count. Only return this value when the data access mode is row. If the input row notification function returns INFA_ROWERROR in array-based mode, the Integration Service treats it as a fatal error. If you need to indicate a row in a block has 100 Chapter 4: Custom Transformation Functions
  • 133. an error, call the INFA_CTASetInputErrorRowM() or INFA_CTASetInputErrorRowU() function. ♦ INFA_FATALERROR. Indicates the function encountered a fatal error for the row of data or the block of data. The Integration Service fails the session. Data Boundary Notification Function The Integration Service calls the p_<proc_name>_dataBdryNotification() function when it passes a commit or rollback row to a partition. Use the following syntax: INFA_STATUS p_<proc_name>_dataBdryNotification(INFA_CT_PARTITION_HANDLE transformation, INFA_CTDataBdryType dataBoundaryType); Input/ Argument Datatype Description Output transformation INFA_CT_PARTITION_HANDLE Input Partition handle. dataBoundaryType INFA_CTDataBdryType Input Integration Service uses one of the following values for the dataBoundaryType parameter: - eBT_COMMIT - eBT_ROLLBACK The return value datatype is INFA_STATUS. Use INFA_SUCCESS and INFA_FAILURE for the return value. When the function returns INFA_FAILURE, the Integration Service fails the session. End Of File Notification Function The Integration Service calls the p_<proc_name>_eofNotification() function after it passes the last row to a partition in an input group. Use the following syntax: INFA_STATUS p_<proc_name>_eofNotification(INFA_CT_PARTITION_HANDLE transformation, INFA_CT_INPUTGROUP_HANDLE group); Input/ Argument Datatype Description Output transformation INFA_CT_PARTITION_HANDLE Input Partition handle. group INFA_CT_INPUTGROUP_HANDLE Input Input group handle. The return value datatype is INFA_STATUS. Use INFA_SUCCESS and INFA_FAILURE for the return value. When the function returns INFA_FAILURE, the Integration Service fails the session. Generated Functions 101
  • 134. Deinitialization Functions The Integration Service calls the deinitialization functions after it processes data for the Custom transformation. Use the deinitialization functions to write processes you want the Integration Service to run after it passes all rows of data to the Custom transformation. The Designer generates the following deinitialization functions: ♦ p_<proc_name>_partitionDeinit(). For more information, see “Partition Deinitialization Function” on page 102. ♦ p_<proc_name>_procDeinit(). For more information, see “Procedure Deinitialization Function” on page 102. ♦ m_<module_name>_moduleDeinit(). For more information, see “Module Deinitialization Function” on page 103. Note: When the Custom transformation requires one thread for each partition, you can include thread-specific operations in the initialization and deinitialization functions. For more information about working with thread-specific procedure code, see “Working with Thread-Specific Procedure Code” on page 66. Partition Deinitialization Function The Integration Service calls the p_<proc_name>_partitionDeinit() function after it calls the p_<proc_name>_eofNotification() or p_<proc_name>_abortNotification() function. The Integration Service calls this function once for each partition of the Custom transformation. Use the following syntax: INFA_STATUS p_<proc_name>_partitionDeinit(INFA_CT_PARTITION_HANDLE partition); Input/ Argument Datatype Description Output partition INFA_CT_PARTITION_HANDLE Input Partition handle. The return value datatype is INFA_STATUS. Use INFA_SUCCESS and INFA_FAILURE for the return value. When the function returns INFA_FAILURE, the Integration Service fails the session. Note: When the Custom transformation requires one thread for each partition, you can include thread-specific operations in the partition deinitialization function. For more information about working with thread-specific procedure code, see “Working with Thread- Specific Procedure Code” on page 66. Procedure Deinitialization Function The Integration Service calls the p_<proc_name>_procDeinit() function after it calls the p_<proc_name>_partitionDeinit() function for all partitions of each Custom transformation instance that uses this procedure in the mapping. 102 Chapter 4: Custom Transformation Functions
  • 135. Use the following syntax: INFA_STATUS p_<proc_name>_procDeinit(INFA_CT_PROCEDURE_HANDLE procedure, INFA_STATUS sessionStatus); Input/ Argument Datatype Description Output procedure INFA_CT_PROCEDURE_HANDLE Input Procedure handle. sessionStatus INFA_STATUS Input Integration Service uses one of the following values for the sessionStatus parameter: - INFA_SUCCESS. Indicates the session succeeded. - INFA_FAILURE. Indicates the session failed. The return value datatype is INFA_STATUS. Use INFA_SUCCESS and INFA_FAILURE for the return value. When the function returns INFA_FAILURE, the Integration Service fails the session. Module Deinitialization Function The Integration Service calls the m_<module_name>_moduleDeinit() function after it runs the post-session tasks. It calls this function, once for a module, after all other functions. Use the following syntax: INFA_STATUS m_<module_name>_moduleDeinit(INFA_CT_MODULE_HANDLE module, INFA_STATUS sessionStatus); Input/ Argument Datatype Description Output module INFA_CT_MODULE_HANDLE Input Module handle. sessionStatus INFA_STATUS Input Integration Service uses one of the following values for the sessionStatus parameter: - INFA_SUCCESS. Indicates the session succeeded. - INFA_FAILURE. Indicates the session failed. The return value datatype is INFA_STATUS. Use INFA_SUCCESS and INFA_FAILURE for the return value. When the function returns INFA_FAILURE, the Integration Service fails the session. Generated Functions 103
  • 136. API Functions PowerCenter provides a set of API functions that you use to develop the transformation logic. When the Designer generates the source code files, it includes the generated functions in the source code. Add API functions to the code to implement the transformation logic. The procedure uses the API functions to interface with the Integration Service. You must code API functions in the procedure C file. Optionally, you can also code the module C file. Informatica provides the following groups of API functions: ♦ Set data access mode. See “Set Data Access Mode Function” on page 104. ♦ Navigation. See “Navigation Functions” on page 105. ♦ Property. See “Property Functions” on page 108. ♦ Rebind datatype. See “Rebind Datatype Functions” on page 115. ♦ Data handling (row-based mode). See “Data Handling Functions (Row-Based Mode)” on page 117. ♦ Set pass-through port. See “Set Pass-Through Port Function” on page 120. ♦ Output notification. See “Output Notification Function” on page 121. ♦ Data boundary output notification. See “Data Boundary Output Notification Function” on page 121. ♦ Error. See “Error Functions” on page 122. ♦ Session log message. See “Session Log Message Functions” on page 123. ♦ Increment error count. See “Increment Error Count Function” on page 124. ♦ Is terminated. See “Is Terminated Function” on page 124. ♦ Blocking. See “Blocking Functions” on page 125. ♦ Pointer. See “Pointer Functions” on page 126. ♦ Change string mode. See “Change String Mode Function” on page 126. ♦ Set data code page. See “Set Data Code Page Function” on page 127. ♦ Row strategy (row-based mode). See “Row Strategy Functions (Row-Based Mode)” on page 128. ♦ Change default row strategy. See “Change Default Row Strategy Function” on page 129. Informatica also provides array-based API Functions. For more information about array-based API functions, see “Array-Based API Functions” on page 130. Set Data Access Mode Function By default, the Integration Service passes data to the Custom transformation procedure one row at a time. However, use the INFA_CTSetDataAccessMode() function to change the data access mode to array-based. When you set the data access mode to array-based, the Integration Service passes multiple rows to the procedure as a block in an array. 104 Chapter 4: Custom Transformation Functions
  • 137. When you set the data access mode to array-based, you must use the array-based versions of the data handling functions and row strategy functions. When you use a row-based data handling or row strategy function and you switch to array-based mode, you will get unexpected results. For example, the DLL or shared library might crash. You can only use this function in the procedure initialization function. If you do not use this function in the procedure code, the data access mode is row-based. However, when you want the data access mode to be row-based, include this function and set the access mode to row-based. For more information about the array-based functions, see “Array-Based API Functions” on page 130. Use the following syntax: INFA_STATUS INFA_CTSetDataAccessMode( INFA_CT_PROCEDURE_HANDLE procedure, INFA_CT_DATA_ACCESS_MODE mode ); Input/ Argument Datatype Description Output procedure INFA_CT_PROCEDURE_HANDLE Input Procedure name. mode INFA_CT_DATA_ACCESS_MODE Input Data access mode. Use the following values for the mode parameter: - eDA_ROW - eDA_ARRAY Navigation Functions Use the navigation functions when you want the procedure to navigate through the handle hierarchy. For more information about handles, see “Working with Handles” on page 90. PowerCenter provides the following navigation functions: ♦ INFA_CTGetAncestorHandle(). For more information, see “Get Ancestor Handle Function” on page 105. ♦ INFA_CTGetChildrenHandles(). For more information, see “Get Children Handles Function” on page 106. ♦ INFA_CTGetInputPortHandle(). For more information, see “Get Port Handle Functions” on page 107. ♦ INFA_CTGetOutputPortHandle(). For more information, see “Get Port Handle Functions” on page 107. Get Ancestor Handle Function Use the INFA_CTGetAncestorHandle() function when you want the procedure to access a parent handle of a given handle. API Functions 105
  • 138. Use the following syntax: INFA_CT_HANDLE INFA_CTGetAncestorHandle(INFA_CT_HANDLE handle, INFA_CTHandleType returnHandleType); Input/ Argument Datatype Description Output handle INFA_CT_HANDLE Input Handle name. returnHandleType INFA_CTHandleType Input Return handle type. Use the following values for the returnHandleType parameter: - PROCEDURETYPE - TRANSFORMATIONTYPE - PARTITIONTYPE - INPUTGROUPTYPE - OUTPUTGROUPTYPE - INPUTPORTTYPE - OUTPUTPORTTYPE The handle parameter specifies the handle whose parent you want the procedure to access. The Integration Service returns INFA_CT_HANDLE if you specify a valid handle in the function. Otherwise, it returns a null value. To avoid compilation errors, you must code the procedure to set a handle name to the return value. For example, you can enter the following code: INFA_CT_MODULE_HANDLE module = INFA_CTGetAncestorHandle(procedureHandle, INFA_CT_HandleType); Get Children Handles Function Use the INFA_CTGetChildrenHandles() function when you want the procedure to access the children handles of a given handle. Use the following syntax: INFA_CT_HANDLE* INFA_CTGetChildrenHandles(INFA_CT_HANDLE handle, size_t* pnChildrenHandles, INFA_CTHandleType returnHandleType); Input/ Argument Datatype Description Output handle INFA_CT_HANDLE Input Handle name. 106 Chapter 4: Custom Transformation Functions
  • 139. Input/ Argument Datatype Description Output pnChildrenHandles size_t* Output Integration Service returns an array of children handles. The pnChildrenHandles parameter indicates the number of children handles in the array. returnHandleType INFA_CTHandleType Input Use the following values for the returnHandleType parameter: - PROCEDURETYPE - TRANSFORMATIONTYPE - PARTITIONTYPE - INPUTGROUPTYPE - OUTPUTGROUPTYPE - INPUTPORTTYPE - OUTPUTPORTTYPE The handle parameter specifies the handle whose children you want the procedure to access. The Integration Service returns INFA_CT_HANDLE* when you specify a valid handle in the function. Otherwise, it returns a null value. To avoid compilation errors, you must code the procedure to set a handle name to the returned value. For example, you can enter the following code: INFA_CT_PARTITION_HANDLE partition = INFA_CTGetChildrenHandles(procedureHandle, pnChildrenHandles, INFA_CT_PARTITION_HANDLE_TYPE); Get Port Handle Functions The Integration Service associates the INFA_CT_INPUTPORT_HANDLE with input and input/output ports, and the INFA_CT_OUTPUTPORT_HANDLE with output and input/ output ports. PowerCenter provides the following get port handle functions: ♦ INFA_CTGetInputPortHandle(). Use this function when the procedure knows the output port handle for an input/output port and needs the input port handle. Use the following syntax: INFA_CTINFA_CT_INPUTPORT_HANDLE INFA_CTGetInputPortHandle(INFA_CT_OUTPUTPORT_HANDLE outputPortHandle); Input/ Argument Datatype Description Output outputPortHandle INFA_CT_OUTPUTPORT_HANDLE input Output port handle. ♦ INFA_CTGetOutputPortHandle(). Use this function when the procedure knows the input port handle for an input/output port and needs the output port handle. API Functions 107
  • 140. Use the following syntax: INFA_CT_OUTPUTPORT_HANDLE INFA_CTGetOutputPortHandle(INFA_CT_INPUTPORT_HANDLE inputPortHandle); Input/ Argument Datatype Description Output inputPortHandle INFA_CT_INPUTPORT_HANDLE input Input port handle. The Integration Service returns NULL when you use the get port handle functions with input or output ports. Property Functions Use the property functions when you want the procedure to access the Custom transformation properties. The property functions access properties on the following tabs of the Custom transformation: ♦ Ports ♦ Properties ♦ Initialization Properties ♦ Metadata Extensions ♦ Port Attribute Definitions Use the following property functions in initialization functions: ♦ INFA_CTGetInternalProperty<datatype>(). For more information, see “Get Internal Property Function” on page 108. ♦ INFA_CTGetAllPropertyNamesM(). For more information, see “Get All External Property Names (MBCS or Unicode)” on page 114. ♦ INFA_CTGetAllPropertyNamesU(). For more information, see “Get All External Property Names (MBCS or Unicode)” on page 114. ♦ INFA_CTGetExternalProperty<datatype>M(). For more information, see “Get External Properties (MBCS or Unicode)” on page 114. ♦ INFA_CTGetExternalProperty<datatype>U(). For more information, see “Get External Properties (MBCS or Unicode)” on page 114. Get Internal Property Function PowerCenter provides functions to access the port attributes specified on the ports tab, and properties specified for attributes on the Properties tab of the Custom transformation. The Integration Service associates each port and property attribute with a property ID. You must specify the property ID in the procedure to access the values specified for the attributes. For more information about property IDs, see “Port and Property Attribute Property IDs” on page 109. For the handle parameter, specify a handle name from the handle hierarchy. The Integration Service fails the session if the handle name is invalid. 108 Chapter 4: Custom Transformation Functions
  • 141. Use the following functions when you want the procedure to access the properties: ♦ INFA_CTGetInternalPropertyStringM(). Accesses a value of type string in MBCS for a given property ID. Use the following syntax: INFA_STATUS INFA_CTGetInternalPropertyStringM( INFA_CT_HANDLE handle, size_t propId, const char** psPropValue ); ♦ INFA_CTGetInternalPropertyStringU(). Accesses a value of type string in Unicode for a given property ID. Use the following syntax: INFA_STATUS INFA_CTGetInternalPropertyStringU( INFA_CT_HANDLE handle, size_t propId, const INFA_UNICHAR** psPropValue ); ♦ INFA_CTGetInternalPropertyInt32(). Accesses a value of type integer for a given property ID. Use the following syntax: INFA_STATUS INFA_CTGetInternalPropertyInt32( INFA_CT_HANDLE handle, size_t propId, INFA_INT32* pnPropValue ); ♦ INFA_CTGetInternalPropertyBool(). Accesses a value of type Boolean for a given property ID. Use the following syntax: INFA_STATUS INFA_CTGetInternalPropertyBool( INFA_CT_HANDLE handle, size_t propId, INFA_BOOLEN* pbPropValue ); ♦ INFA_CTGetInternalPropertyINFA_PTR(). Accesses a pointer to a value for a given property ID. Use the following syntax: INFA_STATUS INFA_CTGetInternalPropertyINFA_PTR( INFA_CT_HANDLE handle, size_t propId, INFA_PTR* pvPropValue ); The return value datatype is INFA_STATUS. Use INFA_SUCCESS and INFA_FAILURE for the return value. Port and Property Attribute Property IDs The following tables list the property IDs for the port and property attributes in the Custom transformation. Each table lists a Custom transformation handle and the property IDs you can access with the handle in a property function. Table 4-5 lists INFA_CT_MODULE _HANDLE property IDs: Table 4-5. INFA_CT_MODULE Property IDs Handle Property ID Datatype Description INFA_CT_MODULE_NAME String Specifies the module name. INFA_CT_SESSION_INFA_VERSION String Specifies the Informatica version. API Functions 109
  • 142. Table 4-5. INFA_CT_MODULE Property IDs Handle Property ID Datatype Description INFA_CT_SESSION_CODE_PAGE Integer Specifies the Integration Service code page. INFA_CT_SESSION_DATAMOVEMENT_MODE Integer Specifies the data movement mode. The Integration Service returns one of the following values: - eASM_MBCS - eASM_UNICODE INFA_CT_SESSION_VALIDATE_CODEPAGE Boolean Specifies whether the Integration Service enforces code page validation. INFA_CT_SESSION_PROD_INSTALL_DIR String Specifies the Integration Service installation directory. INFA_CT_SESSION_HIGH_PRECISION_MODE Boolean Specifies whether session is configured for high precision. INFA_CT_MODULE_RUNTIME_DIR String Specifies the runtime directory for the DLL or shared library. INFA_CT_SESSION_IS_UPD_STR_ALLOWED Boolean Specifies whether the Update Strategy Transformation property is selected in the transformation. INFA_CT_TRANS_OUTPUT_IS_ Integer Specifies whether the Custom transformation REPEATABLE produces data in the same order in every session run. The Integration Service returns one of the following values: - eOUTREPEAT_NEVER = 1 - eOUTREPEAT_ALWAYS = 2 - eOUTREPEAT_BASED_ON_INPUT_ORDER = 3 INFA_CT_TRANS_FATAL_ERROR Boolean Specifies if the Custom Transformation caused a fatal error. The Integration Service returns one of the following values: - INFA_TRUE - INFA_FALSE Table 4-6 lists INFA_CT_PROC_HANDLE property IDs: Table 4-6. INFA_CT_PROC_HANDLE Property IDs Handle Property ID Datatype Description INFA_CT_PROCEDURE_NAME String Specifies the Custom transformation procedure name. 110 Chapter 4: Custom Transformation Functions
  • 143. Table 4-7 lists INFA_CT_TRANS_HANDLE property IDs: Table 4-7. INFA_CT_TRANS_HANDLE Property IDs Handle Property ID Datatype Description INFA_CT_TRANS_INSTANCE_NAME String Specifies the Custom transformation instance name. INFA_CT_TRANS_TRACE_LEVEL Integer Specifies the tracing level. The Integration Service returns one of the following values: - eTRACE_TERSE - eTRACE_NORMAL - eTRACE_VERBOSE_INIT - eTRACE_VERBOSE_DATA INFA_CT_TRANS_MAY_BLOCK_DATA Boolean Specifies if the Integration Service allows the procedure to block input data in the current session. INFA_CT_TRANS_MUST_BLOCK_DATA Boolean Specifies if the Inputs Must Block Custom transformation property is selected. INFA_CT_TRANS_ISACTIVE Boolean Specifies whether the Custom transformation is an active or passive transformation. INFA_CT_TRANS_ISPARTITIONABLE Boolean Specifies if you can partition sessions that use this Custom transformation. INFA_CT_TRANS_IS_UPDATE_ Boolean Specifies if the Custom transformation behaves like STRATEGY an Update Strategy transformation. INFA_CT_TRANS_DEFAULT_UPDATE_STRATE Integer Specifies the default update strategy. GY - eDUS_INSERT - eDUS_UPDATE - eDUS_DELETE - eDUS_REJECT - eDUS_PASSTHROUGH INFA_CT_TRANS_NUM_PARTITIONS Integer Specifies the number of partitions in the sessions that use this Custom transformation. INFA_CT_TRANS_DATACODEPAGE Integer Specifies the code page in which the Integration Service passes data to the Custom transformation. Use the set data code page function if you want the Custom transformation to access data in a different code page. For more information, see “Set Data Code Page Function” on page 127. INFA_CT_TRANS_TRANSFORM_ Integer Specifies the transformation scope in the Custom SCOPE transformation. The Integration Service returns one of the following values: - eTS_ROW - eTS_TRANSACTION - eTS_ALLINPUT API Functions 111
  • 144. Table 4-7. INFA_CT_TRANS_HANDLE Property IDs Handle Property ID Datatype Description INFA_CT_TRANS_GENERATE_ Boolean Specifies if the Generate Transaction property is TRANSACT enabled. The Integration Service returns one of the following values: - INFA_TRUE - INFA_FALSE INFA_CT_TRANS_OUTPUT_IS_ Integer Specifies whether the Custom transformation REPEATABLE produces data in the same order in every session run. The Integration Service returns one of the following values: - eOUTREPEAT_NEVER = 1 - eOUTREPEAT_ALWAYS = 2 - eOUTREPEAT_BASED_ON_INPUT_ORDER = 3 INFA_CT_TRANS_FATAL_ERROR Boolean Specifies if the Custom Transformation caused a fatal error. The Integration Service returns one of the following values: - INFA_TRUE - INFA_FALSE Table 4-8 lists INFA_CT_INPUT_GROUP_HANDLE and INFA_CT_OUTPUT_GROUP_HANDLE property IDs: Table 4-8. INFA_CT_INPUT_GROUP and INFA_CT_OUTPUT_GROUP Handle Property IDs Handle Property ID Datatype Description INFA_CT_GROUP_NAME String Specifies the group name. INFA_CT_GROUP_NUM_PORTS Integer Specifies the number of ports in the group. INFA_CT_GROUP_ISCONNECTED Boolean Specifies if all ports in a group are connected to another transformation. INFA_CT_PORT_NAME String Specifies the port name. INFA_CT_PORT_CDATATYPE Integer Specifies the port datatype. The Integration Service returns one of the following values: - eINFA_CTYPE_SHORT - eINFA_CTYPE_INT32 - eINFA_CTYPE_CHAR - eINFA_CTYPE_RAW - eINFA_CTYPE_UNICHAR - eINFA_CTYPE_TIME - eINFA_CTYPE_FLOAT - eINFA_CTYPE_DOUBLE - eINFA_CTYPE_DECIMAL18_FIXED - eINFA_CTYPE_DECIMAL28_FIXED - eINFA_CTYPE_INFA_CTDATETIME INFA_CT_PORT_PRECISION Integer Specifies the port precision. INFA_CT_PORT_SCALE Integer Specifies the port scale (if applicable). 112 Chapter 4: Custom Transformation Functions
  • 145. Table 4-8. INFA_CT_INPUT_GROUP and INFA_CT_OUTPUT_GROUP Handle Property IDs Handle Property ID Datatype Description INFA_CT_PORT_IS_MAPPED Boolean Specifies whether the port is linked to other transformations in the mapping. INFA_CT_PORT_STORAGESIZE Integer Specifies the internal storage size of the data for a port. The storage size depends on the datatype of the port. INFA_CT_PORT_BOUNDDATATYPE Integer Specifies the port datatype. Use instead of INFA_CT_PORT_CDATATYPE if you rebind the port and specify a datatype other than the default. For more information about rebinding a port, see “Rebind Datatype Functions” on page 115. Table 4-9 lists INFA_CT_INPUTPORT_HANDLE and INFA_CT_OUTPUT_HANDLE property IDs: Table 4-9. INFA_CT_INPUTPORT and INFA_CT_OUTPUTPORT_HANDLE Handle Property IDs Handle Property ID Datatype Description INFA_CT_PORT_NAME String Specifies the port name. INFA_CT_PORT_CDATATYPE Integer Specifies the port datatype. The Integration Service returns one of the following values: - eINFA_CTYPE_SHORT - eINFA_CTYPE_INT32 - eINFA_CTYPE_CHAR - eINFA_CTYPE_RAW - eINFA_CTYPE_UNICHAR - eINFA_CTYPE_TIME - eINFA_CTYPE_FLOAT - eINFA_CTYPE_DOUBLE - eINFA_CTYPE_DECIMAL18_FIXED - eINFA_CTYPE_DECIMAL28_FIXED - eINFA_CTYPE_INFA_CTDATETIME INFA_CT_PORT_PRECISION Integer Specifies the port precision. INFA_CT_PORT_SCALE Integer Specifies the port scale (if applicable). INFA_CT_PORT_IS_MAPPED Boolean Specifies whether the port is linked to other transformations in the mapping. INFA_CT_PORT_STORAGESIZE Integer Specifies the internal storage size of the data for a port. The storage size depends on the datatype of the port. INFA_CT_PORT_BOUNDDATATYPE Integer Specifies the port datatype. Use instead of INFA_CT_PORT_CDATATYPE if you rebind the port and specify a datatype other than the default. For more information about rebinding a port, see “Rebind Datatype Functions” on page 115. API Functions 113
  • 146. Get All External Property Names (MBCS or Unicode) PowerCenter provides two functions to access the property names defined on the Metadata Extensions tab, Initialization Properties tab, and Port Attribute Definitions tab of the Custom transformation. Use the following functions when you want the procedure to access the property names: ♦ INFA_CTGetAllPropertyNamesM(). Accesses the property names in MBCS. Use the following syntax: INFA_STATUS INFA_CTGetAllPropertyNamesM(INFA_CT_HANDLE handle, const char*const** paPropertyNames, size_t* pnProperties); Input/ Argument Datatype Description Output handle INFA_CT_HANDLE Input Specify the handle name. paPropertyNames const char*const** Output Specifies the property name. The Integration Service returns an array of property names in MBCS. pnProperties size_t* Output Indicates the number of properties in the array. ♦ INFA_CTGetAllPropertyNamesU(). Accesses the property names in Unicode. Use the following syntax: INFA_STATUS INFA_CTGetAllPropertyNamesU(INFA_CT_HANDLE handle, const INFA_UNICHAR*const** pasPropertyNames, size_t* pnProperties); Input/ Argument Datatype Description Output handle INFA_CT_HANDLE Input Specify the handle name. paPropertyNames const Output Specifies the property name. The Integration INFA_UNICHAR*const** Service returns an array of property names in Unicode. pnProperties size_t* Output Indicates the number of properties in the array. The return value datatype is INFA_STATUS. Use INFA_SUCCESS and INFA_FAILURE for the return value. Get External Properties (MBCS or Unicode) PowerCenter provides functions to access the values of the properties defined on the Metadata Extensions tab, Initialization Properties tab, or Port Attribute Definitions tab of the Custom transformation. You must specify the property names in the functions if you want the procedure to access the values. Use the INFA_CTGetAllPropertyNamesM() or INFA_CTGetAllPropertyNamesU() 114 Chapter 4: Custom Transformation Functions
  • 147. functions to access property names. For the handle parameter, specify a handle name from the handle hierarchy. The Integration Service fails the session if the handle name is invalid. Note: If you define an initialization property with the same name as a metadata extension, the Integration Service returns the metadata extension value. Use the following functions when you want the procedure to access the values of the properties: ♦ INFA_CTGetExternalProperty<datatype>M(). Accesses the value of the property in MBCS. Use the syntax as shown in Table 4-10: Table 4-10. Property Functions (MBCS) Property Syntax Datatype INFA_STATUS INFA_CTGetExternalPropertyStringM(INFA_CT_HANDLE String handle, const char* sPropName, const char** psPropValue); INFA_STATUS INFA_CTGetExternalPropertyINT32M(INFA_CT_HANDLE Integer handle, const char* sPropName, INFA_INT32* pnPropValue); INFA_STATUS INFA_CTGetExternalPropertyBoolM(INFA_CT_HANDLE Boolean handle, const char* sPropName, INFA_BOOLEN* pbPropValue); ♦ INFA_CTGetExternalProperty<datatype>U(). Accesses the value of the property in Unicode. Use the syntax as shown in Table 4-11: Table 4-11. Property Functions (Unicode) Property Syntax Datatype INFA_STATUS INFA_CTGetExternalPropertyStringU(INFA_CT_HANDLE String handle, INFA_UNICHAR* sPropName, INFA_UNICHAR** psPropValue); INFA_STATUS INFA_CTGetExternalPropertyStringU(INFA_CT_HANDLE Integer handle, INFA_UNICHAR* sPropName, INFA_INT32* pnPropValue); INFA_STATUS INFA_CTGetExternalPropertyStringU(INFA_CT_HANDLE Boolean handle, INFA_UNICHAR* sPropName, INFA_BOOLEN* pbPropValue); The return value datatype is INFA_STATUS. Use INFA_SUCCESS and INFA_FAILURE for the return value. Rebind Datatype Functions You can rebind a port with a datatype other than the default datatype with PowerCenter. Use the rebind datatype functions if you want the procedure to access data in a datatype other than the default datatype. You must rebind the port with a compatible datatype. You can only use these functions in the initialization functions. API Functions 115
  • 148. Consider the following rules when you rebind the datatype for an output or input/output port: ♦ You must use the data handling functions to set the data and the indicator for that port. Use the INFA_CTSetData() and INFA_CTSetIndicator() functions in row-based mode, and use the INFA_CTASetData() function in array-based mode. ♦ Do not call the INFA_CTSetPassThruPort() function for the output port. Table 4-12 lists compatible datatypes: Table 4-12. Compatible Datatypes Default Datatype Compatible With Char Unichar Unichar Char Date INFA_DATETIME Use the following syntax: struct INFA_DATETIME { int nYear; int nMonth; int nDay; int nHour; int nMinute; int nSecond; int nNanoSecond; } Dec18 Char, Unichar Dec28 Char, Unichar 116 Chapter 4: Custom Transformation Functions
  • 149. PowerCenter provides the following rebind datatype functions: ♦ INFA_CTRebindInputDataType(). Rebinds the input port. Use the following syntax: INFA_STATUS INFA_CTRebindInputDataType(INFA_CT_INPUTPORT_HANDLE portHandle, INFA_CDATATYPE datatype); ♦ INFA_CTRebindOutputDataType(). Rebinds the output port. Use the following syntax: INFA_STATUS INFA_CTRebindOutputDataType(INFA_CT_OUTPUTPORT_HANDLE portHandle, INFA_CDATATYPE datatype); Input/ Argument Datatype Description Output portHandle INFA_CT_OUTPUTPORT_HANDLE Input Output port handle. datatype INFA_CDATATYPE Input The datatype with which you rebind the port. Use the following values for the datatype parameter: - eINFA_CTYPE_SHORT - eINFA_CTYPE_INT32 - eINFA_CTYPE_CHAR - eINFA_CTYPE_RAW - eINFA_CTYPE_UNICHAR - eINFA_CTYPE_TIME - eINFA_CTYPE_FLOAT - eINFA_CTYPE_DOUBLE - eINFA_CTYPE_DECIMAL18_FIXED - eINFA_CTYPE_DECIMAL28_FIXED - eINFA_CTYPE_INFA_CTDATETIME The return value datatype is INFA_STATUS. Use INFA_SUCCESS and INFA_FAILURE for the return value. Data Handling Functions (Row-Based Mode) When the Integration Service calls the input row notification function, it notifies the procedure that the procedure can access a row or block of data. However, to get data from the input port, modify it, and set data in the output port, you must use the data handling functions in the input row notification function. When the data access mode is row-based, use the row-based data handling functions. Include the INFA_CTGetData<datatype>() function to get the data from the input port and INFA_CTSetData() function to set the data in the output port. Include the INFA_CTGetIndicator() or INFA_CTGetLength() function if you want the procedure to verify before you get the data if the port has a null value or an empty string. PowerCenter provides the following data handling functions: ♦ INFA_CTGetData<datatype>(). For more information, see “Get Data Functions (Row- Based Mode)” on page 118. ♦ INFA_CTSetData(). For more information, see “Set Data Function (Row-Based Mode)” on page 118. API Functions 117
  • 150. INFA_CTGetIndicator(). For more information, see “Indicator Functions (Row-Based Mode)” on page 119. ♦ INFA_CTSetIndicator(). For more information, see “Indicator Functions (Row-Based Mode)” on page 119. ♦ INFA_CTGetLength(). For more information, see “Length Functions” on page 120. ♦ INFA_CTSetLength(). For more information, see “Length Functions” on page 120. Get Data Functions (Row-Based Mode) Use the INFA_CTGetData<datatype>() functions to retrieve data for the port the function specifies. You must modify the function name depending on the datatype of the port you want the procedure to access. Table 4-13 lists the INFA_CTGetData<datatype>() function syntax and the datatype of the return value: Table 4-13. Get Data Functions Return Value Syntax Datatype void* INFA_CTGetDataVoid(INFA_CT_INPUTPORT_HANDLE dataHandle); Data void pointer to the return value char* INFA_CTGetDataStringM(INFA_CT_INPUTPORT_HANDLE String (MBCS) dataHandle); IUNICHAR* INFA_CTGetDataStringU(INFA_CT_INPUTPORT_HANDLE String dataHandle); (Unicode) INFA_INT32 INFA_CTGetDataINT32(INFA_CT_INPUTPORT_HANDLE Integer dataHandle); double INFA_CTGetDataDouble(INFA_CT_INPUTPORT_HANDLE Double dataHandle); INFA_CT_RAWDATE INFA_CTGetDataDate(INFA_CT_INPUTPORT_HANDLE Raw date dataHandle); INFA_CT_RAWDEC18 INFA_CTGetDataRawDec18( Decimal BLOB INFA_CT_INPUTPORT_HANDLE dataHandle); (precision 18) INFA_CT_RAWDEC28 INFA_CTGetDataRawDec28( Decimal BLOB INFA_CT_INPUTPORT_HANDLE dataHandle); (precision 28) INFA_CT_DATETIME Datetime INFA_CTGetDataDateTime(INFA_CT_INPUTPORT_HANDLE dataHandle); Set Data Function (Row-Based Mode) Use the INFA_CTSetData() function when you want the procedure to pass a value to an output port. 118 Chapter 4: Custom Transformation Functions
  • 151. Use the following syntax: INFA_STATUS INFA_CTSetData(INFA_CT_OUTPUTPORT_HANDLE dataHandle, void* data); The return value datatype is INFA_STATUS. Use INFA_SUCCESS and INFA_FAILURE for the return value. Note: If you use the INFA_CTSetPassThruPort() function on an input/output port, do not use set the data or indicator for that port. Indicator Functions (Row-Based Mode) Use the indicator functions when you want the procedure to get the indicator for an input port or to set the indicator for an output port. The indicator for a port indicates whether the data is valid, null, or truncated. PowerCenter provides the following indicator functions: ♦ INFA_CTGetIndicator(). Gets the indicator for an input port. Use the following syntax: INFA_INDICATOR INFA_CTGetIndicator(INFA_CT_INPUTPORT_HANDLE dataHandle); The return value datatype is INFA_INDICATOR. Use the following values for INFA_INDICATOR: − INFA_DATA_VALID. Indicates the data is valid. − INFA_NULL_DATA. Indicates a null value. − INFA_DATA_TRUNCATED. Indicates the data has been truncated. ♦ INFA_CTSetIndicator(). Sets the indicator for an output port. Use the following syntax: INFA_STATUS INFA_CTSetIndicator(INFA_CT_OUTPUTPORT_HANDLE dataHandle, INFA_INDICATOR indicator); Input/ Argument Datatype Description Output dataHandle INFA_CT_OUTPUTPORT_HANDLE Input Output port handle. indicator INFA_INDICATOR Input The indicator value for the output port. Use one of the following values: - INFA_DATA_VALID. Indicates the data is valid. - INFA_NULL_DATA. Indicates a null value. - INFA_DATA_TRUNCATED. Indicates the data has been truncated. The return value datatype is INFA_STATUS. Use INFA_SUCCESS and INFA_FAILURE for the return value. Note: If you use the INFA_CTSetPassThruPort() function on an input/output port, do not set the data or indicator for that port. API Functions 119
  • 152. Length Functions Use the length functions when you want the procedure to access the length of a string or binary input port, or to set the length of a binary or string output port. Use the following length functions: ♦ INFA_CTGetLength(). Use this function for string and binary ports only. The Integration Service returns the length as the number of characters including trailing spaces. Use the following syntax: INFA_UINT32 INFA_CTGetLength(INFA_CT_INPUTPORT_HANDLE dataHandle); The return value datatype is INFA_UINT32. Use a value between zero and 2GB for the return value. ♦ INFA_CTSetLength(). When the Custom transformation contains a binary or string output port, you must use this function to set the length of the data, including trailing spaces. Verify you the length you set for string and binary ports is not greater than the precision for that port. If you set the length greater than the port precision, you get unexpected results. For example, the session may fail. Use the following syntax: INFA_STATUS INFA_CTSetLength(INFA_CT_OUTPUTPORT_HANDLE dataHandle, IUINT32 length); The return value datatype is INFA_STATUS. Use INFA_SUCCESS and INFA_FAILURE for the return value. Set Pass-Through Port Function Use the INFA_CTSetPassThruPort() function when you want the Integration Service to pass data from an input port to an output port without modifying the data. When you use the INFA_CTSetPassThruPort() function, the Integration Service passes the data to the output port when it calls the input row notification function. Consider the following rules and guidelines when you use the set pass-through port function: ♦ Only use this function in an initialization function. ♦ If the procedure includes this function, do not include the INFA_CTSetData(), INFA_CTSetLength, INFA_CTSetIndicator(), or INFA_CTASetData() functions to pass data to the output port. ♦ In row-based mode, you can only include this function when the transformation scope is Row. When the transformation scope is Transaction or All Input, this function returns INFA_FAILURE. ♦ In row-based mode, when you use this function to output multiple rows for a given input row, every output row contains the data that is passed through from the input port. ♦ In array-based mode, you can only use this function for passive Custom transformations. You must verify that the datatype, precision, and scale are the same for the input and output ports. The Integration Service fails the session if the datatype, precision, or scale are not the same for the input and output ports you specify in the INFA_CTSetPassThruPort() function. 120 Chapter 4: Custom Transformation Functions
  • 153. Use the following syntax: INFA_STATUS INFA_CTSetPassThruPort(INFA_CT_OUTPUTPORT_HANDLE outputport, INFA_CT_INPUTPORT_HANDLE inputport) The return value datatype is INFA_STATUS. Use INFA_SUCCESS and INFA_FAILURE for the return value. Output Notification Function When you want the procedure to output a row to the Integration Service, use the INFA_CTOutputNotification() function. Only include this function for active Custom transformations. For passive Custom transformations, the procedure outputs a row to the Integration Service when the input row notification function gives a return value. If the procedure calls this function for a passive Custom transformation, the Integration Service ignores the function. Note: When the transformation scope is Row, you can only include this function in the input row notification function. If you include it somewhere else, it returns a failure. Use the following syntax: INFA_ROWSTATUS INFA_CTOutputNotification(INFA_CT_OUTPUTGROUP_HANDLE group); Input/ Argument Datatype Description Output group INFA_CT_OUTPUT_GROUP_HANDLE Input Output group handle. The return value datatype is INFA_ROWSTATUS. Use the following values for the return value: ♦ INFA_ROWSUCCESS. Indicates the function successfully processed the row of data. ♦ INFA_ROWERROR. Indicates the function encountered an error for the row of data. The Integration Service increments the internal error count. ♦ INFA_FATALERROR. Indicates the function encountered a fatal error for the row of data. The Integration Service fails the session. Note: When the procedure code calls the INFA_CTOutputNotification() function, you must verify that all pointers in an output port handle point to valid data. When a pointer does not point to valid data, the Integration Service might shut down unexpectedly. Data Boundary Output Notification Function Include the INFA_CTDataBdryOutputNotification() function when you want the procedure to output a commit or rollback transaction. When you use this function, you must select the Generate Transaction property for this Custom transformation. If you do not select this property, the Integration Service fails the session. API Functions 121
  • 154. Use the following syntax: INFA_STATUS INFA_CTDataBdryOutputNotification(INFA_CT_PARTITION_HANDLE handle, INFA_CTDataBdryType dataBoundaryType); Input/ Argument Datatype Description Output handle INFA_CT_PARTITION_HANDLE Input Handle name. dataBoundaryType INFA_CTDataBdryType Input The transaction type. Use the following values for the dataBoundaryType parameter: - eBT_COMMIT - eBT_ROLLBACK The return value datatype is INFA_STATUS. Use INFA_SUCCESS and INFA_FAILURE for the return value. Error Functions Use the error functions to access procedure errors. The Integration Service returns the most recent error. PowerCenter provides the following error functions: ♦ INFA_CTGetErrorMsgM(). Gets the error message in MBCS. Use the following syntax: const char* INFA_CTGetErrorMsgM(); ♦ INFA_CTGetErrorMsgU(). Gets the error message in Unicode. Use the following syntax: const IUNICHAR* INFA_CTGetErrorMsgU(); 122 Chapter 4: Custom Transformation Functions
  • 155. Session Log Message Functions Use the session log message functions when you want the procedure to log a message in the session log in either Unicode or MBCS. PowerCenter provides the following session log message functions: ♦ INFA_CTLogMessageU(). Logs a message in Unicode. Use the following syntax: void INFA_CTLogMessageU(INFA_CT_ErrorSeverityLevel errorseverityLevel, INFA_UNICHAR* msg) Input/ Argument Datatype Description Output errorSeverityLevel INFA_CT_ErrorSeverityLevel Input Severity level of the error message that you want the Integration Service to write in the session log. Use the following values for the errorSeverityLevel parameter: - eESL_LOG - eESL_DEBUG - eESL_ERROR msg INFA_UNICHAR* Input Enter the text of the message in Unicode in quotes. ♦ INFA_CTLogMessageM(). Logs a message in MBCS. Use the following syntax: void INFA_CTLogMessageM(INFA_CT_ErrorSeverityLevel errorSeverityLevel, char* msg) Input/ Argument Datatype Description Output errorSeverityLevel INFA_CT_ErrorSeverityLevel Input Severity level of the error message that you want the Integration Service to write in the session log. Use the following values for the errorSeverityLevel parameter: - eESL_LOG - eESL_DEBUG - eESL_ERROR msg char* Input Enter the text of the message in MBCS in quotes. API Functions 123
  • 156. Increment Error Count Function Use the INFA_CTIncrementErrorCount() function when you want to increase the error count for the session. Use the following syntax: INFA_STATUS INFA_CTIncrementErrorCount(INFA_CT_PARTITION_HANDLE transformation, size_t nErrors, INFA_STATUS* pStatus); Input/ Argument Datatype Description Output transformation INFA_CT_PARTITION_HANDLE Input Partition handle. nErrors size_t Input Integration Service increments the error count by nErrors for the given transformation instance. pStatus INFA_STATUS* Input Integration Service uses INFA_FAILURE for the pStatus parameter when the error count exceeds the error threshold and fails the session. The return value datatype is INFA_STATUS. Use INFA_SUCCESS and INFA_FAILURE for the return value. Is Terminated Function Use the INFA_CTIsTerminated() function when you want the procedure to check if the PowerCenter Client has requested the Integration Service to stop the session. You might call this function if the procedure includes a time-consuming process. Use the following syntax: INFA_CTTerminateType INFA_CTIsTerminated(INFA_CT_PARTITION_HANDLE handle); Input/ Argument Datatype Description Output handle INFA_CT_PARTITION_HANDLE input Partition handle. The return value datatype is INFA_CTTerminateType. The Integration Service returns one of the following values: ♦ eTT_NOTTERMINATED. Indicates the PowerCenter Client has not requested to stop the session. ♦ eTT_ABORTED. Indicates the Integration Service aborted the session. ♦ eTT_STOPPED. Indicates the Integration Service failed the session. 124 Chapter 4: Custom Transformation Functions
  • 157. Blocking Functions When the Custom transformation contains multiple input groups, you can write code to block the incoming data on an input group. For more information about blocking data, see “Blocking Input Data” on page 70. Consider the following rules when you use the blocking functions: ♦ You can block at most n-1 input groups. ♦ You cannot block an input group that is already blocked. ♦ You cannot block an input group when it receives data from the same source as another input group. ♦ You cannot unblock an input group that is already unblocked. PowerCenter provides the following blocking functions: ♦ INFA_CTBlockInputFlow(). Allows the procedure to block an input group. Use the following syntax: INFA_STATUS INFA_CTBlockInputFlow(INFA_CT_INPUTGROUP_HANDLE group); ♦ INFA_CTUnblockInputFlow(). Allows the procedure to unblock an input group. Use the following syntax: INFA_STATUS INFA_CTUnblockInputFlow(INFA_CT_INPUTGROUP_HANDLE group); Input/ Argument Datatype Description Output group INFA_CT_INPUTGROUP_HANDLE Input Input group handle. The return value datatype is INFA_STATUS. Use INFA_SUCCESS and INFA_FAILURE for the return value. Verify Blocking When you use the INFA_CTBlockInputFlow() and INFA_CTUnblockInputFlow() functions in the procedure code, verify the procedure checks whether or not the Integration Service allows the Custom transformation to block incoming data. To do this, check the value of the INFA_CT_TRANS_MAY_BLOCK_DATA propID using the INFA_CTGetInternalPropertyBool() function. When the value of the INFA_CT_TRANS_MAY_BLOCK_DATA propID is FALSE, the procedure should either not use the blocking functions, or it should return a fatal error and stop the session. If the procedure code uses the blocking functions when the Integration Service does not allow the Custom transformation to block data, the Integration Service might fail the session. API Functions 125
  • 158. Pointer Functions Use the pointer functions when you want the Integration Service to create and access pointers to an object or a structure. PowerCenter provides the following pointer functions: ♦ INFA_CTGetUserDefinedPtr(). Allows the procedure to access an object or structure during run time. Use the following syntax: void* INFA_CTGetUserDefinedPtr(INFA_CT_HANDLE handle) Input/ Argument Datatype Description Output handle INFA_CT_HANDLE Input Handle name. ♦ INFA_CTSetUserDefinedPtr(). Allows the procedure to associate an object or a structure with any handle the Integration Service provides. To reduce processing overhead, include this function in the initialization functions. Use the following syntax: void INFA_CTSetUserDefinedPtr(INFA_CT_HANDLE handle, void* pPtr) Input/ Argument Datatype Description Output handle INFA_CT_HANDLE Input Handle name. pPtr void* Input User pointer. You must substitute a valid handle for INFA_CT_HANDLE. Change String Mode Function When the Integration Service runs in Unicode mode, it passes data to the procedure in UCS- 2 by default. When it runs in ASCII mode, it passes data in ASCII by default. Use the INFA_CTChangeStringMode() function if you want to change the default string mode for the procedure. When you change the default string mode to MBCS, the Integration Service passes data in the Integration Service code page. Use the INFA_CTSetDataCodePageID() function if you want to change the code page. For more information about changing the code page ID, see “Set Data Code Page Function” on page 127. When a procedure includes the INFA_CTChangeStringMode() function, the Integration Service changes the string mode for all ports in each Custom transformation that use this particular procedure. Use the change string mode function in the initialization functions. 126 Chapter 4: Custom Transformation Functions
  • 159. Use the following syntax: INFA_STATUS INFA_CTChangeStringMode(INFA_CT_PROCEDURE_HANDLE procedure, INFA_CTStringMode stringMode); Input/ Argument Datatype Description Output procedure INFA_CT_PROCEDURE_HANDLE Input Procedure handle name. stringMode INFA_CTStringMode Input Specifies the string mode that you want the Integration Service to use. Use the following values for the stringMode parameter: - eASM_UNICODE. Use this when the Integration Service runs in ASCII mode and you want the procedure to access data in Unicode. - eASM_MBCS. Use this when the Integration Service runs in Unicode mode and you want the procedure to access data in MBCS. The return value datatype is INFA_STATUS. Use INFA_SUCCESS and INFA_FAILURE for the return value. Set Data Code Page Function Use the INFA_CTSetDataCodePageID() when you want the Integration Service to pass data to the Custom transformation in a code page other than the Integration Service code page. Use the set data code page function in the procedure initialization function. Use the following syntax: INFA_STATUS INFA_CTSetDataCodePageID(INFA_CT_TRANSFORMATION_HANDLE transformation, int dataCodePageID); Input/ Argument Datatype Description Output transformation INFA_CT_TRANSFORMATION_HANDLE Input Transformation handle name. dataCodePageID int Input Specifies the code page you want the Integration Service to pass data in. For valid values for the dataCodePageID parameter, see “Code Pages” in the Administrator Guide. The return value datatype is INFA_STATUS. Use INFA_SUCCESS and INFA_FAILURE for the return value. API Functions 127
  • 160. Row Strategy Functions (Row-Based Mode) The row strategy functions allow you to access and configure the update strategy for each row. PowerCenter provides the following row strategy functions: ♦ INFA_CTGetRowStrategy(). Allows the procedure to get the update strategy for a row. Use the following syntax: INFA_STATUS INFA_CTGetRowStrategy(INFA_CT_INPUTGROUP_HANDLE group, INFA_CTUpdateStrategy updateStrategy); Input/ Argument Datatype Description Output group INFA_CT_INPUTGROUP_HANDLE Input Input group handle. updateStrategy INFA_CT_UPDATESTRATEGY Input Update strategy for the input port. The Integration Service uses the following values: - eUS_INSERT = 0 - eUS_UPDATE = 1 - eUS_DELETE = 2 - eUS_REJECT = 3 ♦ INFA_CTSetRowStrategy(). Sets the update strategy for each row. This overrides the INFA_CTChangeDefaultRowStrategy function. Use the following syntax: INFA_STATUS INFA_CTSetRowStrategy(INFA_CT_OUTPUTGROUP_HANDLE group, INFA_CT_UPDATESTRATEGY updateStrategy); Input/ Argument Datatype Description Output group INFA_CT_OUTPUTGROUP_HANDLE Input Output group handle. updateStrategy INFA_CT_UPDATESTRATEGY Input Update strategy you want to set for the output port. Use one of the following values: - eUS_INSERT = 0 - eUS_UPDATE = 1 - eUS_DELETE = 2 - eUS_REJECT = 3 The return value datatype is INFA_STATUS. Use INFA_SUCCESS and INFA_FAILURE for the return value. 128 Chapter 4: Custom Transformation Functions
  • 161. Change Default Row Strategy Function By default, the row strategy for a Custom transformation is pass-through when the transformation scope is Row. When the transformation scope is Transaction or All Input, the row strategy is the same value as the Treat Source Rows As session property by default. For example, in a mapping you have an Update Strategy transformation followed by a Custom transformation with Row transformation scope. The Update Strategy transformation flags the rows for update, insert, or delete. When the Integration Service passes a row to the Custom transformation, the Custom transformation retains the flag since its row strategy is pass- through. However, you can change the row strategy of a Custom transformation with PowerCenter. Use the INFA_CTChangeDefaultRowStrategy() function to change the default row strategy at the transformation level. For example, when you change the default row strategy of a Custom transformation to insert, the Integration Service flags all the rows that pass through this transformation for insert. Note: The Integration Service returns INFA_FAILURE if the session is not in data-driven mode. Use the following syntax: INFA_STATUS INFA_CTChangeDefaultRowStrategy(INFA_CT_TRANSFORMATION_HANDLE transformation, INFA_CT_DefaultUpdateStrategy defaultUpdateStrategy); Input/ Argument Datatype Description Output transformation INFA_CT_TRANSFORMATION_HANDLE Input Transformation handle. defaultUpdateStrategy INFA_CT_DefaultUpdateStrategy Input Specifies the row strategy you want the Integration Service to use for the Custom transformation. - eDUS_PASSTHROUGH. Flags the row for passthrough. - eDUS_INSERT. Flags rows for insert. - eDUS_UPDATE. Flags rows for update. - eDUS_DELETE. Flags rows for delete. The return value datatype is INFA_STATUS. Use INFA_SUCCESS and INFA_FAILURE for the return value. API Functions 129
  • 162. Array-Based API Functions The array-based functions are API functions you use when you change the data access mode to array-based. For more information about changing the data access mode, see “Set Data Access Mode Function” on page 104. Informatica provides the following groups of array-based API functions: ♦ Maximum number of rows. See “Maximum Number of Rows Functions” on page 130 ♦ Number of rows. See “Number of Rows Functions” on page 131 ♦ Is row valid. See “Is Row Valid Function” on page 132 ♦ Data handling (array-based mode). See “Data Handling Functions (Array-Based Mode)” on page 132 ♦ Row strategy. See “Row Strategy Functions (Array-Based Mode)” on page 135 ♦ Set input error row. See “Set Input Error Row Functions” on page 136 Maximum Number of Rows Functions By default, the Integration Service allows a maximum number of rows in an input block and an output block. However, you can change the maximum number of rows allowed in an output block. Use the INFA_CTAGetInputNumRowsMax() and INFA_CTAGetOutputNumRowsMax() functions to determine the maximum number of rows in input and output blocks. Use the values these functions return to determine the buffer size if the procedure needs a buffer. You can set the maximum number of rows in the output block using the INFA_CTASetOutputRowMax() function. You might use this function if you want the procedure to use a larger or smaller buffer. You can only call these functions in an initialization function. PowerCenter provides the following functions to determine and set the maximum number of rows in blocks: ♦ INFA_CTAGetInputNumRowsMax(). Use this function to determine the maximum number of rows allowed in an input block. Use the following syntax: IINT32 INFA_CTAGetInputRowMax( INFA_CT_INPUTGROUP_HANDLE inputgroup ); Input/ Argument Datatype Description Output inputgroup INFA_CT_INPUTGROUP_HANDLE Input Input group handle. ♦ INFA_CTAGetOutputNumRowsMax(). Use this function to determine the maximum number of rows allowed in an output block. 130 Chapter 4: Custom Transformation Functions
  • 163. Use the following syntax: IINT32 INFA_CTAGetOutputRowMax( INFA_CT_OUTPUTGROUP_HANDLE outputgroup ); Input/ Argument Datatype Description Output outputgroup INFA_CT_OUTPUTGROUP_HANDLE Input Output group handle. ♦ INFA_CTASetOutputRowMax(). Use this function to set the maximum number of rows allowed in an output block. Use the following syntax: INFA_STATUS INFA_CTASetOutputRowMax( INFA_CT_OUTPUTGROUP_HANDLE outputgroup, INFA_INT32 nRowMax ); Input/ Argument Datatype Description Output outputgroup INFA_CT_OUTPUTGROUP_HANDLE Input Output group handle. nRowMax INFA_INT32 Input Maximum number of rows you want to allow in an output block. You must enter a positive number. The function returns a fatal error when you use a non-positive number, including zero. Number of Rows Functions Use the number of rows functions to determine the number of rows in an input block, or to set the number of rows in an output block for the specified input or output group. PowerCenter provides the following number of rows functions: ♦ INFA_CTAGetNumRows(). You can determine the number of rows in an input block. Use the following syntax: INFA_INT32 INFA_CTAGetNumRows( INFA_CT_INPUTGROUP_HANDLE inputgroup ); Input/ Argument Datatype Description Output inputgroup INFA_CT_INPUTGROUP_HANDLE Input Input group handle. ♦ INFA_CTASetNumRows(). You can set the number of rows in an output block. Call this function before you call the output notification function. Array-Based API Functions 131
  • 164. Use the following syntax: void INFA_CTASetNumRows( INFA_CT_OUTPUTGROUP_HANDLE outputgroup, INFA_INT32 nRows ); Input/ Argument Datatype Description Output outputgroup INFA_CT_OUTPUTGROUP_HANDLE Input Output port handle. nRows INFA_INT32 Input Number of rows you want to define in the output block. You must enter a positive number. The Integration Service fails the output notification function when specify a non-positive number. Is Row Valid Function Some rows in a block may be dropped, filter, or error rows. Use the INFA_CTAIsRowValid() function to determine if a row in a block is valid. This function returns INFA_TRUE when a row is valid. Use the following syntax: INFA_BOOLEN INFA_CTAIsRowValid( INFA_CT_INPUTGROUP_HANDLE inputgroup, INFA_INT32 iRow); Input/ Argument Datatype Description Output inputgroup INFA_CT_INPUTGROUP_HANDLE Input Input group handle. iRow INFA_INT32 Input Index number of the row in the block. The index is zero-based. You must verify the procedure only passes an index number that exists in the data block. If you pass an invalid value, the Integration Service shuts down unexpectedly. Data Handling Functions (Array-Based Mode) When the Integration Service calls the p_<proc_name>_inputRowNotification() function, it notifies the procedure that the procedure can access a row or block of data. However, to get data from the input port, modify it, and set data in the output port in array-based mode, you must use the array-based data handling functions in the input row notification function. Include the INFA_CTAGetData<datatype>() function to get the data from the input port and INFA_CTASetData() function to set the data in the output port. Include the INFA_CTAGetIndicator() function if you want the procedure to verify before you get the data if the port has a null value or an empty string. 132 Chapter 4: Custom Transformation Functions
  • 165. PowerCenter provides the following data handling functions for the array-based data access mode: ♦ INFA_CTAGetData<datatype>(). For more information, see “Get Data Functions (Array-Based Mode)” on page 133. ♦ INFA_CTAGetIndicator(). For more information, see “Get Indicator Function (Array- Based Mode)” on page 134. ♦ INFA_CTASetData(). For more information, see “Set Data Function (Array-Based Mode)” on page 134. Get Data Functions (Array-Based Mode) Use the INFA_CTAGetData<datatype>() functions to retrieve data for the port the function specifies. You must modify the function name depending on the datatype of the port you want the procedure to access. The Integration Service passes the length of the data in the array-based get data functions. Table 4-14 lists the INFA_CTGetData<datatype>() function syntax and the datatype of the return value: Table 4-14. Get Data Functions (Array-Based Mode) Return Value Syntax Datatype void* INFA_CTAGetDataVoid( INFA_CT_INPUTPORT_HANDLE Data void pointer to inputport, INFA_INT32 iRow, INFA_UINT32* pLength); the return value char* INFA_CTAGetDataStringM( INFA_CT_INPUTPORT_HANDLE String (MBCS) inputport, INFA_INT32 iRow, INFA_UINT32* pLength); IUNICHAR* INFA_CTAGetDataStringU( INFA_CT_INPUTPORT_HANDLE String (Unicode) inputport, INFA_INT32 iRow, INFA_UINT32* pLength); INFA_INT32 INFA_CTAGetDataINT32( INFA_CT_INPUTPORT_HANDLE Integer inputport, INFA_INT32 iRow); double INFA_CTAGetDataDouble( INFA_CT_INPUTPORT_HANDLE Double inputport, INFA_INT32 iRow); INFA_CT_RAWDATETIME INFA_CTAGetDataRawDate( Raw date INFA_CT_INPUTPORT_HANDLE inputport, INFA_INT32 iRow); INFA_CT_DATETIME INFA_CTAGetDataDateTime( Datetime INFA_CT_INPUTPORT_HANDLE inputport, INFA_INT32 iRow); INFA_CT_RAWDEC18 INFA_CTAGetDataRawDec18( Decimal BLOB INFA_CT_INPUTPORT_HANDLE inputport, INFA_INT32 iRow); (precision 18) INFA_CT_RAWDEC28 INFA_CTAGetDataRawDec28( Decimal BLOB INFA_CT_INPUTPORT_HANDLE inputport, INFA_INT32 iRow); (precision 28) Array-Based API Functions 133
  • 166. Get Indicator Function (Array-Based Mode) Use the get indicator function when you want the procedure to verify if the input port has a null value. Use the following syntax: INFA_INDICATOR INFA_CTAGetIndicator( INFA_CT_INPUTPORT_HANDLE inputport, INFA_INT32 iRow ); Input/ Argument Datatype Description Output inputport INFA_CT_INPUTPORT_HANDLE Input Input port handle. iRow INFA_INT32 Input Index number of the row in the block. The index is zero-based. You must verify the procedure only passes an index number that exists in the data block. If you pass an invalid value, the Integration Service shuts down unexpectedly. The return value datatype is INFA_INDICATOR. Use the following values for INFA_INDICATOR: ♦ INFA_DATA_VALID. Indicates the data is valid. ♦ INFA_NULL_DATA. Indicates a null value. ♦ INFA_DATA_TRUNCATED. Indicates the data has been truncated. Set Data Function (Array-Based Mode) Use the set data function when you want the procedure to pass a value to an output port. You can set the data, the length of the data, if applicable, and the indicator for the output port you specify. You do not use separate functions to set the length or indicator for the output port. Use the following syntax: void INFA_CTASetData( INFA_CT_OUTPUTPORT_HANDLE outputport, INFA_INT32 iRow, void* pData, INFA_UINT32 nLength, INFA_INDICATOR indicator); Input/ Argument Datatype Description Output outputport INFA_CT_OUTPUTPORT_HANDLE Input Output port handle. iRow INFA_INT32 Input Index number of the row in the block. The index is zero-based. You must verify the procedure only passes an index number that exists in the data block. If you pass an invalid value, the Integration Service shuts down unexpectedly. pData void* Input Pointer to the data. 134 Chapter 4: Custom Transformation Functions
  • 167. Input/ Argument Datatype Description Output nLength INFA_UINT32 Input Length of the port. Use for string and binary ports only. You must verify the function passes the correct length of the data. If the function passes a different length, the output notification function returns failure for this port. Verify the length you set for string and binary ports is not greater than the precision for the port. If you set the length greater than the port precision, you get unexpected results. For example, the session may fail. indicator INFA_INDICATOR Input Indicator value for the output port. Use one of the following values: - INFA_DATA_VALID. Indicates the data is valid. - INFA_NULL_DATA. Indicates a null value. - INFA_DATA_TRUNCATED. Indicates the data has been truncated. Row Strategy Functions (Array-Based Mode) The array-based row strategy functions allow you to access and configure the update strategy for each row in a block. PowerCenter provides the following row strategy functions: ♦ INFA_CTAGetRowStrategy(). Allows the procedure to get the update strategy for a row in a block. Use the following syntax: INFA_CT_UPDATESTRATEGY INFA_CTAGetRowStrategy( INFA_CT_INPUTGROUP_HANDLE inputgroup, INFA_INT32 iRow); Input/ Argument Datatype Description Output inputgroup INFA_CT_INPUTGROUP_HANDLE Input Input group handle. iRow INFA_INT32 Input Index number of the row in the block. The index is zero-based. You must verify the procedure only passes an index number that exists in the data block. If you pass an invalid value, the Integration Service shuts down unexpectedly. Array-Based API Functions 135
  • 168. INFA_CTASetRowStrategy(). Sets the update strategy for a row in a block. Use the following syntax: void INFA_CTASetRowStrategy( INFA_CT_OUTPUTGROUP_HANDLE outputgroup, INFA_INT32 iRow, INFA_CT_UPDATESTRATEGY updateStrategy ); Input/ Argument Datatype Description Output outputgroup INFA_CT_OUTPUTGROUP_HANDLE Input Output group handle. iRow INFA_INT32 Input Index number of the row in the block. The index is zero-based. You must verify the procedure only passes an index number that exists in the data block. If you pass an invalid value, the Integration Service shuts down unexpectedly. updateStrategy INFA_CT_UPDATESTRATEGY Input Update strategy for the port. Use one of the following values: - eUS_INSERT = 0 - eUS_UPDATE = 1 - eUS_DELETE = 2 - eUS_REJECT = 3 Set Input Error Row Functions When you use array-based access mode, you cannot return INFA_ROWERROR in the input row notification function. Instead, use the set input error row functions to notify the Integration Service that a particular input row has an error. PowerCenter provides the following set input row functions in array-based mode: ♦ INFA_CTASetInputErrorRowM(). You can notify the Integration Service that a row in the input block has an error and to output an MBCS error message to the session log. Use the following syntax: INFA_STATUS INFA_CTASetInputErrorRowM( INFA_CT_INPUTGROUP_HANDLE inputGroup, INFA_INT32 iRow, size_t nErrors, INFA_MBCSCHAR* sErrMsg ); Input/ Argument Datatype Description Output inputGroup INFA_CT_INPUTGROUP_HANDLE Input Input group handle. iRow INFA_INT32 Input Index number of the row in the block. The index is zero-based. You must verify the procedure only passes an index number that exists in the data block. If you pass an invalid value, the Integration Service shuts down unexpectedly. 136 Chapter 4: Custom Transformation Functions
  • 169. Input/ Argument Datatype Description Output nErrors size_t Input Use this parameter to specify the number of errors this input row has caused. sErrMsg INFA_MBCSCHAR* Input MBCS string containing the error message you want the function to output. You must enter a null-terminated string. This parameter is optional. When you include this argument, the Integration Service prints the message in the session log, even when you enable row error logging. ♦ INFA_CTASetInputErrorRowU(). You can notify the Integration Service that a row in the input block has an error and to output a Unicode error message to the session log. Use the following syntax: INFA_STATUS INFA_CTASetInputErrorRowU( INFA_CT_INPUTGROUP_HANDLE inputGroup, INFA_INT32 iRow, size_t nErrors, INFA_UNICHAR* sErrMsg ); Input/ Argument Datatype Description Output inputGroup INFA_CT_INPUTGROUP_HANDLE Input Input group handle. iRow INFA_INT32 Input Index number of the row in the block. The index is zero-based. You must verify the procedure only passes an index number that exists in the data block. If you pass an invalid value, the Integration Service shuts down unexpectedly. nErrors size_t Input Use this parameter to specify the number of errors this output row has caused. sErrMsg INFA_UNICHAR* Input Unicode string containing the error message you want the function to output. You must enter a null-terminated string. This parameter is optional. When you include this argument, the Integration Service prints the message in the session log, even when you enable row error logging. Array-Based API Functions 137
  • 170. Java API Functions Information forthcoming. 138 Chapter 4: Custom Transformation Functions
  • 171. C++ API Functions Information forthcoming. C++ API Functions 139
  • 172. 140 Chapter 4: Custom Transformation Functions
  • 173. Chapter 5 Expression Transformation This chapter includes the following topics: ♦ Overview, 142 ♦ Creating an Expression Transformation, 143 141
  • 174. Overview Transformation type: Passive Connected Use the Expression transformation to calculate values in a single row before you write to the target. For example, you might need to adjust employee salaries, concatenate first and last names, or convert strings to numbers. Use the Expression transformation to perform any non- aggregate calculations. You can also use the Expression transformation to test conditional statements before you output the results to target tables or other transformations. Note: To perform calculations involving multiple rows, such as sums or averages, use the Aggregator transformation. Unlike the Expression transformation, the Aggregator lets you group and sort data. For more information, see “Aggregator Transformation” on page 37. Calculating Values To use the Expression transformation to calculate values for a single row, you must include the following ports: ♦ Input or input/output ports for each value used in the calculation. For example, when calculating the total price for an order, determined by multiplying the unit price by the quantity ordered, the input or input/output ports. One port provides the unit price and the other provides the quantity ordered. ♦ Output port for the expression. You enter the expression as a configuration option for the output port. The return value for the output port needs to match the return value of the expression. For more information about entering expressions, see “Working with Expressions” on page 10. Expressions use the transformation language, which includes SQL-like functions, to perform calculations. Adding Multiple Calculations You can enter multiple expressions in a single Expression transformation. As long as you enter only one expression for each output port, you can create any number of output ports in the transformation. In this way, use one Expression transformation rather than creating separate transformations for each calculation that requires the same set of data. For example, you might want to calculate several types of withholding taxes from each employee paycheck, such as local and federal income tax, Social Security and Medicare. Since all of these calculations require the employee salary, the withholding category, and/or the corresponding tax rate, you can create one Expression transformation with the salary and withholding category as input/output ports and a separate output port for each necessary calculation. 142 Chapter 5: Expression Transformation
  • 175. Creating an Expression Transformation Use the following procedure to create an Expression transformation. To create an Expression transformation: 1. In the Mapping Designer, click Transformation > Create. Select the Expression transformation. Enter a name for it (the convention is EXP_TransformationName) and click OK. 2. Create the input ports. If you have the input transformation available, you can select Link Columns from the Layout menu and then drag each port used in the calculation into the Expression transformation. With this method, the Designer copies the port into the new transformation and creates a connection between the two ports. Or, you can open the transformation and create each port manually. Note: If you want to make this transformation reusable, you must create each port manually within the transformation. 3. Repeat the previous step for each input port you want to add to the expression. 4. Create the output ports (O) you need, making sure to assign a port datatype that matches the expression return value. The naming convention for output ports is OUT_PORTNAME. 5. Click the small button that appears in the Expression section of the dialog box and enter the expression in the Expression Editor. To prevent typographic errors, where possible, use the listed port names and functions. If you select a port name that is not connected to the transformation, the Designer copies the port into the new transformation and creates a connection between the two ports. Port names used as part of an expression in an Expression transformation follow stricter rules than port names in other types of transformations: ♦ A port name must begin with a single- or double-byte letter or single- or double-byte underscore (_). ♦ It can contain any of the following single- or double-byte characters: a letter, number, underscore (_), $, #, or @. 6. Check the expression syntax by clicking Validate. If necessary, make corrections to the expression and check the syntax again. Then save the expression and exit the Expression Editor. 7. Connect the output ports to the next transformation or target. 8. Select a tracing level on the Properties tab to determine the amount of transaction detail reported in the session log file. 9. Click Repository > Save. Creating an Expression Transformation 143
  • 176. 144 Chapter 5: Expression Transformation
  • 177. Chapter 6 External Procedure Transformation This chapter includes the following topics: ♦ Overview, 146 ♦ Developing COM Procedures, 149 ♦ Developing Informatica External Procedures, 159 ♦ Distributing External Procedures, 169 ♦ Development Notes, 171 ♦ Service Process Variables in Initialization Properties, 180 ♦ External Procedure Interfaces, 181 145
  • 178. Overview Transformation type: Passive Connected/Unconnected External Procedure transformations operate in conjunction with procedures you create outside of the Designer interface to extend PowerCenter functionality. Although the standard transformations provide you with a wide range of options, there are occasions when you might want to extend the functionality provided with PowerCenter. For example, the range of standard transformations, such as Expression and Filter transformations, may not provide the functionality you need. If you are an experienced programmer, you may want to develop complex functions within a dynamic link library (DLL) or UNIX shared library, instead of creating the necessary Expression transformations in a mapping. To obtain this kind of extensibility, use the Transformation Exchange (TX) dynamic invocation interface built into PowerCenter. Using TX, you can create an Informatica External Procedure transformation and bind it to an external procedure that you have developed. You can bind External Procedure transformations to two kinds of external procedures: ♦ COM external procedures (available on Windows only) ♦ Informatica external procedures (available on Windows, AIX, HP-UX, Linux, and Solaris) To use TX, you must be an experienced C, C++, or Visual Basic programmer. Use multi-threaded code in external procedures. Code Page Compatibility When the Integration Service runs in ASCII mode, the external procedure can process data in 7-bit ASCII. When the Integration Service runs in Unicode mode, the external procedure can process data that is two-way compatible with the Integration Service code page. For information about accessing the Integration Service code page, see “Code Page Access Functions” on page 185. Configure the Integration Service to run in Unicode mode if the external procedure DLL or shared library contains multibyte characters. External procedures must use the same code page as the Integration Service to interpret input strings from the Integration Service and to create output strings that contain multibyte characters. Configure the Integration Service to run in either ASCII or Unicode mode if the external procedure DLL or shared library contains ASCII characters only. 146 Chapter 6: External Procedure Transformation
  • 179. External Procedures and External Procedure Transformations There are two components to TX: external procedures and External Procedure transformations. As its name implies, an external procedure exists separately from the Integration Service. It consists of C, C++, or Visual Basic code written by a user to define a transformation. This code is compiled and linked into a DLL or shared library, which is loaded by the Integration Service at runtime. An external procedure is “bound” to an External Procedure transformation. An External Procedure transformation is created in the Designer. It is an object that resides in the Informatica repository and serves several purposes: 1. It contains the metadata describing the following external procedure. It is through this metadata that the Integration Service knows the “signature” (number and types of parameters, type of return value, if any) of the external procedure. 2. It allows an external procedure to be referenced in a mapping. By adding an instance of an External Procedure transformation to a mapping, you call the external procedure bound to that transformation. Note: You can create a connected or unconnected External Procedure. 3. When you develop Informatica external procedures, the External Procedure transformation provides the information required to generate Informatica external procedure stubs. External Procedure Transformation Properties Create reusable External Procedure transformations in the Transformation Developer, and add instances of the transformation to mappings. You cannot create External Procedure transformations in the Mapping Designer or Mapplet Designer. External Procedure transformations return one or no output rows per input row. On the Properties tab of the External Procedure transformation, only enter ASCII characters in the Module/Programmatic Identifier and Procedure Name fields. You cannot enter multibyte characters in these fields. On the Ports tab of the External Procedure transformation, only enter ASCII characters for the port names. You cannot enter multibyte characters for External Procedure transformation port names. Pipeline Partitioning If you purchase the Partitioning option with PowerCenter, you can increase the number of partitions in a pipeline to improve session performance. Increasing the number of partitions allows the Integration Service to create multiple connections to sources and process partitions of source data concurrently. When you create a session, the Workflow Manager validates each pipeline in the mapping for partitioning. You can specify multiple partitions in a pipeline if the Integration Service can maintain data consistency when it processes the partitioned data. Overview 147
  • 180. Use the Is Partitionable property on the Properties tab to specify whether or not you can create multiple partitions in the pipeline. For more information about partitioning External Procedure transformations, see “Working with Partition Points” in the Workflow Administration Guide. COM Versus Informatica External Procedures Table 6-1 describes the differences between COM and Informatica external procedures: Table 6-1. Differences Between COM and Informatica External Procedures COM Informatica Technology Uses COM technology Uses Informatica proprietary technology Operating System Runs on Windows only Runs on all platforms supported for the Integration Service: Windows, AIX, HP, Linux, Solaris Language C, C++, VC++, VB, Perl, VJ++ Only C++ The BankSoft Example The following sections use an example called BankSoft to illustrate how to develop COM and Informatica procedures. The BankSoft example uses a financial function, FV, to illustrate how to develop and call an external procedure. The FV procedure calculates the future value of an investment based on regular payments and a constant interest rate. 148 Chapter 6: External Procedure Transformation
  • 181. Developing COM Procedures You can develop COM external procedures using Microsoft Visual C++ or Visual Basic. The following sections describe how to create COM external procedures using Visual C++ and how to create COM external procedures using Visual Basic. Steps for Creating a COM Procedure To create a COM external procedure, complete the following steps: 1. Using Microsoft Visual C++ or Visual Basic, create a project. 2. Define a class with an IDispatch interface. 3. Add a method to the interface. This method is the external procedure that will be invoked from inside the Integration Service. 4. Compile and link the class into a dynamic link library. 5. Register the class in the local Windows registry. 6. Import the COM procedure in the Transformation Developer. 7. Create a mapping with the COM procedure. 8. Create a session using the mapping. COM External Procedure Server Type The Integration Service only supports in-process COM servers (that is, COM servers with Server Type: Dynamic Link Library). This is done to enhance performance. It is more efficient when processing large amounts of data to process the data in the same process, instead of forwarding it to a separate process on the same machine or a remote machine. Using Visual C++ to Develop COM Procedures C++ developers can use Visual C++ version 5.0 or later to develop COM procedures. The first task is to create a project. Step 1. Create an ATL COM AppWizard Project 1. Launch Visual C++ and click File > New. 2. In the dialog box that appears, select the Projects tab. 3. Enter the project name and location. In the BankSoft example, you enter COM_VC_Banksoft as the project name, and c:COM_VC_Banksoft as the directory. 4. Select the ATL COM AppWizard option in the projects list box and click OK. Developing COM Procedures 149
  • 182. A wizard used to create COM projects in Visual C++ appears. 5. Set the Server Type to Dynamic Link Library, select the Support MFC option, and click Finish. The final page of the wizard appears. 6. Click OK to return to Visual C++. 7. Add a class to the new project. 8. On the next page of the wizard, click the OK button. The Developer Studio creates the basic project files. Step 2. Add an ATL Object to a Project 1. In the Workspace window, select the Class View tab, right-click the tree item COM_VC_BankSoft.BSoftFin classes, and choose New ATL Object from the local menu that appears. 2. Highlight the Objects item in the left list box and select Simple Object from the list of object types. 3. Click Next. 4. In the Short Name field, enter a short name for the class you want to create. In the BankSoft example, use the name BSoftFin, since you are developing a financial function for the fictional company BankSoft. As you type into the Short Name field, the wizard fills in suggested names in the other fields. 5. Enter the programmatic identifier for the class. In the BankSoft example, change the ProgID (programmatic identifier) field to COM_VC_BankSoft.BSoftFin. A programmatic identifier, or ProgID, is the human-readable name for a class. Internally, classes are identified by numeric CLSID's. For example: {33B17632-1D9F-11D1-8790-0000C044ACF9} The standard format of a ProgID is Project.Class[.Version]. In the Designer, you refer to COM classes through ProgIDs. 6. Select the Attributes tab and set the threading model to Free, the interface to Dual, and the aggregation setting to No. 7. Click OK. Now that you have a basic class definition, you can add a method to it. Step 3. Add the Required Methods to the Class 1. Return to the Classes View tab of the Workspace Window. 2. Expand the tree view. 150 Chapter 6: External Procedure Transformation
  • 183. For the BankSoft example, you expand COM_VC_BankSoft. 3. Right-click the newly-added class. In the BankSoft example, you right-click the IBSoftFin tree item. 4. Click the Add Method menu item and enter the name of the method. In the BankSoft example, you enter FV. 5. In the Parameters field, enter the signature of the method. For FV, enter the following: [in] double Rate, [in] long nPeriods, [in] double Payment, [in] double PresentValue, [in] long PaymentType, [out, retval] double* FV This signature is expressed in terms of the Microsoft Interface Description Language (MIDL). For a complete description of MIDL, see the MIDL language reference. Note that: ♦ [in] indicates that the parameter is an input parameter. ♦ [out] indicates that the parameter is an output parameter. ♦ [out, retval] indicates that the parameter is the return value of the method. Also, note that all [out] parameters are passed by reference. In the BankSoft example, the parameter FV is a double. 6. Click OK. The Developer Studio adds to the project a stub for the method you added. Step 4. Fill Out the Method Stub with an Implementation 1. In the BankSoft example, return to the Class View tab of the Workspace window and expand the COM_VC_BankSoft classes item. 2. Expand the CBSoftFin item. 3. Expand the IBSoftFin item under the above item. 4. Right-click the FV item and choose Go to Definition. 5. Position the cursor in the edit window on the line after the TODO comment and add the following code: double v = pow((1 + Rate), nPeriods); *FV = -( (PresentValue * v) + (Payment * (1 + (Rate * PaymentType))) * ((v - 1) / Rate) ); Developing COM Procedures 151
  • 184. Since you refer to the pow function, you have to add the following preprocessor statement after all other include statements at the beginning of the file: #include <math.h> The final step is to build the DLL. When you build it, you register the COM procedure with the Windows registry. Step 5. Build the Project 1. Pull down the Build menu. 2. Select Rebuild All. As Developer Studio builds the project, it generates the following output: ------------Configuration: COM_VC_BankSoft - Win32 Debug-------------- Performing MIDL step Microsoft (R) MIDL Compiler Version 3.01.75 Copyright (c) Microsoft Corp 1991-1997. All rights reserved. Processing .COM_VC_BankSoft.idl COM_VC_BankSoft.idl Processing C:msdevVCINCLUDEoaidl.idl oaidl.idl Processing C:msdevVCINCLUDEobjidl.idl objidl.idl Processing C:msdevVCINCLUDEunknwn.idl unknwn.idl Processing C:msdevVCINCLUDEwtypes.idl wtypes.idl Processing C:msdevVCINCLUDEocidl.idl ocidl.idl Processing C:msdevVCINCLUDEoleidl.idl oleidl.idl Compiling resources... Compiling... StdAfx.cpp Compiling... COM_VC_BankSoft.cpp BSoftFin.cpp Generating Code... Linking... Creating library Debug/COM_VC_BankSoft.lib and object Debug/ COM_VC_BankSoft.exp Registering ActiveX Control... RegSvr32: DllRegisterServer in .DebugCOM_VC_BankSoft.dll succeeded. COM_VC_BankSoft.dll - 0 error(s), 0 warning(s) Notice that Visual C++ compiles the files in the project, links them into a dynamic link library (DLL) called COM_VC_BankSoft.DLL, and registers the COM (ActiveX) class COM_VC_BankSoft.BSoftFin in the local registry. Once the component is registered, it is accessible to the Integration Service running on that host. 152 Chapter 6: External Procedure Transformation
  • 185. For more information about how to package COM classes for distribution to other Integration Services, see “Distributing External Procedures” on page 169. For more information about how to use COM external procedures to call functions in a preexisting library of C or C++ functions, see “Wrapper Classes for Pre-Existing C/C++ Libraries or VB Functions” on page 173. For more information about how to use a class factory to initialize COM objects, see “Initializing COM and Informatica Modules” on page 175. Step 6. Register a COM Procedure with the Repository 1. Open the Transformation Developer. 2. Click Transformation > Import External Procedure. The Import External COM Method dialog box appears. 3. Click the Browse button. Locate the COM procedure. 4. Select the COM DLL you created and click OK. In the Banksoft example, select COM_VC_Banksoft.DLL. 5. Under Select Method tree view, expand the class node (in this example, BSoftFin). 6. Expand Methods. 7. Select the method you want (in this example, FV) and press OK. The Designer creates an External Procedure transformation. 8. Open the External Procedure transformation, and select the Properties tab. Developing COM Procedures 153
  • 186. The transformation properties display: Enter ASCII characters in the Module/Programmatic Identifier and Procedure Name fields. 9. Click the Ports tab. Enter ASCII characters in the Port Name fields. For more information about mapping Visual C++ and Visual Basic datatypes to COM datatypes, see “COM Datatypes” on page 171. 10. Click OK, and then click Repository > Save. The repository now contains the reusable transformation, so you can add instances of this transformation to mappings. 154 Chapter 6: External Procedure Transformation
  • 187. Step 7. Create a Source and a Target for a Mapping Use the following SQL statements to create a source table and to populate this table with sample data: create table FVInputs( Rate float, nPeriods int, Payment float, PresentValue float, PaymentType int ) insert into FVInputs values (.005,10,-200.00,-500.00,1) insert into FVInputs values (.01,12,-1000.00,0.00,0) insert into FVInputs values (.11/12,35,-2000.00,0.00,1) insert into FVInputs values (.005,12,-100.00,-1000.00,1) Use the following SQL statement to create a target table: create table FVOutputs( FVin_ext_proc float, ) Use the Source Analyzer and the Target Designer to import FVInputs and FVOutputs into the same folder as the one in which you created the COM_BSFV transformation. Step 8. Create a Mapping to Test the External Procedure Transformation Now create a mapping to test the External Procedure transformation: 1. In the Mapping Designer, create a new mapping named Test_BSFV. 2. Drag the source table FVInputs into the mapping. 3. Drag the target table FVOutputs into the mapping. 4. Drag the transformation COM_BSFV into the mapping. 5. Connect the Source Qualifier transformation ports to the External Procedure transformation ports as appropriate. Developing COM Procedures 155
  • 188. 6. Connect the FV port in the External Procedure transformation to the FVIn_ext_proc target column. 7. Validate and save the mapping. Step 9. Start the Integration Service Start the Integration Service. Note that the service must be started on the same host as the one on which the COM component was registered. Step 10. Run a Workflow to Test the Mapping When the Integration Service runs the session in a workflow, it performs the following functions: ♦ Uses the COM runtime facilities to load the DLL and create an instance of the class. ♦ Uses the COM IDispatch interface to call the external procedure you defined once for every row that passes through the mapping. Note: Multiple classes, each with multiple methods, can be defined within a single project. Each of these methods can be invoked as an external procedure. To run a workflow to test the mapping: 1. In the Workflow Manager, create the session s_Test_BSFV from the Test_BSFV mapping. 2. Create a workflow that contains the session s_Test_BSFV. 3. Run the workflow. The Integration Service searches the registry for the entry for the COM_VC_BankSoft.BSoftFin class. This entry has information that allows the Integration Service to determine the location of the DLL that contains that class. The Integration Service loads the DLL, creates an instance of the class, and invokes the FV function for every row in the source table. When the workflow finishes, the FVOutputs table should contain the following results: FVIn_ext_proc 2581.403374 12682.503013 82846.246372 2301.401830 Developing COM Procedures with Visual Basic Microsoft Visual Basic offers a different development environment for creating COM procedures. While the Basic language has different syntax and conventions, the development procedure has the same broad outlines as developing COM procedures in Visual C++. 156 Chapter 6: External Procedure Transformation
  • 189. Step 1. Create a Visual Basic Project with a Single Class 1. Launch Visual Basic and click File > New Project. 2. In the dialog box that appears, select ActiveX DLL as the project type and click OK. Visual Basic creates a new project named Project1. If the Project window does not display, type Ctrl+R, or click View > Project Explorer. If the Properties window does not display, press F4, or click View > Properties. 3. In the Project Explorer window for the new project, right-click the project and choose Project1 Properties from the menu that appears. 4. Enter the name of the new project. In the Project window, select Project1 and change the name in the Properties window to COM_VB_BankSoft. Step 2. Change the Names of the Project and Class 1. Inside the Project Explorer, select the “Project – Project1” item, which should be the root item in the tree control. The project properties display in the Properties Window. 2. Select the Alphabetic tab in the Properties Window and change the Name property to COM_VB_BankSoft. This renames the root item in the Project Explorer to COM_VB_BankSoft (COM_VB_BankSoft). 3. Expand the COM_VB_BankSoft (COM_VB_BankSoft) item in the Project Explorer. 4. Expand the Class Modules item. 5. Select the Class1 (Class1) item. The properties of the class display in the Properties Window. 6. Select the Alphabetic tab in the Properties Window and change the Name property to BSoftFin. By changing the name of the project and class, you specify that the programmatic identifier for the class you create is “COM_VB_BankSoft.BSoftFin.” Use this ProgID to refer to this class inside the Designer. Step 3. Add a Method to the Class Place the pointer inside the Code window and enter the following text: Public Function FV( _ Rate As Double, _ nPeriods As Long, _ Payment As Double, _ PresentValue As Double, _ PaymentType As Long _ ) As Double Dim v As Double v = (1 + Rate) ^ nPeriods Developing COM Procedures 157
  • 190. FV = -( _ (PresentValue * v) + _ (Payment * (1 + (Rate * PaymentType))) * ((v - 1) / Rate) _ ) End Function This Visual Basic FV function, of course, performs the same operation as the C++ FV function in “Developing COM Procedures with Visual Basic” on page 156. Step 4. Build the Project To build the project: 1. From the File menu, select the Make COM_VB_BankSoft.DLL. A dialog box prompts you for the file location. 2. Enter the file location and click OK. Visual Basic compiles the source code and creates the COM_VB_BankSoft.DLL in the location you specified. It also registers the class COM_VB_BankSoft.BSoftFin in the local registry. Once the component is registered, it is accessible to the Integration Service running on that host. For more information about how to package Visual Basic COM classes for distribution to other machines hosting the Integration Service, see “Distributing External Procedures” on page 169. For more information about how to use Visual Basic external procedures to call preexisting Visual Basic functions, see “Wrapper Classes for Pre-Existing C/C++ Libraries or VB Functions” on page 173. To create the procedure, follow steps 6 to 9 of “Using Visual C++ to Develop COM Procedures” on page 149. 158 Chapter 6: External Procedure Transformation
  • 191. Developing Informatica External Procedures You can create external procedures that run on 32-bit or 64-bit Integration Service machines. Complete the following steps to create an Informatica-style external procedure: 1. In the Transformation Developer, create an External Procedure transformation. The External Procedure transformation defines the signature of the procedure. The names of the ports, datatypes and port type (input or output) must match the signature of the external procedure. 2. Generate the template code for the external procedure. When you execute this command, the Designer uses the information from the External Procedure transformation to create several C++ source code files and a makefile. One of these source code files contains a “stub” for the function whose signature you defined in the transformation. 3. Modify the code to add the procedure logic. Fill out the stub with an implementation and use a C++ compiler to compile and link the source code files into a dynamic link library or shared library. When the Integration Service encounters an External Procedure transformation bound to an Informatica procedure, it loads the DLL or shared library and calls the external procedure you defined. 4. Build the library and copy it to the Integration Service machine. 5. Create a mapping with the External Procedure transformation. 6. Run the session in a workflow. We use the BankSoft example to illustrate how to implement this feature. Step 1. Create the External Procedure Transformation 1. Open the Transformation Developer and create an External Procedure transformation. 2. Open the transformation and enter a name for it. In the BankSoft example, enter EP_extINF_BSFV. 3. Create a port for each argument passed to the procedure you plan to define. Be sure that you use the correct datatypes. Developing Informatica External Procedures 159
  • 192. To use the FV procedure as an example, you create the following ports. The last port, FV, captures the return value from the procedure: 4. Select the Properties tab and configure the procedure as an Informatica procedure. In the BankSoft example, enter the following: Module/Programmatic Identifier Runtime Location Note on Module/Programmatic Identifier: ♦ The module name is the base name of the dynamic link library (on Windows) or the shared object (on UNIX) that contains the external procedures. The following table 160 Chapter 6: External Procedure Transformation
  • 193. describes how the module name determines the name of the DLL or shared object on the various platforms: Operating System Module Identifier Library File Name Windows INF_BankSoft INF_BankSoft.DLL AIX INF_BankSoft libINF_BankSoftshr.a HPUX INF_BankSoft libINF_BankSoft.sl Linux INF_BankSoft libINF_BankSoft.so Solaris INF_BankSoft libINF_BankSoft.so.1 Notes on Runtime Location: ♦ If you set the Runtime Location to $PMExtProcDir, then the Integration Service looks in the directory specified by the process variable $PMExtProcDir to locate the library. ♦ If you leave the Runtime Location property blank, the Integration Service uses the environment variable defined on the server platform to locate the dynamic link library or shared object. The following table describes the environment variables used to locate the DLL or shared object on the various platforms: Operating System Environment Variable Windows PATH AIX LIBPATH HPUX SHLIB_PATH Linux LD_LIBRARY_PATH Solaris LD_LIBRARY_PATH ♦ You can hard code a path as the Runtime Location. This is not recommended since the path is specific to a single machine only. Note: You must copy all DLLs or shared libraries to the Runtime Location or to the environment variable defined on the Integration Service machine. The Integration Service fails to load the external procedure when it cannot locate the DLL, shared library, or a referenced file. 5. Click OK. 6. Click Repository > Save. After you create the External Procedure transformation that calls the procedure, the next step is to generate the C++ files. Developing Informatica External Procedures 161
  • 194. Step 2. Generate the C++ Files After you create an External Procedure transformation, you generate the code. The Designer generates file names in lower case since files created on UNIX-mapped drives are always in lower case. The following rules apply to the generated files: ♦ File names. A prefix ‘tx’ is used for TX module files. ♦ Module class names. The generated code has class declarations for the module that contains the TX procedures. A prefix Tx is used for TX module classes. For example, if an External Procedure transformation has a module name Mymod, then the class name is TxMymod. To generate the code for an external procedure: 1. Select the transformation and click Transformation > Generate Code. 2. Select the check box next to the name of the procedure you just created. In the BankSoft example, select INF_BankSoft.FV. 3. Specify the directory where you want to generate the files, and click Generate. The Designer creates a subdirectory, INF_BankSoft, in the directory you specified. Each External Procedure transformation created in the Designer must specify a module and a procedure name. The Designer generates code in a single directory for all transformations sharing a common module name. Building the code in one directory creates a single shared library. The Designer generates the following files: ♦ tx<moduleName>.h. Defines the external procedure module class. This class is derived from a base class TINFExternalModule60. No data members are defined for this class in the generated code. However, you can add new data members and methods here. ♦ tx<moduleName>.cpp. Implements the external procedure module class. You can expand the InitDerived() method to include initialization of any new data members you add. The Integration Service calls the derived class InitDerived() method only when it successfully completes the base class Init() method. This file defines the signatures of all External Procedure transformations in the module. Any modification of these signatures leads to inconsistency with the External Procedure transformations defined in the Designer. Therefore, you should not change the signatures. This file also includes a C function CreateExternalModuleObject, which creates an object of the external procedure module class using the constructor defined in this file. The Integration Service calls CreateExternalModuleObject instead of directly calling the constructor. ♦ <procedureName>.cpp. The Designer generates one of these files for each external procedure in this module. This file contains the code that implements the procedure logic, such as data cleansing and filtering. For data cleansing, create code to read in 162 Chapter 6: External Procedure Transformation
  • 195. values from the input ports and generate values for output ports. For filtering, create code to suppress generation of output rows by returning INF_NO_OUTPUT_ROW. ♦ stdafx.h. Stub file used for building on UNIX systems. The various *.cpp files include this file. On Windows systems, the Visual Studio generates an stdafx.h file, which should be used instead of the Designer generated file. ♦ version.cpp. This is a small file that carries the version number of this implementation. In earlier releases, external procedure implementation was handled differently. This file allows the Integration Service to determine the version of the external procedure module. ♦ makefile.aix, makefile.aix64, makefile.hp, makefile.hp64, makefile.hpparisc64, makefile.linux, makefile.sol. Make files for UNIX platforms. Use makefile.aix, makefile.hp, makefile.linux, and makefile.sol for 32-bit platforms. Use makefile.aix64 for 64-bit AIX platforms and makefile.hp64 for 64-bit HP-UX (Itanium) platforms. Example 1 In the BankSoft example, the Designer generates the following files: ♦ txinf_banksoft.h. Contains declarations for module class TxINF_BankSoft and external procedure FV. ♦ txinf_banksoft.cpp. Contains code for module class TxINF_BankSoft. ♦ fv.cpp. Contains code for procedure FV. ♦ version.cpp. Returns TX version. ♦ stdafx.h. Required for compilation on UNIX. On Windows, stdafx.h is generated by Visual Studio. ♦ readme.txt. Contains general help information. Example 2 If you create two External Procedure transformations with procedure names ‘Myproc1’ and ‘Myproc2,’ both with the module name Mymod, the Designer generates the following files: ♦ txmymod.h. Contains declarations for module class TxMymod and external procedures Myproc1 and Myproc2. ♦ txmymod.cpp. Contains code for module class TxMymod. ♦ myproc1.cpp. Contains code for procedure Myproc1. ♦ myproc2.cpp. Contains code for procedure Myproc2. ♦ version.cpp. ♦ stdafx.h. ♦ readme.txt. Developing Informatica External Procedures 163
  • 196. Step 3. Fill Out the Method Stub with Implementation The final step is coding the procedure. 1. Open the <Procedure_Name>.cpp stub file generated for the procedure. In the BankSoft example, you open fv.cpp to code the TxINF_BankSoft::FV procedure. 2. Enter the C++ code for the procedure. The following code implements the FV procedure: INF_RESULT TxINF_BankSoft::FV() { // Input port values are mapped to the m_pInParamVector array in // the InitParams method. Use m_pInParamVector[i].IsValid() to check // if they are valid. Use m_pInParamVector[i].GetLong or GetDouble, // etc. to get their value. Generate output data into m_pOutParamVector. // TODO: Fill in implementation of the FV method here. ostrstream ss; char* s; INF_Boolean bVal; double v; TINFParam* Rate = &m_pInParamVector[0]; TINFParam* nPeriods = &m_pInParamVector[1]; TINFParam* Payment = &m_pInParamVector[2]; TINFParam* PresentValue = &m_pInParamVector[3]; TINFParam* PaymentType = &m_pInParamVector[4]; TINFParam* FV = &m_pOutParamVector[0]; bVal = INF_Boolean( Rate->IsValid() && nPeriods->IsValid() && Payment->IsValid() && PresentValue->IsValid() && PaymentType->IsValid() ); if (bVal == INF_FALSE) { FV->SetIndicator(INF_SQL_DATA_NULL); return INF_SUCCESS; } 164 Chapter 6: External Procedure Transformation
  • 197. v = pow((1 + Rate->GetDouble()), (double)nPeriods->GetLong()); FV->SetDouble( -( (PresentValue->GetDouble() * v) + (Payment->GetDouble() * (1 + (Rate->GetDouble() * PaymentType->GetLong()))) * ((v - 1) / Rate->GetDouble()) ) ); ss << "The calculated future value is: " << FV->GetDouble() <<ends; s = ss.str(); (*m_pfnMessageCallback)(E_MSG_TYPE_LOG, 0, s); (*m_pfnMessageCallback)(E_MSG_TYPE_ERR, 0, s); delete [] s; return INF_SUCCESS; } The Designer generates the function profile, including the arguments and return value. You need to enter the actual code within the function, as indicated in the comments. Since you referenced the POW function and defined an ostrstream variable, you must also include the preprocessor statements: On Windows: #include <math.h>; #include <strstrea.h>; On UNIX, the include statements would be the following: #include <math.h>; #include <strstream.h>; 3. Save the modified file. Step 4. Building the Module On Windows, use Visual C++ to compile the DLL. To build a DLL on Windows: 1. Start Visual C++. 2. Click File > New. 3. In the New dialog box, click the Projects tab and select the MFC AppWizard (DLL) option. Developing Informatica External Procedures 165
  • 198. 4. Enter its location. In the BankSoft example, you enter c:pmclienttxINF_BankSoft, assuming you generated files in c:pmclienttx. 5. Enter the name of the project. It must be the same as the module name entered for the External Procedure transformation. In the BankSoft example, it is INF_BankSoft. 6. Click OK. Visual C++ now steps you through a wizard that defines all the components of the project. 7. In the wizard, click MFC Extension DLL (using shared MFC DLL). 8. Click Finish. The wizard generates several files. 9. Click Project > Add To Project > Files. 10. Navigate up a directory level. This directory contains the external procedure files you created. Select all .cpp files. In the BankSoft example, add the following files: ♦ fv.cpp ♦ txinf_banksoft.cpp ♦ version.cpp 11. Click Project > Settings. 12. Click the C/C++ tab, and select Preprocessor from the Category field. 13. In the Additional Include Directories field, enter ..; <pmserver install dir>extprocinclude. 14. Click the Link tab, and select General from the Category field. 15. Enter <pmserver install dir>binpmtx.lib in the Object/Library Modules field. 16. Click OK. 17. Click Build > Build INF_BankSoft.dll or press F7 to build the project. The compiler now creates the DLL and places it in the debug or release directory under the project directory. For information about running a workflow with the debug version, see “Running a Session with the Debug Version of the Module on Windows” on page 168. 166 Chapter 6: External Procedure Transformation
  • 199. To build shared libraries on UNIX: 1. If you cannot access the PowerCenter Client tools directly, copy all the files you need for the shared library to the UNIX machine where you plan to perform the build. For example, in the BankSoft procedure, use ftp or another mechanism to copy everything from the INF_BankSoft directory to the UNIX machine. 2. Set the environment variable INFA_HOME to the PowerCenter installation directory. Warning: If you specify an incorrect directory path for the INFA_HOME environment variable, the Integration Service cannot start. 3. Enter the command to make the project. The command depends on the version of UNIX, as summarized below: UNIX Version Command AIX (32-bit) make -f makefile.aix AIX (64-bit) make -f makefile.aix64 HP-UX (32-bit) make -f makefile.hp HP-UX (64-bit) make -f makefile.hp64 Linux make -f makefile.linux Solaris make -f makefile.sol Step 5. Create a Mapping In the Mapping Designer, create a mapping that uses this External Procedure transformation. Step 6. Run the Session in a Workflow When you run the session in a workflow, the Integration Service looks in the directory you specify as the Runtime Location to find the library (DLL) you built in Step 4. The default value of the Runtime Location property in the session properties is $PMExtProcDir. To run a session in a workflow: 1. In the Workflow Manager, create a workflow. 2. Create a session for this mapping in the workflow. Tip: Alternatively, you can create a re-usable session in the Task Developer and use it in the workflow. 3. Copy the library (DLL) to the Runtime Location directory. 4. Run the workflow containing the session. Developing Informatica External Procedures 167
  • 200. Running a Session with the Debug Version of the Module on Windows Informatica ships PowerCenter on Windows with the release build (pmtx.dll) and the debug build (pmtxdbg.dll) of the External Procedure transformation library. These libraries are installed in the server bin directory. If you build a release version of the module in Step 4, run the session in a workflow to use the release build (pmtx.dll) of the External Procedure transformation library. You do not need to complete the following task. If you build a debug version of the module in Step 4, follow the procedure below to use the debug build (pmtxdbg.dll) of the External Procedure transformation library. To run a session using a debug version of the module: 1. In the Workflow Manager, create a workflow. 2. Create a session for this mapping in the workflow. Or, you can create a re-usable session in the Task Developer and use it in the workflow. 3. Copy the library (DLL) to the Runtime Location directory. 4. To use the debug build of the External Procedure transformation library: ♦ Preserve pmtx.dll by renaming it or moving it from the server bin directory. ♦ Rename pmtxdbg.dll to pmtx.dll. 5. Run the workflow containing the session. 6. To revert the release build of the External Procedure transformation library back to the default library: ♦ Rename pmtx.dll back to pmtxdbg.dll. ♦ Return/rename the original pmtx.dll file to the server bin directory. Note: If you run a workflow containing this session with the debug version of the module on Windows, you must return the original pmtx.dll file to its original name and location before you can run a non-debug session. 168 Chapter 6: External Procedure Transformation
  • 201. Distributing External Procedures Suppose you develop a set of external procedures and you want to make them available on multiple servers, each of which is running the Integration Service. The methods for doing this depend on the type of the external procedure and the operating system on which you built it. You can also use these procedures to distribute external procedures to external customers. Distributing COM Procedures Visual Basic and Visual C++ register COM classes in the local registry when you build the project. Once registered, these classes are accessible to the Integration Service running on the machine where you compiled the DLL. For example, if you build a project on HOST1, all the classes in the project will be registered in the HOST1 registry and will be accessible to the Integration Service running on HOST1. Suppose, however, that you also want the classes to be accessible to the Integration Service running on HOST2. For this to happen, the classes must be registered in the HOST2 registry. Visual Basic provides a utility for creating a setup program that can install COM classes on a Windows machine and register these classes in the registry on that machine. While no utility is available in Visual C++, you can easily register the class yourself. Figure 6-1 shows the process for distributing external procedures: Figure 6-1. Process for Distributing External Procedures Development PowerCenter Client Integration Service (Where external (Bring the DLL here to (Bring the DLL here to procedure was run execute developed using C++ regsvr32<xyz>.dll) regsvr32<xyz>.dll) or VB) To distribute a COM Visual Basic procedure: 1. After you build the DLL, exit Visual Basic and launch the Visual Basic Application Setup wizard. 2. Skip the first panel of the wizard. 3. On the second panel, specify the location of the project and select the Create a Setup Program option. 4. In the third panel, select the method of distribution you plan to use. 5. In the next panel, specify the directory to which you want to write the setup files. For simple ActiveX components, you can continue to the final panel of the wizard. Otherwise, you may need to add more information, depending on the type of file and the method of distribution. Distributing External Procedures 169
  • 202. 6. Click Finish in the final panel. Visual Basic then creates the setup program for the DLL. Run this setup program on any Windows machine where the Integration Service is running. To distribute a COM Visual C++/Visual Basic procedure manually: 1. Copy the DLL to the directory on the new Windows machine anywhere you want it saved. 2. Log in to this Windows machine and open a DOS prompt. 3. Navigate to the directory containing the DLL and execute the following command: REGSVR32 project_name.DLL project_name is the name of the DLL you created. In the BankSoft example, the project name is COM_VC_BankSoft.DLL. or COM_VB_BankSoft.DLL. This command line program then registers the DLL and any COM classes contained in it. Distributing Informatica Modules You can distribute external procedures between repositories. To distribute external procedures between repositories: 1. Move the DLL or shared object that contains the external procedure to a directory on a machine that the Integration Service can access. 2. Copy the External Procedure transformation from the original repository to the target repository using the Designer client tool. -or- Export the External Procedure transformation to an XML file and import it in the target repository. For more information, see “Exporting and Importing Objects” in the Repository Guide. 170 Chapter 6: External Procedure Transformation
  • 203. Development Notes This section includes some additional guidelines and information about developing COM and Informatica external procedures. COM Datatypes When using either Visual C++ or Visual Basic to develop COM procedures, you need to use COM datatypes that correspond to the internal datatypes that the Integration Service uses when reading and transforming data. These datatype matches are important when the Integration Service attempts to map datatypes between ports in an External Procedure transformation and arguments (or return values) from the procedure the transformation calls. Table 6-2 compares Visual C++ and transformation datatypes: Table 6-2. Visual C++ and Transformation Datatypes Visual C++ COM Datatype Transformation Datatype VT_I4 Integer VT_UI4 Integer VT_R8 Double VT_BSTR String VT_DECIMAL Decimal VT_DATE Date/Time Table 6-3 compares Visual Basic and transformation datatypes: Table 6-3. Visual Basic and Transformation Datatypes Visual Basic COM Datatype Transformation Datatype Long Integer Double Double String String Decimal Decimal Date Date/Time If you do not correctly match datatypes, the Integration Service may attempt a conversion. For example, if you assign the Integer datatype to a port, but the datatype for the corresponding argument is BSTR, the Integration Service attempts to convert the Integer value to a BSTR. Development Notes 171
  • 204. Row-Level Procedures All External Procedure transformations call procedures using values from a single row passed through the transformation. You cannot use values from multiple rows in a single procedure call. For example, you could not code the equivalent of the aggregate functions SUM or AVG into a procedure call. In this sense, all external procedures must be stateless. Return Values from Procedures When you call a procedure, the Integration Service captures an additional return value beyond whatever return value you code into the procedure. This additional value indicates whether the Integration Service successfully called the procedure. For COM procedures, this return value uses the type HRESULT. Informatica procedures use the type INF_RESULT. If the value returned is S_OK/ INF_SUCCESS, the Integration Service successfully called the procedure. You must return the appropriate value to indicate the success or failure of the external procedure. Informatica procedures return four values: ♦ INF_SUCCESS. The external procedure processed the row successfully. The Integration Service passes the row to the next transformation in the mapping. ♦ INF_NO_OUTPUT_ROW. The Integration Service does not write the current row due to external procedure logic. This is not an error. When you use INF_NO_OUTPUT_ROW to filter rows, the External Procedure transformation behaves similarly to the Filter transformation. Note: When you use INF_NO_OUTPUT_ROW in the external procedure, make sure you connect the External Procedure transformation to another transformation that receives rows from the External Procedure transformation only. ♦ INF_ROW_ERROR. Equivalent to a transformation error. The Integration Service discards the current row, but may process the next row unless you configure the session to stop on n errors. ♦ INF_FATAL_ERROR. Equivalent to an ABORT() function call. The Integration Service aborts the session and does not process any more rows. For more information, see “Functions” in the Transformation Language Reference. Exceptions in Procedure Calls The Integration Service captures most exceptions that occur when it calls a COM or Informatica procedure through an External Procedure transformation. For example, if the procedure call creates a divide by zero error, the Integration Service catches the exception. In a few cases, the Integration Service cannot capture errors generated by procedure calls. Since the Integration Service supports only in-process COM servers, and since all Informatica procedures are stored in shared libraries and DLLs, the code running external procedures exists in the same address space in memory as the Integration Service. Therefore, it is possible for the external procedure code to overwrite the Integration Service memory, causing the 172 Chapter 6: External Procedure Transformation
  • 205. Integration Service to stop. If COM or Informatica procedures cause such stops, review the source code for memory access problems. Memory Management for Procedures Since all the datatypes used in Informatica procedures are fixed length, there are no memory management issues for Informatica external procedures. For COM procedures, you need to allocate memory only if an [out] parameter from a procedure uses the BSTR datatype. In this case, you need to allocate memory on every call to this procedure. During a session, the Integration Service releases the memory after calling the function. Wrapper Classes for Pre-Existing C/C++ Libraries or VB Functions Suppose that BankSoft has a library of C or C++ functions and wants to plug these functions in to the Integration Service. In particular, the library contains BankSoft’s own implementation of the FV function, called PreExistingFV. The general method for doing this is the same for both COM and Informatica external procedures. A similar solution is available in Visual Basic. You need only make calls to preexisting Visual Basic functions or to methods on objects that are accessible to Visual Basic. Generating Error and Tracing Messages The implementation of the Informatica external procedure TxINF_BankSoft::FV in “Step 4. Building the Module” on page 165 contains the following lines of code. ostrstream ss; char* s; ... ss << "The calculated future value is: " << FV->GetDouble() << ends; s = ss.str(); (*m_pfnMessageCallback)(E_MSG_TYPE_LOG, 0, s); (*m_pfnMessageCallback)(E_MSG_TYPE_ERR, 0, s); delete [] s; When the Integration Service creates an object of type Tx<MODNAME>, it passes to its constructor a pointer to a callback function that can be used to write error or debugging messages to the session log. (The code for the Tx<MODNAME> constructor is in the file Tx<MODNAME>.cpp.) This pointer is stored in the Tx<MODNAME> member variable m_pfnMessageCallback. The type of this pointer is defined in a typedef in the file $PMExtProcDir/include/infemmsg.h: typedef void (*PFN_MESSAGE_CALLBACK)( enum E_MSG_TYPE eMsgType, unsigned long Code, char* Message ); Also defined in that file is the enumeration E_MSG_TYPE: enum E_MSG_TYPE { E_MSG_TYPE_LOG = 0, E_MSG_TYPE_WARNING, Development Notes 173
  • 206. E_MSG_TYPE_ERR }; If you specify the eMsgType of the callback function as E_MSG_TYPE_LOG, the callback function will write a log message to the session log. If you specify E_MSG_TYPE_ERR, the callback function writes an error message to the session log. If you specify E_MSG_TYPE_WARNING, the callback function writes an warning message to the session log. Use these messages to provide a simple debugging capability in Informatica external procedures. To debug COM external procedures, you may use the output facilities available from inside a Visual Basic or C++ class. For example, in Visual Basic use a MsgBox to print out the result of a calculation for each row. Of course, you want to do this only on small samples of data while debugging and make sure to remove the MsgBox before making a production run. Note: Before attempting to use any output facilities from inside a Visual Basic or C++ class, you must add the following value to the registry: 1. Add the following entry to the Windows registry: HKEY_LOCAL_MACHINESystemCurrentControlSetServicesPowerMartParameter sMiscInfoRunInDebugMode=Yes This option starts the Integration Service as a regular application, not a service. You can debug the Integration Service without changing the debug privileges for the Integration Service service while it is running. 2. Start the Integration Service from the command line, using the command PMSERVER.EXE. The Integration Service is now running in debug mode. When you are finished debugging, make sure you remove this entry from the registry or set RunInDebugMode to No. Otherwise, when you attempt to start PowerCenter as a service, it will not start. 1. Stop the Integration Service and change the registry entry you added earlier to the following setting: HKEY_LOCAL_MACHINESystemCurrentControlSetServicesPowerMartParameter sMiscInfoRunInDebugMode=No 2. Restart the Integration Service as a Windows service. The TINFParam Class and Indicators The <PROCNAME> method accesses input and output parameters using two parameter arrays, and that each array element is of the TINFParam datatype. The TINFParam datatype is a C++ class that serves as a “variant” data structure that can hold any of the Informatica internal datatypes. The actual data in a parameter of type TINFParam* is accessed through member functions of the form Get<Type> and Set<Type>, where <Type> is one of the Informatica internal datatypes. TINFParam also has methods for getting and setting the indicator for each parameter. 174 Chapter 6: External Procedure Transformation
  • 207. You are responsible for checking these indicators on entry to the external procedure and for setting them on exit. On entry, the indicators of all output parameters are explicitly set to INF_SQL_DATA_NULL, so if you do not reset these indicators before returning from the external procedure, you will just get NULLs for all the output parameters. The TINFParam class also supports functions for obtaining the metadata for a particular parameter. For a complete description of all the member functions of the TINFParam class, see the infemdef.h include file in the tx/include directory. Note that one of the main advantages of Informatica external procedures over COM external procedures is that Informatica external procedures directly support indicator manipulation. That is, you can check an input parameter to see if it is NULL, and you can set an output parameter to NULL. COM provides no indicator support. Consequently, if a row entering a COM-style external procedure has any NULLs in it, the row cannot be processed. Use the default value facility in the Designer to overcome this shortcoming. However, it is not possible to pass NULLs out of a COM function. Unconnected External Procedure Transformations When you add an instance of an External Procedure transformation to a mapping, you can choose to connect it as part of the pipeline or leave it unconnected. Connected External Procedure transformations call the COM or Informatica procedure every time a row passes through the transformation. To get return values from an unconnected External Procedure transformation, call it in an expression using the following syntax: :EXT.transformation_name(arguments) When a row passes through the transformation containing the expression, the Integration Service calls the procedure associated with the External Procedure transformation. The expression captures the return value of the procedure through the External Procedure transformation return port, which should have the Result (R) option checked. For more information about expressions, see “Working with Expressions” on page 10. Initializing COM and Informatica Modules Some external procedures must be configured at initialization time. This initialization takes one of two forms, depending on the type of the external procedure: 1. Initialization of Informatica-style external procedures. The Tx<MODNAME> class, which contains the external procedure, also contains the initialization function, Tx<MODNAME>::InitDerived. The signature of this initialization function is well- known to the Integration Service and consists of three parameters: ♦ nInitProps. This parameter tells the initialization function how many initialization properties are being passed to it. ♦ Properties. This parameter is an array of nInitProp strings representing the names of the initialization properties. Development Notes 175
  • 208. Values. This parameter is an array of nInitProp strings representing the values of the initialization properties. The Integration Service first calls the Init() function in the base class. When the Init() function successfully completes, the base class calls the Tx<MODNAME>::InitDerived() function. The Integration Service creates the Tx<MODNAME> object and then calls the initialization function. It is the responsibility of the external procedure developer to supply that part of the Tx<MODNAME>::InitDerived() function that interprets the initialization properties and uses them to initialize the external procedure. Once the object is created and initialized, the Integration Service can call the external procedure on the object for each row. 2. Initialization of COM-style external procedures. The object that contains the external procedure (or EP object) does not contain an initialization function. Instead, another object (the CF object) serves as a class factory for the EP object. The CF object has a method that can create an EP object. The signature of the CF object method is determined from its type library. The Integration Service creates the CF object, and then calls the method on it to create the EP object, passing this method whatever parameters are required. This requires that the signature of the method consist of a set of input parameters, whose types can be determined from the type library, followed by a single output parameter that is an IUnknown** or an IDispatch** or a VARIANT* pointing to an IUnknown* or IDispatch*. The input parameters hold the values required to initialize the EP object and the output parameter receives the initialized object. The output parameter can have either the [out] or the [out, retval] attributes. That is, the initialized object can be returned either as an 176 Chapter 6: External Procedure Transformation
  • 209. output parameter or as the return value of the method. The datatypes supported for the input parameters are: ♦ COM VC type ♦ VT_UI1 ♦ VT_BOOL ♦ VT_I2 ♦ VT_UI2 ♦ VT_I4 ♦ VT_UI4 ♦ VT_R4 ♦ VT_R8 ♦ VT_BSTR ♦ VT_CY ♦ VT_DATE Setting Initialization Properties in the Designer Enter external procedure initialization properties on the Initialization Properties tab of the Edit Transformations dialog box. The tab displays different fields, depending on whether the external procedure is COM-style or Informatica-style. COM-style External Procedure transformations contain the following fields on the Initialization Properties tab: ♦ Programmatic Identifier for Class Factory. Enter the programmatic identifier of the class factory. ♦ Constructor. Specify the method of the class factory that creates the EP object. Development Notes 177
  • 210. Figure 6-2 shows the Initialization Properties tab of a COM-style External Procedure transformation: Figure 6-2. External Procedure Transformation Initialization Properties Add a new property. New Property You can enter an unlimited number of initialization properties to pass to the Constructor method for both COM-style and Informatica-style External Procedure transformations. To add a new initialization property, click the Add button. Enter the name of the parameter in the Property column and enter the value of the parameter in the Value column. For example, you can enter the following parameters: Parameter Value Param1 abc Param2 100 Param3 3.17 Note: You must create a one-to-one relation between the initialization properties you define in the Designer and the input parameters of the class factory constructor method. For example, if the constructor has n parameters with the last parameter being the output parameter that receives the initialized object, you must define n – 1 initialization properties in the Designer, one for each input parameter in the constructor method. You can also use process variables in initialization properties. For information about process variables support in Initialization properties, see “Service Process Variables in Initialization Properties” on page 180. 178 Chapter 6: External Procedure Transformation
  • 211. Other Files Distributed and Used in TX Following are the header files located under the path $PMExtProcDir/include that are needed for compiling external procedures: ♦ infconfg.h ♦ infem60.h ♦ infemdef.h ♦ infemmsg.h ♦ infparam.h ♦ infsigtr.h Following are the library files located under the path <PMInstallDir> that are needed for linking external procedures and running the session: ♦ libpmtx.a (AIX) ♦ libpmtx.sl (HP-UX) ♦ libpmtx.so (Linux) ♦ libpmtx.so (Solaris) ♦ pmtx.dll and pmtx.lib (Windows) Development Notes 179
  • 212. Service Process Variables in Initialization Properties PowerCenter supports built-in process variables in the External Procedure transformation initialization properties list. If the property values contain built-in process variables, the Integration Service expands them before passing them to the external procedure library. This can be very useful for writing portable External Procedure transformations. Figure 6-3 shows an External Procedure transformation with five user-defined properties: Figure 6-3. External Procedure Transformation Initialization Properties Tab Table 6-4 contains the initialization properties and values for the External Procedure transformation in Figure 6-3: Table 6-4. External Procedure Initialization Properties Property Value Expanded Value Passed to the External Procedure Library mytempdir $PMTempDir /tmp memorysize 5000000 5000000 input_file $PMSourceFileDir/file.in /data/input/file.in output_file $PMTargetFileDir/file.out /data/output/file.out extra_var $some_other_variable $some_other_variable When you run the workflow, the Integration Service expands the property list and passes it to the external procedure initialization function. Assuming that the values of the built-in process variables $PMTempDir is /tmp, $PMSourceFileDir is /data/input, and $PMTargetFileDir is /data/output, the last column in Table 6-4 contains the property and expanded value information. Note that the Integration Service does not expand the last property “$some_other_variable” because it is not a built-in process variable. 180 Chapter 6: External Procedure Transformation
  • 213. External Procedure Interfaces The Integration Service uses the following major functions with External Procedures: ♦ Dispatch ♦ External procedure ♦ Property access ♦ Parameter access ♦ Code page access ♦ Transformation name access ♦ Procedure access ♦ Partition related ♦ Tracing level Dispatch Function The Integration Service calls the dispatch function to pass each input row to the external procedure module. The dispatch function, in turn, calls the external procedure function you specify. External procedures access the ports in the transformation directly using the member variable m_pInParamVector for input ports and m_pOutParamVector for output ports. Signature The dispatch function has a fixed signature which includes one index parameter. virtual INF_RESULT Dispatch(unsigned long ProcedureIndex) = 0 External Procedure Function The external procedure function is the main entry point into the external procedure module, and is an attribute of the External Procedure transformation. The dispatch function calls the external procedure function for every input row. For External Procedure transformations, use the external procedure function for input and output from the external procedure module. The function can access the IN and IN-OUT port values for every input row, and can set the OUT and IN-OUT port values. The external procedure function contains all the input and output processing logic. Signature The external procedure function has no parameters. The input parameter array is already passed through the InitParams() method and stored in the member variable m_pInParamVector. Each entry in the array matches the corresponding IN and IN-OUT ports of the External Procedure transformation, in the same order. The Integration Service fills this vector before calling the dispatch function. External Procedure Interfaces 181
  • 214. Use the member variable m_pOutParamVector to pass the output row before returning the Dispatch() function. For the MyExternal Procedure transformation, the external procedure function is the following, where the input parameters are in the member variable m_pInParamVector and the output values are in the member variable m_pOutParamVector: INF_RESULT Tx<ModuleName>::MyFunc() Property Access Functions The property access functions provide information about the initialization properties associated with the External Procedure transformation. The initialization property names and values appear on the Initialization Properties tab when you edit the External Procedure transformation. Informatica provides property access functions in both the base class and the TINFConfigEntriesList class. Use the GetConfigEntryName() and GetConfigEntryValue() functions in the TINFConfigEntriesList class to access the initialization property name and value, respectively. Signature Informatica provides the following functions in the base class: TINFConfigEntriesList* TINFBaseExternalModule60::accessConfigEntriesList(); const char* GetConfigEntry(const char* LHS); Informatica provides the following functions in the TINFConfigEntriesList class: const char* TINFConfigEntriesList::GetConfigEntryValue(const char* LHS); const char* TINFConfigEntriesList::GetConfigEntryValue(int i); const char* TINFConfigEntriesList::GetConfigEntryName(int i); const char* TINFConfigEntriesList::GetConfigEntry(const char* LHS) Note: In the TINFConfigEntriesList class, use the GetConfigEntryName() and GetConfigEntryValue() property access functions to access the initialization property names and values. You can call these functions from a TX program. The TX program then converts this string value into a number, for example by using atoi or sscanf. In the following example, “addFactor” is an Initialization Property. accessConfigEntriesList() is a member variable of the TX base class and does not need to be defined. const char* addFactorStr = accessConfigEntriesList()-> GetConfigEntryValue("addFactor"); 182 Chapter 6: External Procedure Transformation
  • 215. Parameter Access Functions Parameter access functions are datatype specific. Use the parameter access function GetDataType to return the datatype of a parameter. Then use a parameter access function corresponding to this datatype to return information about the parameter. A parameter passed to an external procedure belongs to the datatype TINFParam*. The header file infparam.h defines the related access functions. The Designer generates stub code that includes comments indicating the parameter datatype. You can also determine the datatype of a parameter in the corresponding External Procedure transformation in the Designer. Signature A parameter passed to an external procedure is a pointer to an object of the TINFParam class. This fixed-signature function is a method of that class and returns the parameter datatype as an enum value. The valid datatypes are: INF_DATATYPE_LONG INF_DATATYPE_STRING INF_DATATYPE_DOUBLE INF_DATATYPE_RAW INF_DATATYPE_TIME Table 6-5 lists a brief description of some parameter access functions: Table 6-5. Descriptions of Parameter Access Functions Parameter Access Function Description INF_DATATYPE GetDataType(void); Gets the datatype of a parameter. Use the parameter datatype to determine which datatype-specific function to use when accessing parameter values. INF_Boolean IsValid(void); Verifies that input data is valid. Returns FALSE if the parameter contains truncated data and is a string. INF_Boolean IsNULL(void); Verifies that input data is NULL. INF_Boolean IsInputMapped (void); Verifies that input port passing data to this parameter is connected to a transformation. INF_Boolean IsOutput Mapped (void); Verifies that output port receiving data from this parameter is connected to a transformation. INF_Boolean IsInput(void); Verifies that parameter corresponds to an input port. INF_Boolean IsOutput(void); Verifies that parameter corresponds to an output port. INF_Boolean GetName(void); Gets the name of the parameter. External Procedure Interfaces 183
  • 216. Table 6-5. Descriptions of Parameter Access Functions Parameter Access Function Description SQLIndicator GetIndicator(void); Gets the value of a parameter indicator. The IsValid and ISNULL functions are special cases of this function. This function can also return INF_SQL_DATA_TRUNCATED. void SetIndicator(SQLIndicator Indicator); Sets an output parameter indicator, such as invalid or truncated. long GetLong(void); Gets the value of a parameter having a Long or Integer datatype. Call this function only if you know the parameter datatype is Integer or Long. This function does not convert data to Long from another datatype. double GetDouble(void); Gets the value of a parameter having a Float or Double datatype. Call this function only if you know the parameter datatype is Float or Double. This function does not convert data to Double from another datatype. char* GetString(void); Gets the value of a parameter as a null-terminated string. Call this function only if you know the parameter datatype is String. This function does not convert data to String from another datatype. The value in the pointer changes when the next row of data is read. If you want to store the value from a row for later use, explicitly copy this string into its own allocated buffer. char* GetRaw(void); Gets the value of a parameter as a non-null terminated byte array. Call this function only if you know the parameter datatype is Raw. This function does not convert data to Raw from another datatype. unsigned long GetActualDataLen(void); Gets the current length of the array returned by GetRaw. TINFTime GetTime(void); Gets the value of a parameter having a Date/Time datatype. Call this function only if you know the parameter datatype is Date/Time. This function does not convert data to Date/Time from another datatype. void SetLong(long lVal); Sets the value of an output parameter having a Long datatype. void SetDouble(double dblVal); Sets the value of an output parameter having a Double datatype. void SetString(char* sVal); Sets the value of an output parameter having a String datatype. void SetRaw(char* rVal, size_t Sets a non-null terminated byte array. ActualDataLen); void SetTime(TINFTime timeVal); Sets the value of an output parameter having a Date/Time datatype. Only use the SetInt32 or GetInt32 function when you run the external procedure on a 64-bit Integration Service. Do not use any of the following functions: ♦ GetLong ♦ SetLong ♦ GetpLong ♦ GetpDouble ♦ GetpTime Pass the parameters using two parameter lists. 184 Chapter 6: External Procedure Transformation
  • 217. Table 6-6 lists the member variables of the external procedure base class. Table 6-6. Member Variable of the External Procedure Base Class Variable Description m_nInParamCount Number of input parameters. m_pInParamVector Actual input parameter array. m_nOutParamCount Number of output parameters. m_pOutParamVector Actual output parameter array. Note: Ports defined as input/output show up in both parameter lists. Code Page Access Functions Informatica provides two code page access functions that return the code page of the Integration Service and two that return the code page of the data the external procedure processes. When the Integration Service runs in Unicode mode, the string data passing to the external procedure program can contain multibyte characters. The code page determines how the external procedure interprets a multibyte character string. When the Integration Service runs in Unicode mode, data processed by the external procedure program must be two-way compatible with the Integration Service code page. Signature Use the following functions to obtain the Integration Service code page through the external procedure program. Both functions return equivalent information. int GetServerCodePageID() const; const char* GetServerCodePageName() const; Use the following functions to obtain the code page of the data the external procedure processes through the external procedure program. Both functions return equivalent information. int GetDataCodePageID(); // returns 0 in case of error const char* GetDataCodePageName() const; // returns NULL in case of error Transformation Name Access Functions Informatica provides two transformation name access functions that return the name of the External Procedure transformation. The GetWidgetName() function returns the name of the transformation, and the GetWidgetInstanceName() function returns the name of the transformation instance in the mapplet or mapping. Signature The char* returned by the transformation name access functions is an MBCS string in the code page of the Integration Service. It is not in the data code page. External Procedure Interfaces 185
  • 218. const char* GetWidgetInstanceName() const; const char* GetWidgetName() const; Procedure Access Functions Informatica provides two procedure access functions that provide information about the external procedure associated with the External Procedure transformation. The GetProcedureName() function returns the name of the external procedure specified in the Procedure Name field of the External Procedure transformation. The GetProcedureIndex() function returns the index of the external procedure. Signature Use the following function to get the name of the external procedure associated with the External Procedure transformation: const char* GetProcedureName() const; Use the following function to get the index of the external procedure associated with the External Procedure transformation: inline unsigned long GetProcedureIndex() const; Partition Related Functions Use partition related functions for external procedures in sessions with multiple partitions. When you partition a session that contains External Procedure transformations, the Integration Service creates instances of these transformations for each partition. For example, if you define five partitions for a session, the Integration Service creates five instances of each external procedure at session runtime. Signature Use the following function to obtain the number of partitions in a session: unsigned long GetNumberOfPartitions(); Use the following function to obtain the index of the partition that called this external procedure: unsigned long GetPartitionIndex(); 186 Chapter 6: External Procedure Transformation
  • 219. Tracing Level Function The tracing level function returns the session trace level, for example: typedef enum { TRACE_UNSET = 0, TRACE_TERSE = 1, TRACE_NORMAL = 2, TRACE_VERBOSE_INIT = 3, TRACE_VERBOSE_DATA = 4 } TracingLevelType; Signature Use the following function to return the session trace level: TracingLevelType GetSessionTraceLevel(); External Procedure Interfaces 187
  • 220. 188 Chapter 6: External Procedure Transformation
  • 221. Chapter 7 Filter Transformation This chapter includes the following topics: ♦ Overview, 190 ♦ Filter Condition, 192 ♦ Creating a Filter Transformation, 193 ♦ Tips, 195 ♦ Troubleshooting, 196 189
  • 222. Overview Transformation type: Active Connected You can filter rows in a mapping with the Filter transformation. You pass all the rows from a source transformation through the Filter transformation, and then enter a filter condition for the transformation. All ports in a Filter transformation are input/output, and only rows that meet the condition pass through the Filter transformation. In some cases, you need to filter data based on one or more conditions before writing it to targets. For example, if you have a human resources target containing information about current employees, you might want to filter out employees who are part-time and hourly. The mapping in Figure 7-1 passes the rows from a human resources table that contains employee data through a Filter transformation. The filter only allows rows through for employees that make salaries of $30,000 or higher. Figure 7-1. Sample Mapping with a Filter Transformation 190 Chapter 7: Filter Transformation
  • 223. Figure 7-2 shows the filter condition used in the mapping in Figure 7-1 on page 190: Figure 7-2. Specifying a Filter Condition in a Filter Transformation With the filter of SALARY > 30000, only rows of data where employees that make salaries greater than $30,000 pass through to the target. As an active transformation, the Filter transformation may change the number of rows passed through it. A filter condition returns TRUE or FALSE for each row that passes through the transformation, depending on whether a row meets the specified condition. Only rows that return TRUE pass through this transformation. Discarded rows do not appear in the session log or reject files. To maximize session performance, include the Filter transformation as close to the sources in the mapping as possible. Rather than passing rows you plan to discard through the mapping, you then filter out unwanted data early in the flow of data from sources to targets. You cannot concatenate ports from more than one transformation into the Filter transformation. The input ports for the filter must come from a single transformation. The Filter transformation does not allow setting output default values. Overview 191
  • 224. Filter Condition You use the transformation language to enter the filter condition. The condition is an expression that returns TRUE or FALSE. For example, if you want to filter out rows for employees whose salary is less than $30,000, you enter the following condition: SALARY > 30000 You can specify multiple components of the condition, using the AND and OR logical operators. If you want to filter out employees who make less than $30,000 and more than $100,000, you enter the following condition: SALARY > 30000 AND SALARY < 100000 You do not need to specify TRUE or FALSE as values in the expression. TRUE and FALSE are implicit return values from any condition you set. If the filter condition evaluates to NULL, the row is assumed to be FALSE. Enter conditions using the Expression Editor, available from the Properties tab of the Filter transformation. The filter condition is case sensitive. Any expression that returns a single value can be used as a filter. You can also enter a constant for the filter condition. The numeric equivalent of FALSE is zero (0). Any non-zero value is the equivalent of TRUE. For example, if you have a port called NUMBER_OF_UNITS with a numeric datatype, a filter condition of NUMBER_OF_UNITS returns FALSE if the value of NUMBER_OF_UNITS equals zero. Otherwise, the condition returns TRUE. After entering the expression, you can validate it by clicking the Validate button in the Expression Editor. When you enter an expression, validate it before continuing to avoid saving an invalid mapping to the repository. If a mapping contains syntax errors in an expression, you cannot run any session that uses the mapping until you correct the error. 192 Chapter 7: Filter Transformation
  • 225. Creating a Filter Transformation Creating a Filter transformation requires inserting the new transformation into the mapping, adding the appropriate input/output ports, and writing the condition. To create a Filter transformation: 1. In the Designer, switch to the Mapping Designer and open a mapping. 2. Click Transformation > Create. Select Filter transformation, and enter the name of the new transformation. The naming convention for the Filter transformation is FIL_TransformationName. Click Create, and then click Done. 3. Select and drag all the ports from a source qualifier or other transformation to add them to the Filter transformation. After you select and drag ports, copies of these ports appear in the Filter transformation. Each column has both an input and an output port. 4. Double-click the title bar of the new transformation. 5. Click the Properties tab. A default condition appears in the list of conditions. The default condition is TRUE (a constant with a numeric value of 1). Open Button 6. Click the Value section of the condition, and then click the Open button. The Expression Editor appears. Creating a Filter Transformation 193
  • 226. 7. Enter the filter condition you want to apply. Use values from one of the input ports in the transformation as part of this condition. However, you can also use values from output ports in other transformations. 8. Click Validate to check the syntax of the conditions you entered. You may have to fix syntax errors before continuing. 9. Click OK. 10. Select the Tracing Level, and click OK to return to the Mapping Designer. 11. Click Repository > Save to save the mapping. 194 Chapter 7: Filter Transformation
  • 227. Tips Use the Filter transformation early in the mapping. To maximize session performance, keep the Filter transformation as close as possible to the sources in the mapping. Rather than passing rows that you plan to discard through the mapping, you can filter out unwanted data early in the flow of data from sources to targets. Use the Source Qualifier transformation to filter. The Source Qualifier transformation provides an alternate way to filter rows. Rather than filtering rows from within a mapping, the Source Qualifier transformation filters rows when read from a source. The main difference is that the source qualifier limits the row set extracted from a source, while the Filter transformation limits the row set sent to a target. Since a source qualifier reduces the number of rows used throughout the mapping, it provides better performance. However, the Source Qualifier transformation only lets you filter rows from relational sources, while the Filter transformation filters rows from any type of source. Also, note that since it runs in the database, you must make sure that the filter condition in the Source Qualifier transformation only uses standard SQL. The Filter transformation can define a condition using any statement or transformation function that returns either a TRUE or FALSE value. For more information about setting a filter for a Source Qualifier transformation, see “Source Qualifier Transformation” on page 445. Tips 195
  • 228. Troubleshooting I imported a flat file into another database (Microsoft Access) and used SQL filter queries to determine the number of rows to import into the Designer. But when I import the flat file into the Designer and pass data through a Filter transformation using equivalent SQL statements, I do not import as many rows. Why is there a difference? You might want to check two possible solutions: ♦ Case sensitivity. The filter condition is case sensitive, and queries in some databases do not take this into account. ♦ Appended spaces. If a field contains additional spaces, the filter condition needs to check for additional spaces for the length of the field. Use the RTRIM function to remove additional spaces. How do I filter out rows with null values? To filter out rows containing null values or spaces, use the ISNULL and IS_SPACES functions to test the value of the port. For example, if you want to filter out rows that contain NULLs in the FIRST_NAME port, use the following condition: IIF(ISNULL(FIRST_NAME),FALSE,TRUE) This condition states that if the FIRST_NAME port is NULL, the return value is FALSE and the row should be discarded. Otherwise, the row passes through to the next transformation. For more information about the ISNULL and IS_SPACES functions, see “Functions” in the Transformation Language Reference. 196 Chapter 7: Filter Transformation
  • 229. Chapter 8 HTTP Transformation This chapter includes the following topics: ♦ Overview, 198 ♦ Creating an HTTP Transformation, 200 ♦ Configuring the Properties Tab, 202 ♦ Configuring the HTTP Tab, 204 ♦ Examples, 209 197
  • 230. Overview Transformation type: Passive Connected The HTTP transformation enables you to connect to an HTTP server to use its services and applications. When you run a session with an HTTP transformation, the Integration Service connects to the HTTP server and issues a request to retrieve data from or update data on the HTTP server, depending on how you configure the transformation: ♦ Read data from an HTTP server. When the Integration Service reads data from an HTTP server, it retrieves the data from the HTTP server and passes the data to the target or a downstream transformation in the mapping. For example, you can connect to an HTTP server to read current inventory data, perform calculations on the data during the PowerCenter session, and pass the data to the target. ♦ Update data on the HTTP server. When the Integration Service writes to an HTTP server, it posts data to the HTTP server and passes HTTP server responses to the target or a downstream transformation in the mapping. For example, you can post data providing scheduling information from upstream transformations to the HTTP server during a session. Figure 8-1 shows how the Integration Service processes an HTTP transformation: Figure 8-1. HTTP Transformation Processing HTTP Server HTTP Request HTTP Response Integration Service Source HTTP Transformation Target The Integration Service passes data from upstream transformations or the source to the HTTP transformation, reads a URL configured in the HTTP transformation or application connection, and sends an HTTP request to the HTTP server to either read or update data. Requests contain header information and may contain body information. The header contains information such as authentication parameters, commands to activate programs or web services residing on the HTTP server, and other information that applies to the entire HTTP request. The body contains the data the Integration Service sends to the HTTP server. 198 Chapter 8: HTTP Transformation
  • 231. When the Integration Service sends a request to read data, the HTTP server sends back an HTTP response with the requested data. The Integration Service sends the requested data to downstream transformations or the target. When the Integration Service sends a request to update data, the HTTP server writes the data it receives and sends back an HTTP response that the update succeeded. The HTTP transformation considers response codes 200 and 202 as a success. It considers all other response codes as failures. The session log displays an error when an HTTP server passes a response code that is considered a failure to the HTTP transformation. The Integration Service then sends the HTTP response to downstream transformations or the target. You can configure the HTTP transformation for the headers of HTTP responses. HTTP response body data passes through the HTTPOUT output port. Authentication The HTTP transformation uses the following forms of authentication: ♦ Basic. Based on a non-encrypted user name and password. ♦ Digest. Based on an encrypted user name and password. ♦ NTLM. Based on encrypted user name, password, and domain. Connecting to the HTTP Server When you configure an HTTP transformation, you can configure the URL for the connection. You can also create an HTTP connection object in the Workflow Manager. Configure an HTTP application connection in the following circumstances: ♦ The HTTP server requires authentication. ♦ You want to configure the connection timeout. ♦ You want to override the base URL in the HTTP transformation. For information about configuring the HTTP connection object, see the Workflow Administration Guide. Overview 199
  • 232. Creating an HTTP Transformation You create HTTP transformations in the Transformation Developer or in the Mapping Designer. An HTTP transformation has the following tabs: ♦ Transformation. Configure the name and description for the transformation. ♦ Ports. View input and output ports for the transformation. You cannot add or edit ports on the Ports tab. The Designer creates ports on the Ports tab when you add ports to the header group on the HTTP tab. For more information, see “Configuring Groups and Ports” on page 205. ♦ Properties. Configure properties for the HTTP transformation on the Properties tab. For more information, see “Configuring the Properties Tab” on page 202. ♦ Initialization Properties. You can define properties that the external procedure uses at run time, such as during initialization. For more information about creating initialization properties, see “Working with Procedure Properties” on page 72. ♦ Metadata Extensions. You can specify the property name, datatype, precision, and value. Use metadata extensions for passing information to the procedure. For more information about creating metadata extensions, see “Metadata Extensions” in the Repository Guide. ♦ Port Attribute Definitions. You can view port attributes for HTTP transformation ports. You cannot edit port attribute definitions. ♦ HTTP. Configure the method, ports, and URL on the HTTP tab. For more information, see “Configuring the HTTP Tab” on page 204. 200 Chapter 8: HTTP Transformation
  • 233. Figure 8-2 shows an HTTP transformation: Figure 8-2. HTTP Transformation To create an HTTP transformation: 1. In the Transformation Developer or Mapping Designer, click Transformation > Create. 2. Select HTTP transformation. 3. Enter a name for the transformation. 4. Click Create. The HTTP transformation displays in the workspace. 5. Click Done. 6. Configure the tabs in the transformation. Creating an HTTP Transformation 201
  • 234. Configuring the Properties Tab The HTTP transformation is built using the Custom transformation. Some Custom transformation properties do not apply to the HTTP transformation or are not configurable. Figure 8-3 shows the Properties tab of an HTTP transformation: Figure 8-3. HTTP Transformation Properties Tab Table 8-1 describes the HTTP transformation properties that you can configure: Table 8-1. HTTP Transformation Properties Option Description Runtime Location Location that contains the DLL or shared library. Default is $PMExtProcDir. Enter a path relative to the Integration Service machine that runs the session using the HTTP transformation. If you make this property blank, the Integration Service uses the environment variable defined on the Integration Service machine to locate the DLL or shared library. You must copy all DLLs or shared libraries to the runtime location or to the environment variable defined on the Integration Service machine. The Integration Service fails to load the procedure when it cannot locate the DLL, shared library, or a referenced file. Tracing Level Amount of detail displayed in the session log for this transformation. Default is Normal. 202 Chapter 8: HTTP Transformation
  • 235. Table 8-1. HTTP Transformation Properties Option Description Is Partitionable Indicates if you can create multiple partitions in a pipeline that uses this transformation: - No. The transformation cannot be partitioned. The transformation and other transformations in the same pipeline are limited to one partition. - Locally. The transformation can be partitioned, but the Integration Service must run all partitions in the pipeline on the same node. Choose Local when different partitions of the Custom transformation must share objects in memory. - Across Grid. The transformation can be partitioned, and the Integration Service can distribute each partition to different nodes. Default is No. For more information about using partitioning, see the Workflow Administration Guide. Requires Single Thread Indicates if the Integration Service processes each partition at the procedure with one Per Partition thread. When you enable this option, the procedure code can use thread-specific operations. Default is enabled. For more information about writing thread-specific operations, see “Working with Thread- Specific Procedure Code” on page 66. Configuring the Properties Tab 203
  • 236. Configuring the HTTP Tab On the HTTP tab, you can configure the transformation to read data from the HTTP server or write data to the HTTP server. Configure the following information on the HTTP tab: ♦ Select the method. Select GET, POST, or SIMPLE POST method based on whether you want to read data from or write data to an HTTP server. For more information, see “Selecting a Method” on page 204. ♦ Configure groups and ports. Manage HTTP request/response body and header details by configuring input and output ports. You can also configure port names with special characters. For more information, see “Configuring Groups and Ports” on page 205. ♦ Configure a base URL. Configure the base URL for the HTTP server you want to connect to. For more information, see “Configuring a URL” on page 207. Figure 8-4 shows the HTTP tab of an HTTP transformation: Figure 8-4. HTTP Transformation HTTP Tab Method Groups HTTP Name URL Selecting a Method The groups and ports you define in a transformation depend on the method you select. To read data from an HTTP server, select the GET method. To write data to an HTTP server, select the POST or SIMPLE POST method. 204 Chapter 8: HTTP Transformation
  • 237. Table 8-2 explains the different methods: Table 8-2. HTTP Transformation Methods Method Description GET Reads data from an HTTP server. POST Writes data from multiple input ports to the HTTP server. SIMPLE POST A simplified version of the POST method. Writes data from one input port as a single block of data to the HTTP server. To define the metadata for the HTTP request, you must configure input and output ports based on the method you select: ♦ GET method. Use the input group to add input ports that the Designer uses to construct the final URL for the HTTP server. ♦ POST or SIMPLE POST method. Use the input group for the data that defines the body of the HTTP request. For all methods, use the header group for the HTTP request header information. Configuring Groups and Ports The ports you add to an HTTP transformation depend on the method you choose and the group. An HTTP transformation uses the following groups: ♦ Output. Contains body data for the HTTP response. Passes responses from the HTTP server to downstream transformations or the target. By default, contains one output port, HTTPOUT. You cannot add ports to the output group. You can modify the precision for the HTTPOUT output port. ♦ Input. Contains body data for the HTTP request. Also contains metadata the Designer uses to construct the final URL to connect to the HTTP server. To write data to an HTTP server, the input group passes body information to the HTTP server. By default, contains one input port. ♦ Header. Contains header data for the request and response. Passes header information to the HTTP server when the Integration Service sends an HTTP request. Ports you add to the header group pass data for HTTP headers. When you add ports to the header group the Designer adds ports to the input and output groups on the Ports tab. By default, contains no ports. Note: The data that passes through an HTTP transformation must be of the String datatype. String data includes any markup language common in HTTP communication, such as HTML and XML. Configuring the HTTP Tab 205
  • 238. Table 8-3 describes the groups and ports for the GET method: Table 8-3. GET Method Groups and Ports Request/ Group Description Response REQUEST Input The Designer uses the names and values of the input ports to construct the final URL. Header You can configure input and input/output ports for HTTP requests. The Designer adds ports to the input and output groups based on the ports you add to the header group: - Input group. Creates input ports based on input and input/output ports from the header group. - Output group. Creates output ports based on input/output ports from the header group. RESPONSE Header You can configure output and input/output ports for HTTP responses. The Designer adds ports to the input and output groups based on the ports you add to the header group: - Input group. Creates input ports based on input/output ports from the header group. - Output group. Creates output ports based on output and input/output ports from the header group. Output All body data for an HTTP response passes through the HTTPOUT output port. Table 8-4 describes the ports for the POST method: Table 8-4. POST Method Groups and Ports Request/ Group Description Response REQUEST Input You can add multiple ports to the input group. Body data for an HTTP request can pass through one or more input ports based on what you add to the header group. Header You can configure input and input/output ports for HTTP requests. The Designer adds ports to the input and output groups based on the ports you add to the header group: - Input group. Creates input ports based on input and input/output ports from the header group. - Output group. Creates output ports based on input/output ports from the header group. RESPONSE Header You can configure output and input/output ports for HTTP responses. The Designer adds ports to the input and output groups based on the ports you add to the header group: - Input group. Creates input ports based on input/output ports from the header group. - Output group. Creates output ports based on output and input/output ports from the header group. Output All body data for an HTTP response passes through the HTTPOUT output port. 206 Chapter 8: HTTP Transformation
  • 239. Table 8-5 describes the ports for the SIMPLE POST method: Table 8-5. SIMPLE POST Method Groups and Ports Request/ Group Description Response REQUEST Input You can add one input port. Body data for an HTTP request can pass through one input port. Header You can configure input and input/output ports for HTTP requests. The Designer adds ports to the input and output groups based on the ports you add to the header group: - Input group. Creates input ports based on input and input/output ports from the header group. - Output group. Creates output ports based on input/output ports from the header group. RESPONSE Header You can configure output and input/output ports for HTTP responses. The Designer adds ports to the input and output groups based on the ports you add to the header group: - Input group. Creates input ports based on input/output ports from the header group. - Output group. Creates output ports based on output and input/output ports from the header group. Output All body data for an HTTP response passes through the HTTPOUT output port. Adding an HTTP Name The Designer does not allow special characters, such as a dash (-), in port names. If you need to use special characters in a port name, you can configure an HTTP name to override the name of a port. For example, if you want an input port named Content-type, you can name the port ContentType and enter Content-Type as the HTTP name. Configuring a URL After you select a method and configure input and output ports, you must configure a URL. Enter a base URL, and the Designer constructs the final URL. If you select the GET method, the final URL contains the base URL and parameters based on the port names in the input group. If you select the POST or SIMPLE POST methods, the final URL is the same as the base URL. You can also specify a URL when you configure an HTTP application connection. The base URL specified in the HTTP application connection overrides the base URL specified in the HTTP transformation. Note: An HTTP server can redirect an HTTP request to another HTTP server. When this occurs, the HTTP server sends a URL back to the Integration Service, which then establishes a connection to the other HTTP server. The Integration Service can establish a maximum of five additional connections. Final URL Construction for GET Method The Designer constructs the final URL for the GET method based on the base URL and port names in the input group. It appends HTTP arguments to the base URL to construct the final URL in the form of an HTTP query string. A query string consists of a question mark Configuring the HTTP Tab 207
  • 240. (?), followed by name/value pairs. The Designer appends the question mark and the name/ value pairs that correspond to the names and values of the input ports you add to the input group. When you select the GET method and add input ports to the input group, the Designer appends the following group and port information to the base URL to construct the final URL: ?<input group input port 1 name> = $<input group input port 1 value> For each input port following the first input group input port, the Designer appends the following group and port information: & <input group input port n name> = $<input group input port n value> where n represents the input port. For example, if you enter www.company.com for the base URL and add the input ports ID, EmpName, and Department to the input group, the Designer constructs the following final URL: www.company.com?ID=$ID&EmpName=$EmpName&Department=$Department You can edit the final URL to modify or add operators, variables, or other arguments. For more information about HTTP requests and query string, see http://www.w3c.org. 208 Chapter 8: HTTP Transformation
  • 241. Examples This section contains examples for each type of method: ♦ GET ♦ POST ♦ SIMPLE POST GET Example The source file used with this example contains the following data: 78576 78577 78578 Figure 8-5 shows the HTTP tab of the HTTP transformation for the GET example: Figure 8-5. HTTP Tab for a GET Example The Designer appends a question mark (?), the input group input port name, a dollar sign ($), and the input group input port name again to the base URL to construct the final URL: http://www.informatica.com?CR=$CR Examples 209
  • 242. The Integration Service sends the source file values to the CR input port of the HTTP transformation and sends the following HTTP requests to the HTTP server: http://www.informatica.com?CR=78576 http://www.informatica.com?CR=78577 http://www.informatica.com?CR=78578 The HTTP server sends an HTTP response back to the Integration Service, which sends the data through the output port of the HTTP transformation to the target. POST Example The source file used with this example contains the following data: 33,44,1 44,55,2 100,66,0 Figure 8-6 shows that each field in the source file has a corresponding input port: Figure 8-6. HTTP Tab for a POST Example The Integration Service sends the values of the three fields for each row through the input ports of the HTTP transformation and sends the HTTP request to the HTTP server specified in the final URL. 210 Chapter 8: HTTP Transformation
  • 243. SIMPLE POST Example The following text shows the XML file used with this example: <?xml version="1.0" encoding="UTF-8"?> <n4:Envelope xmlns:cli="http://localhost:8080/axis/Clienttest1.jws" xmlns:n4="http://schemas.xmlsoap.org/soap/envelope/" xmlns:tns="http:// schemas.xmlsoap.org/soap/encoding/" xmlns:xsi="http://www.w3.org/2001/ XMLSchema-instance/" xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <n4:Header> </n4:Header> <n4:Body n4:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/ "><cli:smplsource> <Metadatainfo xsi:type="xsd:string">smplsourceRequest.Metadatainfo106</ Metadatainfo></cli:smplsource> </n4:Body> <n4:Envelope>,capeconnect:Clienttest1services:Clienttest1#smplsource Figure 8-7 shows the HTTP tab of the HTTP transformation for the SIMPLE POST example: Figure 8-7. HTTP Tab for a SIMPLE POST Example The Integration Service sends the body of the source file through the input port and sends the HTTP request to the HTTP server specified in the final URL. Examples 211
  • 244. 212 Chapter 8: HTTP Transformation
  • 245. Chapter 9 Java Transformation This chapter includes the following topics: ♦ Overview, 214 ♦ Using the Java Code Tab, 217 ♦ Configuring Ports, 219 ♦ Configuring Java Transformation Properties, 221 ♦ Developing Java Code, 225 ♦ Configuring Java Transformation Settings, 229 ♦ Compiling a Java Transformation, 231 ♦ Fixing Compilation Errors, 232 213
  • 246. Overview Transformation type: Active/Passive Connected You can extend PowerCenter functionality with the Java transformation. The Java transformation provides a simple native programming interface to define transformation functionality with the Java programming language. You can use the Java transformation to quickly define simple or moderately complex transformation functionality without advanced knowledge of the Java programming language or an external Java development environment. For example, you can define transformation logic to loop through input rows and generate multiple output rows based on a specific condition. You can also use expressions, user-defined functions, unconnected transformations, and mapping variables in the Java code. You create Java transformations by writing Java code snippets that define transformation logic. You can use Java transformation API methods and standard Java language constructs. For example, you can use static code and variables, instance variables, and Java methods. You can use third-party Java APIs, built-in Java packages, or custom Java packages. You can also define and use Java expressions to call expressions from within a Java transformation. For more information about the Java transformation API methods, see “Java Transformation API Reference” on page 237. For more information about using Java expressions, see “Java Expressions” on page 263. The PowerCenter Client uses the Java Development Kit (JDK) to compile the Java code and generate byte code for the transformation. The Integration Service uses the Java Runtime Environment (JRE) to execute generated byte code at run time.When you run a session with a Java transformation, the Integration Service uses the JRE to execute the byte code and process input rows and generate output rows. You can define transformation behavior for a Java transformation based on the following events: ♦ The transformation receives an input row ♦ The transformation has processed all input rows ♦ The transformation receives a transaction notification such as commit or rollback Steps to Define a Java Transformation Complete the following steps to write and compile Java code and fix compilation errors in a Java transformation: 1. Create the transformation in the Transformation Developer or Mapping Designer. 2. Configure input and output ports and groups for the transformation. Use port names as variables in Java code snippets. For more information, see “Configuring Ports” on page 219. 214 Chapter 9: Java Transformation
  • 247. 3. Configure the transformation properties. For more information, see “Configuring Java Transformation Properties” on page 221. 4. Use the code entry tabs in the transformation to write and compile the Java code for the transformation. For more information, see “Developing Java Code” on page 225 and “Compiling a Java Transformation” on page 231. 5. Locate and fix compilation errors in the Java code for the transformation. For more information, see “Fixing Compilation Errors” on page 232. Active and Passive Java Transformations You can create active and passive Java transformations. You select the type of Java transformation when you create the transformation. After you set the transformation type, you cannot change it. Active and passive Java transformations run the Java code in the On Input Row tab for the transformation one time for each row of input data. Use an active transformation when you want to generate more than one output row for each input row in the transformation. You must use Java transformation generateRow API to generate an output row. For example, a Java transformation contains two input ports that represent a start date and an end date. You can generate an output row for each date between the start date and end date. Use a passive transformation when you need one output row for each input row in the transformation. Passive transformations generate an output row after processing each input row. Datatype Mapping The Java transformation maps PowerCenter datatypes to Java primitives, based on the Java transformation port type. The Java transformation maps input port datatypes to Java primitives when it reads input rows, and it maps Java primitives to output port datatypes when it writes output rows. For example, if an input port in a Java transformation has an Integer datatype, the Java transformation maps it to an integer primitive. The transformation treats the value of the input port as Integer in the transformation, and maps the Integer primitive to an integer datatype when the transformation generates the output row. Table 9-1 shows the mapping between PowerCenter datatypes and Java primitives by a Java transformation: Table 9-1. Mapping from PowerCenter Datatypes to Java Datatypes PowerCenter Datatype Java Datatype CHAR String BINARY byte[] LONG (INT32) int Overview 215
  • 248. Table 9-1. Mapping from PowerCenter Datatypes to Java Datatypes PowerCenter Datatype Java Datatype DOUBLE double DECIMAL double * BigDecimal Date/Time long (number of milliseconds since January 1, 1970 00:00:00.000 GMT) * For more information about configuring the Java datatype for PowerCenter Decimal datatypes, see “Enabling High Precision” on page 230. String and byte[] are object datatypes in Java. Int, double, and long are primitive datatypes. 216 Chapter 9: Java Transformation
  • 249. Using the Java Code Tab Use the Java Code tab to define, compile, and fix compilation errors in Java code. You can use code snippets that Java packages, define static code or a static block, instance variables, and user-defined methods, define and call Java expressions, and define transformation logic. Create code snippets in the code entry tabs. After you develop code snippets, you can compile the Java code and view the results of the compilation in the Output window or view the full Java code. Figure 9-1 shows the components of the Java Code tab: Figure 9-1. Java Code Tab Components Navigator Code Window Code Entry Tabs Output Window The Java Code tab contains the following components: ♦ Navigator. Add input or output ports or APIs to a code snippet. The Navigator lists the input and output ports for the transformation, the available Java transformation APIs, and a description of the port or API function. For input and output ports, the description includes the port name, type, datatype, precision, and scale. For API functions, the description includes the syntax and use of the API function. The Navigator disables any port or API function that is unavailable for the code entry tab. For example, you cannot add ports or call API functions from the Import Packages code entry tab. Using the Java Code Tab 217
  • 250. For more information about using the Navigator when you develop Java code, see “Developing Java Code” on page 225. ♦ Code window. Develop Java code for the transformation. The code window uses basic Java syntax highlighting. For more information, see “Developing Java Code” on page 225. ♦ Code entry tabs. Define transformation behavior. Each code entry tab has an associated Code window. To enter Java code for a code entry tab, click the tab and write Java code in the Code window. For more information about the code entry tabs, see “Developing Java Code” on page 225. ♦ Define Expression link. Launches the Define Expression dialog box that you use to create Java expressions. For more information about creating and using Java expressions, see “Java Expressions” on page 263. ♦ Settings link. Launches the Settings dialog box that you use to set the classpath for third- party and custom Java packages and to enable high precision for Decimal datatypes. For more information, see “Configuring Java Transformation Settings” on page 229. ♦ Compile link. Compiles the Java code for the transformation. Output from the Java compiler, including error and informational messages, appears in the Output window. For more information about compiling Java transformations, see “Compiling a Java Transformation” on page 231. ♦ Full Code link. Opens the Full Code window to display the complete class code for the Java transformation. The complete code for the transformation includes the Java code from the code entry tabs added to the Java transformation class template. For more information about using the Full Code window, see “Fixing Compilation Errors” on page 232. ♦ Output window. Displays the compilation results for the Java transformation class. You can right-click an error message in the Output window to locate the error in the snippet code or the full code for the Java transformation class in the Full Code window. You can also double-click an error in the Output window to locate the source of the error. For more information about using the Output window to troubleshoot compilation errors, see “Fixing Compilation Errors” on page 232. 218 Chapter 9: Java Transformation
  • 251. Configuring Ports A Java transformation can have input ports, output ports, and input/output ports. You create and edit groups and ports on the Ports tab. You can specify default values for ports. After you add ports to a transformation, use the port names as variables in Java code snippets. Figure 9-2 shows the Ports tab for a Java transformation with one input group and one output group: Figure 9-2. Java Transformation Ports Tab Add and delete groups, and edit port relationships. Input Group Output Group Set default value. Creating Groups and Ports When you create a Java transformation, it includes one input group and one output group. A Java transformation always has one input group and one output group. You can change the existing group names by typing the group header. If you delete a group, you can add a new group by clicking the Create Input Group or Create Output Group icon. The transformation is not valid if it has multiple input or output groups. To create a port, click the Add button. When you create a port, the Designer adds it below the currently selected row or group. For guidelines about creating and editing groups and ports, see “Working with Groups and Ports” on page 59. Configuring Ports 219
  • 252. Setting Default Port Values You can define default values for ports in a Java transformation. The Java transformation initializes port variables with the default port value, depending on the datatype of the port. For more information about port datatypes, see “Datatype Mapping” on page 215. Input and Output Ports The Java transformation initializes the value of unconnected input ports or output ports that are not assigned a value in the Java code snippets. The Java transformation initializes the ports depending on the port datatype: ♦ Simple datatypes. If you define a default value for the port, the transformation initializes the value of the port variable to the default value. Otherwise, it initializes the value of the port variable to 0. ♦ Complex datatypes. If you provide a default value for the port, the transformation creates a new String or byte[] object, and initializes the object to the default value. Otherwise, the transformation initializes the port variable to NULL. Input ports with a NULL value generate a NullPointerException if you access the value of the port variable in the Java code. Input/Output Ports The Java transformation treats input/output ports as pass-through ports. If you do not set a value for the port in the Java code for the transformation, the output value is the same as the input value. The Java transformation initializes the value of an input/output port in the same way as an input port. If you set the value of a port variable for an input/output port in the Java code, the Java transformation uses this value when it generates an output row. If you do not set the value of an input/output port, the Java transformation sets the value of the port variable to 0 for simple datatypes and NULL for complex datatypes when it generates an output row. 220 Chapter 9: Java Transformation
  • 253. Configuring Java Transformation Properties The Java transformation includes properties for both the transformation code and the transformation. If you create a Java transformation in the Transformation Developer, you can override the transformation properties when you use it in a mapping. Figure 9-3 shows the Java transformation Properties tab: Figure 9-3. Java Transformation Properties Table 9-2 describes the Java transformation properties: Table 9-2. Java Transformation Properties Required / Property Description Optional Language Required Language used for the transformation code. You cannot change this value. Class Name Required Name of the Java class for the transformation. You cannot change this value. Configuring Java Transformation Properties 221
  • 254. Table 9-2. Java Transformation Properties Required / Property Description Optional Tracing Level Required Amount of detail displayed in the session log for this transformation. Use the following tracing levels: - Terse - Normal - Verbose Initialization - Verbose Data Default is Normal. For more information about tracing levels, see “Session and Workflow Logs” in the Workflow Administration Guide. Is Partitionable Required Multiple partitions in a pipeline can use this transformation. Use the following options: - No. The transformation cannot be partitioned. The transformation and other transformations in the same pipeline are limited to one partition. You might choose No if the transformation processes all the input data together, such as data cleansing. - Locally. The transformation can be partitioned, but the Integration Service must run all partitions in the pipeline on the same node. Choose Locally when different partitions of the transformation must share objects in memory. - Across Grid. The transformation can be partitioned, and the Integration Service can distribute each partition to different nodes. Default is No. For more information about using partitioning with Java and Custom transformations, see “Working with Partition Points” in the Workflow Administration Guide. Inputs Must Block Optional The procedure associated with the transformation must be able to block incoming data. Default is enabled. Is Active Required The transformation can generate more than one output row for each input row. You cannot change this property after you create the Java transformation. If you need to change this property, create a new Java transformation. Update Strategy Optional The transformation defines the update strategy for output rows. You can Transformation enable this property for active Java transformations. Default is disabled. For more information about setting the update strategy in Java transformations, see “Setting the Update Strategy” on page 224. Transformation Scope Required The method in which the Integration Service applies the transformation logic to incoming data. Use the following options: - Row - Transaction - All Input This property is always Row for passive transformations. Default is All Input for active transformations. For more information about working with transaction control, see “Working with Transaction Control” on page 223. 222 Chapter 9: Java Transformation
  • 255. Table 9-2. Java Transformation Properties Required / Property Description Optional Generate Transaction Optional The transformation generates transaction rows. You can enable this property for active Java transformations. Default is disabled. For more information about working with transaction control, see “Working with Transaction Control” on page 223. Output Is Ordered Required The order of the output data is consistent between session runs. - Never. The order of the output data is inconsistent between session runs. - Based On Input Order. The output order is consistent between session runs when the input data order is consistent between session runs. - Always. The order of the output data is consistent between session runs even if the order of the input data is inconsistent between session runs. Default is Never for active transformations. Default is Based On Input Order for passive transformations. Requires Single Optional A single thread processes the data for each partition. Thread Per Partition You cannot change this value. Output Is Deterministic Optional The transformation generates consistent output data between session runs. You must enable this property to perform recovery on sessions that use this transformation. For more information about session recovery, see “Recovering Workflows” in the Workflow Administration Guide. Working with Transaction Control You can define transaction control for a Java transformation using the following properties: ♦ Transformation Scope. Determines how the Integration Service applies the transformation logic to incoming data. ♦ Generate Transaction. Indicates that the Java code for the transformation generates transaction rows and outputs them to the output group. Transformation Scope You can configure how the Integration Service applies the transformation logic to incoming data. You can choose one of the following values: ♦ Row. Applies the transformation logic to one row of data at a time. Choose Row when the results of the transformation depend on a single row of data. You must choose Row for passive transformations. ♦ Transaction. Applies the transformation logic to all rows in a transaction. Choose Transaction when the results of the transformation depend on all rows in the same transaction, but not on rows in other transactions. For example, you might choose Transaction when the Java code performs aggregate calculations on the data in a single transaction. ♦ All Input. Applies the transformation logic to all incoming data. When you choose All Input, the Integration Service drops transaction boundaries. Choose All Input when the Configuring Java Transformation Properties 223
  • 256. results of the transformation depend on all rows of data in the source. For example, you might choose All Input when the Java code for the transformation sorts all incoming data. For more information about transformation scope, see “Understanding Commit Points” in the Workflow Administration Guide. Generate Transaction You can define Java code in an active Java transformation to generate transaction rows, such as commit and rollback rows. If the transformation generates commit and rollback rows, configure the Java transformation to generate transactions with the Generate Transaction transformation property. For more information about Java transformation API methods to generate transaction rows, see “commit” on page 239 and “rollBack” on page 247. When you configure the transformation to generate transaction rows, the Integration Service treats the Java transformation like a Transaction Control transformation. Most rules that apply to a Transaction Control transformation in a mapping also apply to the Java transformation. For example, when you configure a Java transformation to generate transaction rows, you cannot concatenate pipelines or pipeline branches containing the transformation. For more information about working with Transaction Control transformations, see “Transaction Control Transformation” on page 555. When you edit or create a session using a Java transformation configured to generate transaction rows, configure it for user-defined commit. Setting the Update Strategy Use an active Java transformation to set the update strategy for a mapping. You can set the update strategy at the following levels: ♦ Within the Java code. You can write the Java code to set the update strategy for output rows. The Java code can flag rows for insert, update, delete, or reject. For more about setting the update strategy, see “setOutRowType” on page 249. ♦ Within the mapping. Use the Java transformation in a mapping to flag rows for insert, update, delete, or reject. Select the Update Strategy Transformation property for the Java transformation. ♦ Within the session. Configure the session to treat the source rows as data driven. If you do not configure the Java transformation to define the update strategy, or you do not configure the session as data driven, the Integration Service does not use the Java code to flag the output rows. Instead, the Integration Service flags the output rows as insert. 224 Chapter 9: Java Transformation
  • 257. Developing Java Code Use the code entry tabs to enter Java code snippets that define Java transformation functionality. You can write Java code using the code entry tabs to import Java packages, write helper code, define Java expressions, and write Java code that defines transformation behavior for specific transformation events. You can develop snippets in the code entry tabs in any order. You can enter Java code in the following code entry tabs: ♦ Import Packages. Import third-party Java packages, built-in Java packages, or custom Java packages. For more information, see “Importing Java Packages” on page 226. ♦ Helper Code. Define variables and methods available to all tabs except Import Packages. For more information, see “Defining Helper Code” on page 226. ♦ On Input Row. Define transformation behavior when it receives an input row. For more information, see “On Input Row Tab” on page 227. ♦ On End of Data. Define transformation behavior when it has processed all input data. For more information, see “On End of Data Tab” on page 228. ♦ On Receiving Transaction. Define transformation behavior when it receives a transaction notification. Use with active Java transformations. For more information, see “On Receiving Transaction Tab” on page 228. ♦ Java Expressions. Define Java expressions to call PowerCenter expressions. You can use Java expressions in the Helper Code, On Input Row, On End of Data, and On Transaction code entry tabs. For more information about Java expressions, see “Java Expressions” on page 263. You can access input data and set output data on the On Input Row tab. For active transformations, you can also set output data on the On End of Data and On Receiving Transaction tabs. Creating Java Code Snippets Use the Code window in the Java Code tab to create Java code snippets to define transformation behavior. To create a Java code snippet: 1. Click the appropriate code entry tab. 2. To access input or output column variables in the snippet, double-click the name of the port in the Navigator. 3. To call a Java transformation API in the snippet, double-click the name of the API in the navigator. If necessary, configure the appropriate API input values. 4. Write appropriate Java code, depending on the code snippet. The Full Code windows displays the full class code for the Java transformation. Developing Java Code 225
  • 258. Importing Java Packages Use the Import Package tab to import third-party Java packages, built-in Java packages, or custom Java packages for active or passive Java transformations. After you import Java packages, use the imported packages in any code entry tab. You cannot declare or use static variables, instance variables, or user methods in this tab. For example, to import the Java I/O package, enter the following code in the Import Packages tab: import java.io.*; When you import non-standard Java packages, you must add the package or class to the classpath. For more information about setting the classpath, see “Configuring Java Transformation Settings” on page 229. When you export or import metadata that contains a Java transformation in the PowerCenter Client, the JAR files or classes that contain the third-party or custom packages required by the Java transformation are not included. If you import metadata that contains a Java transformation, you must also copy the JAR files or classes that contain the required third- party or custom packages to the PowerCenter Client machine. Defining Helper Code Use the Helper Code tab in active or passive Java transformations to declare user-defined variables and methods for the Java transformation class. Use variables and methods declared in the Helper Code tab in any code entry tab except the Import Packages tab. You can declare the following user-defined variables and user-defined methods: ♦ Static code and static variables. You can declare static variables and static code within a static block. All instances of a reusable Java transformation in a mapping and all partitions in a session share static code and variables. Static code executes before any other code in a Java transformation. For example, the following code declares a static variable to store the error threshold for all instances of a Java transformation in a mapping: static int errorThreshold; You can then use this variable to store the error threshold for the transformation and access it from all instances of the Java transformation in a mapping and from any partition in a session. Note: You must synchronize static variables in a multiple partition session or in a reusable Java transformation. ♦ Instance variables. You can declare partition-level instance variables. Multiple instances of a reusable Java transformation in a mapping or multiple partitions in a session do not share instance variables. Declare instance variables with a prefix to avoid conflicts and initialize non-primitive instance variables. 226 Chapter 9: Java Transformation
  • 259. For example, the following code uses a boolean variable to decide whether to generate an output row: // boolean to decide whether to generate an output row // based on validity of input private boolean generateRow; ♦ User-defined methods. Create user-defined static or instance methods to extend the functionality of the Java transformation. Java methods declared in the Helper Code tab can use or modify output variables or locally declared instance variables. You cannot access input variables from Java methods in the Helper Code tab. For example, use the following code in the Helper Code tab to declare a function that adds two integers: private int myTXAdd (int num1,int num2) { return num1+num2; } On Input Row Tab Use the On Input Row tab to define the behavior of the Java transformation when it receives an input row. The Java code in this tab executes one time for each input row. You can access input row data in the On Input Row tab only. You can access and use the following input and output port data, variables, and methods from the On Input Row tab: ♦ Input port and output port variables. You can access input and output port data as a variable by using the name of the port as the name of the variable. For example, if “in_int” is an Integer input port, you can access the data for this port by referring as a variable “in_int” with the Java primitive datatype int. You do not need to declare input and output ports as variables. Do not assign a value to an input port variable. If you assign a value to an input variable in the On Input Row tab, you cannot get the input data for the corresponding port in the current row. ♦ Instance variables and user-defined methods. Use any instance or static variable or user- defined method you declared in the Helper Code tab. For example, an active Java transformation has two input ports, BASE_SALARY and BONUSES, with an integer datatype, and a single output port, TOTAL_COMP, with an integer datatype. You create a user-defined method in the Helper Code tab, myTXAdd, that adds two integers and returns the result. Use the following Java code in the On Input Row tab to assign the total values for the input ports to the output port and generate an output row: TOTAL_COMP = myTXAdd (BASE_SALARY,BONUSES); generateRow(); When the Java transformation receives an input row, it adds the values of the BASE_SALARY and BONUSES input ports, assigns the value to the TOTAL_COMP output port, and generates an output row. Developing Java Code 227
  • 260. Java transformation API methods. You can call API methods provided by the Java transformation. For more information about Java transformation API methods, see “Java Transformation API Reference” on page 237. On End of Data Tab Use the On End of Data tab in active or passive Java transformations to define the behavior of the Java transformation when it has processed all input data. If you want to generate output rows in the On End of Data tab, you must set the transformation scope for the transformation to Transaction or All Input. You cannot access or set the value of input port variables in this tab. You can access and use the following variables and methods from the On End of Data tab: ♦ Output port variables. Use the names of output ports as variables to access or set output data for active Java transformations. ♦ Instance variables and user-defined methods. Use any instance variables or user-defined methods you declared in the Helper Code tab. ♦ Java transformation API methods. You can call API methods provided by the Java transformation. Use the commit and rollBack API methods to generate a transaction. For more information about API methods, see “Java Transformation API Reference” on page 237. For example, use the following Java code to write information to the session log when the end of data is reached: logInfo("Number of null rows for partition is: " + partCountNullRows); On Receiving Transaction Tab Use the On Receiving Transaction tab in active Java transformations to define the behavior of an active Java transformation when it receives a transaction notification. The code snippet for the On Receiving Transaction tab is only executed if the Transaction Scope for the transformation is set to Transaction. You cannot access or set the value of input port variables in this tab. You can access and use the following output data, variables, and methods from the On Receiving Transaction tab: ♦ Output port variables. Use the names of output ports as variables to access or set output data. ♦ Instance variables and user-defined methods. Use any instance variables or user-defined methods you declared in the Helper Code tab. ♦ Java transformation API methods. You can call API methods provided by the Java transformation. Use the commit and rollBack API methods to generate a transaction. For more information about API methods, see “Java Transformation API Reference” on page 237. For example, use the following Java code to generate a transaction after the transformation receives a transaction: commit(); 228 Chapter 9: Java Transformation
  • 261. Configuring Java Transformation Settings You can configure Java transformation settings to set the classpath for third-party and custom Java packages and to enable high precision for Decimal datatypes. Figure 9-4 shows the Settings dialog box for a Java transformation where you can set the classpath and enable high precision: Figure 9-4. Java Transformation Settings Dialog Box Set classpath. Enable high precision. Configuring the Classpath When you import non-standard Java packages in the Import package tab, you must set the classpath to the location of the JAR files or class files for the Java package. You can set the CLASSPATH environment variable on the PowerCenter Client machine or configure the Java transformation settings to set the classpath. The PowerCenter Client adds the Java packages or class files you add in the Settings dialog box to the system classpath when you compile the Java code for the transformation. For example, you import the Java package converter in the Import Packages tab and define the package in converter.jar. You must add converter.jar to the classpath before you compile the Java code for the Java transformation. You do not need to set the classpath for built-in Java packages. For example, java.io is a built- in Java package. If you import java.io, you do not need to set the classpath for java.io. Note: You can also add Java packages to the system classpath for a session, using the Java Classpath session property. For more information, see “Session Properties Reference” in the Workflow Administration Guide. To set the classpath for a Java transformation: 1. On the Java Code tab, click the Settings link. The Settings dialog box appears. 2. Click Browse under Add Classpath to select the JAR file or class file for the imported package. Click OK. Configuring Java Transformation Settings 229
  • 262. 3. Click Add. The JAR or class file appears in the list of JAR and class files for the transformation. 4. To remove a JAR file or class file, select the JAR or class file and click Remove. Enabling High Precision By default, the Java transformation maps ports of type Decimal to double datatypes (with a precision of 15). If you want to process a Decimal datatype with a precision greater than 15, enable high precision to process decimal ports with the Java class BigDecimal. When you enable high precision, you can process Decimal ports with precision less than 28 as BigDecimal. The Java transformation maps Decimal ports with a precision greater than 28 to double datatypes. For example, a Java transformation has an input port of type Decimal that receives a value of 40012030304957666903. If you enable high precision, the value of the port is treated as it appears. If you do not enable high precision, the value of the port is 4.00120303049577 x 10^19. 230 Chapter 9: Java Transformation
  • 263. Compiling a Java Transformation The PowerCenter Client uses the Java compiler to compile the Java code and generate the byte code for the transformation. The Java compiler compiles the Java code and displays the results of the compilation in the Output window. The Java compiler installs with the PowerCenter Client in the java/bin directory. To compile the full code for the Java transformation, click Compile in the Java Code tab. When you create a Java transformation, it contains a Java class that defines the base functionality for a Java transformation. The full code for the Java class contains the template class code for the transformation, plus the Java code you define in the code entry tabs. When you compile a Java transformation, the PowerCenter Client adds the code from the code entry tabs to the template class for the transformation to generate the full class code for the transformation. The PowerCenter Client then calls the Java compiler to compile the full class code. The Java compiler compiles the transformation and generates the byte code for the transformation. The results of the compilation display in the Output window. Use the results of the compilation to identify and locate Java code errors. Note: The Java transformation is also compiled when you click OK in the transformation. Compiling a Java Transformation 231
  • 264. Fixing Compilation Errors You can identify Java code errors and locate the source of Java code errors for a Java transformation in the Output window. Java transformation errors may occur as a result of an error in a code entry tab or may occur as a result of an error in the full code for the Java transformation class. To troubleshoot a Java transformation: ♦ Locate the source of the error. You can locate the source of the error in the Java snippet code or in the full class code for the transformation. ♦ Identify the type of error. Use the results of the compilation in the output window and the location of the error to identify the type of error. After you identify the source and type of error, fix the Java code in the code entry tab and compile the transformation again. Locating the Source of Compilation Errors When you compile a Java transformation, the Output window displays the results of the compilation. Use the results of the compilation to identify compilation errors. When you use the Output window to locate the source of an error, the PowerCenter Client highlights the source of the error in a code entry tab or in the Full Code window. You can locate errors in the Full Code window, but you cannot edit Java code in the Full Code window. To fix errors that you locate in the Full Code window, you need to modify the code in the appropriate code entry tab. You might need to use the Full Code window to view errors caused by adding user code to the full class code for the transformation. Use the results of the compilation in the Output window to identify errors in the following locations: ♦ Code entry tabs ♦ Full Code window Locating Errors in the Code Entry Tabs To locate the source of an error in the code entry tabs, right-click on the error in the Output window and choose View error in snippet or double-click on the error in the Output window. The PowerCenter Client highlights the source of the error in the appropriate code entry tab. 232 Chapter 9: Java Transformation
  • 265. Figure 9-5 shows a highlighted error in a code entry tab: Figure 9-5. Highlighted Error in Code Entry Tab Locating Errors in the Full Code Window To locate the source of errors in the Full Code window, right-click on the error in the Output window and choose View error in full code or double-click the error in the Output window. The PowerCenter Client highlights the source of the error in the Full Code window. Fixing Compilation Errors 233
  • 266. Figure 9-6 shows a highlighted error in the Full Code window: Figure 9-6. Highlighted Error in Full Code Window Identifying Compilation Errors Compilation errors may appear as a result of errors in the user code. Errors in the user code may also generate an error in the non-user code for the class. Compilation errors occur in the following code for the Java transformation: ♦ User code ♦ Non-user code User Code Errors Errors may occur in the user code in the code entry tabs. User code errors may include standard Java syntax and language errors. User code errors may also occur when the PowerCenter Client adds the user code from the code entry tabs to the full class code. For example, a Java transformation has an input port with a name of int1 and an integer datatype. The full code for the class declares the input port variable with the following code: int int1; However, if you use the same variable name in the On Input Row tab, the Java compiler issues an error for a redeclaration of a variable. You must rename the variable in the On Input Row code entry tab to fix the error. 234 Chapter 9: Java Transformation
  • 267. Non-user Code Errors User code in the code entry tabs may cause errors in non-user code. For example, a Java transformation has an input port and an output port, int1 and out1, with integer datatypes. You write the following code in the On Input Row code entry tab to calculate interest for input port int1 and assign it to the output port out1: int interest; interest = CallInterest(int1); // calculate interest out1 = int1 + interest; } When you compile the transformation, the PowerCenter Client adds the code from the On Input Row code entry tab to the full class code for the transformation. When the Java compiler compiles the Java code, the unmatched brace causes a method in the full class code to terminate prematurely, and the Java compiler issues an error. Fixing Compilation Errors 235
  • 268. 236 Chapter 9: Java Transformation
  • 269. Chapter 10 Java Transformation API Reference This chapter includes the following topic: ♦ Java Transformation API Methods, 238 237
  • 270. Java Transformation API Methods You can call Java transformation API methods in the Java Code tab of a Java transformation to define transformation behavior. The Java transformation provides the following API methods: ♦ commit. Generates a transaction. For more information, see “commit” on page 239. ♦ failSession. Throws an exception with an error message and fails the session. For more information, see “failSession” on page 240. ♦ generateRow. Generates an output row for active Java transformations. For more information, see “generateRow” on page 241. ♦ getInRowType. Returns the input type of the current row in the transformation. For more information, see “getInRowType” on page 242. ♦ incrementErrorCount. Increases the error count for the session. For more information, see “incrementErrorCount” on page 243. ♦ isNull. Checks the value of an input column for a null value. For more information, see “isNull” on page 244. ♦ logError. Writes an error message to the session log. For more information, see “logError” on page 246. ♦ logInfo. Writes an informational message to the session log. For more information, see “logInfo” on page 245. ♦ rollback. Generates a rollback transaction. For more information, see “rollBack” on page 247. ♦ setNull. Sets the value of an output column in an active or passive Java transformation to NULL. For more information, see “setNull” on page 248. ♦ setOutRowType. Sets the update strategy for output rows. For more information, see “setOutRowType” on page 249. You can add any API method to a code entry tab by double-clicking the name of the API method in the Navigator, dragging the method from the Navigator into the Java code snippet, or manually typing the API method in the Java code snippet. You can also use the defineJExpression and invokeJExpression API methods to create and invoke Java expressions. For more information about using the API methods with Java expressions, see “Java Expressions” on page 263. 238 Chapter 10: Java Transformation API Reference
  • 271. commit Generates a transaction. Use commit in any tab except the Import Packages or Java Expressions code entry tabs. You can only use commit in active transformations configured to generate transactions. If you use commit in an active transformation not configured to generate transactions, the Integration Service throws an error and fails the session. Syntax Use the following syntax: commit(); Example Use the following Java code to generate a transaction for every 100 rows processed by a Java transformation and then set the rowsProcessed counter to 0: if (rowsProcessed==100) { commit(); rowsProcessed=0; } commit 239
  • 272. failSession Throws an exception with an error message and fails the session. Use failSession to terminate the session. Do not use failSession in a try/catch block in a code entry tab. Use failSession in any tab except the Import Packages or Java Expressions code entry tabs. Syntax Use the following syntax: failSession(String errorMessage); Input/ Argument Datatype Description Output errorMessage String Input Error message string. Example Use the following Java code to test the input port input1 for a null value and fail the session if input1 is NULL: if(isNull(”input1”)) { failSession(“Cannot process a null value for port input1.”); } 240 Chapter 10: Java Transformation API Reference
  • 273. generateRow Generates an output row for active Java transformations. When you call generateRow, the Java transformation generates an output row using the current value of the output port variables. If you want to generate multiple rows corresponding to an input row, you can call generateRow more than once for each input row. If you do not use generateRow in an active Java transformation, the transformation does not generate output rows. Use generateRow in any code entry tab except the Import Packages or Java Expressions code entry tabs. You can use generateRow with active transformations only. If you use generateRow in a passive transformation, the session generates an error. Syntax Use the following syntax: generateRow(); Example Use the following Java code to generate one output row, modify the values of the output ports, and generate another output row: // Generate multiple rows. if(!isNull("input1") && !isNull("input2")) { output1 = input1 + input2; output2 = input1 - input2; } generateRow(); // Generate another row with modified values. output1 = output1 * 2; output2 = output2 * 2; generateRow(); generateRow 241
  • 274. getInRowType Returns the input type of the current row in the transformation. The method returns a value of insert, update, delete, or reject. You can only use getInRowType in the On Input Row code entry tab. You can only use the getInRowType method in active transformations configured to set the update strategy. If you use this method in an active transformation not configured to set the update strategy, the session generates an error. Syntax Use the following syntax: rowType getInRowType(); Input/ Argument Datatype Description Output rowType String Output Update strategy type. Value can be INSERT, UPDATE, DELETE, or REJECT. Example Use the following Java code to propagate the input type of the current row if the row type is UPDATE or INSERT and the value of the input port input1 is less than 100 or set the output type as DELETE if the value of input1 is greater than 100: // Set the value of the output port. output1 = input1; // Get and set the row type. String rowType = getInRowType(); setOutRowType(rowType); // Set row type to DELETE if the output port value is > 100. if(input1 > 100) setOutRowType(DELETE); 242 Chapter 10: Java Transformation API Reference
  • 275. incrementErrorCount Increases the error count for the session. If the error count reaches the error threshold for the session, the session fails. Use incrementErrorCount in any tab except the Import Packages or Java Expressions code entry tabs. Syntax Use the following syntax: incrementErrorCount(int nErrors); Input/ Argument Datatype Description Output nErrors Integer Input Number of errors to increment the error count for the session. Example Use the following Java code to increment the error count if an input port for a transformation has a null value: // Check if input employee id and name is null. if (isNull ("EMP_ID_INP") || isNull ("EMP_NAME_INP")) { incrementErrorCount(1); // if input employee id and/or name is null, don't generate a output row for this input row generateRow = false; } incrementErrorCount 243
  • 276. isNull Checks the value of an input column for a null value. Use isNull to check if data of an input column is NULL before using the column as a value. You can use the isNull method in the On Input Row code entry tab only. Syntax Use the following syntax: Boolean isNull(String satrColName); Input/ Argument Datatype Description Output strColName String Input Name of an input column. Example Use the following Java code to check the value of the SALARY input column before adding it to the instance variable totalSalaries: // if value of SALARY is not null if (!isNull("SALARY")) { // add to totalSalaries TOTAL_SALARIES += SALARY; } or // if value of SALARY is not null String strColName = "SALARY"; if (!isNull(strColName)) { // add to totalSalaries TOTAL_SALARIES += SALARY; } 244 Chapter 10: Java Transformation API Reference
  • 277. logInfo Writes an informational message to the session log. Use logInfo in any tab except the Import Packages or Java Expressions tabs. Syntax Use the following syntax: logInfo(String logMessage); Input/ Argument Datatype Description Output logMessage String Input Information message string. Example Use the following Java code to write a message to the message log after the Java transformation processes a message threshold of 1,000 rows: if (numRowsProcessed == messageThreshold) { logInfo("Processed " + messageThreshold + " rows."); } The following message appears in the session log: [JTX_1012] [INFO] Processed 1000 rows. logInfo 245
  • 278. logError Write an error message to the session log. Use logError in any tab except the Import Packages or Java Expressions code entry tabs. Syntax Use the following syntax: logError(String errorMessage); Input/ Argument Datatype Description Output errorMessage String Input Error message string. Example Use the following Java code to log an error of the input port is null: // check BASE_SALARY if (isNull("BASE_SALARY")) { logError("Cannot process a null salary field."); } The following message appears in the message log: [JTX_1013] [ERROR] Cannot process a null salary field. 246 Chapter 10: Java Transformation API Reference
  • 279. rollBack Generates a rollback transaction. Use rollBack in any tab except the Import Packages or Java Expressions code entry tabs. You can only use rollback in active transformations configured to generate transactions. If you use rollback in an active transformation not configured to generate transactions, the Integration Service generates an error and fails the session. Syntax Use the following syntax: rollBack(); Example Use the following code to generate a rollback transaction and fail the session if an input row has an illegal condition or generate a transaction if the number of rows processed is 100: // If row is not legal, rollback and fail session. if (!isRowLegal()) { rollback(); failSession(“Cannot process illegal row.”); } else if (rowsProcessed==100) { commit(); rowsProcessed=0; } rollBack 247
  • 280. setNull Sets the value of an output column in an active or passive Java transformation to NULL. Once you set an output column to NULL, you cannot modify the value until you generate an output row. Use setNull in any tab except the Import Packages or Java Expressions code entry tabs. Syntax Use the following syntax: setNull(String strColName); Input/ Argument Datatype Description Output strColName String Input Name of an output column. Example Use the following Java code to check the value of an input column and set the corresponding value of an output column to null: // check value of Q3RESULTS input column if(isNull("Q3RESULTS")) { // set the value of output column to null setNull("RESULTS"); } or // check value of Q3RESULTS input column String strColName = "Q3RESULTS"; if(isNull(strColName)) { // set the value of output column to null setNull(strColName); } 248 Chapter 10: Java Transformation API Reference
  • 281. setOutRowType Sets the update strategy for output rows. The setOutRowType method can flag rows for insert, update, or delete. You can only use setOutRowType in the On Input Row code entry tab. You can only use setOutRowType in active transformations configured to set the update strategy. If you use setOutRowType in an active transformation not configured to set the update strategy, the session generates an error and the session fails. Syntax Use the following syntax: setOutRowType(String rowType); Input/ Argument Datatype Description Output rowType String Input Update strategy type. Value can be INSERT, UPDATE, or DELETE. Example Use the following Java code to propagate the input type of the current row if the row type is UPDATE or INSERT and the value of the input port input1 is less than 100 or set the output type as DELETE if the value of input1 is greater than 100: // Set the value of the output port. output1 = input1; // Get and set the row type. String rowType = getInRowType(); setOutRowType(rowType); // Set row type to DELETE if the output port value is > 100. if(input1 > 100) setOutRowType(DELETE); setOutRowType 249
  • 282. 250 Chapter 10: Java Transformation API Reference
  • 283. Chapter 11 Java Transformation Example This chapter includes the following topics: ♦ Overview, 252 ♦ Step 1. Import the Mapping, 253 ♦ Step 2. Create Transformation and Configure Ports, 254 ♦ Step 3. Enter Java Code, 256 ♦ Step 4. Compile the Java Code, 261 ♦ Step 5. Create a Session and Workflow, 262 251
  • 284. Overview You can use the Java code in this example to create and compile an active Java transformation. You import a sample mapping and create and compile the Java transformation. You can then create and run a session and workflow that contains the mapping. The Java transformation processes employee data for a fictional company. It reads input rows from a flat file source and writes output rows to a flat file target. The source file contains employee data, including the employee identification number, name, job title, and the manager identification number. The transformation finds the manager name for a given employee based on the manager identification number and generates output rows that contain employee data. The output data includes the employee identification number, name, job title, and the name of the employee’s manager. If the employee has no manager in the source data, the transformation assumes the employee is at the top of the hierarchy in the company organizational chart. Note: The transformation logic assumes the employee job titles are arranged in descending order in the source file. Complete the following steps to import the sample mapping, create and compile a Java transformation, and create a session and workflow that contains the mapping: 1. Import the sample mapping. For more information, see “Step 1. Import the Mapping” on page 253. 2. Create the Java transformation and configure the Java transformation ports. For more information, see “Step 2. Create Transformation and Configure Ports” on page 254. 3. Enter the Java code for the transformation in the appropriate code entry tabs. For more information, see “Step 3. Enter Java Code” on page 256. 4. Compile the Java code. For more information, see “Step 4. Compile the Java Code” on page 261. 5. Create and run a session and workflow. For more information, see “Step 5. Create a Session and Workflow” on page 262. For a sample source and target file for the session, see “Sample Data” on page 262. The PowerCenter Client installation contains a mapping, m_jtx_hier_useCase.xml, and flat file source, hier_input, that you can use with this example. For more information about creating transformations, mappings, sessions, and workflows, see Getting Started. 252 Chapter 11: Java Transformation Example
  • 285. Step 1. Import the Mapping Import the metadata for the sample mapping in the Designer. The sample mapping contains the following components: ♦ Source definition and Source Qualifier transformation. Flat file source definition, hier_input, that defines the source data for the transformation. ♦ Target definition. Flat file target definition, hier_data, that receives the output data from the transformation. You can import the metadata for the mapping from the following location: <PowerCenter Client installation directory>clientbinm_jtx_hier_useCase.xml Figure 11-1 shows the sample mapping: Figure 11-1. Java Transformation Example - Sample Mapping Step 1. Import the Mapping 253
  • 286. Step 2. Create Transformation and Configure Ports You create the Java transformation and configure the ports in the Mapping Designer. You can use the input and output port names as variables in the Java code. In a Java transformation, you create input and output ports in an input or output group. A Java transformation may contain only one input group and one output group. For more information about configuring ports in a Java transformation, see “Configuring Ports” on page 219. In the Mapping Designer, create an active Java transformation and configure the ports. In this example, the transformation is named jtx_hier_useCase. Note: To use the Java code in this example, you must use the exact names for the input and output ports. Table 11-1 shows the input and output ports for the transformation: Table 11-1. Input and Output Ports Port Name Port Type Datatype Precision Scale EMP_ID_INP Input Integer 10 0 EMP_NAME_INP Input String 100 0 EMP_AGE Input Integer 10 0 EMP_DESC_INP Input String 100 0 EMP_PARENT_EMPID Input Integer 10 0 EMP_ID_OUT Output Integer 10 0 EMP_NAME_OUT Output String 100 0 EMP_DESC_OUT Output String 100 0 EMP_PARENT_EMPNAME Output String 100 0 254 Chapter 11: Java Transformation Example
  • 287. Figure 11-2 shows the Ports tab in the Transformation Developer after you create the ports: Figure 11-2. Java Transformation Example - Ports Tab Step 2. Create Transformation and Configure Ports 255
  • 288. Step 3. Enter Java Code Enter Java code for the transformation in the following code entry tabs: ♦ Import Packages. Imports the java.util.Map and java.util.HashMap packages. For more information, see “Import Packages Tab” on page 256. ♦ Helper Code. Contains a Map object, lock object, and boolean variables used to track the state of data in the Java transformation. For more information, see “Helper Code Tab” on page 257. ♦ On Input Row. Contains the Java code that processes each input row in the transformation. For more information, see “On Input Row Tab” on page 258. For more information about using the code entry tabs to develop Java code, see “Developing Java Code” on page 225. Import Packages Tab Import third-party Java packages, built-in Java packages, or custom Java packages in the Import Packages tab. The example transformation uses the Map and HashMap packages. Enter the following code in the Import Packages tab: import java.util.Map; import java.util.HashMap; The Designer adds the import statements to the Java code for the transformation. 256 Chapter 11: Java Transformation Example
  • 289. Figure 11-3 shows the Import Packages code entry tab: Figure 11-3. Java Transformation Example - Import Packages Tab Helper Code Tab Declare user-defined variables and methods for the Java transformation on the Helper Code tab. The Helper Code tab defines the following variables that are used by the Java code in the On Input Row tab: ♦ empMap. Map object that stores the identification number and employee name from the source. ♦ lock. Lock object used to synchronize the access to empMap across partitions. ♦ generateRow. Boolean variable used to determine if an output row should be generated for the current input row. ♦ isRoot. Boolean variable used to determine if an employee is at the top of the company organizational chart (root). Enter the following code in the Helper Code tab: // Static Map object to store the ID and name relationship of an // employee. If a session uses multiple partitions, empMap is shared // across all partitions. private static Map empMap = new HashMap(); // Static lock object to synchronize the access to empMap across // partitions. private static Object lock = new Object(); Step 3. Enter Java Code 257
  • 290. // Boolean to track whether to generate an output row based on validity // of the input data. private boolean generateRow; // Boolean to track whether the employee is root. private boolean isRoot; Figure 11-4 shows the Helper Code tab: Figure 11-4. Java Transformation Example - Helper Code Tab On Input Row Tab The Java transformation executes the Java code in the On Input Row tab when the transformation receives an input row. In this example, the transformation may or may not generate an output row, based on the values of the input row. Enter the following code in the On Input Row tab: // Initially set generateRow to true for each input row. generateRow = true; // Initially set isRoot to false for each input row. isRoot = false; // Check if input employee id and name is null. if (isNull ("EMP_ID_INP") || isNull ("EMP_NAME_INP")) { incrementErrorCount(1); // If input employee id and/or name is null, don't generate a output 258 Chapter 11: Java Transformation Example
  • 291. // row for this input row. generateRow = false; } else { // Set the output port values. EMP_ID_OUT = EMP_ID_INP; EMP_NAME_OUT = EMP_NAME_INP; } if (isNull ("EMP_DESC_INP")) setNull("EMP_DESC_OUT"); } else { EMP_DESC_OUT = EMP_DESC_INP; } boolean isParentEmpIdNull = isNull("EMP_PARENT_EMPID"); if(isParentEmpIdNull) { // This employee is the root for the hierarchy. isRoot = true; logInfo("This is the root for this hierarchy."); setNull("EMP_PARENT_EMPNAME"); } synchronized(lock) { // If the employee is not the root for this hierarchy, get the // corresponding parent id. if(!isParentEmpIdNull) EMP_PARENT_EMPNAME = (String) (empMap.get(new Integer (EMP_PARENT_EMPID))); // Add employee to the map for future reference. empMap.put (new Integer(EMP_ID_INP), EMP_NAME_INP); } // Generate row if generateRow is true. if(generateRow) generateRow(); Step 3. Enter Java Code 259
  • 292. Figure 11-5 shows the On Input Row tab: Figure 11-5. Java Transformation Example - On Input Row Tab 260 Chapter 11: Java Transformation Example
  • 293. Step 4. Compile the Java Code Click Compile in the Transformation Developer to compile the Java code for the transformation. The Output window displays the status of the compilation. If the Java code does not compile successfully, correct the errors in the code entry tabs and recompile the Java code. After you successfully compile the transformation, save the transformation to the repository. For more information about compiling Java code, see “Compiling a Java Transformation” on page 231. For more information about troubleshooting compilation errors, see “Fixing Compilation Errors” on page 232. Figure 11-6 shows the results of a successful compilation: Figure 11-6. Java Transformation Example - Successful Compilation Step 4. Compile the Java Code 261
  • 294. Step 5. Create a Session and Workflow Create a session and workflow for the mapping in the Workflow Manager, using the m_jtx_hier_useCase mapping. When you configure the session, you can use the sample source file from the following location: <PowerCenter Client installation directory>clientbinhier_data Sample Data The following data is an excerpt from the sample source file: 1,James Davis,50,CEO, 4,Elaine Masters,40,Vice President - Sales,1 5,Naresh Thiagarajan,40,Vice President - HR,1 6,Jeanne Williams,40,Vice President - Software,1 9,Geetha Manjunath,34,Senior HR Manager,5 10,Dan Thomas,32,Senior Software Manager,6 14,Shankar Rahul,34,Senior Software Manager,6 20,Juan Cardenas,32,Technical Lead,10 21,Pramodh Rahman,36,Lead Engineer,14 22,Sandra Patterson,24,Software Engineer,10 23,Tom Kelly,32,Lead Engineer,10 35,Betty Johnson,27,Lead Engineer,14 50,Dave Chu,26,Software Engineer,23 70,Srihari Giran,23,Software Engineer,35 71,Frank Smalls,24,Software Engineer,35 The following data is an excerpt from a sample target file: 1,James Davis,CEO, 4,Elaine Masters,Vice President - Sales,James Davis 5,Naresh Thiagarajan,Vice President - HR,James Davis 6,Jeanne Williams,Vice President - Software,James Davis 9,Geetha Manjunath,Senior HR Manager,Naresh Thiagarajan 10,Dan Thomas,Senior Software Manager,Jeanne Williams 14,Shankar Rahul,Senior Software Manager,Jeanne Williams 20,Juan Cardenas,Technical Lead,Dan Thomas 21,Pramodh Rahman,Lead Engineer,Shankar Rahul 22,Sandra Patterson,Software Engineer,Dan Thomas 23,Tom Kelly,Lead Engineer,Dan Thomas 35,Betty Johnson,Lead Engineer,Shankar Rahul 50,Dave Chu,Software Engineer,Tom Kelly 70,Srihari Giran,Software Engineer,Betty Johnson 71,Frank Smalls,Software Engineer,Betty Johnson 262 Chapter 11: Java Transformation Example
  • 295. Chapter 12 Java Expressions This chapter includes the following topics: ♦ Overview, 264 ♦ Using the Define Expression Dialog Box, 266 ♦ Working with the Simple Interface, 271 ♦ Working with the Advanced Interface, 273 ♦ JExpression API Reference, 279 263
  • 296. Overview You can invoke PowerCenter expressions in a Java transformation with the Java programming language. Use expressions to extend the functionality of a Java transformation. For example, you can invoke an expression in a Java transformation to look up the values of input or output ports or look up the values of Java transformation variables. To invoke expressions in a Java transformation, you generate the Java code or use Java transformation APIs to invoke the expression. You invoke the expression and use the result of the expression in the appropriate code entry tab. You can generate the Java code that invokes an expression or use API methods to write the Java code that invokes the expression. Use the following methods to create and invoke expressions in a Java transformation: ♦ Use the Define Expression dialog box. Create an expression and generate the code for an expression. For more information, see “Using the Define Expression Dialog Box” on page 266. ♦ Use the simple interface. Use a single method to invoke an expression and get the result of the expression. For more information, see “Working with the Simple Interface” on page 271. ♦ Use the advanced interface. Use the advanced interface to define the expression, invoke the expression, and use the result of the expression. For more information, see “Working with the Advanced Interface” on page 273. You can invoke expressions in a Java transformation without advanced knowledge of the Java programming language. You can invoke expressions using the simple interface, which only requires a single method to invoke an expression. If you are familiar with object oriented programming and want more control over invoking the expression, you can use the advanced interface. Expression Function Types You can create expressions for a Java transformation using the Expression Editor, by writing the expression in the Define Expression dialog box, or by using the simple or advanced interface. You can enter expressions that use input or output port variables or variables in the Java code as input parameters. If you use the Define Expression dialog box, you can use the Expression Editor to validate the expression before you use it in a Java transformation. You can invoke the following types of expression functions in a Java transformation: ♦ Transformation language functions. SQL-like functions designed to handle common expressions. ♦ User-defined functions. Functions you create in PowerCenter based on transformation language functions. ♦ Custom functions. Functions you create with the Custom Function API. ♦ Unconnected transformations. You can use unconnected transformations in expressions. For example, you can use an unconnected lookup transformation in an expression. 264 Chapter 12: Java Expressions
  • 297. You can also use system variables, user-defined mapping and workflow variables, and pre- defined workflow variables such as $Session.status in expressions. For more information about the transformation language and custom functions, see the Transformation Language Reference. For more information about user-defined functions, see “Working with User-Defined Functions” in the Designer Guide. Overview 265
  • 298. Using the Define Expression Dialog Box When you define a Java expression, you configure the function, create the expression, and generate the code that invokes the expression. You can define the function and create the expression in the Define Expression dialog box. To create an expression function and use the expression in a Java transformation, complete the following tasks: 1. Configure the function. Configure the function that invokes the expression, including the function name, description, and parameters. You use the function parameters when you create the expression. For more information, see “Step 1. Configure the Function” on page 266. 2. Create the expression. Create the expression syntax and validate the expression. For more information, see “Step 2. Create and Validate the Expression” on page 267. 3. Generate Java code. Use the Define Expression dialog box to generate the Java code that invokes the expression. The Designer places the code in the Java Expressions code entry tab in the Transformation Developer. For more information, see “Step 3. Generate Java Code for the Expression” on page 267. After you generate the Java code, call the generated function in the appropriate code entry tab to invoke an expression or get a JExpression object, depending on whether you use the simple or advanced interface. Note: To validate an expression when you create the expression, you must use the Define Expression dialog box. Step 1. Configure the Function You configure the function name, description, and input parameters for the Java function that invokes the expression. Use the following rules and guidelines when you configure the function: ♦ Use a unique function name that does not conflict with an existing Java function in the transformation or reserved Java keywords. ♦ You must configure the parameter name, Java datatype, precision, and scale. The input parameters are the values you pass when you call the function in the Java code for the transformation. ♦ To pass a Date datatype to an expression, use a String datatype for the input parameter. If an expression returns a Date datatype, you can use the return value as a String datatype in the simple interface and a String or long datatype in the advanced interface. For more information about the mapping between PowerCenter datatypes and Java datatypes, see “Datatype Mapping” on page 215. 266 Chapter 12: Java Expressions
  • 299. Figure 12-1 shows the Define Expression dialog box where you configure the function and the expression for a Java transformation: Figure 12-1. Define Expression Dialog Box Java function name Java function parameters Define expression. Validate expression. Step 2. Create and Validate the Expression When you create the expression, use the parameters you configured for the function. You can also use transformation language functions, custom functions, or other user-defined functions in the expression. You can create and validate the expression in the Define Expression dialog box or in the Expression Editor. When you enter expression syntax, follow the transformation language rules and guidelines. For more information about expression syntax, see “The Transformation Language” in the Transformation Language Reference. For more information about creating expressions, see “Working with Workflows” in the Workflow Administration Guide. Step 3. Generate Java Code for the Expression After you configure the function and function parameters and define and validate the expression, you can generate the Java code that invokes the expression. The Designer places the generated Java code in the Java Expressions code entry tab. Use the generated Java code to call the functions that invoke the expression in the code entry tabs in the Transformation Developer. You can generate the simple or advanced Java code. After you generate the Java code that invokes an expression, you cannot edit the expression and revalidate it. To modify an expression after you generate the code, you must recreate the expression. Using the Define Expression Dialog Box 267
  • 300. Figure 12-2 shows the Java Expressions code entry tab and generated Java code for an expression in the advanced interface: Figure 12-2. Java Expressions Code Entry Tab Generated Java Code Define the expression. Steps to Create an Expression and Generate Java Code Complete the following procedure to create an expression and generate the Java code to invoke the expression. To generate Java code that calls an expression: 1. In the Transformation Developer, open a Java transformation or create a new Java transformation. 2. Click the Java Code tab. 3. Click the Define Expression link. The Define Expression dialog box appears. 4. Enter a function name. 5. Optionally, enter a description for the expression. You can enter up to 2,000 characters. 6. Create the parameters for the function. 268 Chapter 12: Java Expressions
  • 301. When you create the parameters, configure the parameter name, datatype, precision, and scale. 7. Click Launch Editor to create an expression with the parameters you created in step 6. 8. Click Validate to validate the expression. 9. Optionally, you can enter the expression in the Expression field and click Validate to validate the expression. 10. If you want to generate Java code using the advanced interface, select Generate advanced code. 11. Click Generate. The Designer generates the function to invoke the expression in the Java Expressions code entry tab. Java Expression Templates You can generate Java code for an expression using the simple or advanced Java code for an expression. The Java code for the expression is generated according to a template for the expression. Simple Java Code The following example shows the template for a Java expression generated for simple Java code: Object function_name (Java datatype x1[, Java datatype x2 ...] ) throws SDK Exception { return (Object)invokeJExpression( String expression, new Object [] { x1[, x2, ... ]} ); } The following example shows the template for a Java expression generated using the advanced interface: JExpression function_name () throws SDKException { JExprParamMetadata params[] = new JExprParamMetadata[number of parameters]; params[0] = new JExprParamMetadata ( EDataType.STRING, // data type 20, // precision 0 // scale ); ... params[number of parameters - 1] = new JExprParamMetadata ( EDataType.STRING, // data type Using the Define Expression Dialog Box 269
  • 302. 20, // precision 0 // scale ); ... return defineJExpression(String expression,params); } 270 Chapter 12: Java Expressions
  • 303. Working with the Simple Interface Use the invokeJExpression Java API method to invoke an expression in the simple interface. invokeJExpression Invokes an expression and returns the value for the expression. Input parameters for invokeJExpression are a string value that represents the expression and an array of objects that contain the expression input parameters. Use the following rules and guidelines when you use invokeJExpression: ♦ Return datatype. The return type of invokeJExpression is an object. You must cast the return value of the function with the appropriate datatype. You can return values with Integer, Double, String, and byte[] datatypes. ♦ Row type. The row type for return values from invokeJExpression is INSERT. If you want to use a different row type for the return value, use the advanced interface. For more information, see “invoke” on page 279. ♦ Null values. If you pass a null value as a parameter or the return value for invokeJExpression is NULL, the value is treated as a null indicator. For example, if the return value of an expression is NULL and the return datatype is String, a string is returned with a value of null. ♦ Date datatype. You must convert input parameters with a Date datatype to String. To use the string in an expression as a Date datatype, use the to_date() function to convert the string to a Date datatype. Also, you must cast the return type of any expression that returns a Date datatype as a String. Use the following syntax: (datatype)invokeJExpression( String expression, Object[] paramMetadataArray); Input/ Argument Datatype Description Output expression String Input String that represents the expression. paramMetadataArray Object[] Input Array of objects that contain the input parameters for the expression. The following example concatenates the two strings “John” and “Smith” and returns “John Smith”: (String)invokeJExpression("concat(x1,x2)", new Object [] { "John ", "Smith" }); Working with the Simple Interface 271
  • 304. Note: The parameters passed to the expression must be numbered consecutively and start with the letter x. For example, to pass three parameters to an expression, name the parameters x1, x2, and x3. Simple Interface Example You can define and call expressions that use the invokeJExpression API in the Helper Code or On Input Row code entry tabs. The following example shows how to perform a lookup on the NAME and ADDRESS input ports in a Java transformation and assign the return value to the COMPANY_NAME output port. Use the following code in the On Input Row code entry tab: COMPANY_NAME = (String)invokeJExpression(":lkp.my_lookup(X1,X2)", new Object [] {str1 ,str2} ); generateRow(); 272 Chapter 12: Java Expressions
  • 305. Working with the Advanced Interface You can use the object oriented APIs in the advanced interface to define, invoke, and get the result of an expression. The advanced interface contains the following classes and Java transformation APIs: ♦ EDataType class. Enumerates the datatypes for an expression. For more information, see “EDataType Class” on page 274. ♦ JExprParamMetadata class. Contains the metadata for each parameter in an expression. Parameter metadata includes datatype, precision, and scale. For more information, see “JExprParamMetadata Class” on page 274. ♦ defineJExpression API. Defines the expression. Includes PowerCenter expression string and parameters. For more information, see “defineJExpression” on page 275. ♦ JExpression class. Contains the methods to create, invoke, get the metadata and get the expression result, and check the return datatype. For more information, see “JExpression API Reference” on page 279. Steps to Invoke an Expression with the Advanced Interface Complete the following process to define, invoke, and get the result of an expression: 1. In the Helper Code or On Input Row code entry tab, create an instance of JExprParamMetadata for each parameter for the expression and set the value of the metadata. Optionally, you can instantiate the JExprParamMetadata object in defineJExpression. 2. Use defineJExpression to get the JExpression object for the expression. 3. In the appropriate code entry tab, invoke the expression with invoke. 4. Check the result of the return value with isResultNull. 5. You can get the datatype of the return value or the metadata of the return value with getResultDataType and getResultMetadata. 6. Get the result of the expression using the appropriate API. You can use getInt, getDouble, getStringBuffer, and getBytes. Rules and Guidelines for Working with the Advanced Interface Use the following rules and guidelines when you work with expressions in the advanced interface: ♦ Null values. If you pass a null value as a parameter or if the result of an expression is null, the value is treated as a null indicator. For example, if the result of an expression is null and the return datatype is String, a string is returned with a value of null. You can check the result of an expression using isResultNull. For more information, see “isResultNull” on page 280. Working with the Advanced Interface 273
  • 306. Date datatype. You must convert input parameters with a Date datatype to a String before you can use them in an expression. To use the string in an expression as a Date datatype, use the to_date() function to convert the string to a Date datatype. You can get the result of an expression that returns a Data datatype as String or long datatype. For more information, see “getStringBuffer” on page 282 and “getLong” on page 281. EDataType Class Enumerates the Java datatypes used in expressions. You can use the EDataType class to get the return datatype of an expression or assign the datatype for a parameter in a JExprParamMetadata object. You do not need to instantiate the EDataType class. Table 12-1 lists the enumerated values for Java datatypes in expressions: Table 12-1. Enumerated Java Datatypes Datatype Enumerated Value INT 1 DOUBLE 2 STRING 3 BYTE_ARRAY 4 DATE_AS_LONG 5 The following example shows how to use the EDataType class to assign a datatype of String to an JExprParamMetadata object: JExprParamMetadata params[] = new JExprParamMetadata[2]; params[0] = new JExprParamMetadata ( EDataType.STRING, // data type 20, // precision 0 // scale ); ... JExprParamMetadata Class Instantiates an object that represents the parameters for an expression and sets the metadata for the parameters. You use an array of JExprParamMetadata objects as input to the defineJExpression to set the metadata for the input parameters. You can create a instance of the JExprParamMetadata object in the Java Expressions code entry tab or in defineJExpression. 274 Chapter 12: Java Expressions
  • 307. Use the following syntax: JExprParamMetadata paramMetadataArray[] = new JExprParamMetadata[numberOfParameters]; paramMetadataArray[0] = new JExprParamMetadata(datatype, precision, scale); ... paramMetadataArray[numberofParameters - 1] = new JExprParamMetadata(datatype, precision, scale);; Input/ Argument Datatype Description Output datatype EDataType Input Datatype of the parameter. precision Integer Input Precision of the parameter. scale Integer Input Scale of the parameter. For example, use the following Java code to instantiate an array of two JExprParamMetadata objects with String datatypes, precision of 20, and scale of 0: JExprParamMetadata params[] = new JExprParamMetadata[2]; params[0] = new JExprParamMetadata(EDataType.STRING, 20, 0); params[1] = new JExprParamMetadata(EDataType.STRING, 20, 0); return defineJExpression(":LKP.LKP_addresslookup(X1,X2)",params); defineJExpression Defines the expression, including the expression string and input parameters. Arguments for defineJExpression include a JExprParamMetadata object that contains the input parameters and a string value that defines the expression syntax. To use defineJExpression, you must instantiate an array of JExprParamMetadata objects that represent the input parameters for the expression. You set the metadata values for the parameters and pass the array as an argument to defineJExpression. Working with the Advanced Interface 275
  • 308. Use the following syntax: defineJExpression( String expression, Object[] paramMetadataArray ); Input/ Argument Datatype Description Output expression String Input String that represents the expression. paramMetadataArray Object[] Input Array of JExprParaMetadata objects that contain the input parameters for the expression. For example, use the following Java code to create an expression to perform a lookup on two strings: JExprParaMetadata params[] = new JExprParamMetadata[2]; params[0] = new JExprParamMetadata(EDataType.STRING, 20, 0); params[1] = new JExprParamMetadata(EDataType.STRING, 20, 0); defineJExpression(":lkp.mylookup(x1,x2)",params); Note: The parameters passed to the expression must be numbered consecutively and start with the letter x. For example, to pass three parameters to an expression, name the parameters x1, x2, and x3. JExpression Class The JExpression class contains the methods to create and invoke an expression, return the value of an expression, and check the return datatype. Table 12-2 lists the JExpression API methods: Table 12-2. JExpression API Methods Method Name Description invoke Invokes an expression. getResultDataType Returns the datatype of the expression result. getResultMetadata Returns the metadata of the expression result. isResultNull Checks the result value of an expression result. getInt Returns the value of an expression result as an Integer datatype. getDouble Returns the value of an expression result as a Double datatype. getStringBuffer Returns the value of an expression result as a String datatype. getBytes Returns the value of an expression result as a byte[] datatype. 276 Chapter 12: Java Expressions
  • 309. For more information about the JExpression class, including syntax, usage, and examples, see “JExpression API Reference” on page 279. Advanced Interface Example The following example shows how to use the advanced interface to create and invoke a lookup expression in a Java transformation. The Java code shows how to create a function that calls an expression and how to invoke the expression to get the return value. This example passes the values for two input ports with a String datatype, NAME and COMPANY, to the function myLookup. The myLookup function uses a lookup expression to look up the value for the ADDRESS output port. Note: This example assumes you have an unconnected lookup transformation in the mapping called LKP_addresslookup. Use the following Java code in the Helper Code tab of the Transformation Developer: JExprParamMetadata addressLookup() throws SDKException { JExprParamMetadata params[] = new JExprParamMetadata[2]; params[0] = new JExprParamMetadata ( EDataType.STRING, // data type 50, // precision 0 // scale ); params[1] = new JExprParamMetadata ( EDataType.STRING, // data type 50, // precision 0 // scale ); return defineJExpression(":LKP.LKP_addresslookup(X1,X2)",params); } JExpression lookup = null; boolean isJExprObjCreated = false; Working with the Advanced Interface 277
  • 310. Use the following Java code in the On Input Row tab to invoke the expression and return the value of the ADDRESS port: ... if(!iisJExprObjCreated) { lookup = addressLookup(); isJExprObjCreated = true; } lookup = addressLookup(); lookup.invoke(new Object [] {NAME,COMPANY}, ERowType.INSERT); EDataType addressDataType = lookup.getResultDataType(); if(addressDataType == EDataType.STRING) { ADDRESS = (lookup.getStringBuffer()).toString(); } else { logError("Expression result datatype is incorrect."); } ... 278 Chapter 12: Java Expressions
  • 311. JExpression API Reference The JExpression class contains the following API methods: ♦ invoke ♦ getResultDataType ♦ getResultMetadata ♦ isResultNull ♦ getInt ♦ getDouble ♦ getStringBuffer ♦ getBytes invoke Invokes an expression. Arguments for invoke include an object that defines the input parameters and the row type. You must instantiate an JExpression object before you use invoke. You can use ERowType.INSERT, ERowType.DELETE, and ERowType.UPDATE for the row type. Use the following syntax: objectName.invoke( new Object[] { param1[, ... paramN ]}, rowType ); Input/ Argument Datatype Description Output objectName JExpression Input JExpression object name. parameters n/a Input Object array that contains the input values for the expression. For example, you create a function in the Java Expressions code entry tab named address_lookup() that returns an JExpression object that represents the expression. Use the following code to invoke the expression that uses input ports NAME and COMPANY: JExpression myObject = address_lookup(); myObject.invoke(new Object[] { NAME,COMPANY }, ERowType INSERT); JExpression API Reference 279
  • 312. getResultDataType Returns the datatype of an expression result. getResultDataType returns a value of EDataType. For more information about the EDataType enumerated class, see “EDataType Class” on page 274. Use the following syntax: objectName.getResultDataType(); For example, use the following code to invoke an expression and assign the datatype of the result to the variable dataType: myObject.invoke(new Object[] { NAME,COMPANY }, ERowType INSERT); EDataType dataType = myObject.getResultDataType(); getResultMetadata Returns the metadata for an expression result. For example, you can use getResultMetadata to get the precision, scale, and datatype of an expression result. You can assign the metadata of the return value from an expression to an JExprParamMetadata object. Use the getScale, getPrecision, and getDataType object methods to retrieve the result metadata. Use the following syntax: objectName.getResultMetadata(); For example, use the following Java code to assign the scale, precision, and datatype of the return value of myObject to variables: JExprParamMetadata myMetadata = myObject.getResultMetadata(); int scale = myMetadata.getScale(); int prec = myMetadata.getPrecision(); int datatype = myMetadata.getDataType(); Note: The getDataType object method returns the integer value of the datatype, as enumerated in EDataType. For more information about the EDataType class, see “EDataType Class” on page 274. isResultNull Check the value of an expression result. Use the following syntax: objectName.isResultNull(); 280 Chapter 12: Java Expressions
  • 313. For example, use the following Java code to invoke an expression and assign the return value of the expression to the variable address if the return value is not null: JExpression myObject = address_lookup(); myObject.invoke(new Object[] { NAME,COMPANY }, ERowType INSERT); if(!myObject.isResultNull()) { String address = myObject.getStringBuffer(); } getInt Returns the value of an expression result as an Integer datatype. Use the following syntax: objectName.getInt(); For example, use the following Java code to get the result of an expression that returns an employee ID number as an integer, where findEmpID is a JExpression object: int empID = findEmpID.getInt(); getDouble Returns the value of an expression result as a Double datatype. Use the following syntax: objectName.getDouble(); For example, use the following Java code to get the result of an expression that returns a salary value as a double, where JExprSalary is an JExpression object: double salary = JExprSalary.getDouble(); getLong Returns the value of an expression result as a Long datatype. You can use getLong to get the result of an expression that uses a Date datatype. Use the following syntax: objectName.getLong(); For example, use the following Java code to get the result of an expression that returns a Date value as a Long datatype, where JExprCurrentDate is an JExpression object: long currDate = JExprCurrentDate.getLong(); JExpression API Reference 281
  • 314. getStringBuffer Returns the value of an expression result as a String datatype. Use the following syntax: objectName.getStringBuffer(); For example, use the following Java code to get the result of an expression that returns two concatenated strings, where JExprConcat is an JExpression object: String result = JExprConcat.getStringBuffer(); getBytes Returns the value of an expression result as an byte[] datatype. For example, you can use getByte to get the result of an expression that encypts data with the AES_ENCRYPT function. Use the following syntax: objectName.getBytes(); For example, use the following Java code to get the result of an expression that encrypts the binary data using the AES_ENCRYPT function, where JExprEncryptData is an JExpression object: byte[] newBytes = JExprEncryptData.getBytes(); 282 Chapter 12: Java Expressions
  • 315. Chapter 13 Joiner Transformation This chapter includes the following topics: ♦ Overview, 284 ♦ Joiner Transformation Properties, 286 ♦ Defining a Join Condition, 288 ♦ Defining the Join Type, 289 ♦ Using Sorted Input, 292 ♦ Joining Data from a Single Source, 296 ♦ Blocking the Source Pipelines, 299 ♦ Working with Transactions, 300 ♦ Creating a Joiner Transformation, 303 ♦ Tips, 306 283
  • 316. Overview Transformation type: Active Connected Use the Joiner transformation to join source data from two related heterogeneous sources residing in different locations or file systems. You can also join data from the same source. The Joiner transformation joins sources with at least one matching column. The Joiner transformation uses a condition that matches one or more pairs of columns between the two sources. The two input pipelines include a master pipeline and a detail pipeline or a master and a detail branch. The master pipeline ends at the Joiner transformation, while the detail pipeline continues to the target. Figure 13-1 shows the master and detail pipelines in a mapping with a Joiner transformation: Figure 13-1. Mapping with Master and Detail Pipelines Master Pipeline Detail Pipeline To join more than two sources in a mapping, join the output from the Joiner transformation with another source pipeline. Add Joiner transformations to the mapping until you have joined all the source pipelines. The Joiner transformation accepts input from most transformations. However, consider the following limitations on the pipelines you connect to the Joiner transformation: ♦ You cannot use a Joiner transformation when either input pipeline contains an Update Strategy transformation. ♦ You cannot use a Joiner transformation if you connect a Sequence Generator transformation directly before the Joiner transformation. Working with the Joiner Transformation When you work with the Joiner transformation, you must configure the transformation properties, join type, and join condition. You can configure the Joiner transformation for sorted input to improve Integration Service performance. You can also configure the 284 Chapter 13: Joiner Transformation
  • 317. transformation scope to control how the Integration Service applies transformation logic. To work with the Joiner transformation, complete the following tasks: ♦ Configure the Joiner transformation properties. Properties for the Joiner transformation identify the location of the cache directory, how the Integration Service processes the transformation, and how the Integration Service handles caching. For more information, see “Joiner Transformation Properties” on page 286. ♦ Configure the join condition. The join condition contains ports from both input sources that must match for the Integration Service to join two rows. Depending on the type of join selected, the Integration Service either adds the row to the result set or discards the row. For more information, see “Defining a Join Condition” on page 288. ♦ Configure the join type. A join is a relational operator that combines data from multiple tables in different databases or flat files into a single result set. You can configure the Joiner transformation to use a Normal, Master Outer, Detail Outer, or Full Outer join type. For more information, see “Defining the Join Type” on page 289. ♦ Configure the session for sorted or unsorted input. You can improve session performance by configuring the Joiner transformation to use sorted input. To configure a mapping to use sorted data, you establish and maintain a sort order in the mapping so that the Integration Service can use the sorted data when it processes the Joiner transformation. For more information about configuring the Joiner transformation for sorted input, see “Using Sorted Input” on page 292. ♦ Configure the transaction scope. When the Integration Service processes a Joiner transformation, it can apply transformation logic to all data in a transaction, all incoming data, or one row of data at a time. For more information about configuring how the Integration Service applies transformation logic, see “Working with Transactions” on page 300. If you have the partitioning option in PowerCenter, you can increase the number of partitions in a pipeline to improve session performance. For information about partitioning with the Joiner transformation, see “Working with Partition Points” in the Workflow Administration Guide. Overview 285
  • 318. Joiner Transformation Properties Properties for the Joiner transformation identify the location of the cache directory, how the Integration Service processes the transformation, and how the Integration Service handles caching. The properties also determine how the Integration Service joins tables and files. Figure 13-2 shows the Joiner transformation properties: Figure 13-2. Joiner Transformation Properties Tab When you create a mapping, you specify the properties for each Joiner transformation. When you create a session, you can override some properties, such as the index and data cache size for each transformation. Table 13-1 describes the Joiner transformation properties: Table 13-1. Joiner Transformation Properties Option Description Case-Sensitive String Comparison If selected, the Integration Service uses case-sensitive string comparisons when performing joins on string columns. Cache Directory Specifies the directory used to cache master or detail rows and the index to these rows. By default, the cache files are created in a directory specified by the process variable $PMCacheDir. If you override the directory, make sure the directory exists and contains enough disk space for the cache files. The directory can be a mapped or mounted drive. Join Type Specifies the type of join: Normal, Master Outer, Detail Outer, or Full Outer. Null Ordering in Master Not applicable for this transformation type. 286 Chapter 13: Joiner Transformation
  • 319. Table 13-1. Joiner Transformation Properties Option Description Null Ordering in Detail Not applicable for this transformation type. Tracing Level Amount of detail displayed in the session log for this transformation. The options are Terse, Normal, Verbose Data, and Verbose Initialization. Joiner Data Cache Size Data cache size for the transformation. Default cache size is 2,000,000 bytes. If the total configured cache size is 2 GB or more, you must run the session on a 64- bit Integration Service. You can configure a numeric value, or you can configure the Integration Service to determine the cache size at runtime. If you configure the Integration Service to determine the cache size, you can also configure a maximum amount of memory for the Integration Service to allocate to the cache. Joiner Index Cache Size Index cache size for the transformation. Default cache size is 1,000,000 bytes. If the total configured cache size is 2 GB or more, you must run the session on a 64- bit Integration Service. You can configure a numeric value, or you can configure the Integration Service to determine the cache size at runtime. If you configure the Integration Service to determine the cache size, you can also configure a maximum amount of memory for the Integration Service to allocate to the cache. Sorted Input Specifies that data is sorted. Choose Sorted Input to join sorted data. Using sorted input can improve performance. For more information about working with sorted input, see “Using Sorted Input” on page 292. Transformation Scope Specifies how the Integration Service applies the transformation logic to incoming data. You can choose Transaction, All Input, or Row. For more information, see “Working with Transactions” on page 300. Joiner Transformation Properties 287
  • 320. Defining a Join Condition The join condition contains ports from both input sources that must match for the Integration Service to join two rows. Depending on the type of join selected, the Integration Service either adds the row to the result set or discards the row. The Joiner transformation produces result sets based on the join type, condition, and input data sources. Before you define a join condition, verify that the master and detail sources are configured for optimal performance. During a session, the Integration Service compares each row of the master source against the detail source. To improve performance for an unsorted Joiner transformation, use the source with fewer rows as the master source. To improve performance for a sorted Joiner transformation, use the source with fewer duplicate key values as the master. By default, when you add ports to a Joiner transformation, the ports from the first source pipeline display as detail sources. Adding the ports from the second source pipeline sets them as master sources. To change these settings, click the M column on the Ports tab for the ports you want to set as the master source. This sets ports from this source as master ports and ports from the other source as detail ports. You define one or more conditions based on equality between the specified master and detail sources. For example, if two sources with tables called EMPLOYEE_AGE and EMPLOYEE_POSITION both contain employee ID numbers, the following condition matches rows with employees listed in both sources: EMP_ID1 = EMP_ID2 Use one or more ports from the input sources of a Joiner transformation in the join condition. Additional ports increase the time necessary to join two sources. The order of the ports in the condition can impact the performance of the Joiner transformation. If you use multiple ports in the join condition, the Integration Service compares the ports in the order you specify. The Designer validates datatypes in a condition. Both ports in a condition must have the same datatype. If you need to use two ports in the condition with non-matching datatypes, convert the datatypes so they match. If you join Char and Varchar datatypes, the Integration Service counts any spaces that pad Char values as part of the string: Char(40) = "abcd" Varchar(40) = "abcd" The Char value is “abcd” padded with 36 blank spaces, and the Integration Service does not join the two fields because the Char field contains trailing spaces. Note: The Joiner transformation does not match null values. For example, if both EMP_ID1 and EMP_ID2 contain a row with a null value, the Integration Service does not consider them a match and does not join the two rows. To join rows with null values, replace null input with default values, and then join on the default values. For more information about default values, see “Using Default Values for Ports” on page 18. 288 Chapter 13: Joiner Transformation
  • 321. Defining the Join Type In SQL, a join is a relational operator that combines data from multiple tables into a single result set. The Joiner transformation is similar to an SQL join except that data can originate from different types of sources. You define the join type on the Properties tab in the transformation. The Joiner transformation supports the following types of joins: ♦ Normal ♦ Master Outer ♦ Detail Outer ♦ Full Outer Note: A normal or master outer join performs faster than a full outer or detail outer join. If a result set includes fields that do not contain data in either of the sources, the Joiner transformation populates the empty fields with null values. If you know that a field will return a NULL and you do not want to insert NULLs in the target, you can set a default value on the Ports tab for the corresponding port. Normal Join With a normal join, the Integration Service discards all rows of data from the master and detail source that do not match, based on the condition. For example, you might have two sources of data for auto parts called PARTS_SIZE and PARTS_COLOR with the following data: PARTS_SIZE (master source) PART_ID1 DESCRIPTION SIZE 1 Seat Cover Large 2 Ash Tray Small 3 Floor Mat Medium PARTS_COLOR (detail source) PART_ID2 DESCRIPTION COLOR 1 Seat Cover Blue 3 Floor Mat Black 4 Fuzzy Dice Yellow To join the two tables by matching the PART_IDs in both sources, you set the condition as follows: PART_ID1 = PART_ID2 Defining the Join Type 289
  • 322. When you join these tables with a normal join, the result set includes the following data: PART_ID DESCRIPTION SIZE COLOR 1 Seat Cover Large Blue 3 Floor Mat Medium Black The following example shows the equivalent SQL statement: SELECT * FROM PARTS_SIZE, PARTS_COLOR WHERE PARTS_SIZE.PART_ID1 = PARTS_COLOR.PART_ID2 Master Outer Join A master outer join keeps all rows of data from the detail source and the matching rows from the master source. It discards the unmatched rows from the master source. When you join the sample tables with a master outer join and the same condition, the result set includes the following data: PART_ID DESCRIPTION SIZE COLOR 1 Seat Cover Large Blue 3 Floor Mat Medium Black 4 Fuzzy Dice NULL Yellow Because no size is specified for the Fuzzy Dice, the Integration Service populates the field with a NULL. The following example shows the equivalent SQL statement: SELECT * FROM PARTS_SIZE RIGHT OUTER JOIN PARTS_COLOR ON (PARTS_SIZE.PART_ID1 = PARTS_COLOR.PART_ID2) Detail Outer Join A detail outer join keeps all rows of data from the master source and the matching rows from the detail source. It discards the unmatched rows from the detail source. When you join the sample tables with a detail outer join and the same condition, the result set includes the following data: PART_ID DESCRIPTION SIZE COLOR 1 Seat Cover Large Blue 2 Ash Tray Small NULL 3 Floor Mat Medium Black Because no color is specified for the Ash Tray, the Integration Service populates the field with a NULL. 290 Chapter 13: Joiner Transformation
  • 323. The following example shows the equivalent SQL statement: SELECT * FROM PARTS_SIZE LEFT OUTER JOIN PARTS_COLOR ON (PARTS_COLOR.PART_ID2 = PARTS_SIZE.PART_ID1) Full Outer Join A full outer join keeps all rows of data from both the master and detail sources. When you join the sample tables with a full outer join and the same condition, the result set includes: PART_ID DESCRIPTION SIZE Color 1 Seat Cover Large Blue 2 Ash Tray Small NULL 3 Floor Mat Medium Black 4 Fuzzy Dice NULL Yellow Because no color is specified for the Ash Tray and no size is specified for the Fuzzy Dice, the Integration Service populates the fields with NULL. The following example shows the equivalent SQL statement: SELECT * FROM PARTS_SIZE FULL OUTER JOIN PARTS_COLOR ON (PARTS_SIZE.PART_ID1 = PARTS_COLOR.PART_ID2) Defining the Join Type 291
  • 324. Using Sorted Input You can improve session performance by configuring the Joiner transformation to use sorted input. When you configure the Joiner transformation to use sorted data, the Integration Service improves performance by minimizing disk input and output. You see the greatest performance improvement when you work with large data sets. To configure a mapping to use sorted data, you establish and maintain a sort order in the mapping so the Integration Service can use the sorted data when it processes the Joiner transformation. Complete the following tasks to configure the mapping: ♦ Configure the sort order. Configure the sort order of the data you want to join. You can join sorted flat files, or you can sort relational data using a Source Qualifier transformation. You can also use a Sorter transformation. ♦ Add transformations. Use transformations that maintain the order of the sorted data. ♦ Configure the Joiner transformation. Configure the Joiner transformation to use sorted data and configure the join condition to use the sort origin ports. The sort origin represents the source of the sorted data. When you configure the sort order in a session, you can select a sort order associated with the Integration Service code page. When you run the Integration Service in Unicode mode, it uses the selected session sort order to sort character data. When you run the Integration Service in ASCII mode, it sorts all character data using a binary sort order. To ensure that data is sorted as the Integration Service requires, the database sort order must be the same as the user-defined session sort order. When you join sorted data from partitioned pipelines, you must configure the partitions to maintain the order of sorted data. For more information about joining data from partitioned pipelines, see “Working with Partition Points” in the Workflow Administration Guide. Configuring the Sort Order You must configure the sort order to ensure that the Integration Service passes sorted data to the Joiner transformation. Configure the sort order using one of the following methods: ♦ Use sorted flat files. When the flat files contain sorted data, verify that the order of the sort columns match in each source file. ♦ Use sorted relational data. Use sorted ports in the Source Qualifier transformation to sort columns from the source database. Configure the order of the sorted ports the same in each Source Qualifier transformation. For more information about using sorted ports, see “Using Sorted Ports” on page 472. ♦ Use Sorter transformations. Use a Sorter transformation to sort relational or flat file data. Place a Sorter transformation in the master and detail pipelines. Configure each Sorter transformation to use the same order of the sort key ports and the sort order direction. 292 Chapter 13: Joiner Transformation
  • 325. For more information about using the Sorter transformation, see “Creating a Sorter Transformation” on page 443. If you pass unsorted or incorrectly sorted data to a Joiner transformation configured to use sorted data, the session fails and the Integration Service logs the error in the session log file. Adding Transformations to the Mapping When you add transformations between the sort origin and the Joiner transformation, use the following guidelines to maintain sorted data: ♦ Do not place any of the following transformations between the sort origin and the Joiner transformation: − Custom − Unsorted Aggregator − Normalizer − Rank − Union transformation − XML Parser transformation − XML Generator transformation − Mapplet, if it contains one of the above transformations ♦ You can place a sorted Aggregator transformation between the sort origin and the Joiner transformation if you use the following guidelines: − Configure the Aggregator transformation for sorted input using the guidelines in “Using Sorted Input” on page 45. − Use the same ports for the group by columns in the Aggregator transformation as the ports at the sort origin. − The group by ports must be in the same order as the ports at the sort origin. ♦ When you join the result set of a Joiner transformation with another pipeline, verify that the data output from the first Joiner transformation is sorted. Tip: You can place the Joiner transformation directly after the sort origin to maintain sorted data. Configuring the Joiner Transformation To configure the Joiner transformation, complete the following tasks: ♦ Enable Sorted Input on the Properties tab. ♦ Define the join condition to receive sorted data in the same order as the sort origin. Using Sorted Input 293
  • 326. Defining the Join Condition Configure the join condition to maintain the sort order established at the sort origin: the sorted flat file, the Source Qualifier transformation, or the Sorter transformation. If you use a sorted Aggregator transformation between the sort origin and the Joiner transformation, treat the sorted Aggregator transformation as the sort origin when you define the join condition. Use the following guidelines when you define join conditions: ♦ The ports you use in the join condition must match the ports at the sort origin. ♦ When you configure multiple join conditions, the ports in the first join condition must match the first ports at the sort origin. ♦ When you configure multiple conditions, the order of the conditions must match the order of the ports at the sort origin, and you must not skip any ports. ♦ The number of sorted ports in the sort origin can be greater than or equal to the number of ports at the join condition. Example of a Join Condition For example, you configure Sorter transformations in the master and detail pipelines with the following sorted ports: 1. ITEM_NO 2. ITEM_NAME 3. PRICE When you configure the join condition, use the following guidelines to maintain sort order: ♦ You must use ITEM_NO in the first join condition. ♦ If you add a second join condition, you must use ITEM_NAME. ♦ If you want to use PRICE in a join condition, you must also use ITEM_NAME in the second join condition. If you skip ITEM_NAME and join on ITEM_NO and PRICE, you lose the sort order and the Integration Service fails the session. 294 Chapter 13: Joiner Transformation
  • 327. Figure 13-3 shows a mapping configured to sort and join on the ports ITEM_NO, ITEM_NAME, and PRICE: Figure 13-3. Mapping Configured to Join Data from Two Pipelines The master and detail Sorter transformations sort on the same ports in the same order. When you use the Joiner transformation to join the master and detail pipelines, you can configure any one of the following join conditions: ITEM_NO = ITEM_NO or ITEM_NO = ITEM_NO1 ITEM_NAME = ITEM_NAME1 or ITEM_NO = ITEM_NO1 ITEM_NAME = ITEM_NAME1 PRICE = PRICE1 Using Sorted Input 295
  • 328. Joining Data from a Single Source You may want to join data from the same source if you want to perform a calculation on part of the data and join the transformed data with the original data. When you join the data using this method, you can maintain the original data and transform parts of that data within one mapping. You can join data from the same source in the following ways: ♦ Join two branches of the same pipeline. ♦ Join two instances of the same source. Joining Two Branches of the Same Pipeline When you join data from the same source, you can create two branches of the pipeline. When you branch a pipeline, you must add a transformation between the source qualifier and the Joiner transformation in at least one branch of the pipeline. You must join sorted data and configure the Joiner transformation for sorted input. For example, you have a source with the following ports: ♦ Employee ♦ Department ♦ Total Sales In the target, you want to view the employees who generated sales that were greater than the average sales for their departments. To do this, you create a mapping with the following transformations: ♦ Sorter transformation. Sorts the data. ♦ Sorted Aggregator transformation. Averages the sales data and group by department. When you perform this aggregation, you lose the data for individual employees. To maintain employee data, you must pass a branch of the pipeline to the Aggregator transformation and pass a branch with the same data to the Joiner transformation to maintain the original data. When you join both branches of the pipeline, you join the aggregated data with the original data. ♦ Sorted Joiner transformation. Uses a sorted Joiner transformation to join the sorted aggregated data with the original data. ♦ Filter transformation. Compares the average sales data against sales data for each employee and filter out employees with less than above average sales. 296 Chapter 13: Joiner Transformation
  • 329. Figure 13-4 shows joining two branches of the same pipeline: Figure 13-4. Mapping that Joins Two Branches of a Pipeline Pipeline Branch 1 Filter out employees with less than above average sales. Source Pipeline Branch 2 Sorted Joiner Transformation Note: You can also join data from output groups of the same transformation, such as the Custom transformation or XML Source Qualifier transformation. Place a Sorter transformation between each output group and the Joiner transformation and configure the Joiner transformation to receive sorted input. Joining two branches might impact performance if the Joiner transformation receives data from one branch much later than the other branch. The Joiner transformation caches all the data from the first branch, and writes the cache to disk if the cache fills. The Joiner transformation must then read the data from disk when it receives the data from the second branch. This can slow processing. Joining Two Instances of the Same Source You can also join same source data by creating a second instance of the source. After you create the second source instance, you can join the pipelines from the two source instances. If you want to join unsorted data, you must create two instances of the same source and join the pipelines. Figure 13-5 shows two instances of the same source joined using a Joiner transformation: Figure 13-5. Mapping that Joins Two Instances of the Same Source Source Instance 1 Source Instance 2 Note: When you join data using this method, the Integration Service reads the source data for each source instance, so performance can be slower than joining two branches of a pipeline. Joining Data from a Single Source 297
  • 330. Guidelines Use the following guidelines when deciding whether to join branches of a pipeline or join two instances of a source: ♦ Join two branches of a pipeline when you have a large source or if you can read the source data only once. For example, you can only read source data from a message queue once. ♦ Join two branches of a pipeline when you use sorted data. If the source data is unsorted and you use a Sorter transformation to sort the data, branch the pipeline after you sort the data. ♦ Join two instances of a source when you need to add a blocking transformation to the pipeline between the source and the Joiner transformation. ♦ Join two instances of a source if one pipeline may process slower than the other pipeline. ♦ Join two instances of a source if you need to join unsorted data. 298 Chapter 13: Joiner Transformation
  • 331. Blocking the Source Pipelines When you run a session with a Joiner transformation, the Integration Service blocks and unblocks the source data, based on the mapping configuration and whether you configure the Joiner transformation for sorted input. For more information about blocking source data, see “Integration Service Architecture” in the Administrator Guide. Unsorted Joiner Transformation When the Integration Service processes an unsorted Joiner transformation, it reads all master rows before it reads the detail rows. To ensure it reads all master rows before the detail rows, the Integration Service blocks the detail source while it caches rows from the master source. Once the Integration Service reads and caches all master rows, it unblocks the detail source and reads the detail rows. Some mappings with unsorted Joiner transformations violate data flow validation. For more information about mappings containing blocking transformations that violate data flow validation, see “Mappings” in the Designer Guide. Sorted Joiner Transformation When the Integration Service processes a sorted Joiner transformation, it blocks data based on the mapping configuration. Blocking logic is possible if master and detail input to the Joiner transformation originate from different sources. The Integration Service uses blocking logic to process the Joiner transformation if it can do so without blocking all sources in a target load order group simultaneously. Otherwise, it does not use blocking logic. Instead, it stores more rows in the cache. When the Integration Service can use blocking logic to process the Joiner transformation, it stores fewer rows in the cache, increasing performance. Caching Master Rows When the Integration Service processes a Joiner transformation, it reads rows from both sources concurrently and builds the index and data cache based on the master rows. The Integration Service then performs the join based on the detail source data and the cache data. The number of rows the Integration Service stores in the cache depends on the partitioning scheme, the source data, and whether you configure the Joiner transformation for sorted input. To improve performance for an unsorted Joiner transformation, use the source with fewer rows as the master source. To improve performance for a sorted Joiner transformation, use the source with fewer duplicate key values as the master. For more information about Joiner transformation caches, see “Session Caches” in the Workflow Administration Guide. Blocking the Source Pipelines 299
  • 332. Working with Transactions When the Integration Service processes a Joiner transformation, it can apply transformation logic to all data in a transaction, all incoming data, or one row of data at a time. The Integration Service can drop or preserve transaction boundaries depending on the mapping configuration and the transformation scope. You configure how the Integration Service applies transformation logic and handles transaction boundaries using the transformation scope property. You configure transformation scope values based on the mapping configuration and whether you want to preserve or drop transaction boundaries. You can preserve transaction boundaries when you join the following sources: ♦ You join two branches of the same source pipeline. Use the Transaction transformation scope to preserve transaction boundaries. For information about preserving transaction boundaries for a single source, see “Preserving Transaction Boundaries for a Single Pipeline” on page 301. ♦ You join two sources, and you want to preserve transaction boundaries for the detail source. Use the Row transformation scope to preserve transaction boundaries in the detail pipeline. For more information about preserving transaction boundaries for the detail source, see “Preserving Transaction Boundaries in the Detail Pipeline” on page 301. You can drop transaction boundaries when you join the following sources: ♦ You join two sources or two branches and you want to drop transaction boundaries. Use the All Input transformation scope to apply the transformation logic to all incoming data and drop transaction boundaries for both pipelines. For more information about dropping transaction boundaries for two pipelines, see “Dropping Transaction Boundaries for Two Pipelines” on page 302. Table 13-2 summarizes how to preserve transaction boundaries using transformation scopes with the Joiner transformation: Table 13-2. Integration Service Behavior with Transformation Scopes for the Joiner Transformation Transformation Scope Input Type Integration Service Behavior Row Unsorted Preserves transaction boundaries in the detail pipeline. Sorted Session fails. *Transaction Sorted Preserves transaction boundaries when master and detail originate from the same transaction generator. Session fails when master and detail do not originate from the same transaction generator Unsorted Session fails. *All Input Sorted, Unsorted Drops transaction boundaries. *Sessions fail if you use real-time data with All Input or Transaction transformation scopes. 300 Chapter 13: Joiner Transformation
  • 333. For more information about transformation scope and transaction boundaries, see “Understanding Commit Points” in the Workflow Administration Guide. Preserving Transaction Boundaries for a Single Pipeline When you join data from the same source, use the Transaction transformation scope to preserve incoming transaction boundaries for a single pipeline. Use the Transaction transformation scope when the Joiner transformation joins data from the same source, either two branches of the same pipeline or two output groups of one transaction generator. Use this transformation scope with sorted data and any join type. When you use the Transaction transformation scope, verify that master and detail pipelines originate from the same transaction control point and that you use sorted input. For example, in Figure 13-6 the Sorter transformation is the transaction control point. You cannot place another transaction control point between the Sorter transformation and the Joiner transformation. Figure 13-6 shows a mapping configured to join two branches of a pipeline and preserve transaction boundaries: Figure 13-6. Preserving Transaction Boundaries when You Join Two Pipeline Branches Master and detail pipeline branches The Integration Service joins the pipeline branches originate from the same transaction. and preserves transaction boundaries. Preserving Transaction Boundaries in the Detail Pipeline When you want to preserve the transaction boundaries in the detail pipeline, choose the Row transformation scope. The Row transformation scope allows the Integration Service to process data one row at a time. The Integration Service caches the master data and matches the detail data with the cached master data. When the source data originates from a real-time source, such as IBM MQ Series, the Integration Service matches the cached master data with each message as it is read from the detail source. Use the Row transformation scope with Normal and Master Outer join types that use unsorted data. Working with Transactions 301
  • 334. Dropping Transaction Boundaries for Two Pipelines When you want to join data from two sources or two branches and you do not need to preserve transaction boundaries, use the All Input transformation scope. When you use All Input, the Integration Service drops incoming transaction boundaries for both pipelines and outputs all rows from the transformation as an open transaction. At the Joiner transformation, the data from the master pipeline can be cached or joined concurrently, depending on how you configure the sort order. Use this transformation scope with sorted and unsorted data and any join type. For more information about configuring the sort order, see “Joiner Transformation Properties” on page 286. 302 Chapter 13: Joiner Transformation
  • 335. Creating a Joiner Transformation To use a Joiner transformation, add a Joiner transformation to the mapping, set up the input sources, and configure the transformation with a condition and join type and sort type. To create a Joiner Transformation: 1. In the Mapping Designer, click Transformation > Create. Select the Joiner transformation. Enter a name, and click OK. The naming convention for Joiner transformations is JNR_TransformationName. Enter a description for the transformation. The Designer creates the Joiner transformation. 2. Drag all the input/output ports from the first source into the Joiner transformation. The Designer creates input/output ports for the source fields in the Joiner transformation as detail fields by default. You can edit this property later. 3. Select and drag all the input/output ports from the second source into the Joiner transformation. The Designer configures the second set of source fields and master fields by default. 4. Double-click the title bar of the Joiner transformation to open the transformation. 5. Click the Ports tab. 6. Click any box in the M column to switch the master/detail relationship for the sources. Creating a Joiner Transformation 303
  • 336. Tip : To improve performance for an unsorted Joiner transformation, use the source with fewer rows as the master source. To improve performance for a sorted Joiner transformation, use the source with fewer duplicate key values as the master. 7. Add default values for specific ports. Some ports are likely to contain null values, since the fields in one of the sources may be empty. You can specify a default value if the target database does not handle NULLs. 8. Click the Condition tab and set the join condition. 9. Click the Add button to add a condition. You can add multiple conditions. The master and detail ports must have matching datatypes. The Joiner transformation only supports equivalent (=) joins. For more information about defining the join condition, see “Defining a Join Condition” on page 288. 304 Chapter 13: Joiner Transformation
  • 337. 10. Click the Properties tab and configure properties for the transformation. Note: You can edit the join condition from the Condition tab. The keyword AND separates multiple conditions. For more information about defining the properties, see “Joiner Transformation Properties” on page 286. 11. Click OK. 12. Click the Metadata Extensions tab to configure metadata extensions. For information about working with metadata extensions, see “Metadata Extensions” in the Repository Guide. 13. Click Repository > Save to save changes to the mapping. Creating a Joiner Transformation 305
  • 338. Tips The following tips can help improve session performance. Perform joins in a database when possible. Performing a join in a database is faster than performing a join in the session. In some cases, this is not possible, such as joining tables from two different databases or flat file systems. If you want to perform a join in a database, use the following options: ♦ Create a pre-session stored procedure to join the tables in a database. ♦ Use the Source Qualifier transformation to perform the join. For more information, see “Joining Source Data” on page 454 for more information. Join sorted data when possible. You can improve session performance by configuring the Joiner transformation to use sorted input. When you configure the Joiner transformation to use sorted data, the Integration Service improves performance by minimizing disk input and output. You see the greatest performance improvement when you work with large data sets. For more information, see “Using Sorted Input” on page 292. For an unsorted Joiner transformation, designate the source with fewer rows as the master source. For optimal performance and disk storage, designate the source with the fewer rows as the master source. During a session, the Joiner transformation compares each row of the master source against the detail source. The fewer unique rows in the master, the fewer iterations of the join comparison occur, which speeds the join process. For a sorted Joiner transformation, designate the source with fewer duplicate key values as the master source. For optimal performance and disk storage, designate the source with fewer duplicate key values as the master source. When the Integration Service processes a sorted Joiner transformation, it caches rows for one hundred keys at a time. If the master source contains many rows with the same key value, the Integration Service must cache more rows, and performance can be slowed. 306 Chapter 13: Joiner Transformation
  • 339. Chapter 14 Lookup Transformation This chapter includes the following topics: ♦ Overview, 308 ♦ Connected and Unconnected Lookups, 309 ♦ Relational and Flat File Lookups, 311 ♦ Lookup Components, 313 ♦ Lookup Properties, 316 ♦ Lookup Query, 324 ♦ Lookup Condition, 328 ♦ Lookup Caches, 330 ♦ Configuring Unconnected Lookup Transformations, 331 ♦ Creating a Lookup Transformation, 335 ♦ Tips, 336 307
  • 340. Overview Transformation type: Passive Connected/Unconnected Use a Lookup transformation in a mapping to look up data in a flat file or a relational table, view, or synonym. You can import a lookup definition from any flat file or relational database to which both the PowerCenter Client and Integration Service can connect. Use multiple Lookup transformations in a mapping. The Integration Service queries the lookup source based on the lookup ports in the transformation. It compares Lookup transformation port values to lookup source column values based on the lookup condition. Pass the result of the lookup to other transformations and a target. Use the Lookup transformation to perform many tasks, including: ♦ Get a related value. For example, the source includes employee ID, but you want to include the employee name in the target table to make the summary data easier to read. ♦ Perform a calculation. Many normalized tables include values used in a calculation, such as gross sales per invoice or sales tax, but not the calculated value (such as net sales). ♦ Update slowly changing dimension tables. Use a Lookup transformation to determine whether rows already exist in the target. You can configure the Lookup transformation to complete the following types of lookups: ♦ Connected or unconnected. Connected and unconnected transformations receive input and send output in different ways. ♦ Relational or flat file lookup. When you create a Lookup transformation, you can choose to perform a lookup on a flat file or a relational table. When you create a Lookup transformation using a relational table as the lookup source, you can connect to the lookup source using ODBC and import the table definition as the structure for the Lookup transformation. When you create a Lookup transformation using a flat file as a lookup source, the Designer invokes the Flat File Wizard. For more information about using the Flat File Wizard, see “Working with Flat Files” in the Designer Guide. ♦ Cached or uncached. Sometimes you can improve session performance by caching the lookup table. If you cache the lookup, you can choose to use a dynamic or static cache. By default, the lookup cache remains static and does not change during the session. With a dynamic cache, the Integration Service inserts or updates rows in the cache during the session. When you cache the target table as the lookup, you can look up values in the target and insert them if they do not exist, or update them if they do. 308 Chapter 14: Lookup Transformation
  • 341. Connected and Unconnected Lookups You can configure a connected Lookup transformation to receive input directly from the mapping pipeline, or you can configure an unconnected Lookup transformation to receive input from the result of an expression in another transformation. Table 14-1 lists the differences between connected and unconnected lookups: Table 14-1. Differences Between Connected and Unconnected Lookups Connected Lookup Unconnected Lookup Receives input values directly from the pipeline. Receives input values from the result of a :LKP expression in another transformation. Use a dynamic or static cache. Use a static cache. Cache includes all lookup columns used in the mapping Cache includes all lookup/output ports in the lookup (that is, lookup source columns included in the lookup condition and the lookup/return port. condition and lookup source columns linked as output ports to other transformations). Can return multiple columns from the same row or insert Designate one return port (R). Returns one column from into the dynamic lookup cache. each row. If there is no match for the lookup condition, the If there is no match for the lookup condition, the Integration Integration Service returns the default value for all Service returns NULL. output ports. If you configure dynamic caching, the Integration Service inserts rows into the cache or leaves it unchanged. If there is a match for the lookup condition, the If there is a match for the lookup condition, the Integration Integration Service returns the result of the lookup Service returns the result of the lookup condition into the condition for all lookup/output ports. If you configure return port. dynamic caching, the Integration Service either updates the row in the cache or leaves the row unchanged. Pass multiple output values to another transformation. Pass one output value to another transformation. The Link lookup/output ports to another transformation. lookup/output/return port passes the value to the transformation calling :LKP expression. Supports user-defined default values. Does not support user-defined default values. Connected Lookup Transformation The following steps describe how the Integration Service processes a connected Lookup transformation: 1. A connected Lookup transformation receives input values directly from another transformation in the pipeline. 2. For each input row, the Integration Service queries the lookup source or cache based on the lookup ports and the condition in the transformation. 3. If the transformation is uncached or uses a static cache, the Integration Service returns values from the lookup query. Connected and Unconnected Lookups 309
  • 342. If the transformation uses a dynamic cache, the Integration Service inserts the row into the cache when it does not find the row in the cache. When the Integration Service finds the row in the cache, it updates the row in the cache or leaves it unchanged. It flags the row as insert, update, or no change. 4. The Integration Service passes return values from the query to the next transformation. If the transformation uses a dynamic cache, you can pass rows to a Filter or Router transformation to filter new rows to the target. Note: This chapter discusses connected Lookup transformations unless otherwise specified. Unconnected Lookup Transformation An unconnected Lookup transformation receives input values from the result of a :LKP expression in another transformation. You can call the Lookup transformation more than once in a mapping. A common use for unconnected Lookup transformations is to update slowly changing dimension tables. For more information about slowly changing dimension tables, visit the Informatica Knowledge Base at http://my.informatica.com. The following steps describe the way the Integration Service processes an unconnected Lookup transformation: 1. An unconnected Lookup transformation receives input values from the result of a :LKP expression in another transformation, such as an Update Strategy transformation. 2. The Integration Service queries the lookup source or cache based on the lookup ports and condition in the transformation. 3. The Integration Service returns one value into the return port of the Lookup transformation. 4. The Lookup transformation passes the return value into the :LKP expression. For more information about unconnected Lookup transformations, see “Configuring Unconnected Lookup Transformations” on page 331. 310 Chapter 14: Lookup Transformation
  • 343. Relational and Flat File Lookups When you create a Lookup transformation, you can choose to use a relational table or a flat file for the lookup source. Relational Lookups When you create a Lookup transformation using a relational table as a lookup source, you can connect to the lookup source using ODBC and import the table definition as the structure for the Lookup transformation. You can override the default SQL statement to add a WHERE clause or to query multiple tables. Flat File Lookups When you use a flat file for a lookup source, use any flat file definition in the repository, or you can import it. When you import a flat file lookup source, the Designer invokes the Flat File Wizard. Use the following options with flat file lookups only: ♦ Use indirect files as lookup sources by specifying a file list as the lookup file name. ♦ Use sorted input for the lookup. ♦ You can sort null data high or low. With relational lookups, this is based on the database support. ♦ Use case-sensitive string comparison with flat file lookups. With relational lookups, the case-sensitive comparison is based on the database support. Using Sorted Input When you configure a flat file Lookup transformation for sorted input, the condition columns must be grouped. If the condition columns are not grouped, the Integration Service cannot cache the lookup and fails the session. For best caching performance, sort the condition columns. For example, a Lookup transformation has the following condition: OrderID = OrderID1 CustID = CustID1 In the following flat file lookup source, the keys are grouped, but not sorted. The Integration Service can cache the data, but performance may not be optimal. OrderID CustID ItemNo. ItemDesc Comments 1001 CA502 F895S Flashlight Key data is grouped, but not sorted. CustID is out of order within OrderID. 1001 CA501 C530S Compass Relational and Flat File Lookups 311
  • 344. OrderID CustID ItemNo. ItemDesc Comments 1001 CA501 T552T Tent 1005 OK503 S104E Safety Knife Key data is grouped, but not sorted. OrderID is out of order. 1003 CA500 F304T First Aid Kit 1003 TN601 R938M Regulator System The keys are not grouped in the following flat file lookup source. The Integration Service cannot cache the data and fails the session. OrderID CustID ItemNo. ItemDesc Comments 1001 CA501 T552T Tent 1001 CA501 C530S Compass 1005 OK503 S104E Safety Knife 1003 TN601 R938M Regulator System 1003 CA500 F304T First Aid Kit 1001 CA502 F895S Flashlight Key data for CustID is not grouped. If you choose sorted input for indirect files, the range of data must not overlap in the files. 312 Chapter 14: Lookup Transformation
  • 345. Lookup Components Define the following components when you configure a Lookup transformation in a mapping: ♦ Lookup source ♦ Ports ♦ Properties ♦ Condition ♦ Metadata extensions Lookup Source Use a flat file or a relational table for a lookup source. When you create a Lookup transformation, you can import the lookup source from the following locations: ♦ Any relational source or target definition in the repository ♦ Any flat file source or target definition in the repository ♦ Any table or file that both the Integration Service and PowerCenter Client machine can connect to The lookup table can be a single table, or you can join multiple tables in the same database using a lookup SQL override. The Integration Service queries the lookup table or an in- memory cache of the table for all incoming rows into the Lookup transformation. The Integration Service can connect to a lookup table using a native database driver or an ODBC driver. However, the native database drivers improve session performance. Indexes and a Lookup Table If you have privileges to modify the database containing a lookup table, you can improve lookup initialization time by adding an index to the lookup table. This is important for very large lookup tables. Since the Integration Service needs to query, sort, and compare values in these columns, the index needs to include every column used in a lookup condition. You can improve performance by indexing the following types of lookup: ♦ Cached lookups. You can improve performance by indexing the columns in the lookup ORDER BY. The session log contains the ORDER BY clause. ♦ Uncached lookups. Because the Integration Service issues a SELECT statement for each row passing into the Lookup transformation, you can improve performance by indexing the columns in the lookup condition. Lookup Ports The Ports tab contains options similar to other transformations, such as port name, datatype, and scale. In addition to input and output ports, the Lookup transformation includes a Lookup Components 313
  • 346. lookup port type that represents columns of data in the lookup source. An unconnected Lookup transformation also includes a return port type that represents the return value. Table 14-2 describes the port types in a Lookup transformation: Table 14-2. Lookup Transformation Port Types Type of Number Ports Description Lookup Required I Connected Minimum of 1 Input port. Create an input port for each lookup port you want to Unconnected use in the lookup condition. You must have at least one input or input/output port in each Lookup transformation. O Connected Minimum of 1 Output port. Create an output port for each lookup port you want Unconnected to link to another transformation. You can designate both input and lookup ports as output ports. For connected lookups, you must have at least one output port. For unconnected lookups, use a lookup/output port as a return port (R) to designate a return value. L Connected Minimum of 1 Lookup port. The Designer designates each column in the Unconnected lookup source as a lookup (L) and output port (O). R Unconnected 1 only Return port. Use only in unconnected Lookup transformations. Designates the column of data you want to return based on the lookup condition. You can designate one lookup/output port as the return port. The Lookup transformation also enables an associated ports property that you configure when you use a dynamic cache. Use the following guidelines to configure lookup ports: ♦ If you delete lookup ports from a flat file session, the session fails. ♦ You can delete lookup ports from a relational lookup if you are certain the mapping does not use the lookup port. This reduces the amount of memory the Integration Service uses to run the session. ♦ To ensure datatypes match when you add an input port, copy the existing lookup ports. Lookup Properties On the Properties tab, you can configure properties, such as an SQL override for relational lookups, the lookup source name, and tracing level for the transformation. You can also configure caching properties on the Properties tab. For more information about lookup properties, see “Lookup Properties” on page 316. 314 Chapter 14: Lookup Transformation
  • 347. Lookup Condition On the Condition tab, you can enter the condition or conditions you want the Integration Service to use to determine whether input data qualifies values in the lookup source or cache. For more information about the lookup condition, see “Lookup Condition” on page 328. Metadata Extensions You can extend the metadata stored in the repository by associating information with repository objects, such as Lookup transformations. For example, when you create a Lookup transformation, you may want to store your name and the creation date with the Lookup transformation. You associate information with repository metadata using metadata extensions. For more information, see “Metadata Extensions” in the Repository Guide. Lookup Components 315
  • 348. Lookup Properties Properties for the Lookup transformation identify the database source, how the Integration Service processes the transformation, and how it handles caching and multiple matches. When you create a mapping, you specify the properties for each Lookup transformation. When you create a session, you can override some properties, such as the index and data cache size, for each transformation in the session properties. Table 14-3 describes the Lookup transformation properties: Table 14-3. Lookup Transformation Properties Lookup Option Description Type Lookup SQL Override Relational Overrides the default SQL statement to query the lookup table. Specifies the SQL statement you want the Integration Service to use for querying lookup values. Use only with the lookup cache enabled. For more information, see “Lookup Query” on page 324. Lookup Table Name Relational Specifies the name of the table from which the transformation looks up and caches values. You can import a table, view, or synonym from another database by selecting the Import button on the dialog box that appears when you first create a Lookup transformation. If you enter a lookup SQL override, you do not need to add an entry for this option. Lookup Caching Enabled Flat File, Indicates whether the Integration Service caches lookup values during Relational the session. When you enable lookup caching, the Integration Service queries the lookup source once, caches the values, and looks up values in the cache during the session. This can improve session performance. When you disable caching, each time a row passes into the transformation, the Integration Service issues a select statement to the lookup source for lookup values. Note: The Integration Service always caches flat file lookups. Lookup Policy on Multiple Flat File, Determines what happens when the Lookup transformation finds multiple Match Relational rows that match the lookup condition. You can select the first or last row returned from the cache or lookup source, or report an error. Or, you can allow the Lookup transformation to use any value. When you configure the Lookup transformation to return any matching value, the transformation returns the first value that matches the lookup condition. It creates an index based on the key ports rather than all Lookup transformation ports. If you do not enable the Output Old Value On Update option, the Lookup Policy On Multiple Match option is set to Report Error for dynamic lookups. For more information about lookup caches, see “Lookup Caches” on page 337. Lookup Condition Flat File, Displays the lookup condition you set in the Condition tab. Relational 316 Chapter 14: Lookup Transformation
  • 349. Table 14-3. Lookup Transformation Properties Lookup Option Description Type Connection Information Relational Specifies the database containing the lookup table. You can select the database connection or use the $Source or $Target variable. If you use one of these variables, the lookup table must reside in the source or target database you specify when you configure the session. If you select the database connection, you can also specify what type of database connection it is. Type Application: before the connection name if it is an Application connection. Type Relational: before the connection name if it is a relational connection. If you do not specify the type of database connection, the Integration Service fails the session if it cannot determine the type of database connection. For more information about using $Source and $Target, see “Configuring Relational Lookups in a Session” on page 322. Source Type Flat File, Indicates that the Lookup transformation reads values from a relational Relational database or a flat file. Tracing Level Flat File, Sets the amount of detail included in the session log when you run a Relational session containing this transformation. Lookup Cache Directory Flat File, Specifies the directory used to build the lookup cache files when you Name Relational configure the Lookup transformation to cache the lookup source. Also used to save the persistent lookup cache files when you select the Lookup Persistent option. By default, the Integration Service uses the $PMCacheDir directory configured for the Integration Service. Lookup Cache Persistent Flat File, Indicates whether the Integration Service uses a persistent lookup Relational cache, which consists of at least two cache files. If a Lookup transformation is configured for a persistent lookup cache and persistent lookup cache files do not exist, the Integration Service creates the files during the session. Use only with the lookup cache enabled. Lookup Data Cache Size Flat File, Indicates the maximum size the Integration Service allocates to the data Relational cache in memory. You can configure a numeric value, or you can configure the Integration Service to determine the cache size at runtime. If you configure the Integration Service to determine the cache size, you can also configure a maximum amount of memory for the Integration Service to allocate to the cache. If the Integration Service cannot allocate the configured amount of memory when initializing the session, it fails the session. When the Integration Service cannot store all the data cache data in memory, it pages to disk. The Lookup Data Cache Size is 2,000,000 bytes by default. The minimum size is 1,024 bytes. If the total configured session cache size is 2 GB (2,147,483, 648 bytes) or greater, you must run the session on a 64-bit Integration Service. Use only with the lookup cache enabled. Lookup Properties 317
  • 350. Table 14-3. Lookup Transformation Properties Lookup Option Description Type Lookup Index Cache Size Flat File, Indicates the maximum size the Integration Service allocates to the index Relational cache in memory. You can configure a numeric value, or you can configure the Integration Service to determine the cache size at runtime. If you configure the Integration Service to determine the cache size, you can also configure a maximum amount of memory for the Integration Service to allocate to the cache. If the Integration Service cannot allocate the configured amount of memory when initializing the session, it fails the session. When the Integration Service cannot store all the index cache data in memory, it pages to disk. The Lookup Index Cache Size is 1,000,000 bytes by default. The minimum size is 1,024 bytes. If the total configured session cache size is 2 GB (2,147,483, 648 bytes) or greater, you must run the session on a 64-bit Integration Service. Use only with the lookup cache enabled. Dynamic Lookup Cache Flat File, Indicates to use a dynamic lookup cache. Inserts or updates rows in the Relational lookup cache as it passes rows to the target table. Use only with the lookup cache enabled. Output Old Value On Flat File, Use only with dynamic caching enabled. When you enable this property, Update Relational the Integration Service outputs old values out of the lookup/output ports. When the Integration Service updates a row in the cache, it outputs the value that existed in the lookup cache before it updated the row based on the input data. When the Integration Service inserts a new row in the cache, it outputs null values. When you disable this property, the Integration Service outputs the same values out of the lookup/output and input/output ports. This property is enabled by default. Cache File Name Prefix Flat File, Use only with persistent lookup cache. Specifies the file name prefix to Relational use with persistent lookup cache files. The Integration Service uses the file name prefix as the file name for the persistent cache files it saves to disk. Only enter the prefix. Do not enter .idx or .dat. You can enter a parameter or variable for the file name prefix. Use any parameter or variable type that you can define in the parameter file. For information about using parameter files, see “Parameter Files” in the Workflow Administration Guide. If the named persistent cache files exist, the Integration Service builds the memory cache from the files. If the named persistent cache files do not exist, the Integration Service rebuilds the persistent cache files. Recache From Lookup Flat File, Use only with the lookup cache enabled. When selected, the Integration Source Relational Service rebuilds the lookup cache from the lookup source when it first calls the Lookup transformation instance. If you use a persistent lookup cache, it rebuilds the persistent cache files before using the cache. If you do not use a persistent lookup cache, it rebuilds the lookup cache in memory before using the cache. 318 Chapter 14: Lookup Transformation
  • 351. Table 14-3. Lookup Transformation Properties Lookup Option Description Type Insert Else Update Flat File, Use only with dynamic caching enabled. Applies to rows entering the Relational Lookup transformation with the row type of insert. When you select this property and the row type entering the Lookup transformation is insert, the Integration Service inserts the row into the cache if it is new, and updates the row if it exists. If you do not select this property, the Integration Service only inserts new rows into the cache when the row type entering the Lookup transformation is insert. For more information about defining the row type, see “Using Update Strategy Transformations with a Dynamic Cache” on page 354. Update Else Insert Flat File, Use only with dynamic caching enabled. Applies to rows entering the Relational Lookup transformation with the row type of update. When you select this property and the row type entering the Lookup transformation is update, the Integration Service updates the row in the cache if it exists, and inserts the row if it is new. If you do not select this property, the Integration Service only updates existing rows in the cache when the row type entering the Lookup transformation is update. For more information about defining the row type, see “Using Update Strategy Transformations with a Dynamic Cache” on page 354. Datetime Format Flat File If you do not define a datetime format for a particular field in the lookup definition or on the Ports tab, the Integration Service uses the properties defined here. You can enter any datetime format. Default is MM/DD/YYYY HH24:MI:SS. Thousand Separator Flat File If you do not define a thousand separator for a particular field in the lookup definition or on the Ports tab, the Integration Service uses the properties defined here. You can choose no separator, a comma, or a period. Default is no separator. Decimal Separator Flat File If you do not define a decimal separator for a particular field in the lookup definition or on the Ports tab, the Integration Service uses the properties defined here. You can choose a comma or a period decimal separator. Default is period. Case-Sensitive String Flat File If selected, the Integration Service uses case-sensitive string Comparison comparisons when performing lookups on string columns. Note: For relational lookups, the case-sensitive comparison is based on the database support. Lookup Properties 319
  • 352. Table 14-3. Lookup Transformation Properties Lookup Option Description Type Null Ordering Flat File Determines how the Integration Service orders null values. You can choose to sort null values high or low. By default, the Integration Service sorts null values high. This overrides the Integration Service configuration to treat nulls in comparison operators as high, low, or null. Note: For relational lookups, null ordering is based on the database support. Sorted Input Flat File Indicates whether or not the lookup file data is sorted. This increases lookup performance for file lookups. If you enable sorted input, and the condition columns are not grouped, the Integration Service fails the session. If the condition columns are grouped, but not sorted, the Integration Service processes the lookup as if you did not configure sorted input. For more information about sorted input, see “Flat File Lookups” on page 311. Configuring Lookup Properties in a Session When you configure a session, you can configure lookup properties that are unique to sessions: ♦ Flat file lookups. Configure location information, such as the file directory, file name, and the file type. ♦ Relational lookups. You can define $Source and $Target variables in the session properties. You can also override connection information to use the session parameter $DBConnection. 320 Chapter 14: Lookup Transformation
  • 353. Configuring Flat File Lookups in a Session Figure 14-1 shows the session properties for a flat file lookup: Figure 14-1. Session Properties for Flat File Lookups Session Properties for Flat File Lookup Lookup Properties 321
  • 354. Table 14-4 describes the session properties you configure for flat file lookups: Table 14-4. Session Properties for Flat File Lookups Property Description Lookup Source File Directory Enter the directory name. By default, the Integration Service looks in the process variable directory, $PMLookupFileDir, for lookup files. You can enter the full path and file name. If you specify both the directory and file name in the Lookup Source Filename field, clear this field. The Integration Service concatenates this field with the Lookup Source Filename field when it runs the session. You can also use the $InputFileName session parameter to specify the file name. For more information about session parameters, see “Working with Sessions” in the Workflow Administration Guide. Lookup Source Filename Name of the lookup file. If you use an indirect file, specify the name of the indirect file you want the Integration Service to read. You can also use the lookup file parameter, $LookupFileName, to change the name of the lookup file a session uses. If you specify both the directory and file name in the Source File Directory field, clear this field. The Integration Service concatenates this field with the Lookup Source File Directory field when it runs the session. For example, if you have “C:lookup_data” in the Lookup Source File Directory field, then enter “filename.txt” in the Lookup Source Filename field. When the Integration Service begins the session, it looks for “C:lookup_datafilename.txt”. For more information, see “Working with Sessions” in the Workflow Administration Guide. Lookup Source Filetype Indicates whether the lookup source file contains the source data or a list of files with the same file properties. Choose Direct if the lookup source file contains the source data. Choose Indirect if the lookup source file contains a list of files. When you select Indirect, the Integration Service creates one cache for all files. If you use sorted input with indirect files, verify that the range of data in the files do not overlap. If the range of data overlaps, the Integration Service processes the lookup as if you did not configure for sorted input. Configuring Relational Lookups in a Session When you configure a session, you specify the connection for the lookup database in the Connection node on the Mapping tab (Transformation view). You have the following options to specify a connection: ♦ Choose any relational connection. ♦ Use the connection variable, $DBConnection. ♦ Specify a database connection for $Source or $Target information. If you use $Source or $Target for the lookup connection, configure the $Source Connection Value and $Target Connection Value in the session properties. This ensures that the Integration Service uses the correct database connection for the variable when it runs the session. 322 Chapter 14: Lookup Transformation
  • 355. If you use $Source or $Target and you do not specify a Connection Value in the session properties, the Integration Service determines the database connection to use when it runs the session. It uses a source or target database connection for the source or target in the pipeline that contains the Lookup transformation. If it cannot determine which database connection to use, it fails the session. The following list describes how the Integration Service determines the value of $Source or $Target when you do not specify $Source Connection Value or $Target Connection Value in the session properties: ♦ When you use $Source and the pipeline contains one source, the Integration Service uses the database connection you specify for the source. ♦ When you use $Source and the pipeline contains multiple sources joined by a Joiner transformation, the Integration Service uses different database connections, depending on the location of the Lookup transformation in the pipeline: − When the Lookup transformation is after the Joiner transformation, the Integration Service uses the database connection for the detail table. − When the Lookup transformation is before the Joiner transformation, the Integration Service uses the database connection for the source connected to the Lookup transformation. ♦ When you use $Target and the pipeline contains one target, the Integration Service uses the database connection you specify for the target. ♦ When you use $Target and the pipeline contains multiple relational targets, the session fails. ♦ When you use $Source or $Target in an unconnected Lookup transformation, the session fails. Lookup Properties 323
  • 356. Lookup Query The Integration Service queries the lookup based on the ports and properties you configure in the Lookup transformation. The Integration Service runs a default SQL statement when the first row enters the Lookup transformation. If you use a relational lookup, you can customize the default query with the Lookup SQL Override property. Default Lookup Query The default lookup query contains the following statements: ♦ SELECT. The SELECT statement includes all the lookup ports in the mapping. You can view the SELECT statement by generating SQL using the Lookup SQL Override property. Do not add or delete any columns from the default SQL statement. ♦ ORDER BY. The ORDER BY clause orders the columns in the same order they appear in the Lookup transformation. The Integration Service generates the ORDER BY clause. You cannot view this when you generate the default SQL using the Lookup SQL Override property. Overriding the Lookup Query The lookup SQL override is similar to entering a custom query in a Source Qualifier transformation. You can override the lookup query for a relational lookup. You can enter the entire override, or you can generate and edit the default SQL statement. When the Designer generates the default SQL statement for the lookup SQL override, it includes the lookup/ output ports in the lookup condition and the lookup/return port. Override the lookup query in the following circumstances: ♦ Override the ORDER BY clause. Create the ORDER BY clause with fewer columns to increase performance. When you override the ORDER BY clause, you must suppress the generated ORDER BY clause with a comment notation. For more information, see “Overriding the ORDER BY Clause” on page 325. Note: If you use pushdown optimization, you cannot override the ORDER BY clause or suppress the generated ORDER BY clause with a comment notation. ♦ A lookup table name or column names contains a reserved word. If the table name or any column name in the lookup query contains a reserved word, you must ensure that all reserved words are enclosed in quotes. For more information, see “Reserved Words” on page 326. ♦ Use parameters and variables. Use parameters and variables when you enter a lookup SQL override. Use any parameter or variable type that you can define in the parameter file. You can enter a parameter or variable within the SQL statement, or you can use a parameter or variable as the SQL query. For example, you can use a session parameter, $ParamMyLkpOverride, as the lookup SQL query, and set $ParamMyLkpOverride to the SQL statement in a parameter file. 324 Chapter 14: Lookup Transformation
  • 357. The Designer cannot expand parameters and variables in the query override and does not validate it when you use a parameter or variable. The Integration Service expands the parameters and variables when you run the session. For more information about using mapping parameters and variables in expressions, see “Mapping Parameters and Variables” in the Designer Guide. For more information about parameter files, see “Parameter Files” in the Workflow Administration Guide. ♦ A lookup column name contains a slash (/) character. When generating the default lookup query, the Designer and Integration Service replace any slash character (/) in the lookup column name with an underscore character. To query lookup column names containing the slash character, override the default lookup query, replace the underscore characters with the slash character, and enclose the column name in double quotes. ♦ Add a WHERE statement. Use a lookup SQL override to add a WHERE statement to the default SQL statement. You might want to use this to reduce the number of rows included in the cache. When you add a WHERE statement to a Lookup transformation using a dynamic cache, use a Filter transformation before the Lookup transformation. This ensures the Integration Service only inserts rows into the dynamic cache and target table that match the WHERE clause. For more information, see “Using the WHERE Clause with a Dynamic Cache” on page 358. Note: The session fails if you include large object ports in a WHERE clause. ♦ Other. Use a lookup SQL override if you want to query lookup data from multiple lookups or if you want to modify the data queried from the lookup table before the Integration Service caches the lookup rows. For example, use TO_CHAR to convert dates to strings. Overriding the ORDER BY Clause By default, the Integration Service generates an ORDER BY clause for a cached lookup. The ORDER BY clause contains all lookup ports. To increase performance, you can suppress the default ORDER BY clause and enter an override ORDER BY with fewer columns. Note: If you use pushdown optimization, you cannot override the ORDER BY clause or suppress the generated ORDER BY clause with a comment notation. The Integration Service always generates an ORDER BY clause, even if you enter one in the override. Place two dashes ‘--’ after the ORDER BY override to suppress the generated ORDER BY clause. For example, a Lookup transformation uses the following lookup condition: ITEM_ID = IN_ITEM_ID PRICE <= IN_PRICE The Lookup transformation includes three lookup ports used in the mapping, ITEM_ID, ITEM_NAME, and PRICE. When you enter the ORDER BY clause, enter the columns in the same order as the ports in the lookup condition. You must also enclose all database reserved words in quotes. Enter the following lookup query in the lookup SQL override: SELECT ITEMS_DIM.ITEM_NAME, ITEMS_DIM.PRICE, ITEMS_DIM.ITEM_ID FROM ITEMS_DIM ORDER BY ITEMS_DIM.ITEM_ID, ITEMS_DIM.PRICE -- Lookup Query 325
  • 358. To override the default ORDER BY clause for a relational lookup, complete the following steps: 1. Generate the lookup query in the Lookup transformation. 2. Enter an ORDER BY clause that contains the condition ports in the same order they appear in the Lookup condition. 3. Place two dashes ‘--’ as a comment notation after the ORDER BY clause to suppress the ORDER BY clause that the Integration Service generates. If you override the lookup query with an ORDER BY clause without adding comment notation, the lookup fails. Note: Sybase has a 16 column ORDER BY limitation. If the Lookup transformation has more than 16 lookup/output ports (including the ports in the lookup condition), you might want to override the ORDER BY clause or use multiple Lookup transformations to query the lookup table. Reserved Words If any lookup name or column name contains a database reserved word, such as MONTH or YEAR, the session fails with database errors when the Integration Service executes SQL against the database. You can create and maintain a reserved words file, reswords.txt, in the Integration Service installation directory. When the Integration Service initializes a session, it searches for reswords.txt. If the file exists, the Integration Service places quotes around matching reserved words when it executes SQL against the database. You may need to enable some databases, such as Microsoft SQL Server and Sybase, to use SQL-92 standards regarding quoted identifiers. Use connection environment SQL to issue the command. For example, with Microsoft SQL Server, use the following command: SET QUOTED_IDENTIFIER ON Note: The reserved words file, reswords.txt, is a file that you create and maintain in the Integration Service installation directory. The Integration Service searches this file and places quotes around reserved words when it executes SQL against source, target, and lookup databases. For more information about reswords.txt, see “Working with Targets” in the Workflow Administration Guide. Guidelines to Overriding the Lookup Query Use the following guidelines when you override the lookup SQL query: ♦ You can only override the lookup SQL query for relational lookups. ♦ Configure the Lookup transformation for caching. If you do not enable caching, the Integration Service does not recognize the override. ♦ Generate the default query, and then configure the override. This helps ensure that all the lookup/output ports are included in the query. If you add or subtract ports from the SELECT statement, the session fails. 326 Chapter 14: Lookup Transformation
  • 359. Use a Filter transformation before a Lookup transformation using a dynamic cache when you add a WHERE clause to the lookup SQL override. This ensures the Integration Service only inserts rows in the dynamic cache and target table that match the WHERE clause. For more information, see “Using the WHERE Clause with a Dynamic Cache” on page 358. ♦ If you want to share the cache, use the same lookup SQL override for each Lookup transformation. ♦ If you override the ORDER BY clause, the session fails if the ORDER BY clause does not contain the condition ports in the same order they appear in the Lookup condition or if you do not suppress the generated ORDER BY clause with the comment notation. ♦ If you use pushdown optimization, you cannot override the ORDER BY clause or suppress the generated ORDER BY clause with comment notation. ♦ If the table name or any column name in the lookup query contains a reserved word, you must enclose all reserved words in quotes. Steps to Overriding the Lookup Query Use the following steps to override the default lookup SQL query. To override the default lookup query: 1. On the Properties tab, open the SQL Editor from within the Lookup SQL Override field. 2. Click Generate SQL to generate the default SELECT statement. Enter the lookup SQL override. 3. Connect to a database, and then click Validate to test the lookup SQL override. 4. Click OK to return to the Properties tab. Lookup Query 327
  • 360. Lookup Condition The Integration Service uses the lookup condition to test incoming values. It is similar to the WHERE clause in an SQL query. When you configure a lookup condition for the transformation, you compare transformation input values with values in the lookup source or cache, represented by lookup ports. When you run a workflow, the Integration Service queries the lookup source or cache for all incoming values based on the condition. You must enter a lookup condition in all Lookup transformations. Some guidelines for the lookup condition apply for all Lookup transformations, and some guidelines vary depending on how you configure the transformation. Use the following guidelines when you enter a condition for a Lookup transformation: ♦ The datatypes in a condition must match. ♦ Use one input port for each lookup port used in the condition. Use the same input port in more than one condition in a transformation. ♦ When you enter multiple conditions, the Integration Service evaluates each condition as an AND, not an OR. The Integration Service returns only rows that match all the conditions you specify. ♦ The Integration Service matches null values. For example, if an input lookup condition column is NULL, the Integration Service evaluates the NULL equal to a NULL in the lookup. ♦ If you configure a flat file lookup for sorted input, the Integration Service fails the session if the condition columns are not grouped. If the columns are grouped, but not sorted, the Integration Service processes the lookup as if you did not configure sorted input. For more information about sorted input, see “Flat File Lookups” on page 311. The lookup condition guidelines and the way the Integration Service processes matches can vary, depending on whether you configure the transformation for a dynamic cache or an uncached or static cache. For more information about lookup caches, see “Lookup Caches” on page 337. Uncached or Static Cache Use the following guidelines when you configure a Lookup transformation without a cache or to use a static cache: ♦ Use the following operators when you create the lookup condition: =, >, <, >=, <=, != Tip: If you include more than one lookup condition, place the conditions with an equal sign first to optimize lookup performance. For example, create the following lookup condition: ITEM_ID = IN_ITEM_ID PRICE <= IN_PRICE ♦ The input value must meet all conditions for the lookup to return a value. 328 Chapter 14: Lookup Transformation
  • 361. The condition can match equivalent values or supply a threshold condition. For example, you might look for customers who do not live in California, or employees whose salary is greater than $30,000. Depending on the nature of the source and condition, the lookup might return multiple values. Dynamic Cache If you configure a Lookup transformation to use a dynamic cache, you can only use the equality operator (=) in the lookup condition. Handling Multiple Matches Lookups find a value based on the conditions you set in the Lookup transformation. If the lookup condition is not based on a unique key, or if the lookup source is denormalized, the Integration Service might find multiple matches in the lookup source or cache. You can configure a Lookup transformation to handle multiple matches in the following ways: ♦ Return the first matching value, or return the last matching value. You can configure the transformation to return the first matching value or the last matching value. The first and last values are the first value and last value found in the lookup cache that match the lookup condition. When you cache the lookup source, the Integration Service generates an ORDER BY clause for each column in the lookup cache to determine the first and last row in the cache. The Integration Service then sorts each lookup source column in ascending order. The Integration Service sorts numeric columns in ascending numeric order (such as 0 to 10), date/time columns from January to December and from the first of the month to the end of the month, and string columns based on the sort order configured for the session. ♦ Return any matching value. You can configure the Lookup transformation to return any value that matches the lookup condition. When you configure the Lookup transformation to return any matching value, the transformation returns the first value that matches the lookup condition. It creates an index based on the key ports rather than all Lookup transformation ports. When you use any matching value, performance can improve because the process of indexing rows is simplified. ♦ Return an error. When the Lookup transformation uses a static cache or no cache, the Integration Service marks the row as an error, writes the row to the session log by default, and increases the error count by one. When the Lookup transformation uses a dynamic cache, the Integration Service fails the session when it encounters multiple matches either while caching the lookup table or looking up values in the cache that contain duplicate keys. Also, if you configure the Lookup transformation to output old values on updates, the Lookup transformation returns an error when it encounters multiple matches. Lookup Condition 329
  • 362. Lookup Caches You can configure a Lookup transformation to cache the lookup file or table. The Integration Service builds a cache in memory when it processes the first row of data in a cached Lookup transformation. It allocates memory for the cache based on the amount you configure in the transformation or session properties. The Integration Service stores condition values in the index cache and output values in the data cache. The Integration Service queries the cache for each row that enters the transformation. The Integration Service also creates cache files by default in the $PMCacheDir. If the data does not fit in the memory cache, the Integration Service stores the overflow values in the cache files. When the session completes, the Integration Service releases cache memory and deletes the cache files unless you configure the Lookup transformation to use a persistent cache. When configuring a lookup cache, you can specify any of the following options: ♦ Persistent cache ♦ Recache from lookup source ♦ Static cache ♦ Dynamic cache ♦ Shared cache Note: You can use a dynamic cache for relational or flat file lookups. For more information about working with lookup caches, see “Lookup Caches” on page 337. 330 Chapter 14: Lookup Transformation
  • 363. Configuring Unconnected Lookup Transformations An unconnected Lookup transformation is separate from the pipeline in the mapping. You write an expression using the :LKP reference qualifier to call the lookup within another transformation. Some common uses for unconnected lookups include: ♦ Testing the results of a lookup in an expression ♦ Filtering rows based on the lookup results ♦ Marking rows for update based on the result of a lookup, such as updating slowly changing dimension tables ♦ Calling the same lookup multiple times in one mapping Complete the following steps when you configure an unconnected Lookup transformation: 1. Add input ports. 2. Add the lookup condition. 3. Designate a return value. 4. Call the lookup from another transformation. Step 1. Add Input Ports Create an input port for each argument in the :LKP expression. For each lookup condition you plan to create, you need to add an input port to the Lookup transformation. You can create a different port for each condition, or use the same input port in more than one condition. For example, a retail store increased prices across all departments during the last month. The accounting department only wants to load rows into the target for items with increased prices. To accomplish this, complete the following tasks: ♦ Create a lookup condition that compares the ITEM_ID in the source with the ITEM_ID in the target. ♦ Compare the PRICE for each item in the source with the price in the target table. − If the item exists in the target table and the item price in the source is less than or equal to the price in the target table, you want to delete the row. − If the price in the source is greater than the item price in the target table, you want to update the row. Configuring Unconnected Lookup Transformations 331
  • 364. Create an input port (IN_ITEM_ID) with datatype Decimal (37,0) to match the ITEM_ID and an IN_PRICE input port with Decimal (10,2) to match the PRICE lookup port. Step 2. Add the Lookup Condition After you correctly configure the ports, define a lookup condition to compare transformation input values with values in the lookup source or cache. To increase performance, add conditions with an equal sign first. In this case, add the following lookup condition: ITEM_ID = IN_ITEM_ID PRICE <= IN_PRICE If the item exists in the mapping source and lookup source and the mapping source price is less than or equal to the lookup price, the condition is true and the lookup returns the values designated by the Return port. If the lookup condition is false, the lookup returns NULL. Therefore, when you write the update strategy expression, use ISNULL nested in an IIF to test for null values. Step 3. Designate a Return Value With unconnected Lookups, you can pass multiple input values into the transformation, but only one column of data out of the transformation. Designate one lookup/output port as a return port. The Integration Service can return one value from the lookup query. Use the return port to specify the return value. If you call the unconnected lookup from an update strategy or filter expression, you are generally checking for null values. In this case, the return port can be anything. If, however, you call the lookup from an expression performing a calculation, the return value needs to be the value you want to include in the calculation. 332 Chapter 14: Lookup Transformation
  • 365. To continue the update strategy example, you can define the ITEM_ID port as the return port. The update strategy expression checks for null values returned. If the lookup condition is true, the Integration Service returns the ITEM_ID. If the condition is false, the Integration Service returns NULL. Figure 14-2 shows a return port in a Lookup transformation: Figure 14-2. Return Port in a Lookup Transformation Return Port Step 4. Call the Lookup Through an Expression You supply input values for an unconnected Lookup transformation from a :LKP expression in another transformation. The arguments are local input ports that match the Lookup transformation input ports used in the lookup condition. Use the following syntax for a :LKP expression: :LKP.lookup_transformation_name(argument, argument, ...) To continue the example about the retail store, when you write the update strategy expression, the order of ports in the expression must match the order in the lookup condition. In this case, the ITEM_ID condition is the first lookup condition, and therefore, it is the first argument in the update strategy expression. IIF(ISNULL(:LKP.lkpITEMS_DIM(ITEM_ID, PRICE)), DD_UPDATE, DD_REJECT) Use the following guidelines to write an expression that calls an unconnected Lookup transformation: ♦ The order in which you list each argument must match the order of the lookup conditions in the Lookup transformation. ♦ The datatypes for the ports in the expression must match the datatypes for the input ports in the Lookup transformation. The Designer does not validate the expression if the datatypes do not match. Configuring Unconnected Lookup Transformations 333
  • 366. If one port in the lookup condition is not a lookup/output port, the Designer does not validate the expression. ♦ The arguments (ports) in the expression must be in the same order as the input ports in the lookup condition. ♦ If you use incorrect :LKP syntax, the Designer marks the mapping invalid. ♦ If you call a connected Lookup transformation in a :LKP expression, the Designer marks the mapping invalid. Tip: Avoid syntax errors when you enter expressions by using the point-and-click method to select functions and ports. 334 Chapter 14: Lookup Transformation
  • 367. Creating a Lookup Transformation The following steps summarize the process of creating a Lookup transformation. To create a Lookup transformation: 1. In the Mapping Designer, click Transformation > Create. Select the Lookup transformation. Enter a name for the transformation. The naming convention for Lookup transformations is LKP_TransformationName. Click OK. 2. In the Select Lookup Table dialog box, you can choose the following options: ♦ Choose an existing table or file definition. ♦ Choose to import a definition from a relational table or file. ♦ Skip to create a manual definition. Choose an existing definition. Import a definition. Manually create a definition. 3. Define input ports for each lookup condition you want to define. 4. For an unconnected Lookup transformation, create a return port for the value you want to return from the lookup. 5. Define output ports for the values you want to pass to another transformation. 6. For Lookup transformations that use a dynamic lookup cache, associate an input port or sequence ID with each lookup port. 7. Add the lookup conditions. If you include more than one condition, place the conditions using equal signs first to optimize lookup performance. For information about lookup conditions, see “Lookup Condition” on page 328. 8. On the Properties tab, set the properties for the Lookup transformation, and click OK. For a list of properties, see “Lookup Properties” on page 316. 9. For unconnected Lookup transformations, write an expression in another transformation using :LKP to call the unconnected Lookup transformation. Creating a Lookup Transformation 335
  • 368. Tips Use the following tips when you configure the Lookup transformation: Add an index to the columns used in a lookup condition. If you have privileges to modify the database containing a lookup table, you can improve performance for both cached and uncached lookups. This is important for very large lookup tables. Since the Integration Service needs to query, sort, and compare values in these columns, the index needs to include every column used in a lookup condition. Place conditions with an equality operator (=) first. If a Lookup transformation specifies several conditions, you can improve lookup performance by placing all the conditions that use the equality operator first in the list of conditions that appear under the Condition tab. Cache small lookup tables. Improve session performance by caching small lookup tables. The result of the lookup query and processing is the same, whether or not you cache the lookup table. Join tables in the database. If the lookup table is on the same database as the source table in the mapping and caching is not feasible, join the tables in the source database rather than using a Lookup transformation. Use a persistent lookup cache for static lookups. If the lookup source does not change between sessions, configure the Lookup transformation to use a persistent lookup cache. The Integration Service then saves and reuses cache files from session to session, eliminating the time required to read the lookup source. Call unconnected Lookup transformations with the :LKP reference qualifier. When you write an expression using the :LKP reference qualifier, you call unconnected Lookup transformations only. If you try to call a connected Lookup transformation, the Designer displays an error and marks the mapping invalid. 336 Chapter 14: Lookup Transformation
  • 369. Chapter 15 Lookup Caches This chapter includes the following topics: ♦ Overview, 338 ♦ Building Connected Lookup Caches, 340 ♦ Using a Persistent Lookup Cache, 342 ♦ Working with an Uncached Lookup or Static Cache, 344 ♦ Working with a Dynamic Lookup Cache, 345 ♦ Sharing the Lookup Cache, 363 ♦ Lookup Cache Tips, 369 337
  • 370. Overview You can configure a Lookup transformation to cache the lookup table. The Integration Service builds a cache in memory when it processes the first row of data in a cached Lookup transformation. It allocates memory for the cache based on the amount you configure in the transformation or session properties. The Integration Service stores condition values in the index cache and output values in the data cache. The Integration Service queries the cache for each row that enters the transformation. The Integration Service also creates cache files by default in the $PMCacheDir. If the data does not fit in the memory cache, the Integration Service stores the overflow values in the cache files. When the session completes, the Integration Service releases cache memory and deletes the cache files unless you configure the Lookup transformation to use a persistent cache. If you use a flat file lookup, the Integration Service always caches the lookup source. If you configure a flat file lookup for sorted input, the Integration Service cannot cache the lookup if the condition columns are not grouped. If the columns are grouped, but not sorted, the Integration Service processes the lookup as if you did not configure sorted input. For more information, see “Flat File Lookups” on page 311. When you configure a lookup cache, you can configure the following cache settings: ♦ Building caches. You can configure the session to build caches sequentially or concurrently. When you build sequential caches, the Integration Service creates caches as the source rows enter the Lookup transformation. When you configure the session to build concurrent caches, the Integration Service does not wait for the first row to enter the Lookup transformation before it creates caches. Instead, it builds multiple caches concurrently. For more information, see “Building Connected Lookup Caches” on page 340. ♦ Persistent cache. You can save the lookup cache files and reuse them the next time the Integration Service processes a Lookup transformation configured to use the cache. For more information, see “Using a Persistent Lookup Cache” on page 342. ♦ Recache from source. If the persistent cache is not synchronized with the lookup table, you can configure the Lookup transformation to rebuild the lookup cache. For more information, see “Building Connected Lookup Caches” on page 340. ♦ Static cache. You can configure a static, or read-only, cache for any lookup source. By default, the Integration Service creates a static cache. It caches the lookup file or table and looks up values in the cache for each row that comes into the transformation. When the lookup condition is true, the Integration Service returns a value from the lookup cache. The Integration Service does not update the cache while it processes the Lookup transformation. For more information, see “Working with an Uncached Lookup or Static Cache” on page 344. ♦ Dynamic cache. To cache a target table or flat file source and insert new rows or update existing rows in the cache, use a Lookup transformation with a dynamic cache. The Integration Service dynamically inserts or updates data in the lookup cache and passes data 338 Chapter 15: Lookup Caches
  • 371. to the target. For more information, see “Working with a Dynamic Lookup Cache” on page 345. ♦ Shared cache. You can share the lookup cache between multiple transformations. You can share an unnamed cache between transformations in the same mapping. You can share a named cache between transformations in the same or different mappings. For more information, see “Sharing the Lookup Cache” on page 363. When you do not configure the Lookup transformation for caching, the Integration Service queries the lookup table for each input row. The result of the Lookup query and processing is the same, whether or not you cache the lookup table. However, using a lookup cache can increase session performance. Optimize performance by caching the lookup table when the source table is large. For more information about caching properties, see “Lookup Properties” on page 316. For information about configuring the cache size, see “Session Caches” in the Workflow Administration Guide. Note: The Integration Service uses the same transformation logic to process a Lookup transformation whether you configure it to use a static cache or no cache. However, when you configure the transformation to use no cache, the Integration Service queries the lookup table instead of the lookup cache. Cache Comparison Table 15-1 compares the differences between an uncached lookup, a static cache, and a dynamic cache: Table 15-1. Lookup Caching Comparison Uncached Static Cache Dynamic Cache You cannot insert or update the You cannot insert or update the You can insert or update rows in the cache cache. cache. as you pass rows to the target. You cannot use a flat file lookup. Use a relational or a flat file lookup. Use a relational or a flat file lookup. When the condition is true, the When the condition is true, the When the condition is true, the Integration Integration Service returns a Integration Service returns a value Service either updates rows in the cache value from the lookup table or from the lookup table or cache. or leaves the cache unchanged, cache. When the condition is not true, the depending on the row type. This indicates When the condition is not true, Integration Service returns the that the row is in the cache and target the Integration Service returns default value for connected table. You can pass updated rows to a the default value for connected transformations and NULL for target. transformations and NULL for unconnected transformations. When the condition is not true, the unconnected transformations. For more information, see “Working Integration Service either inserts rows into For more information, see with an Uncached Lookup or Static the cache or leaves the cache unchanged, “Working with an Uncached Cache” on page 344. depending on the row type. This indicates Lookup or Static Cache” on that the row is not in the cache or target. page 344. You can pass inserted rows to a target table. For more information, see “Updating the Dynamic Lookup Cache” on page 356. Overview 339
  • 372. Building Connected Lookup Caches The Integration Service can build lookup caches for connected Lookup transformations in the following ways: ♦ Sequential caches. The Integration Service builds lookup caches sequentially. The Integration Service builds the cache in memory when it processes the first row of the data in a cached lookup transformation. For more information, see “Sequential Caches” on page 340. ♦ Concurrent caches. The Integration Service builds lookup caches concurrently. It does not need to wait for data to reach the Lookup transformation. For more information, see “Concurrent Caches” on page 341. Note: The Integration Service builds caches for unconnected Lookup transformations sequentially regardless of how you configure cache building. If you configure the session to build concurrent caches for an unconnected Lookup transformation, the Integration Service ignores this setting and builds unconnected Lookup transformation caches sequentially. Sequential Caches By default, the Integration Service builds a cache in memory when it processes the first row of data in a cached Lookup transformation. The Integration Service creates each lookup cache in the pipeline sequentially. The Integration Service waits for any upstream active transformation to complete processing before it starts processing the rows in the Lookup transformation. The Integration Service does not build caches for a downstream Lookup transformation until an upstream Lookup transformation completes building a cache. For example, the following mapping contains an unsorted Aggregator transformation followed by two Lookup transformations. Figure 15-1 shows a mapping that contains multiple Lookup transformations: Figure 15-1. Building Lookup Caches Sequentially 1) Aggregator 2) Lookup transformation builds 3) Lookup transformation builds transformation cache after it reads first input row. cache after it reads first input row. processes rows. The Integration Service processes all the rows for the unsorted Aggregator transformation and begins processing the first Lookup transformation after the unsorted Aggregator 340 Chapter 15: Lookup Caches
  • 373. transformation completes. When it processes the first input row, the Integration Service begins building the first lookup cache. After the Integration Service finishes building the first lookup cache, it can begin processing the lookup data. The Integration Service begins building the next lookup cache when the first row of data reaches the Lookup transformation. You might want to process lookup caches sequentially if the Lookup transformation may not process row data. The Lookup transformation may not process row data if the transformation logic is configured to route data to different pipelines based on a condition. Configuring sequential caching may allow you to avoid building lookup caches unnecessarily. For example, a Router transformation might route data to one pipeline if a condition resolves to true, and it might route data to another pipeline if the condition resolves to false. In this case, a Lookup transformation might not receive data at all. Concurrent Caches You can configure the Integration Service to create lookup caches concurrently. You may be able to improve session performance using concurrent caches. Performance may especially improve when the pipeline contains an active transformations upstream of the Lookup transformation. You may want to configure the session to create concurrent caches if you are certain that the you will need to build caches for each of the Lookup transformations in the session. When you configure the Lookup transformation to create concurrent caches, it does not wait for upstream transformations to complete before it creates lookup caches, and it does not need to finish building a lookup cache before it can begin building other lookup caches. For example, you configure the session shown in Figure 15-1 for concurrent cache creation. Figure 15-2 shows lookup transformation caches built concurrently: Figure 15-2. Building Lookup Caches Concurrently Caches are built concurrently. When you run the session, the Integration Service builds the Lookup caches concurrently. It does not wait for upstream transformations to complete, and it does not wait for other Lookup transformations to complete cache building. Note: You cannot process caches for unconnected Lookup transformations concurrently. To configure the session to create concurrent caches, configure a value for the session configuration attribute, Additional Concurrent Pipelines for Lookup Cache Creation. Building Connected Lookup Caches 341
  • 374. Using a Persistent Lookup Cache You can configure a Lookup transformation to use a non-persistent or persistent cache. The Integration Service saves or deletes lookup cache files after a successful session based on the Lookup Cache Persistent property. If the lookup table does not change between sessions, you can configure the Lookup transformation to use a persistent lookup cache. The Integration Service saves and reuses cache files from session to session, eliminating the time required to read the lookup table. Using a Non-Persistent Cache By default, the Integration Service uses a non-persistent cache when you enable caching in a Lookup transformation. The Integration Service deletes the cache files at the end of a session. The next time you run the session, the Integration Service builds the memory cache from the database. Using a Persistent Cache If you want to save and reuse the cache files, you can configure the transformation to use a persistent cache. Use a persistent cache when you know the lookup table does not change between session runs. The first time the Integration Service runs a session using a persistent lookup cache, it saves the cache files to disk instead of deleting them. The next time the Integration Service runs the session, it builds the memory cache from the cache files. If the lookup table changes occasionally, you can override session properties to recache the lookup from the database. When you use a persistent lookup cache, you can specify a name for the cache files. When you specify a named cache, you can share the lookup cache across sessions. For more information about the Cache File Name Prefix property, see “Lookup Properties” on page 316. For more information about sharing lookup caches, see “Sharing the Lookup Cache” on page 363. Rebuilding the Lookup Cache You can instruct the Integration Service to rebuild the lookup cache if you think that the lookup source changed since the last time the Integration Service built the persistent cache. When you rebuild a cache, the Integration Service creates new cache files, overwriting existing persistent cache files. The Integration Service writes a message to the session log when it rebuilds the cache. You can rebuild the cache when the mapping contains one Lookup transformation or when the mapping contains Lookup transformations in multiple target load order groups that share a cache. You do not need to rebuild the cache when a dynamic lookup shares the cache with a static lookup in the same mapping. If the Integration Service cannot reuse the cache, it either recaches the lookup from the database, or it fails the session, depending on the mapping and session properties. 342 Chapter 15: Lookup Caches
  • 375. Table 15-2 summarizes how the Integration Service handles persistent caching for named and unnamed caches: Table 15-2. Integration Service Handling of Persistent Caches Mapping or Session Changes Between Sessions Named Cache Unnamed Cache Integration Service cannot locate cache files. Rebuilds cache. Rebuilds cache. Enable or disable the Enable High Precision option in session properties. Fails session. Rebuilds cache. Edit the transformation in the Mapping Designer, Mapplet Designer, or Fails session. Rebuilds cache. Reusable Transformation Developer.* Edit the mapping (excluding Lookup transformation). Reuses cache. Rebuilds cache. Change database connection or the file location used to access the lookup Fails session. Rebuilds cache. table. Change the Integration Service data movement mode. Fails session. Rebuilds cache. Change the sort order in Unicode mode. Fails session. Rebuilds cache. Change the Integration Service code page to a compatible code page. Reuses cache. Reuses cache. Change the Integration Service code page to an incompatible code page. Fails session. Rebuilds cache. *Editing properties such as transformation description or port description does not affect persistent cache handling. Using a Persistent Lookup Cache 343
  • 376. Working with an Uncached Lookup or Static Cache By default, the Integration Service creates a static lookup cache when you configure a Lookup transformation for caching. The Integration Service builds the cache when it processes the first lookup request. It queries the cache based on the lookup condition for each row that passes into the transformation. The Integration Service does not update the cache while it processes the transformation. The Integration Service processes an uncached lookup the same way it processes a cached lookup except that it queries the lookup source instead of building and querying the cache. When the lookup condition is true, the Integration Service returns the values from the lookup source or cache. For connected Lookup transformations, the Integration Service returns the values represented by the lookup/output ports. For unconnected Lookup transformations, the Integration Service returns the value represented by the return port. When the condition is not true, the Integration Service returns either NULL or default values. For connected Lookup transformations, the Integration Service returns the default value of the output port when the condition is not met. For unconnected Lookup transformations, the Integration Service returns NULL when the condition is not met. When you create multiple partitions in a pipeline that use a static cache, the Integration Service creates one memory cache for each partition and one disk cache for each transformation. For more information, see “Session Caches” in the Workflow Administration Guide. 344 Chapter 15: Lookup Caches
  • 377. Working with a Dynamic Lookup Cache You can use a dynamic cache with a relational lookup or a flat file lookup. For relational lookups, you might configure the transformation to use a dynamic cache when the target table is also the lookup table. For flat file lookups, the dynamic cache represents the data to update in the target table. The Integration Service builds the cache when it processes the first lookup request. It queries the cache based on the lookup condition for each row that passes into the transformation. When you use a dynamic cache, the Integration Service updates the lookup cache as it passes rows to the target. When the Integration Service reads a row from the source, it updates the lookup cache by performing one of the following actions: ♦ Inserts the row into the cache. The row is not in the cache and you specified to insert rows into the cache. You can configure the transformation to insert rows into the cache based on input ports or generated sequence IDs. The Integration Service flags the row as insert. ♦ Updates the row in the cache. The row exists in the cache and you specified to update rows in the cache. The Integration Service flags the row as update. The Integration Service updates the row in the cache based on the input ports. ♦ Makes no change to the cache. The row exists in the cache and you specified to insert new rows only. Or, the row is not in the cache and you specified to update existing rows only. Or, the row is in the cache, but based on the lookup condition, nothing changes. The Integration Service flags the row as unchanged. The Integration Service either inserts or updates the cache or makes no change to the cache, based on the results of the lookup query, the row type, and the Lookup transformation properties you define. For more information, see “Updating the Dynamic Lookup Cache” on page 356. The following list describes some situations when you use a dynamic lookup cache: ♦ Updating a master customer table with new and updated customer information. You want to load new and updated customer information into a master customer table. Use a Lookup transformation that performs a lookup on the target table to determine if a customer exists or not. Use a dynamic lookup cache that inserts and updates rows in the cache as it passes rows to the target. ♦ Loading data into a slowly changing dimension table and a fact table. You want to load data into a slowly changing dimension table and a fact table. Create two pipelines and use a Lookup transformation that performs a lookup on the dimension table. Use a dynamic lookup cache to load data to the dimension table. Use a static lookup cache to load data to the fact table, making sure you specify the name of the dynamic cache from the first pipeline. For more information, see “Example Using a Dynamic Lookup Cache” on page 360. ♦ Reading a flat file that is an export from a relational table. You want to read data from a Teradata table, but the ODBC connection is slow. You can export the Teradata table contents to a flat file and use the file as a lookup source. You can pass the lookup cache Working with a Dynamic Lookup Cache 345
  • 378. changes back to the Teradata table if you configure the Teradata table as a relational target in the mapping. Use a Router or Filter transformation with the dynamic Lookup transformation to route inserted or updated rows to the cached target table. You can route unchanged rows to another target table or flat file, or you can drop them. When you create multiple partitions in a pipeline that use a dynamic lookup cache, the Integration Service creates one memory cache and one disk cache for each transformation. However, if you add a partition point at the Lookup transformation, the Integration Service creates one memory cache for each partition. For more information, see “Session Caches” in the Workflow Administration Guide. Figure 15-3 shows a mapping with a Lookup transformation that uses a dynamic lookup cache: Figure 15-3. Mapping with a Dynamic Lookup Cache A Lookup transformation using a dynamic cache has the following properties: ♦ NewLookupRow. The Designer adds this port to a Lookup transformation configured to use a dynamic cache. Indicates with a numeric value whether the Integration Service inserts or updates the row in the cache, or makes no change to the cache. To keep the lookup cache and the target table synchronized, you pass rows to the target when the NewLookupRow value is equal to 1 or 2. For more information, see “Using the NewLookupRow Port” on page 347. 346 Chapter 15: Lookup Caches
  • 379. Associated Port. Associate lookup ports with either an input/output port or a sequence ID. The Integration Service uses the data in the associated ports to insert or update rows in the lookup cache. If you associate a sequence ID, the Integration Service generates a primary key for inserted rows in the lookup cache. For more information, see “Using the Associated Input Port” on page 348. ♦ Ignore Null Inputs for Updates. The Designer activates this port property for lookup/ output ports when you configure the Lookup transformation to use a dynamic cache. Select this property when you do not want the Integration Service to update the column in the cache when the data in this column contains a null value. For more information, see “Using the Ignore Null Property” on page 353. ♦ Ignore in Comparison. The Designer activates this port property for lookup/output ports not used in the lookup condition when you configure the Lookup transformation to use a dynamic cache. The Integration Service compares the values in all lookup ports with the values in their associated input ports by default. Select this property if you want the Integration Service to ignore the port when it compares values before updating a row. For more information, see “Using the Ignore in Comparison Property” on page 354. Figure 15-4 shows the output port properties unique to a dynamic Lookup transformation: Figure 15-4. Dynamic Lookup Transformation Ports Tab NewLookupRow Associated Sequence-ID Associated Port Ignore Null Ignore in Comparison Using the NewLookupRow Port When you define a Lookup transformation to use a dynamic cache, the Designer adds the NewLookupRow port to the transformation. The Integration Service assigns a value to the port, depending on the action it performs to the lookup cache. Working with a Dynamic Lookup Cache 347
  • 380. Table 15-3 lists the possible NewLookupRow values: Table 15-3. NewLookupRow Values NewLookupRow Value Description 0 Integration Service does not update or insert the row in the cache. 1 Integration Service inserts the row into the cache. 2 Integration Service updates the row in the cache. When the Integration Service reads a row, it changes the lookup cache depending on the results of the lookup query and the Lookup transformation properties you define. It assigns the value 0, 1, or 2 to the NewLookupRow port to indicate if it inserts or updates the row in the cache, or makes no change. For information about how the Integration Service determines to update the cache, see “Updating the Dynamic Lookup Cache” on page 356. The NewLookupRow value indicates how the Integration Service changes the lookup cache. It does not change the row type. Therefore, use a Filter or Router transformation and an Update Strategy transformation to help keep the target table and lookup cache synchronized. Configure the Filter transformation to pass new and updated rows to the Update Strategy transformation before passing them to the cached target. Use the Update Strategy transformation to change the row type of each row to insert or update, depending on the NewLookupRow value. You can drop the rows that do not change the cache, or you can pass them to another target. For more information, see “Using Update Strategy Transformations with a Dynamic Cache” on page 354. Define the filter condition in the Filter transformation based on the value of NewLookupRow. For example, use the following condition to pass both inserted and updated rows to the cached target: NewLookupRow != 0 For more information about the Filter transformation, see “Filter Transformation” on page 189. Using the Associated Input Port When you use a dynamic lookup cache, you must associate each lookup/output port with an input/output port or a sequence ID. The Integration Service uses the data in the associated port to insert or update rows in the lookup cache. The Designer associates the input/output ports with the lookup/output ports used in the lookup condition. For more information about the values of a Lookup transformation when you use a dynamic lookup cache, see “Working with Lookup Transformation Values” on page 349. 348 Chapter 15: Lookup Caches
  • 381. Sometimes you need to create a generated key for a column in a target table. For lookup ports with an Integer or Small Integer datatype, you can associate a generated key instead of an input port. To do this, select Sequence-ID in the Associated Port column. When you select Sequence-ID in the Associated Port column, the Integration Service generates a key when it inserts a row into the lookup cache. The Integration Service uses the following process to generate sequence IDs: 1. When the Integration Service creates the dynamic lookup cache, it tracks the range of values in the cache associated with any port using a sequence ID. 2. When the Integration Service inserts a new row of data into the cache, it generates a key for a port by incrementing the greatest sequence ID existing value by one. 3. When the Integration Service reaches the maximum number for a generated sequence ID, it starts over at one. It then increments each sequence ID by one until it reaches the smallest existing value minus one. If the Integration Service runs out of unique sequence ID numbers, the session fails. Note: The maximum value for a sequence ID is 2147483647. The Integration Service only generates a sequence ID for rows it inserts into the cache. Working with Lookup Transformation Values When you associate an input/output port or a sequence ID with a lookup/output port, the following values match by default: ♦ Input value. Value the Integration Service passes into the transformation. ♦ Lookup value. Value that the Integration Service inserts into the cache. ♦ Input/output port output value. Value that the Integration Service passes out of the input/output port. The lookup/output port output value depends on whether you choose to output old or new values when the Integration Service updates a row: ♦ Output old values on update. The Integration Service outputs the value that existed in the cache before it updated the row. ♦ Output new values on update. The Integration Service outputs the updated value that it writes in the cache. The lookup/output port value matches the input/output port value. Note: You configure to output old or new values using the Output Old Value On Update transformation property. For more information about this property, see “Lookup Properties” on page 316. Working with a Dynamic Lookup Cache 349
  • 382. For example, you have the following Lookup transformation that uses a dynamic lookup cache: You define the following lookup condition: IN_CUST_ID = CUST_ID By default, the row type of all rows entering the Lookup transformation is insert. To perform both inserts and updates in the cache and target table, you select the Insert Else Update property in the Lookup transformation. The following sections describe the values of the rows in the cache, the input rows, lookup rows, and output rows as you run the session. Initial Cache Values When you run the session, the Integration Service builds the lookup cache from the target table with the following data: PK_PRIMARYKEY CUST_ID CUST_NAME ADDRESS 100001 80001 Marion James 100 Main St. 100002 80002 Laura Jones 510 Broadway Ave. 100003 80003 Shelley Lau 220 Burnside Ave. Input Values The source contains rows that exist and rows that do not exist in the target table. The following rows pass into the Lookup transformation from the Source Qualifier transformation: SQ_CUST_ID SQ_CUST_NAME SQ_ADDRESS 80001 Marion Atkins 100 Main St. 80002 Laura Gomez 510 Broadway Ave. 99001 Jon Freeman 555 6th Ave. Note: The input values always match the values the Integration Service outputs out of the input/output ports. 350 Chapter 15: Lookup Caches
  • 383. Lookup Values The Integration Service looks up values in the cache based on the lookup condition. It updates rows in the cache for existing customer IDs 80001 and 80002. It inserts a row into the cache for customer ID 99001. The Integration Service generates a new key (PK_PRIMARYKEY) for the new row. PK_PRIMARYKEY CUST_ID CUST_NAME ADDRESS 100001 80001 Marion Atkins 100 Main St. 100002 80002 Laura Gomez 510 Broadway Ave. 100004 99001 Jon Freeman 555 6th Ave. Output Values The Integration Service flags the rows in the Lookup transformation based on the inserts and updates it performs on the dynamic cache. These rows pass through an Expression transformation to a Router transformation that filters and passes on the inserted and updated rows to an Update Strategy transformation. The Update Strategy transformation flags the rows based on the value of the NewLookupRow port. The output values of the lookup/output and input/output ports depend on whether you choose to output old or new values when the Integration Service updates a row. However, the output values of the NewLookupRow port and any lookup/output port that uses the Sequence-ID is the same for new and updated rows. When you choose to output new values, the lookup/output ports output the following values: NewLookupRow PK_PRIMARYKEY CUST_ID CUST_NAME ADDRESS 2 100001 80001 Marion Atkins 100 Main St. 2 100002 80002 Laura Gomez 510 Broadway Ave. 1 100004 99001 Jon Freeman 555 6th Ave. When you choose to output old values, the lookup/output ports output the following values: NewLookupRow PK_PRIMARYKEY CUST_ID CUST_NAME ADDRESS 2 100001 80001 Marion James 100 Main St. 2 100002 80002 Laura Jones 510 Broadway Ave. 1 100004 99001 Jon Freeman 555 6th Ave. Note that when the Integration Service updates existing rows in the lookup cache and when it passes rows to the lookup/output ports, it always uses the existing primary key (PK_PRIMARYKEY) values for rows that exist in the cache and target table. The Integration Service uses the sequence ID to generate a new primary key for the customer that it does not find in the cache. The Integration Service inserts the new primary key value into the lookup cache and outputs it to the lookup/output port. Working with a Dynamic Lookup Cache 351
  • 384. The Integration Service output values from the input/output ports that match the input values. For those values, see “Input Values” on page 350. Note: If the input value is NULL and you select the Ignore Null property for the associated input port, the input value does not equal the lookup value or the value out of the input/ output port. When you select the Ignore Null property, the lookup cache and the target table might become unsynchronized if you pass null values to the target. You must verify that you do not pass null values to the target. For more information, see “Using the Ignore Null Property” on page 353. 352 Chapter 15: Lookup Caches
  • 385. Using the Ignore Null Property When you update a dynamic lookup cache and target table, the source data might contain some null values. The Integration Service can handle the null values in the following ways: ♦ Insert null values. The Integration Service uses null values from the source and updates the lookup cache and target table using all values from the source. ♦ Ignore null values. The Integration Service ignores the null values in the source and updates the lookup cache and target table using only the not null values from the source. If you know the source data contains null values, and you do not want the Integration Service to update the lookup cache or target with null values, select the Ignore Null property for the corresponding lookup/output port. For example, you want to update the master customer table. The source contains new customers and current customers whose last names have changed. The source contains the customer IDs and names of customers whose names have changed, but it contains null values for the address columns. You want to insert new customers and update the current customer names while retaining the current address information in a master customer table. For example, the master customer table contains the following data: PRIMARYKEY CUST_ID CUST_NAME ADDRESS CITY STATE ZIP 100001 80001 Marion James 100 Main St. Mt. View CA 94040 100002 80002 Laura Jones 510 Broadway Ave. Raleigh NC 27601 100003 80003 Shelley Lau 220 Burnside Ave. Portland OR 97210 The source contains the following data: CUST_ID CUST_NAME ADDRESS CITY STATE ZIP 80001 Marion Atkins NULL NULL NULL NULL 80002 Laura Gomez NULL NULL NULL NULL 99001 Jon Freeman 555 6th Ave. San Jose CA 95051 Select Insert Else Update in the Lookup transformation in the mapping. Select the Ignore Null option for all lookup/output ports in the Lookup transformation. When you run a session, the Integration Service ignores null values in the source data and updates the lookup cache and the target table with not null values: PRIMARYKEY CUST_ID CUST_NAME ADDRESS CITY STATE ZIP 100001 80001 Marion Atkins 100 Main St. Mt. View CA 94040 100002 80002 Laura Gomez 510 Broadway Ave. Raleigh NC 27601 100003 80003 Shelley Lau 220 Burnside Ave. Portland OR 97210 100004 99001 Jon Freeman 555 6th Ave. San Jose CA 95051 Note: When you choose to ignore NULLs, you must verify that you output the same values to the target that the Integration Service writes to the lookup cache. When you choose to ignore Working with a Dynamic Lookup Cache 353
  • 386. NULLs, the lookup cache and the target table might become unsynchronized if you pass null input values to the target. Configure the mapping based on the value you want the Integration Service to output from the lookup/output ports when it updates a row in the cache: ♦ New values. Connect only lookup/output ports from the Lookup transformation to the target. ♦ Old values. Add an Expression transformation after the Lookup transformation and before the Filter or Router transformation. Add output ports in the Expression transformation for each port in the target table and create expressions to ensure you do not output null input values to the target. Using the Ignore in Comparison Property When you run a session that uses a dynamic lookup cache, the Integration Service compares the values in all lookup ports with the values in their associated input ports by default. It compares the values to determine whether or not to update the row in the lookup cache. When a value in an input port differs from the value in the lookup port, the Integration Service updates the row in the cache. If you do not want to compare all ports, you can choose the ports you want the Integration Service to ignore when it compares ports. The Designer only enables this property for lookup/ output ports when the port is not used in the lookup condition. You can improve performance by ignoring some ports during comparison. You might want to do this when the source data includes a column that indicates whether or not the row contains data you need to update. Select the Ignore in Comparison property for all lookup ports except the port that indicates whether or not to update the row in the cache and target table. Note: You must configure the Lookup transformation to compare at least one port. The Integration Service fails the session when you ignore all ports. Using Update Strategy Transformations with a Dynamic Cache When you use a dynamic lookup cache, use Update Strategy transformations to define the row type for the following rows: ♦ Rows entering the Lookup transformation. By default, the row type of all rows entering a Lookup transformation is insert. However, use an Update Strategy transformation before a Lookup transformation to define all rows as update, or some as update and some as insert. ♦ Rows leaving the Lookup transformation. The NewLookupRow value indicates how the Integration Service changed the lookup cache, but it does not change the row type. Use a Filter or Router transformation after the Lookup transformation to direct rows leaving the Lookup transformation based on the NewLookupRow value. Use Update Strategy transformations after the Filter or Router transformation to flag rows for insert or update before the target definition in the mapping. 354 Chapter 15: Lookup Caches
  • 387. Note: If you want to drop the unchanged rows, do not connect rows from the Filter or Router transformation with the NewLookupRow equal to 0 to the target definition. When you define the row type as insert for rows entering a Lookup transformation, use the Insert Else Update property in the Lookup transformation. When you define the row type as update for rows entering a Lookup transformation, use the Update Else Insert property in the Lookup transformation. If you define some rows entering a Lookup transformation as update and some as insert, use either the Update Else Insert or Insert Else Update property, or use both properties. For more information, see “Updating the Dynamic Lookup Cache” on page 356. Figure 15-5 shows a mapping with multiple Update Strategy transformations and a Lookup transformation using a dynamic cache: Figure 15-5. Using Update Strategy Transformations with a Lookup Transformation Update Strategy marks rows as update. Update Strategy inserts new rows into the target. Update Strategy updates existing rows in the target. Output rows not connected to a target get dropped. In this case, the Update Strategy transformation before the Lookup transformation flags all rows as update. Select the Update Else Insert property in the Lookup transformation. The Router transformation sends the inserted rows to the Insert_New Update Strategy transformation and sends the updated rows to the Update_Existing Update Strategy transformation. The two Update Strategy transformations to the right of the Lookup transformation flag the rows for insert or update for the target. Configuring Sessions with a Dynamic Lookup Cache When you configure a session using Update Strategy transformations and a dynamic lookup cache, you must define certain session properties. On the General Options settings on the Properties tab in the session properties, define the Treat Source Rows As option as Data Driven. Working with a Dynamic Lookup Cache 355
  • 388. You must also define the following update strategy target table options: ♦ Select Insert ♦ Select Update as Update ♦ Do not select Delete These update strategy target table options ensure that the Integration Service updates rows marked for update and inserts rows marked for insert. If you do not choose Data Driven, the Integration Service flags all rows for the row type you specify in the Treat Source Rows As option and does not use the Update Strategy transformations in the mapping to flag the rows. The Integration Service does not insert and update the correct rows. If you do not choose Update as Update, the Integration Service does not correctly update the rows flagged for update in the target table. As a result, the lookup cache and target table might become unsynchronized. For more information, see “Setting the Update Strategy for a Session” on page 580. For more information about configuring target session properties, see “Working with Targets” in the Workflow Administration Guide. Updating the Dynamic Lookup Cache When you use a dynamic lookup cache, define the row type of the rows entering the Lookup transformation as either insert or update. You can define some rows as insert and some as update, or all insert, or all update. By default, the row type of all rows entering a Lookup transformation is insert. You can add an Update Strategy transformation before the Lookup transformation to define the row type as update. For more information, see “Using Update Strategy Transformations with a Dynamic Cache” on page 354. The Integration Service either inserts or updates rows in the cache, or does not change the cache. The row type of the rows entering the Lookup transformation and the lookup query result affect how the Integration Service updates the cache. However, you must also configure the following Lookup properties to determine how the Integration Service updates the lookup cache: ♦ Insert Else Update. Applies to rows entering the Lookup transformation with the row type of insert. ♦ Update Else Insert. Applies to rows entering the Lookup transformation with the row type of update. Note: You can select either the Insert Else Update or Update Else Insert property, or you can select both properties or neither property. The Insert Else Update property only affects rows entering the Lookup transformation with the row type of insert. The Update Else Insert property only affects rows entering the Lookup transformation with the row type of update. Insert Else Update You can select the Insert Else Update property in the Lookup transformation. This property only applies to rows entering the Lookup transformation with the row type of insert. When a 356 Chapter 15: Lookup Caches
  • 389. row of any other row type, such as update, enters the Lookup transformation, the Insert Else Update property has no effect on how the Integration Service handles the row. When you select Insert Else Update and the row type entering the Lookup transformation is insert, the Integration Service inserts the row into the cache if it is new. If the row exists in the index cache but the data cache is different than the current row, the Integration Service updates the row in the data cache. If you do not select Insert Else Update and the row type entering the Lookup transformation is insert, the Integration Service inserts the row into the cache if it is new, and makes no change to the cache if the row exists. Table 15-4 describes how the Integration Service changes the lookup cache when the row type of the rows entering the Lookup transformation is insert: Table 15-4. Dynamic Lookup Cache Behavior for Insert Row Type Insert Else Update Row Found in Data Cache is Lookup Cache NewLookupRow Option Cache Different Result Value Cleared (insert only) Yes n/a No change 0 No n/a Insert 1 Selected Yes Yes Update 2* Yes No No change 0 No n/a Insert 1 *If you select Ignore Null for all lookup ports not in the lookup condition and if all those ports contain null values, the Integration Service does not change the cache and the NewLookupRow value equals 0. For more information, see “Using the Ignore Null Property” on page 353. Update Else Insert You can select the Update Else Insert property in the Lookup transformation. This property only applies to rows entering the Lookup transformation with the row type of update. When a row of any other row type, such as insert, enters the Lookup transformation, this property has no effect on how the Integration Service handles the row. When you select this property and the row type entering the Lookup transformation is update, the Integration Service updates the row in the cache if the row exists in the index cache and the cache data is different than the existing row. The Integration Service inserts the row in the cache if it is new. If you do not select this property and the row type entering the Lookup transformation is update, the Integration Service updates the row in the cache if it exists, and makes no change to the cache if the row is new. Working with a Dynamic Lookup Cache 357
  • 390. Table 15-5 describes how the Integration Service changes the lookup cache when the row type of the rows entering the Lookup transformation is update: Table 15-5. Dynamic Lookup Cache Behavior for Update Row Type Update Else Insert Row Found in Data Cache is Lookup Cache NewLookupRow Option Cache Different Result Value Cleared (update only) Yes Yes Update 2* Yes No No change 0 No n/a No change 0 Selected Yes Yes Update 2* Yes No No change 0 No n/a Insert 1 *If you select Ignore Null for all lookup ports not in the lookup condition and if all those ports contain null values, the Integration Service does not change the cache and the NewLookupRow value equals 0. For more information, see “Using the Ignore Null Property” on page 353. Using the WHERE Clause with a Dynamic Cache When you add a WHERE clause in a lookup SQL override, the Integration Service uses the WHERE clause to build the cache from the database and to perform a lookup on the database table for an uncached lookup. However, it does not use the WHERE clause to insert rows into a dynamic cache when it runs a session. When you add a WHERE clause in a Lookup transformation using a dynamic cache, connect a Filter transformation before the Lookup transformation to filter rows you do not want to insert into the cache or target table. If you do not use a Filter transformation, you might get inconsistent data. For example, you configure a Lookup transformation to perform a dynamic lookup on the employee table, EMP, matching rows by EMP_ID. You define the following lookup SQL override: SELECT EMP_ID, EMP_STATUS FROM EMP ORDER BY EMP_ID, EMP_STATUS WHERE EMP_STATUS = 4 When you first run the session, the Integration Service builds the lookup cache from the target table based on the lookup SQL override. Therefore, all rows in the cache match the condition in the WHERE clause, EMP_STATUS = 4. Suppose the Integration Service reads a source row that meets the lookup condition you specify (the value for EMP_ID is found in the cache), but the value of EMP_STATUS is 2. The Integration Service does not find the row in the cache, so it inserts the row into the cache and passes the row to the target table. When this happens, not all rows in the cache match the condition in the WHERE clause. When the Integration Service tries to insert this row in the target table, you might get inconsistent data if the row already exists there. 358 Chapter 15: Lookup Caches
  • 391. To verify that you only insert rows into the cache that match the WHERE clause, add a Filter transformation before the Lookup transformation and define the filter condition as the condition in the WHERE clause in the lookup SQL override. For the example above, enter the following filter condition: EMP_STATUS = 4 For more information about the lookup SQL override, see “Overriding the Lookup Query” on page 324. Synchronizing the Dynamic Lookup Cache When you use a dynamic lookup cache, the Integration Service writes to the lookup cache before it writes to the target table. The lookup cache and target table can become unsynchronized if the Integration Service does not write the data to the target. For example, the target database or Informatica writer might reject the data. Use the following guidelines to keep the lookup cache synchronized with the lookup table: ♦ Use a Router transformation to pass rows to the cached target when the NewLookupRow value equals one or two. Use the Router transformation to drop rows when the NewLookupRow value equals zero, or you can output those rows to a different target. ♦ Use Update Strategy transformations after the Lookup transformation to flag rows for insert or update into the target. ♦ Set the error threshold to one when you run a session. When you set the error threshold to one, the session fails when it encounters the first error. The Integration Service does not write the new cache files to disk. Instead, it restores the original cache files, if they exist. You must also restore the pre-session target table to the target database. For more information about setting the error threshold, see “Working with Sessions” in the Workflow Administration Guide. ♦ Verify that you output the same values to the target that the Integration Service writes to the lookup cache. When you choose to output new values on update, only connect lookup/ output ports to the target table instead of input/output ports. When you choose to output old values on update, add an Expression transformation after the Lookup transformation and before the Router transformation. Add output ports in the Expression transformation for each port in the target table and create expressions to ensure you do not output null input values to the target. ♦ Set the Treat Source Rows As property to Data Driven in the session properties. ♦ Select Insert and Update as Update when you define the update strategy target table options in the session properties. This ensures that the Integration Service updates rows marked for update and inserts rows marked for insert. Select these options in the Transformations View on the Mapping tab in the session properties. For more information, see “Working with Targets” in the Workflow Administration Guide. Working with a Dynamic Lookup Cache 359
  • 392. Null Values in Lookup Condition Columns Sometimes when you run a session, the source data may contain null values in columns used in the lookup condition. The Integration Service handles rows with null values in lookup condition columns differently, depending on whether the row exists in the cache: ♦ If the row does not exist in the lookup cache, the Integration Service inserts the row in the cache and passes it to the target table. ♦ If the row does exist in the lookup cache, the Integration Service does not update the row in the cache or target table. Note: If the source data contains null values in the lookup condition columns, set the error threshold to one. This ensures that the lookup cache and table remain synchronized if the Integration Service inserts a row in the cache, but the database rejects the row due to a Not Null constraint. Example Using a Dynamic Lookup Cache Use a dynamic lookup cache when you need to insert and update rows in the target. When you use a dynamic lookup cache, you can insert and update the cache with the same data you pass to the target to insert and update. For example, use a dynamic lookup cache to update a table that contains customer data. The source data contains rows that you need to insert into the target and rows you need to update in the target. Figure 15-6 shows a mapping that uses a dynamic cache: Figure 15-6. Slowly Changing Dimension Mapping with Dynamic Lookup Cache The Lookup transformation uses a dynamic lookup cache. When the session starts, the Integration Service builds the lookup cache from the target table. When the Integration Service reads a row that is not in the lookup cache, it inserts the row in the cache and then passes the row out of the Lookup transformation. The Router transformation directs the row to the UPD_Insert_New Update Strategy transformation. The Update Strategy transformation marks the row as insert before passing it to the target. The target table changes as the session runs, and the Integration Service inserts new rows and updates existing rows in the lookup cache. The Integration Service keeps the lookup cache and target table synchronized. 360 Chapter 15: Lookup Caches
  • 393. To generate keys for the target, use Sequence-ID in the associated port. The sequence ID generates primary keys for new rows the Integration Service inserts into the target table. Without the dynamic lookup cache, you need to use two Lookup transformations in the mapping. Use the first Lookup transformation to insert rows in the target. Use the second Lookup transformation to recache the target table and update rows in the target table. You increase session performance when you use a dynamic lookup cache because you only need to build the cache from the database once. You can continue to use the lookup cache even though the data in the target table changes. Rules and Guidelines for Dynamic Caches Use the following guidelines when you use a dynamic lookup cache: ♦ You can create a dynamic lookup cache from a relational or flat file source. ♦ The Lookup transformation must be a connected transformation. ♦ Use a persistent or a non-persistent cache. ♦ If the dynamic cache is not persistent, the Integration Service always rebuilds the cache from the database, even if you do not enable Recache from Lookup Source. ♦ You cannot share the cache between a dynamic Lookup transformation and static Lookup transformation in the same target load order group. ♦ You can only create an equality lookup condition. You cannot look up a range of data. ♦ Associate each lookup port (that is not in the lookup condition) with an input port or a sequence ID. ♦ Use a Router transformation to pass rows to the cached target when the NewLookupRow value equals one or two. Use the Router transformation to drop rows when the NewLookupRow value equals zero, or you can output those rows to a different target. ♦ Verify that you output the same values to the target that the Integration Service writes to the lookup cache. When you choose to output new values on update, only connect lookup/ output ports to the target table instead of input/output ports. When you choose to output old values on update, add an Expression transformation after the Lookup transformation and before the Router transformation. Add output ports in the Expression transformation for each port in the target table and create expressions to ensure you do not output null input values to the target. ♦ When you use a lookup SQL override, make sure you map the correct columns to the appropriate targets for lookup. ♦ When you add a WHERE clause to the lookup SQL override, use a Filter transformation before the Lookup transformation. This ensures the Integration Service only inserts rows in the dynamic cache and target table that match the WHERE clause. For more information, see “Using the WHERE Clause with a Dynamic Cache” on page 358. ♦ When you configure a reusable Lookup transformation to use a dynamic cache, you cannot edit the condition or disable the Dynamic Lookup Cache property in a mapping. Working with a Dynamic Lookup Cache 361
  • 394. Use Update Strategy transformations after the Lookup transformation to flag the rows for insert or update for the target. ♦ Use an Update Strategy transformation before the Lookup transformation to define some or all rows as update if you want to use the Update Else Insert property in the Lookup transformation. ♦ Set the row type to Data Driven in the session properties. ♦ Select Insert and Update as Update for the target table options in the session properties. 362 Chapter 15: Lookup Caches
  • 395. Sharing the Lookup Cache You can configure multiple Lookup transformations in a mapping to share a single lookup cache. The Integration Service builds the cache when it processes the first Lookup transformation. It uses the same cache to perform lookups for subsequent Lookup transformations that share the cache. You can share caches that are unnamed and named: ♦ Unnamed cache. When Lookup transformations in a mapping have compatible caching structures, the Integration Service shares the cache by default. You can only share static unnamed caches. ♦ Named cache. Use a persistent named cache when you want to share a cache file across mappings or share a dynamic and a static cache. The caching structures must match or be compatible with a named cache. You can share static and dynamic named caches. When the Integration Service shares a lookup cache, it writes a message in the session log. Sharing an Unnamed Lookup Cache By default, the Integration Service shares the cache for Lookup transformations in a mapping that have compatible caching structures. For example, if you have two instances of the same reusable Lookup transformation in one mapping and you use the same output ports for both instances, the Lookup transformations share the lookup cache by default. When two Lookup transformations share an unnamed cache, the Integration Service saves the cache for a Lookup transformation and uses it for subsequent Lookup transformations that have the same lookup cache structure. If the transformation properties or the cache structure do not allow sharing, the Integration Service creates a new cache. Guidelines for Sharing an Unnamed Lookup Cache Use the following guidelines when you configure Lookup transformations to share an unnamed cache: ♦ You can share static unnamed caches. ♦ Shared transformations must use the same ports in the lookup condition. The conditions can use different operators, but the ports must be the same. ♦ You must configure some of the transformation properties to enable unnamed cache sharing. For more information, see Table 15-6 on page 364. ♦ The structure of the cache for the shared transformations must be compatible. − If you use hash auto-keys partitioning, the lookup/output ports for each transformation must match. − If you do not use hash auto-keys partitioning, the lookup/output ports for the first shared transformation must match or be a superset of the lookup/output ports for subsequent transformations. Sharing the Lookup Cache 363
  • 396. If the Lookup transformations with hash auto-keys partitioning are in different target load order groups, you must configure the same number of partitions for each group. If you do not use hash auto-keys partitioning, you can configure a different number of partitions for each target load order group. Table 15-6 shows when you can share an unnamed static and dynamic cache: Table 15-6. Location for Sharing Unnamed Cache Shared Cache Location of Transformations Static with Static Anywhere in the mapping. Dynamic with Dynamic Cannot share. Dynamic with Static Cannot share. Table 15-7 describes the guidelines to follow when you configure Lookup transformations to share an unnamed cache: Table 15-7. Properties for Sharing Unnamed Cache Properties Configuration for Unnamed Shared Cache Lookup SQL Override If you use the Lookup SQL Override property, you must use the same override in all shared transformations. Lookup Table Name Must match. Lookup Caching Enabled Must be enabled. Lookup Policy on Multiple n/a Match Lookup Condition Shared transformations must use the same ports in the lookup condition. The conditions can use different operators, but the ports must be the same. Connection Information The connection must be the same. When you configure the sessions, the database connection must match. Source Type Must match. Tracing Level n/a Lookup Cache Directory Name Does not need to match. Lookup Cache Persistent Optional. You can share persistent and non-persistent. Lookup Data Cache Size Integration Service allocates memory for the first shared transformation in each pipeline stage. It does not allocate additional memory for subsequent shared transformations in the same pipeline stage. For information about pipeline stages, see “Pipeline Partitioning” in the Workflow Administration Guide. Lookup Index Cache Size Integration Service allocates memory for the first shared transformation in each pipeline stage. It does not allocate additional memory for subsequent shared transformations in the same pipeline stage. For information about pipeline stages, see “Pipeline Partitioning” in the Workflow Administration Guide. 364 Chapter 15: Lookup Caches
  • 397. Table 15-7. Properties for Sharing Unnamed Cache Properties Configuration for Unnamed Shared Cache Dynamic Lookup Cache You cannot share an unnamed dynamic cache. Output Old Value On Update Does not need to match. Cache File Name Prefix Do not use. You cannot share a named cache with an unnamed cache. Recache From Lookup Source If you configure a Lookup transformation to recache from source, subsequent Lookup transformations in the target load order group can share the existing cache whether or not you configure them to recache from source. If you configure subsequent Lookup transformations to recache from source, the Integration Service shares the cache instead of rebuilding the cache when it processes the subsequent Lookup transformation. If you do not configure the first Lookup transformation in a target load order group to recache from source, and you do configure the subsequent Lookup transformation to recache from source, the transformations cannot share the cache. The Integration Service builds the cache when it processes each Lookup transformation. Lookup/Output Ports The lookup/output ports for the second Lookup transformation must match or be a subset of the ports in the transformation that the Integration Service uses to build the cache. The order of the ports do not need to match. Insert Else Update n/a Update Else Insert n/a Datetime Format n/a Thousand Separator n/a Decimal Separator n/a Case-Sensitive String Must match. Comparison Null Ordering Must match. Sorted Input n/a Sharing a Named Lookup Cache You can also share the cache between multiple Lookup transformations by using a persistent lookup cache and naming the cache files. You can share one cache between Lookup transformations in the same mapping or across mappings. The Integration Service uses the following process to share a named lookup cache: 1. When the Integration Service processes the first Lookup transformation, it searches the cache directory for cache files with the same file name prefix. For more information about the Cache File Name Prefix property, see “Lookup Properties” on page 316. 2. If the Integration Service finds the cache files and you do not specify to recache from source, the Integration Service uses the saved cache files. 3. If the Integration Service does not find the cache files or if you specify to recache from source, the Integration Service builds the lookup cache using the database table. Sharing the Lookup Cache 365
  • 398. 4. The Integration Service saves the cache files to disk after it processes each target load order group. 5. The Integration Service uses the following rules to process the second Lookup transformation with the same cache file name prefix: ♦ The Integration Service uses the memory cache if the transformations are in the same target load order group. ♦ The Integration Service rebuilds the memory cache from the persisted files if the transformations are in different target load order groups. ♦ The Integration Service rebuilds the cache from the database if you configure the transformation to recache from source and the first transformation is in a different target load order group. ♦ The Integration Service fails the session if you configure subsequent Lookup transformations to recache from source, but not the first one in the same target load order group. ♦ If the cache structures do not match, the Integration Service fails the session. If you run two sessions simultaneously that share a lookup cache, the Integration Service uses the following rules to share the cache files: ♦ The Integration Service processes multiple sessions simultaneously when the Lookup transformations only need to read the cache files. ♦ The Integration Service fails the session if one session updates a cache file while another session attempts to read or update the cache file. For example, Lookup transformations update the cache file if they are configured to use a dynamic cache or recache from source. Guidelines for Sharing a Named Lookup Cache Use the following guidelines when you configure Lookup transformations to share a named cache: ♦ You can share any combination of dynamic and static caches, but you must follow the guidelines for location. For more information, see Table 15-8 on page 367. ♦ You must configure some of the transformation properties to enable named cache sharing. For more information, see Table 15-9 on page 367. ♦ A dynamic lookup cannot share the cache if the named cache has duplicate rows. ♦ A named cache created by a dynamic Lookup transformation with a lookup policy of error on multiple match can be shared by a static or dynamic Lookup transformation with any lookup policy. ♦ A named cache created by a dynamic Lookup transformation with a lookup policy of use first or use last can be shared by a Lookup transformation with the same lookup policy. ♦ Shared transformations must use the same output ports in the mapping. The criteria and result columns for the cache must match the cache files. The Integration Service might use the memory cache, or it might build the memory cache from the file, depending on the type and location of the Lookup transformations. 366 Chapter 15: Lookup Caches
  • 399. Table 15-8 shows when you can share a static and dynamic named cache: Table 15-8. Location for Sharing Named Cache Shared Cache Location of Transformations Cache Shared Static with Static - Same target load order group. - Integration Service uses memory cache. - Separate target load order groups. - Integration Service uses memory cache. - Separate mappings. - Integration Service builds memory cache from file. Dynamic with Dynamic - Separate target load order groups. - Integration Service uses memory cache. - Separate mappings. - Integration Service builds memory cache from file. Dynamic with Static - Separate target load order groups. - Integration Service builds memory cache from file. - Separate mappings. - Integration Service builds memory cache from file. For more information about target load order groups, see “Mappings” in the Designer Guide. Table 15-9 describes the guidelines to follow when you configure Lookup transformations to share a named cache: Table 15-9. Properties for Sharing Named Cache Properties Configuration for Named Shared Cache Lookup SQL Override If you use the Lookup SQL Override property, you must use the same override in all shared transformations. Lookup Table Name Must match. Lookup Caching Enabled Must be enabled. Lookup Policy on Multiple - A named cache created by a dynamic Lookup transformation with a lookup policy of Match error on multiple match can be shared by a static or dynamic Lookup transformation with any lookup policy. - A named cache created by a dynamic Lookup transformation with a lookup policy of use first or use last can be shared by a Lookup transformation with the same lookup policy. Lookup Condition Shared transformations must use the same ports in the lookup condition. The conditions can use different operators, but the ports must be the same. Connection Information The connection must be the same. When you configure the sessions, the database connection must match. Source Type Must match. Tracing Level n/a Lookup Cache Directory Must match. Name Lookup Cache Persistent Must be enabled. Lookup Data Cache Size When transformations within the same mapping share a cache, the Integration Service allocates memory for the first shared transformation in each pipeline stage. It does not allocate additional memory for subsequent shared transformations in the same pipeline stage. For information about pipeline stages, see “Pipeline Partitioning” in the Workflow Administration Guide. Sharing the Lookup Cache 367
  • 400. Table 15-9. Properties for Sharing Named Cache Properties Configuration for Named Shared Cache Lookup Index Cache Size When transformations within the same mapping share a cache, the Integration Service allocates memory for the first shared transformation in each pipeline stage. It does not allocate additional memory for subsequent shared transformations in the same pipeline stage. For information about pipeline stages, see “Pipeline Partitioning” in the Workflow Administration Guide. Dynamic Lookup Cache For more information about sharing static and dynamic cache, see Table 15-8 on page 367. Output Old Value on Update Does not need to match. Cache File Name Prefix Must match. Enter the prefix only. Do not enter .idx or .dat. You cannot share a named cache with an unnamed cache. Recache from Source If you configure a Lookup transformation to recache from source, subsequent Lookup transformations in the target load order group can share the existing cache whether or not you configure them to recache from source. If you configure subsequent Lookup transformations to recache from source, the Integration Service shares the cache instead of rebuilding the cache when it processes the subsequent Lookup transformation. If you do not configure the first Lookup transformation in a target load order group to recache from source, and you do configure the subsequent Lookup transformation to recache from source, the session fails. Lookup/Output Ports Lookup/output ports must be identical, but they do not need to be in the same order. Insert Else Update n/a Update Else Insert n/a Thousand Separator n/a Decimal Separator n/a Case-Sensitive String n/a Comparison Null Ordering n/a Sorted Input Must match. Note: You cannot share a lookup cache created on a different operating system. For example, only an Integration Service on UNIX can read a lookup cache created on a Integration Service on UNIX, and only an Integration Service on Windows can read a lookup cache created on an Integration Service on Windows. 368 Chapter 15: Lookup Caches
  • 401. Lookup Cache Tips Use the following tips when you configure the Lookup transformation to cache the lookup table: Cache small lookup tables. Improve session performance by caching small lookup tables. The result of the lookup query and processing is the same, whether or not you cache the lookup table. Use a persistent lookup cache for static lookup tables. If the lookup table does not change between sessions, configure the Lookup transformation to use a persistent lookup cache. The Integration Service then saves and reuses cache files from session to session, eliminating the time required to read the lookup table. Lookup Cache Tips 369
  • 402. 370 Chapter 15: Lookup Caches
  • 403. Chapter 16 Normalizer Transformation This chapter includes the following topics: ♦ Overview, 372 ♦ Normalizer Transformation Components, 374 ♦ Normalizer Transformation Generated Keys, 379 ♦ VSAM Normalizer Transformation, 380 ♦ Pipeline Normalizer Transformation, 387 ♦ Using a Normalizer Transformation in a Mapping, 394 ♦ Troubleshooting, 399 371
  • 404. Overview Transformation type: Active Connected The Normalizer transformation receives a row that contains multiple-occurring columns and returns a row for each instance of the multiple-occurring data. The transformation processes multiple-occurring columns or multiple-occurring groups of columns in each source row. The Normalizer transformation parses multiple-occurring columns from COBOL sources, relational tables, or other sources. It can process multiple record types from a COBOL source that contains a REDEFINES clause. For example, a relational table contains quarterly sales totals by store. You need to create a row for each sales occurrence. You can configure a Normalizer transformation to return a separate row for each quarter. The following source rows contain four quarters of sales by store: Store1 100 300 500 700 Store2 250 450 650 850 The Normalizer returns a row for each store and sales combination. It also returns an index that identifies the quarter number: Store1 100 1 Store1 300 2 Store1 500 3 Store1 700 4 Store2 250 1 Store2 450 2 Store2 650 3 Store2 850 4 The Normalizer transformation generates a key for each source row. The Integration Service increments the generated key sequence number each time it processes a source row. When the source row contains a multiple-occurring column or a multiple-occurring group of columns, the Normalizer transformation returns a row for each occurrence. Each row contains the same generated key value. When the Normalizer returns multiple rows from a source row, it returns duplicate data for single-occurring source columns. For example, Store1 and Store2 repeat for each instance of sales. You can create a VSAM Normalizer transformation or a pipeline Normalizer transformation: ♦ VSAM Normalizer transformation. A non-reusable transformation that is a Source Qualifier transformation for a COBOL source. The Mapping Designer creates VSAM 372 Chapter 16: Normalizer Transformation
  • 405. Normalizer columns from a COBOL source in a mapping. The column attributes are read- only. The VSAM Normalizer receives a multiple-occurring source column through one input port. For more information, see “VSAM Normalizer Transformation” on page 380. ♦ Pipeline Normalizer transformation. A transformation that processes multiple-occurring data from relational tables or flat files. You create the columns manually and edit them in the Transformation Developer or Mapping Designer. The pipeline Normalizer transformation represents multiple-occurring columns with one input port for each source column occurrence. For more information about the Pipeline Normalizer transformation, see “Pipeline Normalizer Transformation” on page 387. Overview 373
  • 406. Normalizer Transformation Components A Normalizer transformation contains the following tabs: ♦ Transformation. Enter the name and description of the transformation. The naming convention for an Normalizer transformation is NRM_TransformationName. You can also make the pipeline Normalizer transformation reusable. ♦ Ports. View the transformation ports and attributes. For more information, see “Ports Tab” on page 374. ♦ Properties. Configure the tracing level to determine the amount of transaction detail reported in the session log file. Choose to reset or restart the generated key sequence value in the next session. For more information, see “Properties Tab” on page 376. ♦ Normalizer. Define the structure of the source data. The Normalizer tab defines source data as columns and groups of columns. For more information, see “Normalizer Tab” on page 377. ♦ Metadata Extensions. Configure the extension name, datatype, precision, and value. You can also create reusable metadata extensions. For more information about creating metadata extensions, see “Metadata Extensions” in the Repository Guide. Figure 16-1 shows the ports on the Normalizer transformation: Figure 16-1. Normalizer Transformation Ports Ports Tab When you define a Normalizer transformation, you configure the columns in the Normalizer tab. The Designer creates the ports. You can view the Normalizer ports and attributes on the Ports tab. 374 Chapter 16: Normalizer Transformation
  • 407. Pipeline and VSAM Normalizer transformations represent multiple-occurring source columns differently. A VSAM Normalizer transformation has one input port for a multiple-occurring column. A pipeline Normalizer transformation has multiple input ports for a multiple- occurring column. The Normalizer transformation has one output port for each single-occurring input port. When a source column is multiple-occurring, the pipeline and VSAM Normalizer transformations have one output port for the column. The transformation returns a row for each source column occurrence. The Normalizer transformation has a generated column ID (GCID) port for each multiple- occurring column. The generated column ID is an index for the instance of the multiple- occurring data. For example, if a column occurs four times in a source record, the Normalizer returns a value of 1, 2, 3, or 4 in the generated column ID based on which instance of the multiple-occurring data occurs in the row. The naming convention for the Normalizer generated column ID is GCID_<occuring_field_name>. The Normalizer transformation has at least one generated key port. The Integration Service increments the generated key sequence number each time it processes a source row. Figure 16-2 shows the Normalizer transformation Ports tab: Figure 16-2. Normalizer Ports Tab Sales_By_Quarter is multiple- occurring in the source. The Normalizer transformation has one output port for Sales_By_Quarter. It returns four rows for each source row. Generated Key Start Value You can change the ports on a pipeline Normalizer transformation by editing the columns on the Normalizer tab. To change a VSAM Normalizer transformation, you need to change the COBOL source and recreate the transformation. You can change the generated key start values on the Ports tab. You can enter different values for each generated key. When you change a start value, the generated key value resets to the start value the next time you run a session. For more information about generated keys, see “Normalizer Transformation Generated Keys” on page 379. Normalizer Transformation Components 375
  • 408. For more information about the VSAM Normalizer Ports tab, see “VSAM Normalizer Ports Tab” on page 382. For more information about the pipeline Normalizer Ports tab, see “Pipeline Normalizer Ports Tab” on page 388. Properties Tab Configure the Normalizer transformation general properties on the Properties tab. Figure 16-3 shows the Normalizer transformation Properties tab: Figure 16-3. Normalizer Transformation Properties Tab Table 16-1 describes the Normalizer transformation properties: Table 16-1. Normalizer Transformation Properties Required/ Property Description Optional Reset Required At the end of a session, resets the value sequence for each generated key value to the value it was before the session. For more information about generated keys, see “Normalizer Transformation Generated Keys” on page 379. Restart Required Starts the generated key sequence at 1. Each time you run a session, the key sequence value starts at 1 and overrides the sequence value on the Ports tab. For more information about generated keys, see “Normalizer Transformation Generated Keys” on page 379. Tracing Level Required Sets the amount of detail included in the session log when you run a session containing this transformation. For more information, see “Configuring Tracing Level in Transformations” on page 30. 376 Chapter 16: Normalizer Transformation
  • 409. Normalizer Tab The Normalizer tab defines the structure of the source data. The Normalizer tab defines source data as columns and groups of columns. A group of columns might define a record in a COBOL source or it might define a group of multiple-occurring fields in the source. The column level number identifies groups of columns in the data. Level numbers define a data hierarchy. Columns in a group have the same level number and display sequentially below a group-level column. A group-level column has a lower level number, and it contains no data. In Figure 16-4 on page 377, Quarterly_Data is a group-level column. It is Level 1. The Quarterly_Data group occurs four times in each row. Sales_by_Quarter and Returns_by_Quarter belong to the group. They are Level 2 columns. Figure 16-4 shows the Normalizer tab of a pipeline Normalizer transformation: Figure 16-4. Normalizer Tab The Quarterly_Data columns occur 4 times. Each column has an Occurs attribute. The Occurs attribute identifies columns or groups of columns that occur more than once in a source row. When you create a pipeline Normalizer transformation, you can edit the columns. When you create a VSAM Normalizer transformation, the Normalizer tab is read-only. Normalizer Transformation Components 377
  • 410. Table 16-2 describes the Normalizer tab attributes that are common to the VSAM and pipeline Normalizer transformations: Table 16-2. Normalizer Tab Columns Attribute Description Column Name Name of the source column. Level Group columns. Columns in the same group occur beneath a column with a lower level number. When each column is the same level, the transformation contains no column groups. Occurs The number of instances of a column or group of columns in the source row. Datatype The transformation column datatype can be String, Nstring, or Number. Prec Precision. Length of the column. Scale Number of decimal positions for a numeric column. The Normalizer tab for a VSAM Normalizer transformation contains the same attributes as the pipeline Normalizer transformation, but it includes attributes unique to a COBOL source definition. For more information about the Normalizer tab for a VSAM Normalizer transformation, see “VSAM Normalizer Tab” on page 383. For more information about the Normalizer tab for the pipeline Normalizer transformation, see “Pipeline Normalizer Tab” on page 390. 378 Chapter 16: Normalizer Transformation
  • 411. Normalizer Transformation Generated Keys The Normalizer transformation has at least one generated key column in the output row. The Integration Service increments the generated key sequence number each time it processes a source row. The Integration Service determines the initial key value from the generated key value in the Ports tab of the Normalizer transformation. When you create a Normalizer transformation, the generated key value is 1 by default. The naming convention for the Normalizer generated key is GK_<redefined_field_name>. For information about mapping the Normalizer transformation generated keys to targets, see “Generating Key Values” on page 396. Storing Generated Key Values You can view the current generated key values on the Normalizer transformation Ports tab. At the end of each session, the Integration Service updates the generated key value in the Normalizer transformation to the last value generated for the session plus one. If you have multiple instances of the Normalizer transformation in the repository, the Integration Service updates the generated key value in all versions when it runs a session. Note: Change the generated key sequence number only when you need to change the sequence. The Integration Service might pass duplicate keys to the target when you reset a generated key that exists in the target. Changing the Generated Key Values You can change the generated key value in the following ways: ♦ Modify the generated key sequence value. You can modify the generated key sequence value on the Ports tab of the Normalizer transformation. The Integration Service assigns the sequence value to the first generated key it creates for that column. ♦ Reset the generated key sequence. Reset the generated key sequence on the Normalizer transformation Properties tab. When you reset the generated key sequence, the Integration Service resets the generated key start value back to the value it was before the session. Reset the generated key sequence when want to create the same generated key values each time you run the session. ♦ Restart the generated key sequence. Restart the generated key sequence on the Normalizer transformation Properties tab. When you restart the generated key sequence, the Integration Service starts the generated key sequence at 1 the next time it runs a session. When you restart the generated key sequence, the generated key start value does not change in the Normalizer transformation until you run a session. When you run the session, the Integration Service overrides the sequence number value on the Ports tab. When you reset or restart the generated key sequence, the reset or restart affects the generated key sequence values the next time you run a session. You do not change the current generated key sequence values in the Normalizer transformation. When you reset or restart the generated key sequence, the option is enabled for every session until you disable the option. Normalizer Transformation Generated Keys 379
  • 412. VSAM Normalizer Transformation The VSAM Normalizer transformation is the source qualifier for a COBOL source definition. A COBOL source is a flat file that can contain multiple-occurring data and multiple types of records in the same file. VSAM (Virtual Storage Access Method) is a file access method for an IBM mainframe operating system. VSAM files organize records in indexed or sequential flat files. However, you can use the VSAM Normalizer transformation for any flat file source that you define with a COBOL source definition. A COBOL source definition can have an OCCURS statement that defines a multiple- occurring column. The COBOL source definition can also contain a REDEFINES statement to define more than one type of record in the file. The following COBOL copybook defines a sales record: 01 SALES_RECORD. 03 HDR_DATA. 05 HDR_REC_TYPE PIC X. 05 HDR_STORE PIC X(02). 03 STORE_DATA. 05 STORE_NAME PIC X(30). 05 STORE_ADDR1 PIC X(30). 05 STORE_CITY PIC X(30). 03 DETAIL_DATA REDEFINES STORE_DATA. 05 DETAIL_ITEM PIC 9(9). 05 DETAIL_DESC PIC X(30). 05 DETAIL_PRICE PIC 9(4)V99. 05 DETAIL_QTY PIC 9(5). 05 SUPPLIER_INFO OCCURS 4 TIMES . 10 SUPPLIER_CODE PIC XX. 10 SUPPLIER_NAME PIC X(8). The sales file can contain two types of sales records. Store_Data defines a store and Detail_Data defines merchandise sold in the store. The REDEFINES clause indicates that Detail_Data fields might occur in a record instead of Store_Data fields. The first three characters of each sales record is the header. The header includes a record type and a store ID. The value of Hdr_Rec_Type defines whether the rest of the record contains store information or merchandise information. For example, when Hdr_Rec_Type is “S,” the record contains store data. When Hdr_Rec_Type is “D,” the record contains detail data. When the record contains detail data, it includes the Supplier_Info fields. The OCCURS clause defines four suppliers in each Detail_Data record. For more information about COBOL source definitions, see the Designer Guide. 380 Chapter 16: Normalizer Transformation
  • 413. Figure 16-5 shows the Sales_File COBOL source definition that you might create from the COBOL copybook: Figure 16-5. COBOL Source Definition Example Group level columns identify groups of columns in a COBOL source definition. Group level columns do not contain data. The Sales_Rec, Hdr_Data, Store_Data, Detail_Data, and Supplier_Info columns are group- level columns that identify groups of lower level data. Group-level columns have a length of zero because they contain no data. None of these columns are output ports in the source definition. The Supplier_Info group contains Supplier_Code and Supplier_Name columns. The Supplier_Info group occurs four times in each Detail_Data record. When you create a VSAM Normalizer transformation from the COBOL source definition, the Mapping Designer creates the input/output ports in the Normalizer transformation based on the COBOL source definition. The Normalizer transformation contains at least one generated key output port. When the COBOL source has multiple-occurring columns, the Normalizer transformation has a generated column ID output port. For more information about the generated column ID, see “Ports Tab” on page 374. Figure 16-6 shows the Normalizer transformation ports the Mapping Designer creates from the source definition: Figure 16-6. Sales File VSAM Normalizer Transformation The Normalizer transformation has a generated key and a generated column ID port. VSAM Normalizer Transformation 381
  • 414. In Figure 16-5 on page 381, the Supplier_Info group of columns occurs four times in each COBOL source row. The COBOL source row might contain the following data: Item1 ItemDesc 100 25 A Supplier1 B Supplier2 C Supplier3 D Supplier4 The Normalizer transformation returns a row for each occurrence of the Supplier_Code and Supplier_Name columns. Each output row contains the same item, description, price, and quantity values. The Normalizer returns the following detail data rows from the COBOL source row: Item1 ItemDesc 100 25 A Supplier1 1 1 Item1 ItemDesc 100 25 B Supplier2 1 2 Item1 ItemDesc 100 25 C Supplier3 1 3 Item1 ItemDesc 100 25 D Supplier4 1 4 Each output row contains a generated key and a column ID. The Integration Service updates the generated key value when it processes a new source row. In the detail data rows, the generated key value is 1. The column ID defines the Supplier_Info column occurrence number. The Integration Service updates the column ID for each occurrence of the Supplier_Info. The column ID values are 1, 2, 3, 4 in the detail data rows. VSAM Normalizer Ports Tab The VSAM Normalizer Ports tab shows the transformation input and output ports. It has one input/output port for each COBOL source column. It has one input/output port for a multiple-occurring column. The transformation does not have input or output ports for group level columns. 382 Chapter 16: Normalizer Transformation
  • 415. Figure 16-7 shows the VSAM Normalizer Ports tab: Figure 16-7. VSAM Normalizer Ports Tab Supplier_Code and Supplier_Name occur four times in the COBOL source. The Ports tab shows one Supplier_Code port and one Supplier_Name port. Generated Key Start Values VSAM Normalizer Tab When you create a VSAM Normalizer transformation, the Mapping Designer creates the columns from a COBOL source. The Normalizer tab displays the same information as the COBOL source definition. You cannot edit the columns on a VSAM Normalizer tab. VSAM Normalizer Transformation 383
  • 416. Figure 16-8 shows a Normalizer tab for a VSAM Normalizer transformation: Figure 16-8. Normalizer Tab for a VSAM Normalizer Transformation Table 16-3 describes the VSAM Normalizer tab: Table 16-3. Normalizer Tab for a VSAM Normalizer Transformation Attribute Description POffs Physical offset. Location of the field in the file. The first byte in the file is zero. Plen Physical length. Number of bytes in the field. Column Name Name of the source field. Level Provides column group hierarchy. The higher the level number, the lower the data is in the hierarchy. Columns in the same group occur beneath a column with a lower level number. When each column is the same level, the transformation contains no column groups. Occurs The number of instances of a column or group of columns in the source row. Datatype The transformation datatype can be String, Nstring, or Number. Prec Precision. Length of the column. Scale Number of decimal positions for a numeric column. Picture How the data is stored or displayed in the source. Picture 99V99 defines a numeric field with two implied decimals. Picture X(10) indicates ten characters. Usage COBOL data storage format such as COMP, BINARY, and COMP-3. When the Usage is DISPLAY, the Picture clause defines how the source data is formatted when you view it. 384 Chapter 16: Normalizer Transformation
  • 417. Table 16-3. Normalizer Tab for a VSAM Normalizer Transformation Attribute Description Key Type Type of key constraint to apply to this field. When you configure a field as a primary key, the Integration Service generates unique numeric IDs for this field when running a session with a COBOL source. Signed (S) Indicates whether numeric values are signed. Trailing Sign (T) Indicates that the sign (+ or -) exists in the last digit of the field. If not enabled, the sign appears as the first character in the field. Included Sign (I) Indicates whether the sign is included in any value appearing in the field. Real Decimal Point (R) Indicates whether the decimal point is a period (.) or the decimal point is represented by the V character in a numeric field. Redefines Indicates that the column REDEFINES another column. Business Name Descriptive name that you give to a column. Steps to Create a VSAM Normalizer Transformation When you create a VSAM Normalizer transformation, you drag a COBOL source into a mapping and the Mapping Designer creates the transformation columns from the source. The Normalizer transformation is the source qualifier for the COBOL source in the mapping. When you add a COBOL source to a mapping, the Mapping Designer creates and configures a Normalizer transformation. The Mapping Designer identifies nested records and multiple- occurring fields in the COBOL source. It creates the columns and ports in the Normalizer transformation from the source columns. To create a VSAM Normalizer transformation: 1. In the Mapping Designer, create a new mapping or open an existing mapping. 2. Drag a COBOL source definition into the mapping. The Designer adds a Normalizer transformation and connects it to the COBOL source definition. If you have not enabled the option to create a source qualifier by default, the Create Normalizer Transformation dialog box appears: VSAM Normalizer Transformation 385
  • 418. For more information about the option to create a source qualifier by default, see “Using the Designer” in the Designer Guide. 3. If the Create Normalizer Transformation dialog box appears, you can choose from the following options: ♦ VSAM Source. Create a transformation from the COBOL source definition in the mapping. ♦ Pipeline. Create a transformation, but do not define columns from a COBOL source. Define the columns manually on the Normalizer tab. You might choose this option when you want to process multiple-occurring data from another transformation in the mapping. To create the VSAM Normalizer transformation, select the VSAM Normalizer transformation option. The dialog box displays the name of the COBOL source definition in the mapping. Select the COBOL source definition and click OK. 4. Open the Normalizer transformation. 5. Select the Ports tab to view the ports in the Normalizer transformation. The Designer creates the ports from the COBOL source definition by default. 6. Click the Normalizer tab to review the source column organization. The Normalizer tab contains the same information as the Columns tab of the COBOL source. However, you cannot modify column attributes in the Normalizer transformation. To change column attributes, change the COBOL copybook, import the COBOL source, and recreate the Normalizer transformation. 7. Select the Properties tab to set the tracing level. You can also configure the transformation to reset the generated key sequence numbers at the start of the next session. For more information about changing generated key values, see “Changing the Generated Key Values” on page 379. 386 Chapter 16: Normalizer Transformation
  • 419. Pipeline Normalizer Transformation When you create a Normalizer transformation in the Transformation Developer, you create a pipeline Normalizer transformation by default. When you create a pipeline Normalizer transformation, you define the columns based on the data the transformation receives from a another type of transformation such as a Source Qualifier transformation. The Designer creates the input and output Normalizer transformation ports from the columns you define. Figure 16-9 shows the Normalizer transformation columns for a transformation that receives four sales columns in each relational source row: Figure 16-9. Pipeline Normalizer Columns Each source row has a StoreName column and four instances of Sales_By_Quarter. The source rows might contain the following data: Dellmark 100 450 650 780 Tonys 666 333 444 555 Figure 16-10 shows the ports that the Designer creates from the columns in the Normalizer transformation: Figure 16-10. Pipeline Normalizer Ports A pipeline Normalizer transformation has an input port for each instance of a multiple-occurring column. The transformation returns one instance of the multiple-occurring column in each output row. Pipeline Normalizer Transformation 387
  • 420. The Normalizer transformation returns one row for each instance of the multiple-occurring column: Dellmark 100 1 1 Dellmark 450 1 2 Dellmark 650 1 3 Dellmark 780 1 4 Tonys 666 2 1 Tonys 333 2 2 Tonys 444 2 3 Tonys 555 2 4 The Integration Service increments the generated key sequence number each time it processes a source row. The generated key links each quarter sales to the same store. In this example, the generated key for the Dellmark row is 1. The generated key for the Tonys store is 2. The transformation returns a generated column ID (GCID) for each instance of a multiple- occurring field. The GCID_Sales_by_Quarter value is always 1, 2, 3, or 4 in this example. For more information about the generated key, see “Normalizer Transformation Generated Keys” on page 379. Pipeline Normalizer Ports Tab The pipeline Normalizer Ports tab displays the input and output ports for the transformation. It has one input/output port for each single-occurring column you define in the transformation. It has one port for each occurrence of a multiple-occurring column. The transformation does not have input or output ports for group level columns. 388 Chapter 16: Normalizer Transformation
  • 421. Figure 16-11 shows the pipeline Normalizer transformation Ports tab: Figure 16-11. Pipeline Normalizer Ports Tab The Designer creates an input port for each occurrence of a multiple- occurring column. You can change the generated key sequence number. To change the ports in a pipeline Normalizer transformation, modify the columns in the Normalizer tab. When you add a column occurrence, the Designer adds an input port. The Designer creates ports for the lowest level columns. It does not create ports for group level columns. Pipeline Normalizer Transformation 389
  • 422. Pipeline Normalizer Tab When you create a pipeline Normalizer transformation, you define the columns on the Normalizer tab. The Designer creates input and output ports based on the columns you enter on the Normalizer tab. Figure 16-12 shows the Normalizer tab for a pipeline Normalizer transformation: Figure 16-12. Normalizer Tab Click Level to organize columns into groups. Table 16-4 describes the pipeline Normalizer tab attributes: Table 16-4. Pipeline Normalizer Tab Attribute Description Column Name Name of the column. Level Identifies groups of columns. Columns in the same group have the same level number. Default is zero. When each column is the same level, the transformation contains no column groups. Occurs The number of instances of a column or group of columns in the source row. Datatype The column datatype can be String, Nstring, or Number. Prec Precision. Length of the column. Scale Number of decimal digits in a numeric value. Normalizer Tab Column Groups When a source row contains groups of repeating columns, you can define column groups on the Normalizer tab. The Normalizer transformation returns a row for each column group occurrence instead for each column occurrence. 390 Chapter 16: Normalizer Transformation
  • 423. The level number on the Normalizer tab identifies a hierarchy of columns. Group level columns identify groups of columns. The group level column has a lower level number than columns in the group. Columns in the same group have the same level number and display sequentially below the group level column on the Normalizer tab. Figure 16-13 shows a group of multiple-occurring columns in the Normalizer tab: Figure 16-13. Grouping Repeated Columns on the Normalizer Tab The NEWRECORD column contains no data. It is a Level 1 group column. The group occurs four times in each source row. Store_Number and Store_Name are Level 2 columns. They belong to the NEWRECORD group. For more information about creating columns and groups, see “Steps to Create a Pipeline Normalizer Transformation” on page 391. Steps to Create a Pipeline Normalizer Transformation When you create a pipeline Normalizer transformation, you define the columns on the Normalizer tab. You can create a Normalizer transformation in the Transformation Developer or in the Mapping Designer. To create a Normalizer transformation: 1. In the Transformation Developer or the Mapping Designer, click Transformation > Create. Select Normalizer transformation. Enter a name for the Normalizer transformation. The naming convention for Normalizer transformations is NRM_TransformationName. 2. Click Create and click Done. 3. Open the Normalizer transformation and click the Normalizer tab. 4. Click Add to add a new column. Pipeline Normalizer Transformation 391
  • 424. The Designer creates a new column with default attributes. You can change the name, datatype, precision, and scale. 5. To create a multiple-occurring column, enter the number of occurrences in the Occurs column. 6. To create a group of multiple-occurring columns, enter at least one of the columns on the Normalizer tab. Select the column. Click Level. Click Level to change column levels. All columns are the same level by default. The Level defines columns that are grouped together. The Designer adds a NEWRECORD group level column above the selected column. NEWRECORD becomes Level 1. The selected column becomes Level 2. You can rename the NEWRECORD column. 7. You can change the column level for other columns to add them to the same group. Select a column and click Level to change it to the same level as the column above it. Columns in the same group must appear sequentially in the Normalizer tab. 392 Chapter 16: Normalizer Transformation
  • 425. Figure 16-14 shows the NEWRECORD column that groups the Store_Number and Store_Name columns: Figure 16-14. Group-Level Column on the Normalizer Tab The NEWRECORD column is a level one group column. Store_Number and Store_Name are level two columns. 8. Change the occurrence at the group level to make the group of columns multiple- occurring. 9. Click Apply to save the columns and create input and output ports. The Designer creates the Normalizer transformation input and output ports. In addition, the Designer creates the generated key columns and a column ID for each multiple- occurring column or group of columns. 10. Select the Properties tab to change the tracing level or reset the generated key sequence numbers after the next session. For more information about changing generated key values, see “Changing the Generated Key Values” on page 379. Pipeline Normalizer Transformation 393
  • 426. Using a Normalizer Transformation in a Mapping When a Normalizer transformation receives more than one type of data from a COBOL source, you need to connect the Normalizer output ports to different targets based on the type of data in each row. The following example describes how to map the Sales_File COBOL source definition through a Normalizer transformation to multiple targets. The Sales_File source record contains either store information or information about items that a store sells. The sales file contains both types of records. The following example includes two sales file records: Store Record H01Software Suppliers Incorporated 1111 Battery Street San Francisco Item Record D01123456789USB Line - 10 Feet 001495000020 01Supp1 02Supp2 03Supp3 04Supp4 The COBOL source definition and the Normalizer transformation have columns that represent fields in both types of records. You need to filter the store rows from the item rows and pass them to different targets. Figure 16-15 shows the Sales_File COBOL source: Figure 16-15. Sales File COBOL Source Hdr_Rec_Type defines the type of data in the source record. The source record might contain the Store_Data information or Detail_Data information with four occurrences of Supplier_Info. The Hdr_Rec_Type defines whether the record contains store or merchandise data. When the Hdr_Rec_Type value is “S,” the record contains Store_Data. When the Hdr_Rec_Type is “D,” the record contains Detail_Data. Detail_Data always includes four occurrences of Supplier_Info fields. To filter data, connect the Normalizer output rows to a Router transformation to route the store, item, and supplier data to different targets. You can filter rows in the Router transformation based on the value of Hdr_Rec_Type. 394 Chapter 16: Normalizer Transformation
  • 427. Figure 16-16 shows the mapping that routes Sales_File records to different targets: Figure 16-16. Multiple Record Types Routed to Different Targets 1 3 2 4 6 5 The Router transformation filters store, detail, and supplier columns. The mapping filters multiple record types from the COBOL source to relational targets. The the multiple-occurring source columns are mapped to a separate relational table. Each row is indexed by occurrence in the source row. The mapping contains the following transformations: ♦ Normalizer transformation. The Normalizer transformation returns multiple rows when the source contains multiple-occurring Detail_Data. It also processes different record types from the same source. ♦ Router transformation. The Router transformation routes data to targets based on the value of Hdr_Rec_Type. ♦ Aggregator transformation. The Aggregator transformation removes duplicate Detail_Data rows that occur with each Supplier_Info occurrence. The mapping has the following functionality: 1. The Normalizer transformation passes the header record type and header store number columns to the Sales_Header target. Each Sales_Header record has a generated key that links the Sales_Header row to a Store_Data or Detail_Data target row. The Normalizer returns Hdr_Data and Store_Data once per row. 2. The Normalizer transformation passes all columns to the Router transformation. It passes Detail_Data data four times per row, once for each occurrence of the Supplier_Info columns. The Detail_Data columns contain duplicate data, except for the Supplier_Info columns. 3. The Router transformation passes the store name, address, city, and generated key to Store_Data when the Hdr_Rec_Type is “S.” The generated key links Store_Data rows to Sales_Header rows. The Router transformation contains one user-defined group for the store data and one user-defined group for the merchandise items. Using a Normalizer Transformation in a Mapping 395
  • 428. 4. The Router transformation passes the item, item description, price, quantity, and Detail_Data generated keys to an Aggregator transformation when the Hdr_Rec_Type is “D.” 5. The Router transformation passes the supplier code, name, and column ID to the Suppliers target when the Hdr_Rec_Type is “D”. It passes the generated key that links the Suppliers row to the Detail_Data row. 6. The Aggregator transformation removes the duplicate Detail_Data columns. The Aggregator passes one instance of the item, description, price, quantity, and generated key to Detail_Data. The Detail_Data generated key links the Detail_Data rows to the Suppliers rows. Detail_Data also has a key that links the Detail_Data rows to Sales_Header rows. Figure 16-17 shows the user-defined groups and the filter conditions in the Router transformation: Figure 16-17. Router Transformation User-Defined Groups The Router transformation passes store data or item data based on the record type. Generating Key Values The Normalizer transformation creates a generated key when the COBOL source contains a group of multiple-occurring columns. You can pass a group of multiple-occurring columns to a different target than the other columns in the row. You can create a primary-foreign key relationship between the targets with the generated key. For more information about generated keys, see “Normalizer Transformation Generated Keys” on page 379. 396 Chapter 16: Normalizer Transformation
  • 429. For example, Figure 16-18 shows a COBOL source definition that contains a multiple- occurring group of columns: Figure 16-18. COBOL Source with A Multiple-Occurring Group of Columns The Detail_Suppliers group of columns occurs four times in the Detail_Record. The Normalizer transformation generates a GK_Detail_Sales key for each source row. The GK_Detail_Sales key represents one Detail_Record source row. Figure 16-19 shows the primary-foreign key relationships between the targets: Figure 16-19. Generated Keys in Target Tables Multiple-occurring Detail_Supplier rows have a foreign key linking them to the same Detail_Sales row. The Detail_Sales target has a one-to- many relationship to the Detail_Suppliers target. Using a Normalizer Transformation in a Mapping 397
  • 430. Figure 16-20 shows the GK_Detail_Sales generated key connected to primary and foreign keys in the target: Figure 16-20. Generated Keys Mapped to Target Keys Pass GK_Detail_Sales to the primary key of Detail_Sales and the foreign key of Detail_Suppliers. Map the Normalizer output columns to the following objects: ♦ Detail_Sales_Target. Pass the Detail_Item, Detail_Desc, Detail_Price, and Detail_Qty columns to a Detail_Sales target. Pass the GK_Detail_Sales key to the Detail_Sales primary key. ♦ Aggregator Transformation. Pass each Detail_Sales row through an Aggregator transformation to remove duplicate rows. The Normalizer returns duplicate Detail_Sales columns for each occurrence of Detail_Suppliers. ♦ Detail_Suppliers. Pass each instance of the Detail_Suppliers columns to a the Detail_Suppliers target. Pass the GK_Detail_Sales key to the Detail_Suppliers foreign key. Each instance of the Detail_Suppliers columns has a foreign key that relates the Detail_Suppliers row to the Detail_Sales row. For more information about connecting Normalizer transformation ports to relational targets, see “Using a Normalizer Transformation in a Mapping” on page 394. 398 Chapter 16: Normalizer Transformation
  • 431. Troubleshooting I cannot edit the ports in my Normalizer transformation when using a relational source. When you create ports manually, add them on the Normalizer tab in the transformation, not the Ports tab. Importing a COBOL file failed with numberrors. What should I do? Verify that the COBOL program follows the COBOL standard, including spaces, tabs, and end of line characters. The COBOL file headings should be similar to the following text: identification division. program-id. mead. environment division. select file-one assign to "fname". data division. file section. fd FILE-ONE. The Designer does not read hidden characters in the COBOL program. Use a text-only editor to make changes to the COBOL file. Do not use Word or Wordpad. Remove extra spaces. A session that reads binary data completed, but the information in the target table is incorrect. Edit the session in the Workflow Manager and verify that the source file format is set correctly. The file format might be EBCDIC or ASCII. The number of bytes to skip between records must be set to 0. I have a COBOL field description that uses a non-IBM COMP type. How should I import the source? In the source definition, clear the IBM COMP option. In my mapping, I use one Expression transformation and one Lookup transformation to modify two output ports from the Normalizer transformation. The mapping concatenates them into a single transformation. All the ports are under the same level. When I check the data loaded in the target, it is incorrect. Why is that? You can only concatenate ports from level one. Remove the concatenation. Troubleshooting 399
  • 432. 400 Chapter 16: Normalizer Transformation
  • 433. Chapter 17 Rank Transformation This chapter includes the following topics: ♦ Overview, 402 ♦ Ports in a Rank Transformation, 404 ♦ Defining Groups, 405 ♦ Creating a Rank Transformation, 406 401
  • 434. Overview Transformation type: Active Connected You can select only the top or bottom rank of data with Rank transformation. Use a Rank transformation to return the largest or smallest numeric value in a port or group. You can also use a Rank transformation to return the strings at the top or the bottom of a session sort order. During the session, the Integration Service caches input data until it can perform the rank calculations. The Rank transformation differs from the transformation functions MAX and MIN, in that it lets you select a group of top or bottom values, not just one value. For example, use Rank to select the top 10 salespersons in a given territory. Or, to generate a financial report, you might also use a Rank transformation to identify the three departments with the lowest expenses in salaries and overhead. While the SQL language provides many functions designed to handle groups of data, identifying top or bottom strata within a set of rows is not possible using standard SQL functions. You connect all ports representing the same row set to the transformation. Only the rows that fall within that rank, based on some measure you set when you configure the transformation, pass through the Rank transformation. You can also write expressions to transform data or perform calculations. Figure 17-1 shows a mapping that passes employee data from a human resources table through a Rank transformation. The Rank only passes the rows for the top 10 highest paid employees to the next transformation. Figure 17-1. Sample Mapping with a Rank Transformation As an active transformation, the Rank transformation might change the number of rows passed through it. You might pass 100 rows to the Rank transformation, but select to rank only the top 10 rows, which pass from the Rank transformation to another transformation. You can connect ports from only one transformation to the Rank transformation. You can also create local variables and write non-aggregate expressions. 402 Chapter 17: Rank Transformation
  • 435. Ranking String Values When the Integration Service runs in the ASCII data movement mode, it sorts session data using a binary sort order. When the Integration Service runs in Unicode data movement mode, the Integration Service uses the sort order configured for the session. You select the session sort order in the session properties. The session properties lists all available sort orders based on the code page used by the Integration Service. For example, you have a Rank transformation configured to return the top three values of a string port. When you configure the workflow, you select the Integration Service on which you want the workflow to run. The session properties display all sort orders associated with the code page of the selected Integration Service, such as French, German, and Binary. If you configure the session to use a binary sort order, the Integration Service calculates the binary value of each string, and returns the three rows with the highest binary values for the string. Rank Caches During a session, the Integration Service compares an input row with rows in the data cache. If the input row out-ranks a cached row, the Integration Service replaces the cached row with the input row. If you configure the Rank transformation to rank across multiple groups, the Integration Service ranks incrementally for each group it finds. The Integration Service stores group information in an index cache and row data in a data cache. If you create multiple partitions in a pipeline, the Integration Service creates separate caches for each partition. For more information about caching, see “Session Caches” in the Workflow Administration Guide. Rank Transformation Properties When you create a Rank transformation, you can configure the following properties: ♦ Enter a cache directory. ♦ Select the top or bottom rank. ♦ Select the input/output port that contains values used to determine the rank. You can select only one port to define a rank. ♦ Select the number of rows falling within a rank. ♦ Define groups for ranks, such as the 10 least expensive products for each manufacturer. Overview 403
  • 436. Ports in a Rank Transformation The Rank transformation includes input or input/output ports connected to another transformation in the mapping. It also includes variable ports and a rank port. Use the rank port to specify the column you want to rank. Table 17-1 lists the ports in a Rank transformation: Table 17-1. Rank Transformation Ports Ports Number Required Description I Minimum of one Input port. Create an input port to receive data from another transformation. O Minimum of one Output port. Create an output port for each port you want to link to another transformation. You can designate input ports as output ports. V Not Required Variable port. Can use to store values or calculations to use in an expression. Variable ports cannot be input or output ports. They pass data within the transformation only. R One only Rank port. Use to designate the column for which you want to rank values. You can designate only one Rank port in a Rank transformation. The Rank port is an input/output port. You must link the Rank port to another transformation. Rank Index The Designer creates a RANKINDEX port for each Rank transformation. The Integration Service uses the Rank Index port to store the ranking position for each row in a group. For example, if you create a Rank transformation that ranks the top five salespersons for each quarter, the rank index numbers the salespeople from 1 to 5: RANKINDEX SALES_PERSON SALES 1 Sam 10,000 2 Mary 9,000 3 Alice 8,000 4 Ron 7,000 5 Alex 6,000 The RANKINDEX is an output port only. You can pass the rank index to another transformation in the mapping or directly to a target. 404 Chapter 17: Rank Transformation
  • 437. Defining Groups Like the Aggregator transformation, the Rank transformation lets you group information. For example, if you want to select the 10 most expensive items by manufacturer, you would first define a group for each manufacturer. When you configure the Rank transformation, you can set one of its input/output ports as a group by port. For each unique value in the group port (for example, MANUFACTURER_ID or MANUFACTURER_NAME), the transformation creates a group of rows falling within the rank definition (top or bottom, and a particular number in each rank). Therefore, the Rank transformation changes the number of rows in two different ways. By filtering all but the rows falling within a top or bottom rank, you reduce the number of rows that pass through the transformation. By defining groups, you create one set of ranked rows for each group. For example, you might create a Rank transformation to identify the 50 highest paid employees in the company. In this case, you would identify the SALARY column as the input/ output port used to measure the ranks, and configure the transformation to filter out all rows except the top 50. After the Rank transformation identifies all rows that belong to a top or bottom rank, it then assigns rank index values. In the case of the top 50 employees, measured by salary, the highest paid employee receives a rank index of 1. The next highest-paid employee receives a rank index of 2, and so on. When measuring a bottom rank, such as the 10 lowest priced products in the inventory, the Rank transformation assigns a rank index from lowest to highest. Therefore, the least expensive item would receive a rank index of 1. If two rank values match, they receive the same value in the rank index and the transformation skips the next value. For example, if you want to see the top five retail stores in the country and two stores have the same sales, the return data might look similar to the following: RANKINDEX SALES STORE 1 10000 Orange 1 10000 Brea 3 90000 Los Angeles 4 80000 Ventura Defining Groups 405
  • 438. Creating a Rank Transformation You can add a Rank transformation anywhere in the mapping after the source qualifier. To create a Rank transformation: 1. In the Mapping Designer, click Transformation > Create. Select the Rank transformation. Enter a name for the Rank. The naming convention for Rank transformations is RNK_TransformationName. Enter a description for the transformation. This description appears in the Repository Manager. 2. Click Create, and then click Done. The Designer creates the Rank transformation. 3. Link columns from an input transformation to the Rank transformation. 4. Click the Ports tab, and then select the Rank (R) option for the port used to measure ranks. If you want to create groups for ranked rows, select Group By for the port that defines the group. 406 Chapter 17: Rank Transformation
  • 439. 5. Click the Properties tab and select whether you want the top or bottom rank. 6. For the Number of Ranks option, enter the number of rows you want to select for the rank. 7. Change the other Rank transformation properties, if necessary. Table 17-2 describes the Rank transformation properties: Table 17-2. Rank Transformation Properties Setting Description Cache Directory Local directory where the Integration Service creates the index and data cache files. By default, the Integration Service uses the directory entered in the Workflow Manager for the process variable $PMCacheDir. If you enter a new directory, make sure the directory exists and contains enough disk space for the cache files. Top/Bottom Specifies whether you want the top or bottom ranking for a column. Number of Ranks Number of rows you want to rank. Case-Sensitive String Comparison When running in Unicode mode, the Integration Service ranks strings based on the sort order selected for the session. If the session sort order is case-sensitive, select this option to enable case-sensitive string comparisons, and clear this option to have the Integration Service ignore case for strings. If the sort order is not case-sensitive, the Integration Service ignores this setting. By default, this option is selected. Tracing Level Determines the amount of information the Integration Service writes to the session log about data passing through this transformation in a session. Creating a Rank Transformation 407
  • 440. Table 17-2. Rank Transformation Properties Setting Description Rank Data Cache Size Data cache size for the transformation. Default is 2,000,000 bytes. If the total configured session cache size is 2 GB (2,147,483,648 bytes) or more, you must run the session on a 64-bit Integration Service. You can configure a numeric value, or you can configure the Integration Service to determine the cache size at runtime. If you configure the Integration Service to determine the cache size, you can also configure a maximum amount of memory for the Integration Service to allocate to the cache. Rank Index Cache Size Index cache size for the transformation. Default is 1,000,000 bytes. If the total configured session cache size is 2 GB (2,147,483,648 bytes) or more, you must run the session on a 64-bit Integration Service. You can configure a numeric value, or you can configure the Integration Service to determine the cache size at runtime. If you configure the Integration Service to determine the cache size, you can also configure a maximum amount of memory for the Integration Service to allocate to the cache. Transformation Scope Specifies how the Integration Service applies the transformation logic to incoming data: - Transaction. Applies the transformation logic to all rows in a transaction. Choose Transaction when a row of data depends on all rows in the same transaction, but does not depend on rows in other transactions. - All Input. Applies the transformation logic on all incoming data. When you choose All Input, the PowerCenter drops incoming transaction boundaries. Choose All Input when a row of data depends on all rows in the source. For more information about transformation scope, see “Understanding Commit Points” in the Workflow Administration Guide. 8. Click OK. 9. Click Repository > Save. 408 Chapter 17: Rank Transformation
  • 441. Chapter 18 Router Transformation This chapter includes the following topics: ♦ Overview, 410 ♦ Working with Groups, 412 ♦ Working with Ports, 416 ♦ Connecting Router Transformations in a Mapping, 418 ♦ Creating a Router Transformation, 420 409
  • 442. Overview Transformation type: Active Connected A Router transformation is similar to a Filter transformation because both transformations allow you to use a condition to test data. A Filter transformation tests data for one condition and drops the rows of data that do not meet the condition. However, a Router transformation tests data for one or more conditions and gives you the option to route rows of data that do not meet any of the conditions to a default output group. If you need to test the same input data based on multiple conditions, use a Router transformation in a mapping instead of creating multiple Filter transformations to perform the same task. The Router transformation is more efficient. For example, to test data based on three conditions, you only need one Router transformation instead of three filter transformations to perform this task. Likewise, when you use a Router transformation in a mapping, the Integration Service processes the incoming data only once. When you use multiple Filter transformations in a mapping, the Integration Service processes the incoming data for each transformation. Figure 18-1 shows two mappings that perform the same task. Mapping A uses three Filter transformations while Mapping B produces the same result with one Router transformation: Figure 18-1. Comparing Router and Filter Transformations Mapping A Mapping B A Router transformation consists of input and output groups, input and output ports, group filter conditions, and properties that you configure in the Designer. 410 Chapter 18: Router Transformation
  • 443. Figure 18-2 shows a sample Router transformation and its components: Figure 18-2. Sample Router Transformation Input Ports Input Group User-Defined Output Groups Output Ports Default Output Group Overview 411
  • 444. Working with Groups A Router transformation has the following types of groups: ♦ Input ♦ Output Input Group The Designer copies property information from the input ports of the input group to create a set of output ports for each output group. Output Groups There are two types of output groups: ♦ User-defined groups ♦ Default group You cannot modify or delete output ports or their properties. User-Defined Groups You create a user-defined group to test a condition based on incoming data. A user-defined group consists of output ports and a group filter condition. You can create and edit user- defined groups on the Groups tab with the Designer. Create one user-defined group for each condition that you want to specify. The Integration Service uses the condition to evaluate each row of incoming data. It tests the conditions of each user-defined group before processing the default group. The Integration Service determines the order of evaluation for each condition based on the order of the connected output groups. The Integration Service processes user-defined groups that are connected to a transformation or a target in a mapping. The Integration Service only processes user-defined groups that are not connected in a mapping if the default group is connected to a transformation or a target. If a row meets more than one group filter condition, the Integration Service passes this row multiple times. The Default Group The Designer creates the default group after you create one new user-defined group. The Designer does not allow you to edit or delete the default group. This group does not have a group filter condition associated with it. If all of the conditions evaluate to FALSE, the Integration Service passes the row to the default group. If you want the Integration Service to 412 Chapter 18: Router Transformation
  • 445. drop all rows in the default group, do not connect it to a transformation or a target in a mapping. The Designer deletes the default group when you delete the last user-defined group from the list. Using Group Filter Conditions You can test data based on one or more group filter conditions. You create group filter conditions on the Groups tab using the Expression Editor. You can enter any expression that returns a single value. You can also specify a constant for the condition. A group filter condition returns TRUE or FALSE for each row that passes through the transformation, depending on whether a row satisfies the specified condition. Zero (0) is the equivalent of FALSE, and any non-zero value is the equivalent of TRUE. The Integration Service passes the rows of data that evaluate to TRUE to each transformation or target that is associated with each user-defined group. For example, you have customers from nine countries, and you want to perform different calculations on the data from only three countries. You might want to use a Router transformation in a mapping to filter this data to three different Expression transformations. There is no group filter condition associated with the default group. However, you can create an Expression transformation to perform a calculation based on the data from the other six countries. Figure 18-3 shows a mapping with a Router transformation that filters data based on multiple conditions: Figure 18-3. Using a Router Transformation in a Mapping Working with Groups 413
  • 446. Since you want to perform multiple calculations based on the data from three different countries, create three user-defined groups and specify three group filter conditions on the Groups tab. Figure 18-4 shows specifying group filter conditions in a Router transformation to filter customer data: Figure 18-4. Specifying Group Filter Conditions In the session, the Integration Service passes the rows of data that evaluate to TRUE to each transformation or target that is associated with each user-defined group, such as Japan, France, and USA. The Integration Service passes the row to the default group if all of the conditions evaluate to FALSE. If this happens, the Integration Service passes the data of the other six countries to the transformation or target that is associated with the default group. If you want the Integration Service to drop all rows in the default group, do not connect it to a transformation or a target in a mapping. Adding Groups Adding a group is similar to adding a port in other transformations. The Designer copies property information from the input ports to the output ports. For more information, see “Working with Groups” on page 412. To add a group to a Router transformation: 1. Click the Groups tab. 2. Click the Add button. 3. Enter a name for the new group in the Group Name section. 4. Click the Group Filter Condition field and open the Expression Editor. 414 Chapter 18: Router Transformation
  • 447. 5. Enter the group filter condition. 6. Click Validate to check the syntax of the condition. 7. Click OK. Working with Groups 415
  • 448. Working with Ports A Router transformation has input ports and output ports. Input ports are in the input group, and output ports are in the output groups. You can create input ports by copying them from another transformation or by manually creating them on the Ports tab. Figure 18-5 shows the Ports tab of a Router transformation: Figure 18-5. Router Transformation Ports Tab The Designer creates output ports by copying the following properties from the input ports: ♦ Port name ♦ Datatype ♦ Precision ♦ Scale ♦ Default value When you make changes to the input ports, the Designer updates the output ports to reflect these changes. You cannot edit or delete output ports. The output ports display in the Normal view of the Router transformation. The Designer creates output port names based on the input port names. For each input port, the Designer creates a corresponding output port in each output group. 416 Chapter 18: Router Transformation
  • 449. Figure 18-6 shows the output port names of a Router transformation in Normal view that correspond to the input port names: Figure 18-6. Input Port Name and Corresponding Output Port Names Input Port Name Corresponding Output Port Names Working with Ports 417
  • 450. Connecting Router Transformations in a Mapping When you connect transformations to a Router transformation in a mapping, consider the following rules: ♦ You can connect one group to one transformation or target. Output Group 1 Port 1 Port 2 Port 1 Port 3 Port 2 Output Group 2 Port 3 Port 1 Port 4 Port 2 Port 3 ♦ You can connect one output port in a group to multiple transformations or targets. Output Group 1 Port 1 Port 1 Port 2 Port 2 Port 3 Port 3 Output Group 2 Port 4 Port 1 Port 2 Port 1 Port 3 Port 2 Port 3 Port 4 ♦ You can connect multiple output ports in one group to multiple transformations or targets. Output Group 1 Port 1 Port 1 Port 2 Port 2 Port 3 Port 3 Output Group 2 Port 4 Port 1 Port 2 Port 1 Port 3 Port 2 Port 3 Port 4 ♦ You cannot connect more than one group to one target or a single input group transformation. Output Group 1 Port 1 Port 1 Port 2 Port 2 Port 3 Port 3 Output Group 2 Port 4 Port 1 Port 2 Port 3 418 Chapter 18: Router Transformation
  • 451. You can connect more than one group to a multiple input group transformation, except for Joiner transformations, when you connect each output group to a different input group. Output Group 1 Input Group 1 Port 1 Port 1 Port 2 Port 2 Port 3 Port 3 Output Group 2 Input Group 2 Port 1 Port 1 Port 2 Port 2 Port 3 Port 3 Connecting Router Transformations in a Mapping 419
  • 452. Creating a Router Transformation To add a Router transformation to a mapping, complete the following steps. To create a Router transformation: 1. In the Mapping Designer, open a mapping. 2. Click Transformation > Create. Select Router transformation, and enter the name of the new transformation. The naming convention for the Router transformation is RTR_TransformationName. Click Create, and then click Done. 3. Select and drag all the ports from a transformation to add them to the Router transformation, or you can manually create input ports on the Ports tab. 4. Double-click the title bar of the Router transformation to edit transformation properties. 5. Click the Transformation tab and configure transformation properties. 6. Click the Properties tab and configure tracing levels. For more information about configuring tracing levels, see “Configuring Tracing Level in Transformations” on page 30. 7. Click the Groups tab, and then click the Add button to create a user-defined group. The Designer creates the default group when you create the first user-defined group. 8. Click the Group Filter Condition field to open the Expression Editor. 9. Enter a group filter condition. 10. Click Validate to check the syntax of the conditions you entered. 11. Click OK. 12. Connect group output ports to transformations or targets. 13. Click Repository > Save. 420 Chapter 18: Router Transformation
  • 453. Chapter 19 Sequence Generator Transformation This chapter includes the following topics: ♦ Overview, 422 ♦ Common Uses, 423 ♦ Sequence Generator Ports, 424 ♦ Transformation Properties, 427 ♦ Creating a Sequence Generator Transformation, 432 421
  • 454. Overview Transformation type: Passive Connected The Sequence Generator transformation generates numeric values. Use the Sequence Generator to create unique primary key values, replace missing primary keys, or cycle through a sequential range of numbers. The Sequence Generator transformation is a connected transformation. It contains two output ports that you can connect to one or more transformations. The Integration Service generates a block of sequence numbers each time a block of rows enters a connected transformation. If you connect CURRVAL, the Integration Service processes one row in each block. When NEXTVAL is connected to the input port of another transformation, the Integration Service generates a sequence of numbers. When CURRVAL is connected to the input port of another transformation, the Integration Service generates the NEXTVAL value plus the Increment By value. You can make a Sequence Generator reusable, and use it in multiple mappings. You might reuse a Sequence Generator when you perform multiple loads to a single target. For example, if you have a large input file that you separate into three sessions running in parallel, use a Sequence Generator to generate primary key values. If you use different Sequence Generators, the Integration Service might generate duplicate key values. Instead, use the reusable Sequence Generator for all three sessions to provide a unique value for each target row. 422 Chapter 19: Sequence Generator Transformation
  • 455. Common Uses You can complete the following tasks with a Sequence Generator transformation: ♦ Create keys. ♦ Replace missing values. ♦ Cycle through a sequential range of numbers. Creating Keys You can create approximately two billion primary or foreign key values with the Sequence Generator transformation by connecting the NEXTVAL port to the transformation or target and using the widest range of values (1 to 2147483647) with the smallest interval (1). When you create primary or foreign keys, only use the Cycle option to prevent the Integration Service from creating duplicate primary keys. You might do this by selecting the Truncate Target Table option in the session properties (if appropriate) or by creating composite keys. To create a composite key, you can configure the Integration Service to cycle through a smaller set of values. For example, if you have three stores generating order numbers, you might have a Sequence Generator cycling through values from 1 to 3, incrementing by 1. When you pass the following set of foreign keys, the generated values then create unique composite keys: COMPOSITE_KEY ORDER_NO 1 12345 2 12345 3 12345 1 12346 2 12346 3 12346 Replacing Missing Values Use the Sequence Generator transformation to replace missing keys by using NEXTVAL with the IIF and ISNULL functions. For example, to replace null values in the ORDER_NO column, you create a Sequence Generator transformation with the properties and drag the NEXTVAL port to an Expression transformation. In the Expression transformation, drag the ORDER_NO port into the transformation (along with any other necessary ports). Then create a new output port, ALL_ORDERS. In ALL_ORDERS, you can then enter the following expression to replace null orders: IIF( ISNULL( ORDER_NO ), NEXTVAL, ORDER_NO ) Common Uses 423
  • 456. Sequence Generator Ports The Sequence Generator transformation provides two output ports: NEXTVAL and CURRVAL. You cannot edit or delete these ports. Likewise, you cannot add ports to the transformation. NEXTVAL Connect NEXTVAL to multiple transformations to generate unique values for each row in each transformation. Use the NEXTVAL port to generate sequence numbers by connecting it to a transformation or target. You connect the NEXTVAL port to a downstream transformation to generate the sequence based on the Current Value and Increment By properties. For more information about Sequence Generator properties, see Table 19-1 on page 427. For example, you might connect NEXTVAL to two target tables in a mapping to generate unique primary key values. The Integration Service creates a column of unique primary key values for each target table. The column of unique primary key values is sent to one target table as a block of sequence numbers. The second targets receives a block of sequence numbers from the Sequence Generator transformation only after the first target table receives the block of sequence numbers. Figure 19-1 shows connecting NEXTVAL to two target tables in a mapping: Figure 19-1. Connecting NEXTVAL to Two Target Tables in a Mapping For example, you configure the Sequence Generator transformation as follows: Current Value = 1, Increment By = 1. When you run the workflow, the Integration Service generates the following primary key values for the T_ORDERS_PRIMARY and T_ORDERS_FOREIGN target tables: T_ORDERS_PRIMARY TABLE: T_ORDERS_FOREIGN TABLE: PRIMARY KEY PRIMARY KEY 1 6 2 7 3 8 424 Chapter 19: Sequence Generator Transformation
  • 457. T_ORDERS_PRIMARY TABLE: T_ORDERS_FOREIGN TABLE: PRIMARY KEY PRIMARY KEY 4 9 5 10 If you want the same values to go to more than one target that receives data from a single transformation, you can connect a Sequence Generator transformation to that preceding transformation. The Integration Service processes the values into a block of sequence numbers. This allows the Integration Service to pass unique values to the transformation, and then route rows from the transformation to targets. Figure 19-2 shows a mapping with a the Sequence Generator that passes unique values to the Expression transformation. The Expression transformation then populates both targets with identical primary key values. Figure 19-2. Mapping with a Sequence Generator and an Expression Transformation For example, you configure the Sequence Generator transformation as follows: Current Value = 1, Increment By = 1. When you run the workflow, the Integration Service generates the following primary key values for the T_ORDERS_PRIMARY and T_ORDERS_FOREIGN target tables: T_ORDERS_PRIMARY TABLE: T_ORDERS_FOREIGN TABLE: PRIMARY KEY PRIMARY KEY 1 1 2 2 3 3 4 4 5 5 Note: When you run a partitioned session on a grid, the Sequence Generator transformation may skip values depending on the number of rows in each partition. Sequence Generator Ports 425
  • 458. CURRVAL CURRVAL is NEXTVAL plus the Increment By value. You typically only connect the CURRVAL port when the NEXTVAL port is already connected to a downstream transformation. When a row enters the transformation connected to the CURRVAL port, the Integration Service passes the last-created NEXTVAL value plus one. For information about the Increment By value, see “Increment By” on page 428. Figure 19-3 shows connecting CURRVAL and NEXTVAL ports to a target: Figure 19-3. Connecting CURRVAL and NEXTVAL Ports to a Target For example, you configure the Sequence Generator transformation as follows: Current Value = 1, Increment By = 1. When you run the workflow, the Integration Service generates the following values for NEXTVAL and CURRVAL: NEXTVAL CURRVAL 1 2 2 3 3 4 4 5 5 6 If you connect the CURRVAL port without connecting the NEXTVAL port, the Integration Service passes a constant value for each row. When you connect the CURRVAL port in a Sequence Generator transformation, the Integration Service processes one row in each block. You can optimize performance by connecting only the NEXTVAL port in a mapping. Note: When you run a partitioned session on a grid, the Sequence Generator transformation may skip values depending on the number of rows in each partition. 426 Chapter 19: Sequence Generator Transformation
  • 459. Transformation Properties The Sequence Generator transformation is unique among all transformations because you cannot add, edit, or delete its default ports (NEXTVAL and CURRVAL). Table 19-1 lists the Sequence Generator transformation properties you can configure: Table 19-1. Sequence Generator Transformation Properties Sequence Required/ Description Generator Setting Optional Start Value Required Start value of the generated sequence that you want the Integration Service to use if you use the Cycle option. If you select Cycle, the Integration Service cycles back to this value when it reaches the end value. Default is 0. Increment By Required Difference between two consecutive values from the NEXTVAL port. Default is 1. End Value Optional Maximum value the Integration Service generates. If the Integration Service reaches this value during the session and the sequence is not configured to cycle, the session fails. Current Value Optional Current value of the sequence. Enter the value you want the Integration Service to use as the first value in the sequence. If you want to cycle through a series of values, the value must be greater than or equal to the start value and less than the end value. If the Number of Cached Values is set to 0, the Integration Service updates the current value to reflect the last-generated value for the session plus one, and then uses the updated current value as the basis for the next time you run this session. However, if you use the Reset option, the Integration Service resets this value to its original value after each session. Note: If you edit this setting, you reset the sequence to the new setting. If you reset Current Value to 10, and the increment is 1, the next time you use the session, the Integration Service generates a first value of 10. Cycle Optional If selected, the Integration Service cycles through the sequence range. Otherwise, the Integration Service stops the sequence at the configured end value. If disabled, the Integration Service fails the session with overflow errors if it reaches the end value and still has rows to process. Number of Cached Optional Number of sequential values the Integration Service caches at a Values time. Use this option when multiple sessions use the same reusable Sequence Generator at the same time to ensure each session receives unique values. The Integration Service updates the repository as it caches each value. When set to 0, the Integration Service does not cache values. Default value for a standard Sequence Generator is 0. Default value for a reusable Sequence Generator is 1,000. Transformation Properties 427
  • 460. Table 19-1. Sequence Generator Transformation Properties Sequence Required/ Description Generator Setting Optional Reset Optional If selected, the Integration Service generates values based on the original current value for each session. Otherwise, the Integration Service updates the current value to reflect the last-generated value for the session plus one, and then uses the updated current value as the basis for the next session run. This option is disabled for reusable Sequence Generator transformations. Tracing Level Optional Level of detail about the transformation that the Integration Service writes into the session log. Start Value and Cycle Use Cycle to generate a repeating sequence, such as numbers 1 through 12 to correspond to the months in a year. To cycle the Integration Service through a sequence: 1. Enter the lowest value in the sequence that you want the Integration Service to use for the Start Value. 2. Enter the highest value to be used for End Value. 3. Select Cycle. As it cycles, the Integration Service reaches the configured end value for the sequence, it wraps around and starts the cycle again, beginning with the configured Start Value. Increment By The Integration Service generates a sequence (NEXTVAL) based on the Current Value and Increment By properties in the Sequence Generator transformation. The Current Value property is the value at which the Integration Service starts creating the sequence for each session. Increment By is the integer the Integration Service adds to the existing value to create the new value in the sequence. By default, the Current Value is set to 1, and Increment By is set to 1. For example, you might create a Sequence Generator transformation with a current value of 1,000 and an increment of 10. If you pass three rows through the mapping, the Integration Service generates the following set of values: 1000 1010 1020 428 Chapter 19: Sequence Generator Transformation
  • 461. End Value End Value is the maximum value you want the Integration Service to generate. If the Integration Service reaches the end value and the Sequence Generator is not configured to cycle through the sequence, the session fails with the following error message: TT_11009 Sequence Generator Transformation: Overflow error. You can set the end value to any integer between 1 and 2,147,483,647. Current Value The Integration Service uses the current value as the basis for generated values for each session. To indicate which value you want the Integration Service to use the first time it uses the Sequence Generator transformation, you must enter that value as the current value. If you want to use the Sequence Generator transformation to cycle through a series of values, the current value must be greater than or equal to Start Value and less than the end value. At the end of each session, the Integration Service updates the current value to the last value generated for the session plus one if the Sequence Generator Number of Cached Values is 0. For example, if the Integration Service ends a session with a generated value of 101, it updates the Sequence Generator current value to 102 in the repository. The next time the Sequence Generator is used, the Integration Service uses 102 as the basis for the next generated value. If the Sequence Generator Increment By is 1, when the Integration Service starts another session using the Sequence Generator, the first generated value is 102. If you have multiple versions of a Sequence Generator transformation, the Integration Service updates the current value across all versions when it runs a session. The Integration Service updates the current value across versions regardless of whether you have checked out the Sequence Generator transformation or the parent mapping. The updated current value overrides an edited current value for a Sequence Generator transformation if the two values are different. For example, User 1 creates Sequence Generator transformation and checks it in, saving a current value of 10 to Sequence Generator version 1. Then User 1 checks out the Sequence Generator transformation and enters a new current value of 100 to Sequence Generator version 2. User 1 keeps the Sequence Generator transformation checked out. Meanwhile, User 2 runs a session that uses the Sequence Generator transformation version 1. The Integration Service uses the checked-in value of 10 as the current value when User 2 runs the session. When the session completes, the current value is 150. The Integration Service updates the current value to 150 for version 1 and version 2 of the Sequence Generator transformation even though User 1 has the Sequence Generator transformation checked out. If you open the mapping after you run the session, the current value displays the last value generated for the session plus one. Since the Integration Service uses the current value to determine the first value for each session, you should edit the current value only when you want to reset the sequence. If you have multiple versions of the Sequence Generator transformation, and you want to reset the sequence, you must check in the mapping or Sequence Generator (reusable) transformation after you modify the current value. Transformation Properties 429
  • 462. Note: If you configure the Sequence Generator to Reset, the Integration Service uses the current value as the basis for the first generated value for each session. Number of Cached Values Number of Cached Values determines the number of values the Integration Service caches at one time. When Number of Cached Values is greater than zero, the Integration Service caches the configured number of values and updates the current value each time it caches values. When multiple sessions use the same reusable Sequence Generator transformation at the same time, there might be multiple instances of the Sequence Generator transformation. To avoid generating the same values for each session, reserve a range of sequence values for each session by configuring Number of Cached Values. Tip: To increase performance when running a session on a grid, increase the number of cached values for the Sequence Generator transformation. This reduces the communication required between the master and worker DTM processes and the repository. Non-Reusable Sequence Generators For non-reusable Sequence Generator transformations, Number of Cached Values is set to zero by default, and the Integration Service does not cache values during the session. When the Integration Service does not cache values, it accesses the repository for the current value at the start of a session. The Integration Service then generates values for the sequence. At the end of the session, the Integration Service updates the current value in the repository. When you set Number of Cached Values greater than zero, the Integration Service caches values during the session. At the start of the session, the Integration Service accesses the repository for the current value, caches the configured number of values, and updates the current value accordingly. If the Integration Service exhausts the cache, it accesses the repository for the next set of values and updates the current value. At the end of the session, the Integration Service discards any remaining values in the cache. For non-reusable Sequence Generator transformations, setting Number of Cached Values greater than zero can increase the number of times the Integration Service accesses the repository during the session. It also causes sections of skipped values since unused cached values are discarded at the end of each session. For example, you configure a Sequence Generator transformation as follows: Number of Cached Values = 50, Current Value = 1, Increment By = 1. When the Integration Service starts the session, it caches 50 values for the session and updates the current value to 50 in the repository. The Integration Service uses values 1 to 39 for the session and discards the unused values, 40 to 49. When the Integration Service runs the session again, it checks the repository for the current value, which is 50. It then caches the next 50 values and updates the current value to 100. During the session, it uses values 50 to 98. The values generated for the two sessions are 1 to 39 and 50 to 98. 430 Chapter 19: Sequence Generator Transformation
  • 463. Reusable Sequence Generators When you have a reusable Sequence Generator transformation in several sessions and the sessions run at the same time, use Number of Cached Values to ensure each session receives unique values in the sequence. By default, Number of Cached Values is set to 1000 for reusable Sequence Generators. When multiple sessions use the same Sequence Generator transformation at the same time, you risk generating the same values for each session. To avoid this, have the Integration Service cache a set number of values for each session by configuring Number of Cached Values. For example, you configure a reusable Sequence Generator transformation as follows: Number of Cached Values = 50, Current Value = 1, Increment By = 1. Two sessions use the Sequence Generator, and they are scheduled to run at approximately the same time. When the Integration Service starts the first session, it caches 50 values for the session and updates the current value to 50 in the repository. The Integration Service begins using values 1 to 50 in the session. When the Integration Service starts the second session, it checks the repository for the current value, which is 50. It then caches the next 50 values and updates the current value to 100. It then uses values 51 to 100 in the second session. When either session uses all its cached values, the Integration Service caches a new set of values and updates the current value to ensure these values remain unique to the Sequence Generator. For reusable Sequence Generator transformations, you can reduce Number of Cached Values to minimize discarded values, however it must be greater than one. When you reduce the Number of Cached Values, you might increase the number of times the Integration Service accesses the repository to cache values during the session. Reset If you select Reset for a non-reusable Sequence Generator transformation, the Integration Service generates values based on the original current value each time it starts the session. Otherwise, the Integration Service updates the current value to reflect the last-generated value plus one, and then uses the updated value the next time it uses the Sequence Generator transformation. For example, you might configure a Sequence Generator transformation to create values from 1 to 1,000 with an increment of 1, and a current value of 1 and choose Reset. During the first session run, the Integration Service generates numbers 1 through 234. The next time (and each subsequent time) the session runs, the Integration Service again generates numbers beginning with the current value of 1. If you do not select Reset, the Integration Service updates the current value to 235 at the end of the first session run. The next time it uses the Sequence Generator transformation, the first value generated is 235. Note: Reset is disabled for reusable Sequence Generator transformations. Transformation Properties 431
  • 464. Creating a Sequence Generator Transformation To use a Sequence Generator transformation in a mapping, add it to the mapping, configure the transformation properties, and then connect NEXTVAL or CURRVAL to one or more transformations. To create a Sequence Generator transformation: 1. In the Mapping Designer, click Transformation > Create. Select the Sequence Generator transformation. The naming convention for Sequence Generator transformations is SEQ_TransformationName. 2. Enter a name for the Sequence Generator, and click Create. Click Done. The Designer creates the Sequence Generator transformation. 3. Double-click the title bar of the transformation to open the Edit Transformations dialog box. 4. Enter a description for the transformation. This description appears in the Repository Manager, making it easier for you or others to understand what the transformation does. 5. Select the Properties tab. Enter settings. For a list of transformation properties, see Table 19-1 on page 427. 432 Chapter 19: Sequence Generator Transformation
  • 465. Note: You cannot override the Sequence Generator transformation properties at the session level. This protects the integrity of the sequence values generated. 6. Click OK. 7. To generate new sequences during a session, connect the NEXTVAL port to at least one transformation in the mapping. Use the NEXTVAL or CURRVAL ports in an expression in other transformations. 8. Click Repository > Save. Creating a Sequence Generator Transformation 433
  • 466. 434 Chapter 19: Sequence Generator Transformation
  • 467. Chapter 20 Sorter Transformation This chapter includes the following topics: ♦ Overview, 436 ♦ Sorting Data, 437 ♦ Sorter Transformation Properties, 439 ♦ Creating a Sorter Transformation, 443 435
  • 468. Overview Transformation type: Active Connected You can sort data with the Sorter transformation. You can sort data in ascending or descending order according to a specified sort key. You can also configure the Sorter transformation for case-sensitive sorting, and specify whether the output rows should be distinct. The Sorter transformation is an active transformation. It must be connected to the data flow. You can sort data from relational or flat file sources. You can also use the Sorter transformation to sort data passing through an Aggregator transformation configured to use sorted input. When you create a Sorter transformation in a mapping, you specify one or more ports as a sort key and configure each sort key port to sort in ascending or descending order. You also configure sort criteria the Integration Service applies to all sort key ports and the system resources it allocates to perform the sort operation. Figure 20-1 shows a simple mapping that uses a Sorter transformation. The mapping passes rows from a sales table containing order information through a Sorter transformation before loading to the target. Figure 20-1. Sample Mapping with a Sorter Transformation 436 Chapter 20: Sorter Transformation
  • 469. Sorting Data The Sorter transformation contains only input/output ports. All data passing through the Sorter transformation is sorted according to a sort key. The sort key is one or more ports that you want to use as the sort criteria. You can specify more than one port as part of the sort key. When you specify multiple ports for the sort key, the Integration Service sorts each port sequentially. The order the ports appear in the Ports tab determines the succession of sort operations. The Sorter transformation treats the data passing through each successive sort key port as a secondary sort of the previous port. At session run time, the Integration Service sorts data according to the sort order specified in the session properties. The sort order determines the sorting criteria for special characters and symbols. Figure 20-2 shows the Ports tab configuration for the Sorter transformation sorting the data in ascending order by order ID and item ID: Figure 20-2. Sample Sorter Transformation Ports Configuration At session run time, the Integration Service passes the following rows into the Sorter transformation: ORDER_ID ITEM_ID QUANTITY DISCOUNT 45 123456 3 3.04 45 456789 2 12.02 43 000246 6 34.55 41 000468 5 .56 Sorting Data 437
  • 470. After sorting the data, the Integration Service passes the following rows out of the Sorter transformation: ORDER_ID ITEM_ID QUANTITY DISCOUNT 41 000468 5 .56 43 000246 6 34.55 45 123456 3 3.04 45 456789 2 12.02 438 Chapter 20: Sorter Transformation
  • 471. Sorter Transformation Properties The Sorter transformation has several properties that specify additional sort criteria. The Integration Service applies these criteria to all sort key ports. The Sorter transformation properties also determine the system resources the Integration Service allocates when it sorts data. Figure 20-3 shows the Sorter transformation Properties tab: Figure 20-3. Sorter Transformation Properties Sorter Cache Size The Integration Service uses the Sorter Cache Size property to determine the maximum amount of memory it can allocate to perform the sort operation. The Integration Service passes all incoming data into the Sorter transformation before it performs the sort operation. You can configure a numeric value for the Sorter cache, or you can configure the Integration Service to determine the cache size at runtime. If you configure the Integration Service to determine the cache size, you can also configure a maximum amount of memory for the Integration Service to allocate to the cache. If the total configured session cache size is 2 GB (2,147,483,648 bytes) or greater, you must run the session on a 64-bit Integration Service. Before starting the sort operation, the Integration Service allocates the amount of memory configured for the Sorter cache size. If the Integration Service runs a partitioned session, it allocates the specified amount of Sorter cache memory for each partition. If it cannot allocate enough memory, the Integration Service fails the session. For best performance, configure Sorter cache size with a value less than or equal to the amount of Sorter Transformation Properties 439
  • 472. available physical RAM on the Integration Service machine. Allocate at least 8 MB (8,388,608 bytes) of physical memory to sort data using the Sorter transformation. Sorter cache size is set to 8,388,608 bytes by default. If the amount of incoming data is greater than the amount of Sorter cache size, the Integration Service temporarily stores data in the Sorter transformation work directory. The Integration Service requires disk space of at least twice the amount of incoming data when storing data in the work directory. If the amount of incoming data is significantly greater than the Sorter cache size, the Integration Service may require much more than twice the amount of disk space available to the work directory. Use the following formula to determine the size of incoming data: number_of_input_rows [( Σ column_size) + 16] Table 20-1 gives the individual column size values by datatype for Sorter data calculations: Table 20-1. Column Sizes for Sorter Data Calculations Datatype Column Size Binary precision + 8 Round to nearest multiple of 8 Date/Time 24 Decimal, high precision off (all precision) 16 Decimal, high precision on (precision <=18) 24 Decimal, high precision on (precision >18, <=28) 32 Decimal, high precision on (precision >28) 16 Decimal, high precision on (negative scale) 16 Double 16 Real 16 Integer 16 Small integer 16 NString, NText, String, Text Unicode mode: 2*(precision + 5) ASCII mode: precision + 9 The column sizes include the bytes required for a null indicator. To increase performance for the sort operation, the Integration Service aligns all data for the Sorter transformation memory on an 8-byte boundary. Each Sorter column includes rounding to the nearest multiple of eight. The Integration Service also writes the row size and amount of memory the Sorter transformation uses to the session log when you configure the Sorter transformation tracing level to Normal. For more information about Sorter transformation tracing levels, see “Tracing Level” on page 441. 440 Chapter 20: Sorter Transformation
  • 473. Case Sensitive The Case Sensitive property determines whether the Integration Service considers case when sorting data. When you enable the Case Sensitive property, the Integration Service sorts uppercase characters higher than lowercase characters. Work Directory You must specify a work directory the Integration Service uses to create temporary files while it sorts data. After the Integration Service sorts the data, it deletes the temporary files. You can specify any directory on the Integration Service machine to use as a work directory. By default, the Integration Service uses the value specified for the $PMTempDir process variable. When you partition a session with a Sorter transformation, you can specify a different work directory for each partition in the pipeline. To increase session performance, specify work directories on physically separate disks on the Integration Service system. Distinct Output Rows You can configure the Sorter transformation to treat output rows as distinct. If you configure the Sorter transformation for distinct output rows, the Mapping Designer configures all ports as part of the sort key. When the Integration Service runs the session, it discards duplicate rows compared during the sort operation. Tracing Level Configure the Sorter transformation tracing level to control the number and type of Sorter error and status messages the Integration Service writes to the session log. At Normal tracing level, the Integration Service writes the size of the row passed to the Sorter transformation and the amount of memory the Sorter transformation allocates for the sort operation. The Integration Service also writes the time and date when it passes the first and last input rows to the Sorter transformation. If you configure the Sorter transformation tracing level to Verbose Data, the Integration Service writes the time the Sorter transformation finishes passing all data to the next transformation in the pipeline. The Integration Service also writes the time to the session log when the Sorter transformation releases memory resources and removes temporary files from the work directory. For more information about configuring tracing levels for transformations, see “Configuring Tracing Level in Transformations” on page 30. Sorter Transformation Properties 441
  • 474. Null Treated Low You can configure the way the Sorter transformation treats null values. Enable this property if you want the Integration Service to treat null values as lower than any other value when it performs the sort operation. Disable this option if you want the Integration Service to treat null values as higher than any other value. Transformation Scope The transformation scope specifies how the Integration Service applies the transformation logic to incoming data: ♦ Transaction. Applies the transformation logic to all rows in a transaction. Choose Transaction when a row of data depends on all rows in the same transaction, but does not depend on rows in other transactions. ♦ All Input. Applies the transformation logic on all incoming data. When you choose All Input, the PowerCenter drops incoming transaction boundaries. Choose All Input when a row of data depends on all rows in the source. For more information about transformation scope, see “Understanding Commit Points” in the Workflow Administration Guide. 442 Chapter 20: Sorter Transformation
  • 475. Creating a Sorter Transformation To add a Sorter transformation to a mapping, complete the following steps. To create a Sorter transformation: 1. In the Mapping Designer, click Transformation > Create. Select the Sorter transformation. The naming convention for Sorter transformations is SRT_TransformationName. Enter a description for the transformation. This description appears in the Repository Manager, making it easier to understand what the transformation does. 2. Enter a name for the Sorter and click Create. The Designer creates the Sorter transformation. 3. Click Done. 4. Drag the ports you want to sort into the Sorter transformation. The Designer creates the input/output ports for each port you include. 5. Double-click the title bar of the transformation to open the Edit Transformations dialog box. 6. Select the Ports tab. 7. Select the ports you want to use as the sort key. 8. For each port selected as part of the sort key, specify whether you want the Integration Service to sort data in ascending or descending order. 9. Select the Properties tab. Modify the Sorter transformation properties. For information about Sorter transformation properties, see “Sorter Transformation Properties” on page 439. 10. Select the Metadata Extensions tab. Create or edit metadata extensions for the Sorter transformation. For more information about metadata extensions, see “Metadata Extensions” in the Repository Guide. 11. Click OK. 12. Click Repository > Save to save changes to the mapping. Creating a Sorter Transformation 443
  • 476. 444 Chapter 20: Sorter Transformation
  • 477. Chapter 21 Source Qualifier Transformation This chapter includes the following topics: ♦ Overview, 446 ♦ Source Qualifier Transformation Properties, 449 ♦ Default Query, 451 ♦ Joining Source Data, 454 ♦ Adding an SQL Query, 458 ♦ Entering a User-Defined Join, 460 ♦ Outer Join Support, 462 ♦ Entering a Source Filter, 470 ♦ Using Sorted Ports, 472 ♦ Select Distinct, 474 ♦ Adding Pre- and Post-Session SQL Commands, 475 ♦ Creating a Source Qualifier Transformation, 476 ♦ Troubleshooting, 478 445
  • 478. Overview Transformation type: Active Connected When you add a relational or a flat file source definition to a mapping, you need to connect it to a Source Qualifier transformation. The Source Qualifier transformation represents the rows that the Integration Service reads when it runs a session. Use the Source Qualifier transformation to complete the following tasks: ♦ Join data originating from the same source database. You can join two or more tables with primary key-foreign key relationships by linking the sources to one Source Qualifier transformation. ♦ Filter rows when the Integration Service reads source data. If you include a filter condition, the Integration Service adds a WHERE clause to the default query. ♦ Specify an outer join rather than the default inner join. If you include a user-defined join, the Integration Service replaces the join information specified by the metadata in the SQL query. ♦ Specify sorted ports. If you specify a number for sorted ports, the Integration Service adds an ORDER BY clause to the default SQL query. ♦ Select only distinct values from the source. If you choose Select Distinct, the Integration Service adds a SELECT DISTINCT statement to the default SQL query. ♦ Create a custom query to issue a special SELECT statement for the Integration Service to read source data. For example, you might use a custom query to perform aggregate calculations. Transformation Datatypes The Source Qualifier transformation displays the transformation datatypes. The transformation datatypes determine how the source database binds data when the Integration Service reads it. Do not alter the datatypes in the Source Qualifier transformation. If the datatypes in the source definition and Source Qualifier transformation do not match, the Designer marks the mapping invalid when you save it. Target Load Order You specify a target load order based on the Source Qualifier transformations in a mapping. If you have multiple Source Qualifier transformations connected to multiple targets, you can designate the order in which the Integration Service loads data into the targets. 446 Chapter 21: Source Qualifier Transformation
  • 479. If one Source Qualifier transformation provides data for multiple targets, you can enable constraint-based loading in a session to have the Integration Service load data based on target table primary and foreign key relationships. For more information, see “Mappings” in the Designer Guide. Parameters and Variables You can use parameters and variables in the SQL query, user-defined join, source filter, and pre- and post-session SQL commands of a Source Qualifier transformation. Use any parameter or variable type that you can define in the parameter file. You can enter a parameter or variable within the SQL statement, or you can use a parameter or variable as the SQL query. For example, you can use a session parameter, $ParamMyQuery, as the SQL query, and set $ParamMyQuery to the SQL statement in a parameter file. The Integration Service first generates an SQL query and replaces each mapping parameter or variable with its start value. Then it runs the query on the source database. When you use a string mapping parameter or variable in the Source Qualifier transformation, use a string identifier appropriate to the source system. Most databases use a single quotation mark as a string identifier. For example, to use the string parameter $$IPAddress in a source filter for a Microsoft SQL Server database table, enclose the parameter in single quotes as follows, ‘$$IPAddress’. For more information, see the database documentation. When you use a datetime mapping parameter or variable, or when you use the system variable $$$SessStartTime, you might need to change the date format to the format used in the source. The Integration Service passes datetime parameters and variables to source systems as strings in the SQL query. The Integration Service converts a datetime parameter or variable to a string, based on the source database. Table 21-1 describes the datetime formats the Integration Service uses for each source system: Table 21-1. Conversion for Datetime Mapping Parameters and Variables Source Date Format DB2 YYYY-MM-DD-HH24:MI:SS Informix YYYY-MM-DD HH24:MI:SS Microsoft SQL Server MM/DD/YYYY HH24:MI:SS ODBC YYYY-MM-DD HH24:MI:SS Oracle MM/DD/YYYY HH24:MI:SS Sybase MM/DD/YYYY HH24:MI:SS Teradata YYYY-MM-DD HH24:MI:SS Some databases require you to identify datetime values with additional punctuation, such as single quotation marks or database specific functions. For example, to convert the $$$SessStartTime value for an Oracle source, use the following Oracle function in the SQL override: Overview 447
  • 480. to_date (‘$$$SessStartTime’, ‘mm/dd/yyyy hh24:mi:ss’) For Informix, use the following Informix function in the SQL override to convert the $$$SessStartTime value: DATETIME ($$$SessStartTime) YEAR TO SECOND For more information about SQL override, see “Overriding the Default Query” on page 452. For information about database specific functions, see the database documentation. Tip: To ensure the format of a datetime parameter or variable matches that used by the source, validate the SQL query. For information about mapping parameters and variables, see “Mapping Parameters and Variables” in the Designer Guide. 448 Chapter 21: Source Qualifier Transformation
  • 481. Source Qualifier Transformation Properties Configure the Source Qualifier transformation properties on the Properties tab of the Edit Transformations dialog box. Table 21-2 describes the Source Qualifier transformation properties: Table 21-2. Source Qualifier Transformation Properties Option Description SQL Query Defines a custom query that replaces the default query the Integration Service uses to read data from sources represented in this Source Qualifier transformation. For more information, see “Adding an SQL Query” on page 458. A custom query overrides entries for a custom join or a source filter. User-Defined Join Specifies the condition used to join data from multiple sources represented in the same Source Qualifier transformation. For more information, see “Entering a User-Defined Join” on page 460. Source Filter Specifies the filter condition the Integration Service applies when querying rows. For more information, see “Entering a Source Filter” on page 470. Number of Sorted Ports Indicates the number of columns used when sorting rows queried from relational sources. If you select this option, the Integration Service adds an ORDER BY to the default query when it reads source rows. The ORDER BY includes the number of ports specified, starting from the top of the transformation. When selected, the database sort order must match the session sort order. Tracing Level Sets the amount of detail included in the session log when you run a session containing this transformation. For more information, see “Configuring Tracing Level in Transformations” on page 30. Select Distinct Specifies if you want to select only unique rows. The Integration Service includes a SELECT DISTINCT statement if you choose this option. Source Qualifier Transformation Properties 449
  • 482. Table 21-2. Source Qualifier Transformation Properties Option Description Pre-SQL Pre-session SQL commands to run against the source database before the Integration Service reads the source. For more information, see “Adding Pre- and Post-Session SQL Commands” on page 475. Post-SQL Post-session SQL commands to run against the source database after the Integration Service writes to the target. For more information, see “Adding Pre- and Post-Session SQL Commands” on page 475. Output is Deterministic Relational source or transformation output that does not change between session runs when the input data is consistent between runs. When you configure this property, the Integration Service does not stage source data for recovery if transformations in the pipeline always produce repeatable data. Output is Repeatable Relational source or transformation output that is in the same order between session runs when the order of the input data is consistent. When output is deterministic and output is repeatable, the Integration Service does not stage source data for recovery. 450 Chapter 21: Source Qualifier Transformation
  • 483. Default Query For relational sources, the Integration Service generates a query for each Source Qualifier transformation when it runs a session. The default query is a SELECT statement for each source column used in the mapping. In other words, the Integration Service reads only the columns that are connected to another transformation. Figure 21-1 shows a single source definition connected to a Source Qualifier transformation: Figure 21-1. Source Definition Connected to a Source Qualifier Transformation Although there are many columns in the source definition, only three columns are connected to another transformation. In this case, the Integration Service generates a default query that selects only those three columns: SELECT CUSTOMERS.CUSTOMER_ID, CUSTOMERS.COMPANY, CUSTOMERS.FIRST_NAME FROM CUSTOMERS If any table name or column name contains a database reserved word, you can create and maintain a file, reswords.txt, containing reserved words. When the Integration Service initializes a session, it searches for reswords.txt in the Integration Service installation directory. If the file exists, the Integration Service places quotes around matching reserved words when it executes SQL against the database. If you override the SQL, you must enclose any reserved word in quotes. For more information about the reserved words file, see “Working with Targets” in the Workflow Administration Guide. When generating the default query, the Designer delimits table and field names containing the following characters with double quotes: / + - = ~ ` ! % ^ & * ( ) [ ] { } ' ; ? , < > | <space> Default Query 451
  • 484. Viewing the Default Query You can view the default query in the Source Qualifier transformation. To view the default query: 1. From the Properties tab, select SQL Query. The SQL Editor appears. The SQL Editor displays the default query the Integration Service uses to select source data. 2. Click Generate SQL. 3. Click Cancel to exit. Note: If you do not cancel the SQL query, the Integration Service overrides the default query with the custom SQL query. Do not connect to the source database. You only connect to the source database when you enter an SQL query that overrides the default query. Tip: You must connect the columns in the Source Qualifier transformation to another transformation or target before you can generate the default query. Overriding the Default Query You can alter or override the default query in the Source Qualifier transformation by changing the default settings of the transformation properties. Do not change the list of selected ports or the order in which they appear in the query. This list must match the connected transformation output ports. When you edit transformation properties, the Source Qualifier transformation includes these settings in the default query. However, if you enter an SQL query, the Integration Service uses 452 Chapter 21: Source Qualifier Transformation
  • 485. only the defined SQL statement. The SQL Query overrides the User-Defined Join, Source Filter, Number of Sorted Ports, and Select Distinct settings in the Source Qualifier transformation. Note: When you override the default SQL query, you must enclose all database reserved words in quotes. Default Query 453
  • 486. Joining Source Data Use one Source Qualifier transformation to join data from multiple relational tables. These tables must be accessible from the same instance or database server. When a mapping uses related relational sources, you can join both sources in one Source Qualifier transformation. During the session, the source database performs the join before passing data to the Integration Service. This can increase performance when source tables are indexed. Tip: Use the Joiner transformation for heterogeneous sources and to join flat files. Default Join When you join related tables in one Source Qualifier transformation, the Integration Service joins the tables based on the related keys in each table. This default join is an inner equijoin, using the following syntax in the WHERE clause: Source1.column_name = Source2.column_name The columns in the default join must have: ♦ A primary key-foreign key relationship ♦ Matching datatypes For example, you might see all the orders for the month, including order number, order amount, and customer name. The ORDERS table includes the order number and amount of each order, but not the customer name. To include the customer name, you need to join the ORDERS and CUSTOMERS tables. Both tables include a customer ID, so you can join the tables in one Source Qualifier transformation. 454 Chapter 21: Source Qualifier Transformation
  • 487. Figure 21-2 shows joining two tables with one Source Qualifier transformation: Figure 21-2. Joining Two Tables with One Source Qualifier Transformation When you include multiple tables, the Integration Service generates a SELECT statement for all columns used in the mapping. In this case, the SELECT statement looks similar to the following statement: SELECT CUSTOMERS.CUSTOMER_ID, CUSTOMERS.COMPANY, CUSTOMERS.FIRST_NAME, CUSTOMERS.LAST_NAME, CUSTOMERS.ADDRESS1, CUSTOMERS.ADDRESS2, CUSTOMERS.CITY, CUSTOMERS.STATE, CUSTOMERS.POSTAL_CODE, CUSTOMERS.PHONE, CUSTOMERS.EMAIL, ORDERS.ORDER_ID, ORDERS.DATE_ENTERED, ORDERS.DATE_PROMISED, ORDERS.DATE_SHIPPED, ORDERS.EMPLOYEE_ID, ORDERS.CUSTOMER_ID, ORDERS.SALES_TAX_RATE, ORDERS.STORE_ID FROM CUSTOMERS, ORDERS WHERE CUSTOMERS.CUSTOMER_ID=ORDERS.CUSTOMER_ID The WHERE clause is an equijoin that includes the CUSTOMER_ID from the ORDERS and CUSTOMER tables. Custom Joins If you need to override the default join, you can enter contents of the WHERE clause that specifies the join in the custom query. If the query performs an outer join, the Integration Service may insert the join syntax in the WHERE clause or the FROM clause, depending on the database syntax. You might need to override the default join under the following circumstances: ♦ Columns do not have a primary key-foreign key relationship. Joining Source Data 455
  • 488. The datatypes of columns used for the join do not match. ♦ You want to specify a different type of join, such as an outer join. For more information about custom joins and queries, see “Entering a User-Defined Join” on page 460. Heterogeneous Joins To perform a heterogeneous join, use the Joiner transformation. Use the Joiner transformation when you need to join the following types of sources: ♦ Join data from different source databases ♦ Join data from different flat file systems ♦ Join relational sources and flat files For more information, see “Joiner Transformation” on page 283. Creating Key Relationships You can join tables in the Source Qualifier transformation if the tables have primary key- foreign key relationships. However, you can create primary key-foreign key relationships in the Source Analyzer by linking matching columns in different tables. These columns do not have to be keys, but they should be included in the index for each table. Tip: If the source table has more than 1,000 rows, you can increase performance by indexing the primary key-foreign keys. If the source table has fewer than 1,000 rows, you might decrease performance if you index the primary key-foreign keys. For example, the corporate office for a retail chain wants to extract payments received based on orders. The ORDERS and PAYMENTS tables do not share primary and foreign keys. Both tables, however, include a DATE_SHIPPED column. You can create a primary key- foreign key relationship in the metadata in the Source Analyzer. Note, the two tables are not linked. Therefore, the Designer does not recognize the relationship on the DATE_SHIPPED columns. You create a relationship between the ORDERS and PAYMENTS tables by linking the DATE_SHIPPED columns. The Designer adds primary and foreign keys to the DATE_SHIPPED columns in the ORDERS and PAYMENTS table definitions. 456 Chapter 21: Source Qualifier Transformation
  • 489. Figure 21-3 shows a relationship between two tables: Figure 21-3. Creating a Relationship Between Two Tables If you do not connect the columns, the Designer does not recognize the relationships. The primary key-foreign key relationships exist in the metadata only. You do not need to generate SQL or alter the source tables. Once the key relationships exist, use a Source Qualifier transformation to join the two tables. The default join is based on DATE_SHIPPED. Joining Source Data 457
  • 490. Adding an SQL Query The Source Qualifier transformation provides the SQL Query option to override the default query. You can enter an SQL statement supported by the source database. Before entering the query, connect all the input and output ports you want to use in the mapping. When you edit the SQL Query, you can generate and edit the default query. When the Designer generates the default query, it incorporates all other configured options, such as a