The most essential operation when combining data from multiple sources is the Join operation. Used to combine rows from different tables the Join operation relies on matching values in specified fields or columns. Real world data originating in separate systems often contains field values that do not match exactly even though they refer to the same object. These differences can arise due to system differences, use of different naming conventions or weak compliance to standards. Data engineers building data models drawing on disparate sources must spend significant effort wrangling the data to account for such differences. This can entail complicated manipulations to achieve joins based on matches that are “near” or less than exact matches.
The new Advanced Join transformation in Element Unify enables users to quickly combine data from various sources based on matching multiple relevant data fields and using matching approaches including “fuzzy” and “contains” matching.
Advanced Join speeds the data wrangling effort, giving the modeler a convenient and flexible method for performing useful and tighter joins. Other benefits include the reduction in record duplication and the ability for customers to deploy terminology standards for common assets across their business.
The Advanced Join transformation supports three joining methods:
Figure 1: Advanced Join Dialog Box
Using Advanced Joins the engineer can easily perform successive refinement of their join, making decisions based on the amount of data matched at a given similarity level. Furthermore, the Advanced Join transformation allows the engineer to create joins based on multiple columns and use different joining methods for different columns as needed e.g. ExactJoin (Column A, Column B) and FuzzyJoin (Column C, Column D) and ContainsJoin (Column D, Column E).
Questions? Please contact us.