- TG Marketing Team
Legacy to Talend: Conversion Plan
Updated: Feb 10, 2020
Authored by: TG Marketing
Introduction
At TESCHGlobal, we have been Gold Partners to Talend for over 4 years and have taken part in all types and sizes of Talend implementations. We have developed a strategic and repeatable approach to helping clients convert to Talend from: Informatica, DataStage, SSIS, and Ab Initio.
There are a few patterns that emerge during a data management migration and we exploit these patterns to help clients optimize and refactor their architectural priorities, such as: high availability, resource management, maintainability and scalability.
We can help with strategies to optimize resources both on-premise and in the cloud. We will implement monitoring and management capabilities to enable the provisioning of load and capacities.
In some cases migrations can be fairly straightforward with a one-to-one port approach, but based on discovered pain points and opportunities we push our clients to continuously improve their environments.
TALEND AT ANY SCALE
The following patterns can be used when converting from proprietary legacy data integration tools to Talend and open source. The goal in Talend is to create modularized and horizontally scalable jobs that can be run in multiple ways. There are misconceptions on how Talend can handle large loads like the legacy monolithic providers. The patterns below can match and even beat legacy performance and manageability.
Use YARN to manage and run jobs in Hadoop or Spark at Big Data scale
Use the Talend scheduler and virtual servers to create a bank of servers with no data tax
Use Talend data service REST APIs fronted by a load balancer
Use a competing consumer pattern with queues. This allows the same job to deployed across a bank of resources. Each deployed job will grab the next available context or partition of work from the queue and process it. This patterns allows both load balancing and throttling of work.
METADATA DRIVEN VS VISUAL DESIGN
We can help you to leverage the correct mix of visual design vs the operationalization of metadata.
Define the correct mix of data driven vs drag and drop design
Maximize code reuse
Implement best practices for creating maintainable jobs
Use metadata driven processes for patterned activities such as data ingestion
What we offer is a Professional Services Full Lifecycle Delivery. It is our goal to take clients from assessment to management and operations.
Our Approach

Migration Life Cycle
Billing options (Retainer of work hours vs. set of estimated)
Delivery vs. Knowledge Transfer
DISCOVER & ALIGN
Discover Business and Technical Drivers and Priorities
Identify stakeholders
Define Scope
Discover high level objectives
Discover and Align with Enterprise Architectures current and future state
ASSESS & REFINE
- Identify the backlog and align with business & technical priorities
- Estimate resources and timeline for backlog
- Gather requirements and reverse engineer existing legacy jobs
- Develop detailed integration roadmap
- Define integrations approach with guidance from architecture
- Identify patterns and best approaches for the port from legacy systems to Talend
Security
Parallelization
Identify the preferred run time based on best practices, cost, and enterprise standards
Job Composition and Modularization
Runtime Talend Jobs, ELT, MPP, Map
Reduce, or Spark
- Solution Architecture
Design Patterns
Best Practices
DELIVER
- Identify integration requirements
Connections
Contexts
Schemas
Business & routing rules
- Design integrations and unit test
- Architectural Oversight
- Project oversight
Planning
Execution
Review
OPERATE & MANAGE
- Continuous Integration
Build
Test
Publish
Deploy
Operationalize
- Infrastructure
Installation
Configuration
Tuning
Administration
Operations
Communicate technical and operational requirements
Monitor performance and error logs
Maintain integrations
CONTAIN & RETIRE LEGACY JOBS
Set the deprecation date and communicate plan to service consumers
Develop migration plan for migrating consuming applications off of legacy systems
Place the integrations in containment
Retire the integration
OVERARCHING CONCERNS
The integration lifecycle enables the growth and maturation of integrations post initial release to retirement
Best practice data management principles, frameworks and methodologies are used throughout the lifecycle
The lifecycle is applicable to multiple integration patterns and supports the addition of technologies & practices
Have a targeted and governed approach to where data is staged, processed and stored based on prioritized architectural attributes.
If you are looking to explore the use of Talend please reach out to sales@teschglobal.com