Initial Load from Tandem (HP-UX) to AWS Kafka
Working with a customer where we needed to move data from a Tandem (HP-UX Guardian) system up to an AWS EC2 platform that will eventually end up on an AWS MSK Cluster (Kafka). The concept is pretty straight forward; however, I will say that putting it into practice provided to be a challenge. The biggest part of this challenge was the initial load process that should be used. Oracle GoldenGate is the best replication tool on the market, but the one thing that it lags in is the approach of initial load.
The initial load process for Oracle GoldenGate (both Oracle and Heterogenous) comes in many different options. The documentation for initial loads have changed over the years as well as the approaches. With the latest release of Oracle GoldenGate (21c), this hasn’t gotten any better. But if you want a reference point, you can review the steps in Chapter 11 of the Using Oracle GoldenGate with Oracle Database (here).
To perform this initial load from a Tandem system to Kafka, we used two different binary sets. In this case, we were using the following:
- Oracle GoldenGate 12c (12.2.0.1)(Classic)
- Oracle GoldenGate 21c (21.5.0.0)(Microservices)
As you may have noticed, the implementation we were working with is a Classic to Microservices architecture. Oracle GoldenGate (Microservices) is the latest release of Oracle GoldenGate and the direction that Oracle is pushing for data integration strategies.
Below you see the basic concept of this architecture. The Classic side of the architecture is straight forward when it comes to Oracle GoldenGate. When the architecture transitions to the EC2 side, Oracle GoldenGate (Microservices) has a few more moving parts that should be managed. For the purpose of the initial load, the only service that is needed is the Receiver Service (port 16003); more on this shortly.
In the diagram above, we are only going to be discussing the items in red. These items represent the initial load process and how this process was accomplished between a Oracle GoldenGate (Classic) running on a Tandem (HP-UX Non-Stop) and loading data into Kafka using Oracle GoldenGate (Microservices). The “direct load” initial load process was followed for this configuration (this approach is not covered in the 21c docs…just an FYI).
Tandem (HP-UX Non-Stop):
First thing that needs to be done is configuring the Initial Load Extract on the Tandem side. The following parameter file was used:
EXTRACT eil
RMTHOST <host/IP address>, MGRPORT 16003
RMTFILE <rmtfile>
TABLE *.*.*;
This initial load extract looks pretty standard. We are telling Oracle GoldenGate (Classic) to read all the data from the tables that were in the TABLE line. Essenstally doing a “SELECT *” and pull all the data. Then move that data across the network to the remote server and begin writing to the remote file. Oracle GoldenGate (Microservices) begins writing the remote file, but then immediately presents a “broken pipe” error. This behavior caused a lot of confusion. After opening an SR and talking with Oracle resources, it was noted that the Tandem default settings for the buffer needed to be changed.
Note: By default, Tandem sets its TCP/IP buffer to 64K.
Apparently, using the default settings on the Tandem didn’t work. To work around the “broken pipe” issue, we had to set the TCPFlushBytes and TCPBufSize to less than 28K. This resulted in our extract parameter file being changed:
EXTRACT eil
TCPFlushBytes 27000
TCPBufSize 27000
RMTHOST <host/IP address>, MGRPORT 16003
RMTFILE <rmtfile>
TABLE *.*.*;
By adding the TCP parameters, the extract in Oracle GoldenGate (Classic) is able to successfully make the connection to the Oracle GoldenGate (Microservices): Receiver Service on port 16003 and write data to the remote file. By shrinking the TCPFlushBytes and the TCPBufSize below 28K, this avoids a known limitation with Oracle GoldenGate and the usage of RMTTASK and RMTFILE, since they both make the same calls.
To build the extract within Oracle GoldenGat (Classic) the following commands were used:
GGSCI> add extract eil, SOURCEISTABLE
EC2 Instance/GoldenGate Microservices
With the RMTFILE being written successfully to the AWS EC2 platform, an initial load replicat can be built to apply the bulk data to Kafka. The parameter file for the replicat is:
REPLICAT ril
TARGETDB LIBFILE libggjava.so SET property=<location for Kafka properties file>
SOURCEDEFS <location for source def file>
MAP *.*.*, TARGET *.*.*;
Now the exact setting for configuring the connection to Kafka are contained within two properties files. These properties files are used to make the connect and what format the data should be provided in. The first of these files is the Kafka.properties (connection file). This file sets ups the Kafka Handler and any specific items that are needed for the handler. The example that we used is similar to the following:
gg.handlerlist=kafkahandler
#The handler properties
gg.handler.kafkahandler.type=kafka
gg.handler.kafkahandler.kafkaProducerConfigFile=kafka.properties
gg.handler.kafkahandler.topicMappingTemplate=${toLowerCase[${tableName}]}
gg.handler.kafkahandler.keyMappingTemplate=${toLowerCase[${tableName}]}
gg.handler.kafkahandler.mode=op
gg.handler.kafkahandler.Format=json
goldengate.userexit.writers=javawriter
javawriter.stats.display=TRUE
javawriter.stats.full=TRUE
gg.log=log4j
gg.log.level=INFO
gg.report.time=30sec
gg.classpath=/app/orabd/opt/DependencyDownloader/dependencies/kafka_2.7.1/*
sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule required \
username=“<user>" \
password=“<password>";
security.protocol=SASL_SSL
Next we defined a second properties file that defines the connection to Kafka and the associated brokers. As well as setting up the conversion of the data formats and performance tuning items. Our file looked similar to the following:
# address/port of the Kafka broker
bootstrap.servers=<kafka broker servers and ports>
#JSON Converter Settings
key.converter.schemas.enable=false
value.converter.schemas.enable=false
value.serializer = org.apache.kafka.common.serialization.ByteArraySerializer
key.serializer = org.apache.kafka.common.serialization.ByteArraySerializer
#Adjust for performance
buffer.memory=33554432
batch.size=2048
linger.ms=500
After setting the properties file that will be used by Oracle GoldenGate for Big Data to connect to Kafka, we needed to add the replicat to the architecture. This can be done either from the AdminClient or from the HMTL5 web page through the Administration Service. For command line compatibility, the following steps are done through the AdminClient:
AdminClient> add replicat ril, exttrail <RMTFILE>
AdminClient> start replicat ril
At this stage, after starting the replicat (RIL), we are reading the RMTFILE and performing a “direct load” initial load from Tandem to Kafka.
With the “direct load” working, we were pushing approximate 5.4 million records in ~45 minutes. This was a single table load. Depending on the number of tables that need to be loaded, you will need to scale this approach with either more RMTFILEs or multiple replicats.
Enjoy and happy replicating!
Current Oracle Certs
Bobby Curtis
I’m Bobby Curtis and I’m just your normal average guy who has been working in the technology field for awhile (started when I was 18 with the US Army). The goal of this blog has changed a bit over the years. Initially, it was a general blog where I wrote thoughts down. Then it changed to focus on the Oracle Database, Oracle Enterprise Manager, and eventually Oracle GoldenGate.
If you want to follow me on a more timely manner, I can be followed on twitter at @dbasolved or on LinkedIn under “Bobby Curtis MBA”.
This blog was… how do I say it? Relevant!! Finally I’ve found
something that helped me. Kudos!
There’s definately a great deal to know about this issue. I love all the points you made.
For newest news you have to visit world-wide-web and on world-wide-web I found this web site as a finest web site for latest updates.
Now I am ready to do my breakfast, once having my breakfast
coming yet again to read more news.
Feel free to surf to my site – balikesir.ogo.org.tr
Greetings from Los angeles! I’m bored to death
at work so I decided to browse your site on my iphone during lunch break.
I really like the information you provide here and can’t wait to take a
look when I get home. I’m amazed at how quick your blog loaded on my phone ..
I’m not even using WIFI, just 3G .. Anyways, amazing
blog!
Here is my web page; 서울출장마사지
Greetings from Carolina! I’m bored at work so I decided to browse your website
on my iphone during lunch break. I really like the knowledge
you present here and can’t wait to take a look when I get home.
I’m shocked at how fast your blog loaded on my mobile ..
I’m not even using WIFI, just 3G .. Anyhow, awesome
blog!
Also visit my page – 遊戲代儲
This web site really has all of the information I wanted about this subject and didn’t know who
to ask.
What’s up, always i used to check webpage posts here
early in the daylight, as i enjoy to gain knowledge of more
and more.
Nice Article
Thank you! A lot of info.
Here is my web blog https://depot.lk/user/profile/51474
You actually stated this very well.
Also visit my site http://Woodvalley.kr/bbs/board.php?bo_table=after&wr_id=602102
Fine forum posts Thank you.
Thanks for sharing. I read many of your blog posts, cool, your blog is very good.
You suggested it really well.
You actually stated it well.
Wonderful data, Kudos!
my web page – https://Tangguifang.Dreamhosters.com
You revealed that adequately!
my blog https://Tourdeskhawaii.com/
Within the office, sales may be tracked in a fashion that encourages friendly competition together with ongoing rankings or perhaps a theme related display “leader board.” With the aid of an web-based platform, these rankings and best practices are broadly visible across geographies.
In such circumstances it is very difficult and risky for an investor to stick to a single mode.
Whoa quite a lot of wonderful advice!
Whereas flood myths would possibly make for great storytelling, where does science stand on the opportunity of a massive, earth-shattering flood?
When firms have a dynamic debt-equity target, this explains why some firms use dividends and others do not.
It actually adds a great deal of value on your resume when MBA degree with past and present job experiences is mentioned.
Daniels also claimed that a number of the autographs on the images may have been fakes.
These embody approach cheaper versus typical houses or maybe apartments; however there’re susceptible with reference to security or perhaps durability, which suggests they will not be that made for longer lodgings durations.
I’d never set foot inside an IKEA until working on this article – I had heard the horror stories, and I decided my Saturdays would be better spent on my own couch than wading through the crowds to test out a low-slung Swedish one.
It contains: (i) fashionable portfolio principle (efficient frontier, capital asset pricing model, and many others.); (ii) funding observe (defining the investment coverage for individual and institutional traders, resultant asset allocation, order execution, and hedging utilizing derivatives); and (iii) measurement of funding performance.