The DataOps – Page 8 – StorageNewsBox

Transformer for Spark: Multitable Consumer

storagenewsbox — February 9, 2023 add comment

Newcomers to StreamSets might not know that we have four engines under the Control Hub hood: Data Collector, Transformer for Spark, Transformer for Snowflake, and Mainframe Collector. This fact... Read more »

Delta Lake Architecture: A Bridge Between Data Lakes & Data Warehouses

storagenewsbox — February 7, 2023 add comment

Data warehouses and data lakes are the most common central data repositories employed by most data-driven organizations today, each with its own strengths and tradeoffs. For one, while data... Read more »

Using the StreamSets Python SDK To Create Reusable StreamSets Pipelines (S3 to Redshift Example)

storagenewsbox — February 2, 2023 add comment

Many StreamSets Data Collector customers are now migrating their Hadoop ingestion pipelines to cloud platforms like AWS and they want to take full advantage of the AWS native services... Read more »

Extend StreamSets Integration With Source Systems Using Groovy

storagenewsbox — February 1, 2023 add comment

StreamSets Data Collector (SDC) supports 69 sources, including relational and no-SQL databases, on-prem and cloud file systems and a handful of messaging applications (documentation). Yet, occasionally, customers ask if... Read more »

3 Ways To Keep Up With Constant Change

storagenewsbox — January 26, 2023 add comment

The business climate today feels a bit like a battleground, and everyone’s feeling the pressure. A recession looms, competition is fierce, ongoing supply chain issues wreak havoc, and customer... Read more »

4 Ways Data Federation Tools Will Let You Down

storagenewsbox — January 25, 2023 add comment

Data federation tools are often touted for their ability to unify and query data in a variety of sources and formats using virtualization. The technology, the theory goes, provides... Read more »

Send Kafka Messages To Amazon S3

storagenewsbox — January 20, 2023 add comment

In this post, we will take a look at best practices for integrating StreamSets Data Collector Engine (SDC), a fast data ingestion engine, with Kafka. Then, we’ll dive deep into... Read more »

Take Control of the Data “Wild West” — And Empower Your LOB To Boot

storagenewsbox — January 19, 2023 add comment

Raise your hand if you’ve ever had a line of business user go rogue and create a dataset without telling you. Don’t be shy… You’re in good company. In... Read more »

Cloud Data Migration – Knowing When, Why, and How To Move Your Data

storagenewsbox — January 19, 2023 add comment

Cloud data migration is on the rise, with cloud adoption expected to nearly double in the next five years. It’s no surprise that cloud data migration is increasing, as... Read more »

Reverse ETL to Marketo: A Real-Life Example

storagenewsbox — January 19, 2023 add comment

Standing for Extract, Transform, and Load, the acronym ETL describes the process of extracting data from a target, transforming it, and sending it on to load into a destination.... Read more »