Starburst Galaxy enhancements accelerate data querying

Starburst launches foundational Data Products capabilities for its managed service, Starburst Galaxy.

New data and schema discovery and data privileges capabilities simplify and streamline the traditional Extract, Transform, Load (ETL) process for the curation of Data Products, accelerating data querying, access and analytics across the organization. The foundational Data Products capabilities are now available in public preview.

Data volumes and use are exploding across industries. Connected internet of things (IoT) devices are expected to generate almost 80B zettabytes (ZB) of data by 2025 and organizations are inheriting data through M&A and newly developed applications at a fast pace. Regardless of intention, nearly every modern enterprise is or will be driven to the cloud as a result.

While data lakes and data warehouses have been effective in solving many data management challenges, the cloud is still becoming everyone’s reality and data use is continuing to explode. In turn, these data lakes and warehouses can quickly become data swamps – murky or cluttered with disorganized data that presents significant challenges around accessibility and the ability to leverage the data for actionable insights.

Delivered as a managed service through Starburst Galaxy, these new discoverability features are addressing these challenges, reducing time to discovery from hours to seconds, and laying the foundation for self-service Data Product curation, regardless of technical expertise. New capabilities include:

  • Data discovery enables data users to easily search and understand what data they have, where it lives, and where it came from. Metadata is automatically populated with query history and context, providing key insights into how data is being used.
  • Schema discovery takes this a step further by enabling the discovery of not just existing datasets across sources and clouds, but also net new datasets, no matter what form they’re in. This takes the “Transform” out of “ELT,” supporting a more simplified process where data engineers loading data don’t need to consider the schemas beforehand.
  • Granular Access Control enables data administrators to clearly see and understand who has access to what data, and how it’s being used, in the context of the data itself. This means data administrators can monitor and change permissions through policy as code to ensure security and risk reduction within a continuous integration / continuous delivery / continuous deployment (CI/CD) pipeline.

“Data volumes are growing exponentially and simultaneously becoming more distributed, which makes finding, managing, and curating datasets a highly time-consuming, resource-intensive process,” said Justin Borgman, Chairman & CEO, Starburst. “New discoverability capabilities in Starburst Galaxy empower organizations to find and understand data before querying, laying the foundation for Data Product curation by streamlining data discovery and accelerating ELT processes. Enabling organizations to more efficiently discover the right datasets, Starburst Galaxy is helping reduce costs while getting more value out of their data.”

This Data Products update to Starburst Galaxy comes shortly after Starburst announced enhanced Data Products functionality for its flagship product, Starburst Enterprise. It also comes on the heels of a big year for Starburst’s software-as-a-service (SaaS) product where it enhanced its distributed data analytics capabilities and, less than a year after its introduction, closed its first-ever seven-figure customer deal.

While enabling data producers and consumers to create, publish, discover, and manage curated Data Products is core to the emerging Data Mesh paradigm focused on decentralization and self-service, Starburst remains steadfast in its commitment to realizing the Data Mesh vision.

More about

Don't miss