From data warehouses to data lakes, there is a growing range of options when it comes to cloud platforms, deployment models, and features. At the same time, challenges remain, including data integration, governance, performance, management, and monitoring.
Nearly 80% of DBTA subscribers currently have digital transformation initiatives underway and the vast majority of these projects focus on two main areas: solutions and cloud analytics.
DBTA held a webinar in which Anand Rao, Director of Product Marketing, Qlik; Louis Carr, Senior Director, Product Marketing, Actian; and Thomas Hazel, Founder, Chief Technology Officer, and Chief Scientist, ChaosSearch, who discussed key solutions and best practices for success with today’s modern analytics.
Rao explained that there are three ongoing trends in modernizing the data architecture and automation space. This includes cloud application development, data warehouse modernization, and next-generation cloud data generation.
Rao said cloud application development through CDC Stream ensures consistency and ease of use across all major platforms (resources and targets), has no impact on production systems, is easy to manage, automated, and has a low TCO.
Updating the data warehouse reduces risk, saves time and money – without the need for scripting or coding. The data warehouse can now be created for hours, and changed in minutes. It can also demonstrate new requirements and new platforms in the future.
Creating a managed data lake allows users to quickly and easily build high-scale data pipelines. It eliminates risky, expensive, and complex custom coding. The ‘last mile’ can be closed by providing data ready for real-time analytics.
Carr said modern cloud analytics needs convergence. A modern architecture needs a data center, analytics, hub, data lake, and data warehouse.
The data center can connect to and ingest diverse and disparate data sources. Offers batch and stream modes. It is used in preparing the data to ensure the quality of the data.
The Analytics Center serves as a self-service and setup data access for non-IT users. It removes the silos of spreadsheets and allows for advanced (canned and custom) analytics.
Data Lake can perform advanced analytics using Spark, Kafka, and other open standards. It is a cost effective measure that can analyze semi-structured/unstructured data.
The data warehouse component introduces subseconds. Queries about operational workloads, 10 terabytes of persistent data, can read structured relational data, and provide flexible deployment, according to Carr.
Hazel noted that a data lake is best suited to a big data scientist. According to Hazel, ChaosSearch provides a data lake platform for large-scale analytics. He said enable search, SQL, and machine learning workloads on cloud data at scale and at lower cost, with no data traffic and faster time to insights.
An on-demand archived replay of this webinar is available here.