Azure Databricks + MagicOrange
is powered by — Unified Cloud Analytics and AI platform
: Bhushan Tambatkar
is the market’s most advanced IT Financial Management platform. Our proprietary technology enables CIOs & divisional leaders to discover and align their costs with the business and increase visibility of complex cost combos.
is Cloud first multi-tenant SaaS offering, on . Since its inception MagicOrange has been using Azure’s native SaaS offerings to construct and scale our MagicOrange Prism Platform.
Once we first began implementing an information platform for MagicOrange back in 2015, we weren’t anticipating huge data volumes. Existing Azure service offerings like Azure SQL DB, App Services, Storage Accounts, Power BI, Evaluation Services were greater than sufficient to run and grow our platform, considering that every one of those services offered scalability. But after a number of years after we began to see significant increases in data volumes with more customers onboarding to the platform, we began to search for scalable, durable and value effective solutions in Azure Cloud to construct out our MagicOrange Data and Analytics Platform.
We went through an exercise of evaluating which tools we wanted to construct our Data and Analytics Platform. We quickly realized that there are some tools which were good at ETL, others specialized in warehousing, and a few a greater fit for Analytics. Overall, if we selected this path, stitching together various tools and technologies, we could be spending more cash on the assorted services and there could be management overhead.
After evaluating various tools and platforms, corresponding to Snowflake, we selected to go together with Azure Databricks because it is more cost effective, there’s less management overhead, and it meets all of our requirements for a next gen Cloud Data and Analytics Platform.
- Lakehouse Architecture and single platform for Data Engineering, Data Science, Data Ingestion, Machine Learning, Data Warehouse/Lakehouse, Data Analytics.
- Data Security and Integration with Azure AD.
- Integration with Power BI using Azure Databricks Connector.
- Scalable architecture with clusters and Databricks Runtime, giving power of Apache-Spark and taking away complexity for managing any Spark config.
- Interactive development experience with Databricks Workspace and Notebooks, added the good thing about support for multiple languages like Python, R, SQL, Scala, Java (.jars).
- Orchestration with Jobs/Workflows and recently using Delta Live Tables.
- Most Necessary — Cost Effective — Databricks enabled us to construct and run a Cloud Data and Analytics platform at scale and keep our cost well under budget. For instance after migrating our ETL workload from a cloud native ETL tool to Azure Databricks we saw savings as much as 400% monthly on ETL jobs alone. We were capable of start small and scale based on our needs, as we only pay for what we use.
- Storage and compute are separate, which saves on storage cost, as data is in Delta Lake format and is stored in type of Parquet files on Azure Data Lake Storage Containers.
Here is the MagicOrange architecture and the important thing areas we’ve found helpful as we scale the business.
- Azure Databricks seamlessly integrates a wide selection of knowledge sources which has helped us construct and scale our solutions quickly.
- Data Engineering Workspace UI is developer friendly, with native features like notebooks, pipeline environment with Jobs, Workflows, Delta Live Tables, scheduling/orchestration, and failure notifications. This eliminates the necessity to keep up different tools to do the identical tasks and has enabled the Data Engineering Team to maintain their deal with solving ETL tasks.
- Before Databricks Lakehouse, complex ETL pipelines were developed using cloud native ETL tools. Migrating to the Databricks Lakehouse was relatively easy using PySpark and Spark-SQL, with support for multiple languages and this enabled our Data Engineering teams to deliver complex ETL requirements quickly.
- Since migrating to the Databricks Lakehouse, using scalable clusters and notebooks, ETL tasks are completing faster and are inexpensive.
- MagicOrange is multi-tenant SaaS offering. Data Security and Customer Data isolation are top priorities, and since Azure Databricks is compliant with several industry and regulatory standards, including ISO 27001, SOC 2, and HIPAA, it helps MagicOrange in constructing secure solutions.
- Azure Databricks has strong integration with Azure AD, which eliminates many security concerns and helps to leverage RBAC (Role Based Access Control) to regulate access to Databricks Workspace and other resources.
- The implementation of Unity-Catalog helped us make the general data landscape safer. Databricks helped us remove prior limitations and gave us the power to attain our dev and production data isolation policy.
- Unity Catalog features like external storage location, and support for SQL GRANT statements, helped in implementing higher access control per Customer Catalog.
- There are out-of-the box security measures like Network Isolation, Data Encryption and a spread of security measures that helped us protect our data and meet our security requirements.
- Databricks SQL Warehouses/Endpoints could be easily integrated with Power BI using the Azure Databricks Connector and support Direct Query mode to Delta Lake Data, which enabled us to construct customer-facing Power BI Reports and Dashboards.
- Serverless SQL Warehouses with Photon are immensely powerful and help us to visualise large datasets (100 million+ rows) in Power BI.
- Databricks SQL Dashboards have helped our Data Analysts and Customer Success team to quickly analyze very large data sets, by writing easy SQL queries and constructing dashboards inside Databricks.
- Delta Sharing is an open standard that we use to securely share data with external and internal consumers from its original source.
- Delta Sharing has helped us to democratize data and share data externally and securely with MagicOrange Customers. As a part of customer on-boarding, each customer gets a dedicated share and recipient link.
- Delta Sharing connectors are supported in popular BI tools, which has eliminated the necessity to construct something in house to share data securely.
- MagicOrange is an information driven company, all the time attempting to create revolutionary solutions to assist our customers draw insights into their complex data. As a part of the MagicOrange product roadmap, there are plans to construct ML/AI based data products which might enable customers to simply draw more insights from complex data. Using the Databricks Lakehouse Platform will help MagicOrange construct and scale our ML/AI practice.
- We plan to leverage which might bring more value to MagicOrange Customers.
Conclusion
- On this blog, I actually have shared some insights into how implementing the Lakehouse Architecture has helped MagicOrange construct a scalable Data and Analytics Platform. Working as Cloud Architect and Data Architect, I find Azure Databricks very cost-effective, because it has allowed us to begin on a small budget and scale with many features. In my perspective, Databricks will help fulfill most organization’s data, analytics and AI requirements using a single unified platform, which we couldn’t achieve on other cloud data warehouses.
- Over the previous few years, I actually have seen Databricks evolve — adding recent features and ideas, which makes it unique on this space. There’s also continuous effort from the Databricks team to all the time improve. There’s an amazing support system from Databricks Solution Architects, who bring expertise and best practices, speeding up implementation of Databricks in your chosen Cloud provider.
- There’s a ton of documentation available on and to try to implement any feature mentioned on this blog, I like to recommend you to envision out these docs in the event you are keen to grasp and implement Databricks at your organization.
Thanks for reading, and stay tuned for more..
References: lakehouse, databricks-security , Delta-Lake, auto-loader , copy-into, Delta-Live Table pipelines, Workflows, databricks-data-science, databricks-ml, databricks-automl, unity-catalog, databricks-sql, delta-sharing
Uçak uçur para kazan aviator oyunu için sitemize bekliyoruz başlangıç bonusu veriyoruz Aviator oyunu için buraya tıkla