Databricks, the Data and AI company, introduced Lakehouse Apps, a new way for developers to build native, secure applications for Databricks. Lakehouse Apps will enable over 10,000 Databricks customers to easily access a wide range of powerful applications that run entirely inside their Lakehouse instance, using their data with the full security and governance capabilities of Databricks.
The company also introduced new data-sharing providers and AI model-sharing capabilities to the Databricks Marketplace — the only marketplace for data, AI, and applications — set to be widely available at the Data + AI Summit.
Lakehouse Apps Simplifies Access to Data and AI
Data and AI applications are among the fastest-growing software categories, and the growth in generative AI and large language models (LLMs) has accelerated that trend. For customers, Lakehouse Apps will be the most secure way to run applications that unlock the full value of data in their Lakehouse, leverage Databricks-native services, and extend Databricks with new capabilities. Lakehouse Apps will give users safe and easy access to a wide range of innovative new applications and reduce the time and effort to adopt, integrate, and manage data and AI applications.
Lakehouse Apps Offer Security Without Compromise for Developers
To get the next generation of innovative applications in the hands of users, software vendors must clear significant hurdles to securely access customer data, integrate with customers’ security and governance solutions, and efficiently run close to customer data. To secure enterprise adoption, many developers have taken one of two approaches: restrict the capabilities of their application and rebuild vital parts of their application in SQL or proprietary code from data platform vendors or build versions of their products that customers have to install and operate themselves, which are fragile and hard to scale.
Lakehouse Apps helps developers overcome this dilemma with a native, secure, no-compromise solution. By running directly on a customer’s Databricks instance, these apps can easily and securely integrate with the customer’s data, use and extend Databricks services and enable users to interact with a single sign-on experience — all without data ever leaving the customer’s instance. Lakehouse Apps inherit the same security, privacy, and compliance controls as Databricks. Developers can use any technology and language of their choice to build apps and aren’t limited to a proprietary framework.
Developers also benefit from easier distribution by listing their Lakehouse Apps in the Databricks Marketplace, enabling customers to quickly discover and deploy their software.
Early development partners for Lakehouse Apps include Retool, Posit, Kumo.ai, and Lamini:
-
Retool enables customers to quickly build and deploy internal apps, powered by their data. Developers can assemble UIs with drag-and-drop building blocks like tables and forms and write queries to interact with data using SQL and JavaScript.
-
Posit is an open-source data science company that empowers data professionals with cutting-edge tools for code-first data science.
-
Kumo.ai is an AI-powered platform tackling predictive problems in business. Its platform works directly on relational data by using graph neural networks, a class of AI system for processing data that can be represented as a series of graphs.
-
Lamini is an LLM platform for every developer to build customised, private models: easier, faster, and better performing than any general-purpose LLM.
New AI Model-Sharing Capabilities and Data Providers
Databricks will also offer AI model sharing in the Databricks Marketplace, enabling data consumers and providers to discover and monetise AI models and integrate AI into all their data solutions. With AI model sharing, Databricks customers will have access to best-in-class models, which can be quickly and securely applied on top of their data. Databricks itself will curate and publish open-source models across common use cases, such as instruction-following and text summarisation, and optimise tuning or deploying of these models on Databricks.
Databricks Marketplace also welcomes new data providers, including financial services leaders such as S&P Global, Experian, London Stock Exchange Group, Nasdaq, Corelogic and YipitData; healthcare innovators like Datavant and IQVIA; geospatial leaders like Divirod, Accuweather and Safegraph; data collaboration companies like LiveRamp; and business information services companies like LexisNexis and ZoomInfo.
“With Lakehouse Apps, software providers can offer their rich, secure apps within the lakehouse, which is exciting both for Databricks customers and for software vendors, greatly reducing the friction for applications to reach new customers,” said Matei Zaharia, Co-Founder and CTO at Databricks. “In addition, the expansion of Databricks Marketplace to cover AI models as well as apps satisfies a critical need in today’s business world, as collaboration between enterprises is evolving beyond the mere exchange of datasets to secure computations and AI modelling on joint data.”
“At Edmunds, we provide car shoppers with a wealth of insights, which is why we rely so much on data ourselves. The Databricks Marketplace simplifies the process of discovering and evaluating external data with pre-built notebooks without locking us into a single vendor or prolonging procurement cycles. We can access the data within our Databricks workspaces with just a few clicks. We also look forward to leveraging the notebooks, dashboards, and AI models through the Marketplace to enhance our analytics and AI initiatives,” said Greg Rokita, AVP of Technology at Edmunds, a Databricks Marketplace customer.
Availability
Databricks Marketplace will generally be available on 28 June 2023, coming out of a public preview. Lakehouse Apps and AI model sharing in Databricks Marketplace are expected in preview in the coming year.
New Partners to Accelerate Data Sharing
Databricks also announced new Delta Sharing partnerships with Cloudflare, Dell, Oracle, and Twilio, expanding its data-sharing ecosystem. Delta Sharing provides an open solution to securely share live data from your lakehouse to any computing platform.
Organisations must continuously develop innovative business solutions and products leveraging data and AI to thrive in today’s data-driven economy. That requires an open, secure exchange of data and AI assets with customers, partners, and suppliers. However, the absence of an open standard sharing solution limited the development of an open data exchange ecosystem. Enterprises had to replicate data across multiple platforms, clouds and regions to facilitate collaboration. Traditional solutions also only took a data-only approach, which means organisations were limited in monetising anything beyond a dataset and experienced friction in attempting to create new revenue opportunities with non-compatible platforms.
Delta Sharing helps organisations share and consume live data sets across platforms, clouds, and regions without dependencies on specific data-sharing services, including
Databricks. Unlike other sharing solutions that require both the provider and consumer to both be using the same platform and vendor, Delta Sharing enables enterprises to share and consume data from any platform or vendor that supports the open Delta Sharing protocol. With an open approach powered by Delta Sharing, organisations can put their data to work more quickly and discover insights faster.
“Without an open standard for secure data exchange across organisations, companies find it highly time-consuming to collaborate, requiring export, replication and maintenance of data across many software platforms,” said Matei Zaharia, Co-Founder and CTO at Databricks. “Delta Sharing provides the first open protocol for sharing data across diverse computing platforms, clouds and regions. Today’s announcements show just how much demand there is for this in the industry, with multiple major technology vendors joining the ecosystem. We are excited about how this will push open interchange forward and help all of our customers collaborate more easily.”
Databricks is expanding the Delta Sharing ecosystem with new partners, including Cloudflare, Dell, Oracle and Twilio, to seamlessly share data between their platforms, Databricks, Apache Spark™, pandas, PowerBI, Excel and any other system that supports the open protocol. Benefits include:
-
Partners can share live access to data, AI models and notebooks directly with consumers without costly or complicated replication. Delta Sharing gives providers an easy way to manage access permissions to any consumer regardless of their cloud, region, or platform.
-
With a wide array of open Delta Sharing clients, a consumer can access shared data from multiple compute platforms without being constrained to particular vendor solutions.
“We are in the midst of an AI revolution rooted in data,” said Matthew Prince, co-founder and CEO, Cloudflare. “Cloudflare R2 provides an amazing value proposition for companies that suffer from vendor lock-in, and instead ensures developers retain the power to choose where to move and use their data. The combination of Cloudflare’s massive global network and zero egress storage, along with Databricks’ powerful sharing and processing capabilities, will give our joint customers the fastest, most secure, and most affordable data sharing capabilities across the globe.”
“Dell and Databricks are helping organisations adopt multi-cloud by design with the newly announced ability to access and combine data across on-premises and cloud environments, and securely sharing that data through Delta Sharing,” said Greg Findlen, SVP Product Management, Data Management, Dell Technologies.