Data Storage Management

Databricks Unveils Lakehouse Apps, Enhances Marketplace with AI Models and Data Sharing Boost

Databricks, the Data and AI company, introduced Lakehouse Apps, a new  way  for  developers  to  build  native,  secure  applications  for  Databricks. Lakehouse Apps  will  enable  over  10,000  Databricks  customers  to easily access a wide range  of powerful  applications  that  run  entirely  inside  their  Lakehouse  instance,  using their data with the full security and governance capabilities of Databricks.

The company also introduced new data-sharing providers and AI model-sharing capabilities  to  the  Databricks  Marketplace  —  the  only  marketplace  for  data,  AI,  and applications — set to be widely available at the Data + AI Summit.

Lakehouse Apps Simplifies Access to Data and AI
Data and AI applications  are  among  the  fastest-growing  software  categories,  and  the growth in generative AI and large language models (LLMs) has accelerated that trend. For customers,  Lakehouse  Apps  will  be  the  most  secure  way  to  run applications  that unlock the full value of data in their Lakehouse, leverage Databricks-native services, and extend Databricks with new capabilities. Lakehouse Apps will give users safe and easy access to a wide range of innovative new applications and reduce the time and effort to adopt, integrate, and manage data and AI applications.

Lakehouse Apps Offer Security Without Compromise for Developers
To get the next generation of innovative applications in the hands of users, software vendors must clear significant hurdles to securely access customer data, integrate with customers’ security and governance solutions, and efficiently run close to customer data. To secure enterprise adoption, many developers have taken one of two approaches: restrict the capabilities of their application and rebuild vital parts of their application in SQL or proprietary code from data platform vendors or build versions of their products that customers have to install and operate themselves, which are fragile and hard to scale.

Lakehouse Apps helps developers overcome this dilemma with a native, secure, no-compromise solution. By running directly on a customer’s Databricks instance, these apps can easily and securely integrate with the customer’s data, use and extend Databricks services and enable users to interact with a single sign-on experience — all without data ever leaving the customer’s instance. Lakehouse Apps inherit the same security, privacy, and compliance controls as Databricks. Developers can use any technology and language of their choice to build apps and aren’t limited to a proprietary framework.

Developers also benefit from easier distribution by listing their Lakehouse Apps in the Databricks Marketplace, enabling customers to quickly discover and deploy their software.

Early development partners for Lakehouse Apps include Retool, Posit, Kumo.ai, and Lamini:

  • Retool enables customers to quickly build and deploy internal apps, powered by their data. Developers can assemble UIs with drag-and-drop building blocks like tables and forms and write queries to interact with data using SQL and JavaScript.

  • Posit is an open-source data science company that empowers data professionals with cutting-edge tools for code-first data science.

  • Kumo.ai is an AI-powered platform tackling predictive problems in business. Its platform works directly on relational data by using graph neural networks, a class of AI system for processing data that can be represented as a series of graphs.

  • Lamini is an LLM platform for every developer to build customised, private models: easier, faster, and better performing than any general-purpose LLM.

New AI Model-Sharing Capabilities and Data Providers
Databricks will also offer AI model sharing in the Databricks Marketplace, enabling data consumers and providers to discover and monetise AI models and integrate AI into all their data solutions. With AI model sharing, Databricks customers will have access to best-in-class models, which can be quickly and securely applied on top of their data. Databricks itself will curate and publish open-source models across common use cases, such as instruction-following and text summarisation, and optimise tuning or deploying of these models on Databricks.

Databricks Marketplace also welcomes new data providers, including financial services leaders such as S&P Global, Experian, London Stock Exchange Group, Nasdaq, Corelogic and YipitData; healthcare innovators like Datavant and IQVIA; geospatial leaders like Divirod, Accuweather and Safegraph; data collaboration companies like LiveRamp; and business information services companies like LexisNexis and ZoomInfo.

“With Lakehouse Apps, software providers can offer their rich, secure apps within the lakehouse, which is exciting both for Databricks customers and for software vendors, greatly reducing  the  friction  for  applications  to  reach  new  customers,”  said  Matei Zaharia, Co-Founder and CTO at Databricks. “In addition, the expansion of Databricks Marketplace to cover AI models as well as apps satisfies a critical need in today’s business world, as collaboration between enterprises is evolving beyond the mere exchange of datasets to secure computations and AI modelling on joint data.”

“At Edmunds, we provide car shoppers with a wealth of insights, which is why we rely so much on data ourselves. The Databricks Marketplace simplifies the process of discovering and evaluating external data with pre-built notebooks without locking us into a single vendor or prolonging procurement cycles. We can access the data within our Databricks workspaces with just a few clicks. We also look forward to leveraging the notebooks, dashboards, and AI models through the Marketplace to enhance our analytics and AI  initiatives,”  said  Greg  Rokita,  AVP  of  Technology  at  Edmunds,  a  Databricks Marketplace customer.

Availability
Databricks Marketplace will generally be available on 28 June 2023, coming out of a public preview. Lakehouse Apps and AI model sharing in Databricks Marketplace are expected in preview in the coming year.

New Partners to Accelerate Data Sharing
Databricks also announced new Delta Sharing partnerships with Cloudflare, Dell, Oracle, and Twilio, expanding its data-sharing ecosystem. Delta Sharing provides an open solution to securely share live data from your lakehouse to any computing platform.

Organisations must continuously develop innovative business solutions and products leveraging data and AI to thrive in today’s data-driven economy. That requires an open, secure exchange of data and AI assets with customers, partners, and suppliers. However, the absence of an open standard sharing solution limited the development of an open data exchange ecosystem. Enterprises had to replicate data across multiple platforms, clouds and regions to facilitate collaboration. Traditional solutions also only took a data-only approach, which means organisations were limited in monetising anything beyond a dataset and experienced friction in attempting to create new revenue opportunities with non-compatible platforms.

Delta Sharing helps organisations share and consume live data sets across platforms, clouds, and regions without dependencies on specific data-sharing services, including

Databricks. Unlike other sharing solutions that require both the provider and consumer to both be using the same platform and vendor, Delta Sharing enables enterprises to share and consume data from any platform or vendor that supports the open Delta Sharing protocol. With an open approach powered by Delta Sharing, organisations can put their data to work more quickly and discover insights faster.

“Without  an  open  standard  for  secure  data  exchange  across  organisations,  companies find    it   highly   time-consuming   to   collaborate,   requiring   export,   replication   and maintenance of data across many software platforms,” said Matei Zaharia, Co-Founder and CTO at Databricks. “Delta Sharing provides the first open protocol for sharing data across  diverse  computing  platforms,  clouds and regions. Today’s announcements show just  how  much  demand  there  is  for  this in the industry, with multiple major technology vendors joining the ecosystem. We are excited about how this will push open interchange forward and help all of our customers collaborate more easily.”

Databricks  is  expanding  the  Delta  Sharing  ecosystem  with  new  partners,  including Cloudflare,  Dell,  Oracle  and  Twilio,  to  seamlessly  share  data  between  their  platforms, Databricks,  Apache  Spark™, pandas, PowerBI, Excel and any other system that supports the open protocol. Benefits include:

  • Partners can share live access to data, AI models and notebooks directly with consumers without costly or complicated replication. Delta Sharing gives providers an easy way to manage access permissions to any consumer regardless of their cloud, region, or platform.

  • With a wide array of open Delta Sharing clients, a consumer can access shared data from multiple compute platforms without being constrained to particular vendor solutions.

“We are in the midst of an AI revolution rooted in data,” said  Matthew  Prince, co-founder and CEO, Cloudflare. “Cloudflare R2 provides an amazing value proposition for companies that suffer from vendor lock-in, and instead ensures developers retain the power to choose where to move and use their data. The combination of Cloudflare’s massive global network and zero egress storage, along with Databricks’ powerful sharing and  processing  capabilities,  will  give  our  joint  customers  the fastest,  most secure, and most affordable data sharing capabilities across the globe.”

“Dell and Databricks are helping organisations adopt multi-cloud by design with the newly announced ability to access and combine data across on-premises and cloud environments, and securely sharing that data through Delta Sharing,” said Greg Findlen, SVP Product Management, Data Management, Dell Technologies.

DSA Editorial

The region’s leading specialist IT news publication focused on Data Lifecycle, Storage Infrastructure and Data-Driven Transformation. DSA has nearly 17,000 e-news subscribers, over 6500 unique visitors per day, over 20,000 social media followers and a reputation for deep domain knowledge.

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *