In this blog, we are going to explore the development workflow for building intelligent data applications using machine learning on SAP HANA and the SAP Cloud Application Programming Model (CAP). Along the way, we will share insights from our own experiences, examine the practical viability of this approach, and touch upon challenges and potential improvements.
Development Approach
There are a couple essential steps within a typical development workflow for building intelligent data applications. Generally, the work is split between two different people: a data scientist and a software engineer (or CAP developer in this context).
- The data scientist takes the lead in analyzing available datasets, designing, and training machine learning models that provide valuable insights.
- Once the model is ready, the data scientist hands over the design-time machine learning artifacts to the developer, who will be able to integrate and consume the models in a CAP project.
This handshake marks the shift from data exploration and experimentation to building functional, user-facing applications. While conceptually straightforward, this transition can introduce several practical challenges, particularly in aligning the output of the data scientist’s work with the development requirements of a CAP project.
Data Scientist – Developer Handshake
What is the SAP Predictive Analysis Library?
The SAP Predictive Analysis Library (PAL) is a collection of SQLScript functions designed to enable predictive analytics directly within SAP HANA. By running machine learning algorithms natively in the HANA database, PAL eliminates the need to move large datasets to external environments, which improves both speed and efficiency.
PAL includes a wide range of algorithms across different data-mining categories such as clustering, classification, regression, time series analysis, and more. The integration with HANA makes it particularly interesting for customers looking to incorporate machine learning into their existing SAP ecosystem. Our own workflow involved setting up a development environment in SAP Business Application Studio (BAS) with Python extensions. Once our Python environment was set up, we could import the HANA ML Python Client API package to use PAL capabilities in our Python code. This works quite well in our opinion. There is comprehensive documentation available about both PAL and the HANA ML package. We also found several SAP Learning Journeys online to help you get started.
Enabling PAL in SAP HANA Cloud
It is important to note that PAL needs to be explicitly enabled on your HANA Cloud instance before you can use any of its functionality. Initially, we found this somewhat confusing, since PAL did not seem to be available in the trial version of HANA Cloud when we were setting up our own instance. However, it was available in the Free tier.
We were also not entirely sure which resources were required to run PAL without any problems. Fortunately, the standard resources of the Free tier version of HANA Cloud ended up being sufficient for allowing about 10 different users to use PAL functionality simultaneously. Please make sure that every user has the appropriate roles to actually use this functionality though.
After some trial and error, we were successfully able to create an instance that was usable for an internal CodeJam that we organized at our office. This hands-on session gave us plenty of insights into how PAL can be used for machine learning scenarios.
Machine Learning CodeJam at INNOV8iON
Integrating the model in SAP CAP
After building and training a machine learning model using PAL, the data scientist is ready to hand over the appropriate design-time artifacts to the CAP developer. The HANA ML package provides a HANAGeneratorForCAP module that is able to automatically generate the artifacts for you. These artifacts should then be able to be imported into a CAP project. Unfortunately, in our case these auto-generated artifacts did not seem to work out-of-the-box.
We encountered some problems with the artifacts during the deployment step:
- Insufficient privilege errors caused by the content of the .hdbgrants file.
- Missing keys in the auto-generated entities, requiring manual adjustments.
- There also were multiple issues with case sensitive column names not being mapped and/or handled correctly.
The documentation for this step was limited and appeared outdated. We could only find two relevant SAP blogs from two years ago. As such, we suspect that this HANAGeneratorForCAP module may not be fully optimized for use in the latest version of SAP CAP. Due to these hurdles, we decided to focus solely on the data science aspects during our CodeJam, leaving the CAP development workflow for future exploration.
Conclusion
Overall, we see significant potential when it comes to building custom machine learning models for SAP customers. There are many different use cases where AI can help to streamline business processes. However, we believe that this specific development workflow may not be particularly intuitive or accessible for customers who are just beginning their journey into the world of AI. Especially when transitioning from data science to actual application development using these models. That step in particular requires more intuitive tools and updated documentation. We are excited to see how these AI solutions evolve and will continue to keep a close eye on future developments in this space!