Author: dsa_admin

Interview with Aspaara Algorithmic Solutions AG

Written by dsa_admin on 17. July 2020. Posted in Blog. 44 Comments on Interview with Aspaara Algorithmic Solutions AG

Could you shortly tell us what Aspaara is?

Employees are the most influential production factor and — at the same time — the largest cost factor in most companies. Therefore, it is important to make best use of the skills in-house. Our artificial intelligence-based optimization engine, the Aspaara® MatchingCore®, identifies hidden internal potential within the company and satisfies customized scheduling criteria while adapting to our clients’ specific situations. Our optimization engine ensures that the right people are in the right place at the right time. We make sure that the final result is optimal for our client’s specific needs. Through sophisticated analyses, accurate predictions and optimizations, MatchingCore® is specifically aimed for recurrent, long-term time allocation of the team, and it complements and helps internal operations.

What is Aspaara’s background story?

Founders Alexander Grimm, physicist with a PhD in Business Administration and Kevin Zemmer, PhD in Applied Maths, met at the ETH Zürich. The history of Aspaara Algorithmic Solutions AG began more than five years ago with the Sola-Match project. We attended as runners in the Sola-Stafette. The challenge there, we noticed, was to put together a team of 14 different runners. We created a platform that matched runners with teams that still had capacity for additional participants. We realized that this type of calculation of runners’ skills, connecting them to the right team, could be further developed within a professional team in a company. This is how the business idea was established. From there we moved on to tutoring schools and then aircraft ground handling was added relatively quickly. We make allocation solutions for companies in the professional service sector, such as PwC, but also for logistics and railway companies.

Why is it important that Aspaara exists?

We have been able to save up to 6% of wrongly allocated labor costs for our clients and up to a quarter of travel time reduction. We make sure that employees’ preferences are respected and that the best teams work together. Moreover, we ensure that all persons involved are satisfied, and therefore we have been able to achieve sustainable success with our customers. We accompany our customers over a long period of time.

Who can profit from your services?

Staffing is complex because of a variety of constraints. Our precise and predictive decisions enable our clients to automate planning processes, reduce failure rates in staffing, while increasing reliability and efficiency. Our customer-base is very varied. We work with professional service companies where we focus in particular on assurance and auditing, we have clients in the transportation sector, such as aircraft ground handlers, further we have clients in logistics companies. Our clients use the MatchingCore® for their allocation and planning of long-term jobs – some companies use it once a year to optimize their staffing as well as when there is a turnover or a new client. Others, for example ground handlers, have more recurrent needs and use it more frequently. The clients use the MatchingCore® continuously and independently as “Software-as-a-Service”. Our customers typically have at least 350 permanent employees (internal and external).

Your client-base is very varied; can you give some examples of your projects?

At PwC Switzerland we conceived, implemented and deployed the Aspaara MatchingCore® for internal staff, matching operations for all of their 14 offices. While respecting all kinds of optimization criteria our MatchingCore® aspires towards the minimization of travel costs. We also seek to increase employment satisfaction, looking at satisfactory career paths. We find the best matches to increase team continuity.

At Zurich Airport, Aspaara® Groundcloud® uncovered how to save over 5% of wrongly allocated wage costs while increasing process reliability for customer airlines for all air and land-side operations of a ground handling company.

What are your biggest challenges?

What is special for us is that we offer services that are individually developed for each client’s specific need. MatchingCore® adapts to the very individual planning challenges of each client with the help of artificial intelligence and machine learning – which takes some weeks.

Currently, we work hard on bringing the learning cycles for the customization down to a few days. This way we will be able to offer a fully customized plan for our clients within a few days – which a technical challenge. But we love challenges!

How do you see the future of Aspaara and what is your long-term goal?

In the short-term we would like to help our existing customers, as well as those who would like to become our customers, with a planning optimization to help them out of the current Covid-19 situation and strengthen them beyond this.

Our long-term goal is to become the best and most innovative provider of customized resource planning software in Europe.

Use Case Talks Series

Three times a year we organize the Use Case Talks series, on behalf of the Swiss Alliance for Data Intensive Services. At these events we discuss among experts about Artificial Intelligence. We are joined by about one third industrial, one third academic and one third individual members. The next Use Case Talk will take place the 2^nd of November – save the date and contact us if you are interested in participating! Find out more here.

A Hybrid Edge-Cloud Platform for Self-Adaptive Machine Learning Based IoT Applications, Pre-conference workshop, 25.06.2020

Written by dsa_admin on 16. July 2020. Posted in Blog. 41 Comments on A Hybrid Edge-Cloud Platform for Self-Adaptive Machine Learning Based IoT Applications, Pre-conference workshop, 25.06.2020

On the 25th of June, the day before the SDS2020, Nabil Abdennadher, professor at the University of Applied Sciences Western Switzerland, Marc-Elian Bégin, CEO and Co-Funder of SixSq and Francisco Mendonca from HESGE organized a workshop entitled “A Hybrid Edge-Cloud Platform for Self-Adaptive Machine Learning Based on IoT Applications”. The workshop was aimed for PhD Students, Engineers, Scientists, Persons from industry and researchers interested in edge-cloud IoT applications and platforms.

The workshop was organized in two parts. The first part was aimed at answering questions about what problems edge computing solve and how to take advantage of it, how edge computing and cloud computing work together and what the current technologies for designing hybrid platforms (edge and cloud) for secure IoT applications are. These technologies were illustrated through hands-on demonstrations.

The second part presented a generic open-source platform for intelligent IoT applications based on a shareable backbone infrastructure composed of three layers: IoT objects, edge devices and cloud infrastructure.

Out attendees learned to understand the added value of the edge compared to a centralized cloud-based solution. They browsed the most common “technologies” used to deploy hybrid edge-cloud platforms and discovered a “Swiss made” open-source technology used to deploy hybrid edge-cloud platforms.

We want to thank everyone who participated in the workshop for the interesting questions and discussions and hope that you will benefit from what you learned.

Machine Learning Push-Down to SAP HANA with Python, Pre-conference workshop 25.06.2020

Written by dsa_admin on 16. July 2020. Posted in Blog. 42 Comments on Machine Learning Push-Down to SAP HANA with Python, Pre-conference workshop 25.06.2020

On the 25^th of June 2020, the day prior to the SDS2020 conference, Andreas Forster, Thomas Bitterle and Michael Probst from SAP (Schweiz) AG organized a pre-conference workshop entitled “Machine Learning Push-Down to SAP HANA with Python”.

Data Scientists who work in business environments often require access to SAP data. But often this data is held in a high-performance, in-memory appliance, which already contains Machine Learning algorithms. This tutorial, suited for Data Scientists, Python users and SAP HANA users, showed the participants how to leverage in-memory appliance to train Machine Learning models from the – by the participants – preferred Python environment. It also showed how to trigger predictive algorithms in SAP HANA, without having to extract the data.

The participants learned how to get hands-on data located in SAP HANA systems and other stores. They also experienced running notebooks to trigger Machine Learning with SAP HANA. They performed data exploration tasks on the source system without transferring data. Finally, they learned how to use the developed model and to bring it into an enterprise-ready environment.

We want to say a huge “thank you” to all of our participants. We are glad that you all passed the workshop with excellence and hope you can leverage the new skills in the future.

Implementing Data Ethics in Business Processes, Pre-Conference Workshop, 25.06.2020

Written by dsa_admin on 14. July 2020. Posted in Blog. 54 Comments on Implementing Data Ethics in Business Processes, Pre-Conference Workshop, 25.06.2020

In 2019, the Swiss Alliance for Data-Intensive Services launched the first version of its “Data Ethics Code”, whose update is currently in production and will be published in the fall of 2020. The Code will include an Implementation Guide that outlines several possibilities on how data ethics can be integrated in companies and business processes. Both the Codex recommendations and the Implementation Guide were topics of this workshop.

In total, seven persons representing a broad spectrum of institutions (Swiss Re, PostFinance, SMIs, public administrations and academia) participated in the workshop led by Christoph Heitz (ZHAW) as well as Markus Christen and Michele Loi (both from the University of Zurich). After a general introduction by Christoph and an in-depth presentations about the Codex by Michele and the Implementation Guide by Markus, the participants discussed in three small groups, representing different types of organizations (large and small companies, public institutions), which data ethics problems typically emerge in these contexts and which ethics structures would be adequate to resolve them.

The discussion revealed that large companies usually do have organizational measures in place to handle data ethics issues, but they may sometimes lack grounding in the day-to-day processes of the companies. A key challenge of smaller companies is to identify that ethical issues are part of their products and services. The difficulties to operationalized key values of the Codex such as transparency were discussed as well.

We thank the participants for an interesting workshop.

An Experimental Exploratory Data Analysis for a Classification Task, Pre-Conference Workshop 25.06.2020

Written by dsa_admin on 14. July 2020. Posted in Blog. 48 Comments on An Experimental Exploratory Data Analysis for a Classification Task, Pre-Conference Workshop 25.06.2020

From Visualization to Statistical Analysis

From Feature Engineering to Feature Selection

From the Best Model Selection to Interpretability

The 7th Swiss Conference on Data Science was held online on the 26th of June 2020, and on 25th June 2020 Claudio G. Giancaterino organized a pre-conference workshop about Exploratory Data Analysis topic from a differing point of view.

Usually the goal of Exploratory Data Analysis (EDA) is to understand patterns, detect mistakes, check assumptions and check relationships between variables of a data set with the help of graphical charts and summary statistics.

Instead, the goal of this workshop was to expand the classical EDA journey in a wider pipeline by an experimental approach that, step by step, with an iterative approach, tried to understand the impact of each action taken, into the behavior of models. The result was an Exploratory Data & Models Analysis.

The whole online workshop was conducted in a webinar format where the attendees (18) had the opportunity to interact with the speaker through a Q&A chat box asking questions during the presentation. The approach of the seminar was a hands-on workshop leaving attendees, at almost every step of the journey, the opportunity either to run Google Colaboratory notebooks with a sample of the data set and looking at the results, or the opportunity to challenge themselves with exercise notebooks filling in pieces of missing code.

Participants showed interest in the workshop, posting positive feedback at the end of the seminar, and during the webinar they asked questions about all arguments, with the minutes ticking by quickly.

Topics covered & discussed:

During the workshop a data set from a data science competition was used, and the goal of the classification task was to develop a model to predict whether or not a mortgage can be funded, based on certain factors in a customer’s application data.

The journey started with a quick look at the data set with the help of a visualization tool: AutoViz. Participants were thrilled with the tool used.

Then, the data set was divided into two paths: categorical variables with an encoding activity (transformation of each category string by a numerical representation) and numerical variables to look at the performance of several baseline models.

The Q&A chat box showed questions about issues linked to the use of one-hot encoding (it expands the features space).

At this point we handled missing values, replacing them with a data imputation strategy instead of removing interested rows. We did this because with dropping rows there is a risk of removing relevant features, this is why it is preferable to work with a complete data set. The participants agreed with this.

We then applied Exploratory Data Analysis to the data set, using bivariate analysis as feature selection for the relevant features. One of the questions was about the difference between PCA (it was born as dimensionality reduction but is often used to create new features) and the approach then followed (it was used to select the most predictive features on the target variable).

Before going to the last step, the handling of imbalanced classification, we managed outliers (extreme values that fall far away from the other observations). To it was applied logarithmic transformation to correct the skewness of some variables or discretization to mixture distributions and a new numerical feature was created. As explained, feature engineering con sometimes be frustrating because there are generated correlated features that need to be deleted in the preprocessing step, and business knowledge can play a significative role in the application of this methodology.

In the last step we discussed some strategies to face imbalanced classes in the classification task and applied some techniques.

· Oversampling: randomly sample (with replacement) the minority class to reach the same size of the majority class.

· Undersampling: randomly subset the majority class to reach the same size of the minority class.

· SMOTE (Synthetic Minority Over-sampling Technique): an over-sampling method that creates synthetic samples from the minority class instead of creating copies from it.

In all the steps, except the first one, we applied a modeling process to evaluate the impact of each action on the performance of the models, and the attendees were immediately interested in which models we used: Logistic Regression, AdaBoost, Gradient Boosting Machine, Bagging, Random Forest and Neural Network. For the Oversampling method, the best models were Gradient Boosting Machine and AdaBoost.

From Feature Importance Analysis, using permutation, the best feature able to explain the target values was Property Value for almost all models, instead of using Shap Values with Gradient Boosting Machine, the best feature was the Payment Frequency and specifically the Monthly Payment. The curiosity of candidates was focused on this mentioned feature because it was also the first ranked for its importance in the Logistic Regression and, moreover, the attention was focused in the feature created by the product between the interest rate and loan-to-value that showed importance.

Look at the repository

NLPeasy – Harnessing the Power of Unstructured Data, Pre-conference Workshop 25.06.2020

Written by dsa_admin on 13. July 2020. Posted in Blog. 44 Comments on NLPeasy – Harnessing the Power of Unstructured Data, Pre-conference Workshop 25.06.2020

At this year’s virtual pre-conference workshop, Philipp, Jacqueline, and Jürgen from D ONE got the chance to teach 20 participants about Natural Language Processing (NLP) with the python package NLPeasy. Philipp has written this python package to make it easier for data scientists to get started with NLP. It is a wrapper for other NLP packages such as VaderSentiment and SpaCy, and builds a bridge to Elasticsearch and Kibana. Elasticsearch is a document database, while Kibana is a dashboarding tool that can read data from Elasticsearch efficiently. The setup of Elasticsearch and Kibana is made simple by running them in Docker containers. Like this, the package can be used without long pre-installations and is a great starting point for any data exploration journey that works with text data.

In this pre-conference workshop, D ONE taught participants the main methods that are used in NLP. Because of the virtual setup, it was hard to involve participants in discussions as well. This is why they chose to deliver content in two modes. First, Philipp, who is trained in both Mathematics and Linguistics, explained the theoretical aspects in a classroom setting. Then, the participants split into smaller groups and entered breakout rooms. Jacqueline, Jürgen and Philipp helped the participants try out how NLP methods work in practise. For example, participants looked into how different words are represented by vectors. Also, they created and visualised syntax trees of different sentences.

Participants creating and visualising syntax trees of sentences during a breakout session.

D ONE did not want participants to get lost during the workshop, so they tried to limit technical problems as much as possible. The approach is highly recommendable: D ONE hosted VMs on a Binder Hub, so that participants did not have to follow lengthy installation protocols before the workshop. With one link, people could connect to the hub, where their own machine was instantiated. And off they went to start with NLP! For those interested, the NLPeasy tutorial can also be found on D ONE’s github account.

After discovering NLP basics, participants explored how NLPeasy can be set up, in a first tutorial . To this end, D ONE used a freely available, anonymised dataset of a dating website. It included text answers to profile sections of the app, such as “Describe yourself” and “What are you doing on a typical Friday night?”, as well as more structured information like city, languages, and age. The participants did some basic feature engineering, and then defined the pipeline steps they wanted to include, such as the text columns used for sentiment analysis. Then, they ran the enrichment on the dataset. Finally, they loaded the enriched dataset into Elasticsearch and a basic Kibana dashboard was created automatically, all using NLPeasy out-of-the-box functions.

An automatically generated Kibana-dashboard from the dating app dataset.

Overall the NLPeasy workshop was a great experience for both participants and hosts. With the workshop being held remotely, people from further away could participate, for example one participant joining from King Abdulaziz University in Saudi Arabia. It was fun to have such a diverse audience. Participants walked out of the workshop equipped with a new toolset not to be afraid of textual data analysis anymore, but instead to dive right into it. D ONE will host this workshop again in the future, may it be on site or remotely. For interested readers, D ONE is also happy to tailor an introductory NLP course to your company’s needs, so feel free to reach out any time!

A.I. Use-Case Talk Series, 17.06.2020

Written by dsa_admin on 7. July 2020. Posted in Blog. 46 Comments on A.I. Use-Case Talk Series, 17.06.2020

Because of the exceptional situation, industry, academic and individual members of the Swiss Alliance for Data Intensive Services, joined the A.I. Use-Case Talk on the 17^th of June 2020, in an online format for the very first time.

The Use-Case Talk Series allows participants to enjoy in-depth technical discussions and exchange information about interesting technical challenges amongst experts.

This time three industry experts and numerous participants took part in the Use-Case Talk to share stories and insights about frameworks, best practices and tools in data science.

The first Use-Case was presented by Christian Kindler, Full Stack Data Scientist at Valdon Mesh GmbH. Our participants learned how to Trade Options with A.I. Methods. Christian explained the Auto-Regressive Feed Forward Neural Network and demonstrated how A.I. can be used for trading.

Our second speaker, Achim Kohli, Co-Founder and CEO of legal-i, presented a live demo and participants gained insight into the company’s tool, which allows insurance lawyers to become 10x faster with the help of A.I.

Our last speaker of the night was Mark Schuster, Channel Manager at UiPath Switzerland GmbH. Mark explained how to apply A.I. to RPA workflows in minutes and provided interesting insights into Digital Claims and Voice Enabled Travel Systems.

Following the presentations, our speakers answered several questions by the participants. In the interesting Q&A session we exchanged ideas, challenges and information among the industry and academic experts.

This first online version of the Use-Case Talk Series was a success. However, we hope that we are able to host the Use-Case Talk Series in Technopark Zurich again in the near future so that participants have the opportunity to network and connect with other industry specialists at our sponsored Use-Case Talk Apéro.

The Use-Case Talks are part of a series that takes place three times a year. If you are interested in sharing your A.I. stories and discussing them with other industry members, you are warmly welcome to join us for our next Use-Case Talk taking place on the 21^st of October 2020. If you are interested in presenting a Use-Case, please contact us by e-mail (info@aspaara.com).

The Use-Case Talk Series is organized by Aspaara Algorithmic Solutions AG on behalf of Swiss Alliance for Data-Intensive Services.

We look forward to seeing you soon!

Blockchain Technology in Supply Chain Management Expert Group meeting 18.06.2020

Written by dsa_admin on 30. June 2020. Posted in Blog. 45 Comments on Blockchain Technology in Supply Chain Management Expert Group meeting 18.06.2020

On Thursday, 18th June 2020, we had our 9th meeting of the Expert Group “Blockchain Technology in Supply Chain Management”. The virtual meeting was dedicated to the concept and the application of ‘Self-Sovereign Identities’ in the digital area.

In the first part of the meeting, Martin Fabini, CTO from ti&m provided us with a short overview about the current concepts, developments and applied use cases in regard to SSIs. He explained how we have already moved from ‘centralized identities’ to ‘federated identities’ in our daily use of digital applications. However, he pointed out that we will need to develop a more user centric approach of a fully ‘self-sovereign identitiy’ (SSI) to get full control over our digital identities. The SSI building blocks – and their technological solutions – are currently a hot topic discussed in different consortia and foundations. A crucial part of all the SSI concept is a ‘decentralized identifier’ (DID) based on a decentralized trusted infrastructure provided by distributed ledger technologies (‘blockchain’).

In the second part of the meeting we discussed, in virtual breakout rooms, the possible business cases as well as the legal, technological, and business issues with SSI. In the discussion, we saw that there are many potential use cases and that some group members have already worked on some concrete pilots. However, the development of globally accepted standards is needed in order to overcome current legal and technological issues and to fully develop interoperable SSI concepts. Only a common SSI solution stack with compatible protocols will unlock the enormous business potential of SSIs.

Unfortunately, this time the meeting was concluded without our usual apéro due to obvious reasons. For our next meeting, planned in September, we hope that we can catch up again with this tradition and meet each other personally for an after-meeting networking apéro to exchange further ideas and contacts.

Spatial Data Analytics Expert Group Meeting 4.06.2020

Written by dsa_admin on 23. June 2020. Posted in Blog. 39 Comments on Spatial Data Analytics Expert Group Meeting 4.06.2020

The 7th meeting of the spatial data analytics expert group was held online on June 4^th 2020. The topic of the meeting was geovisualizations. This was the first meeting kindly hosted by a meeting chair. Simon Würsten from SBB had taken the lead in defining the schedule and in choosing the two talks.

Kevin Lang (SBB) presented the talk “Visualisation of accessibility at SBB”. It became clear that he had been confronted with some well known mapping problems and he presented nice solutions by his team. The talk generated questions and discussions.

Ralf Mauerhofer (Koboldgames) talked about “Murgame.ch – Natural hazard prevention by means of an online game” and gave some behind-the-scene-insights into the development of a game.

As usual, the participants had the possibility to interact and discuss. Part of the meeting was conducted in breakout-rooms for group discussions. After the meeting the participants gave arguments in favor for or against online meetings:

In theory it is possible to reach a larger audience in online meetings, because of the convenience of not having to spend any time traveling. In practice, however, the number of participants (19) was similar to previous in-person meetings. Sadly some participants could not connect due to internet security policies in their private network. A nice feature of the online format was the chat box where participants could ask questions even during the presentations. It was also used to share links and suggestions. After the meeting there was an informal discussion in a smaller group. We’re sure that the group would have been larger with the prospect of an Apero and a beer…

As a result we announce: The 8th meeting of the spatial data analytics expert group will be a physical meeting. The topic is “Applied sampling strategies” and Madlene Nussbaum will be our meeting chair. The meeting will take place on Thursday, August 27th 2020

CAS Data Product Design / Smart Service Engineering Course 04-05.06.2020

Written by dsa_admin on 16. June 2020. Posted in Blog. 51 Comments on CAS Data Product Design / Smart Service Engineering Course 04-05.06.2020

Value creation through data is a very current topic of utmost importance. However, in practical applications it is often not sufficiently clear, or unknown, how value can be created for businesses and their customers. The CAS Data Product Design / Smart Service Engineering course offers practical solutions and concrete options for how companies easily can generate service value from data. In four modules, participants acquire methodological knowledge of data-driven value creation and value capturing (including data-driven business models) as well as questions of data ethics, data protection and data security.

In the course, new methods are taught and applied directly to a continuous case study in numerous iterations. The participants form small groups and choose a problem at the beginning of the course for which they develop a data-driven service during the course. During the two-day practical workshop at the Mobiliar Forum Thun in the beginning of June, we developed service ecosystems in intensive iterations to sharpen data-driven value propositions, including testing with potential users. This year’s special feature: the entire workshop took place online, which worked very well. Many thanks to Ina Goller for guiding us through the workshop and pushing our cases forward. We learned a lot. Many thanks also to Fabio Rovelli, managing director of the Mobiliar Forum Thun, for enabling this workshop.

The participants will publish their cases in short papers starting this year – summarized in an eBook. More information on this will be available over the summer, e.g., on the website of the Swiss Alliance for Data-Intensive Services.

And an interesting new option: CAS Data Product Design is part of the new MAS Industry 4.0 under the name of CAS Smart Service Engineering and thus opens new doors and perspectives for graduates. Please note it has already been part of the existing MAS Data Science.