OpenAnalytics has in-depth understanding and practical experience at all steps of the data analysis process and organizes its services along four pillars.
Statistical Consulting
Open Analytics employs a team of PhD-level statisticians and machine learners that are trained to translate customer questions into appropriate analysis methodology. This can involve the design of experiments or sample surveys, the analysis of data as well as development of novel methods to tackle specific characteristics in the data that require a carefully chosen approach to answer the right questions. In terms of specialization, the team covers a very broad range of techniques including
- traditional statistical modeling and simulation techniques
- time series analysis and forecasting
- machine learning and artificial intelligence
- data mining and text mining
- digital signal processing
- image processing and analysis
The application domains in our consultancy projects have been very diverse and include
- analysis of high-dimensional biological data (omics)
- modeling of ecological and environmental data
- health monitoring and IoT data
- six sigma, statistical quality and process control
- analysis of marketing and financial data
A sample of projects are described in our case studies. Based on our large experience we also offer training in data analysis or specific statistical or machine learning methodology.
Scientific Programming
When proper methodologies for analyzing data are identified, software implementations are needed to perform the actual analysis. Often, implementations are readily available, but when methodologies need tuning it may require expert programming expertise to apply the relevant modifications. Open Analytics has a skilled team of software engineers that combine strong methodological skills with solid algorithmic know-how to bring scientific programming problems to a good end. In some circumstances, research papers describe methodologies that are not readily available and Open Analytics has appropriate depth of skills and experience to implement such methods straight from the research paper. Another common scenario where Open Analytics is called in for help is to work on code that does not run efficiently enough for customer purposes. In that case Open Analytics will review and optimize the code or parallelize parts of the code to speed up execution.
When implementation using high-level languages is an option, Open Analytics will develop solutions using R, Python or the Julia language. On certain projects, however, code will be developed in low-level languages (C, C++, Fortran) or languages that are appropriate for the use case (Scala, Java).
Examples of scientific programming tasks are:
- implement very fast methods for non-negative matrix factorization methods
- extend an implementation of the SAEM algorithm to take into account censored data
- port simulation code for optimal design of PK experiments to C++
- fine tune and speed up the implementation of an MRMR based feature selection method
- run advanced dimension reduction methods in-database
A sample of projects are described in our case studies and Open Analytics provides trainings in common data science languages including R, Python and Julia.
Application Development and Integration
Software for data science applications has specific requirements and may need specific technology to conveniently bring the results of analysis algorithms to end users or integrate these with other components in a given software landscape. Open Analytics typically works on
- architecture for data analysis solutions (on premise, hosted, cloud-based)
- design of data stores and big data infrastructure for scientific applications
- development of data science APIs e.g. for automated analyses
- development of custom applications for specific data analysis problems (desktop applications such as Phaedra and web applications)
- automation of statistical analyses or predictive modeling
- development of data science tooling (e.g. Architect, ShinyProxy, R Service Bus, see our products)
A sample of projects are described in our case studies and Open Analytics provides trainings in common frameworks for data science products e.g. Shiny development or data science tooling e.g. ShinyProxy or the R Service Bus.
Data Analysis Hardware and Hosting
Customers trust Open Analytics to externalize certain components of their data analysis infrastructure or to host data science platforms from our data centers. In line with industry best practices we offer
- hosting of web applications for data analysis
- secure and scalable hosting of Shiny applications
- hosting of data science APIs
- hosting of data science collaboration and notebook servers
- design of custom hardware configurations and managed servers for data analysis
Our hosting platforms have been validated for use in regulatory contexts (e.g. pharmaceutical industry or banking). A sample of projects that highlight our competence and experience are described in the case studies.