A data science service for smart business insight
Data science is becoming more and more important in different domains such as computer vision, e-commerce, healthcare, smart energy, business, social media, insurance, and research & development, to name a few.
Data science projects require multiple competences such as programming, statistical and machine learning skills as well as project management and communication. One needs to ask several questions when starting a project:
A project can often be captured with the following flow-chart:
Data science projects require multiple competences such as programming, statistical and machine learning skills as well as project management and communication. One needs to ask several questions when starting a project:
- Do you need small, large or big data, and do they exist or do they need to be collected?
- Are the data annotated?
- Is it anticipated that the data will be quickly growing in the nearby future?
- Are real-time analytics needed?
- To what degree should the analytics be automated?
A project can often be captured with the following flow-chart:
A few examples of services provided:
- Computer vision, e.g. automated image tagging.
- Sales volume forecasting using machine learning
- Deployment of dashboards
- Webscraping
- Natural language processing (NLP)
- Sentiment analysis
Automated image tagging
A system can be trained to do binary or multi-label classification. In practice, automated tagging can serve different use cases.
Data science toolbox
A list of tools used in previous projects. Of course, every project is unique with its specific needs, and new innovative tools appear rapidly. Please feel free to suggest your own tools.
Activity |
Tool |
Data & image processing |
Python Pandas, Numpy, Scipy, PIL, OpenCV, Albumentations, R, Matlab |
Machine learning |
Scikit-learn logistic regression, random forest, gradient boosting, ARIMA |
Deep learning |
Keras with backend TensorFlow, Image classification and regression: CNN (basic architecture to more complex such as EfficientNet), SwinTransformer. Object detection: Yolo, R-CNN, Time-series forecasting: LSTM |
Big data, databases and ETL |
Spark, Dask, H2O, PostgreSQL |
Visualization |
Plotly, Matplotlib, Bokeh, GGplot, Seaborn |
Dashboards |
Python Dash, R Shiny, Klipfolio, Tableau, Grafana |
Cloud |
Amazon Webservices, Sagemaker (AWS), Google Cloud Platform, Hidora |
OS & development platforms |
Linux, Docker, GitHub |
Communication |
Jupyter notebook, R Studio |
Images were taken from Unsplash and credits go to the following persons: