Workflow Element Store

  1. Flat files
  2. WebScraping
  3. Data Collaboration and Partnerships
  4. APIs and Data Feeds
  5. Data bases - NoSQL
  6. Feedback Data
  7. Mobile Applications or IoT Applications
  8. Public Datasets
  9. Surveys and Questionnaires
  10. Experiments (DoE)
  11. Data Bases - SQL
  1. AWS Kinesis
  2. Azure ADF
  3. MySQL
  4. GCP Data Fusion
  5. s3
  6. Azure Streaming Analytics
  7. Azure Synapse
  8. GCP Dataflow
  9. GCP BigQuery
  10. ETL/ELT pipeline
  11. GCS
  12. MongoDB
  13. Apache Kafka
  14. MS SQL server
  15. AWS Redshift
  16. RDBMS
  17. Azure blob storage
  18. AWS Glue
  19. AWS RDS
  20. Oracle DB
  21. PostgreSQL
  1. Augmentation
  2. Feature Selection
  3. Polynomial Features
  4. Annotation
  5. Binning / Discretization
  6. Data Transformations
  7. Data Partitioning - Train, Validation, & Test
  8. Auto-Preprocessing libraries
  9. Feature Extraction from Images
  10. Handling Noisy Data
  11. AutoEDA libraries
  12. Handling Categorical Data
  13. Dealing with Outliers
  14. Data Scaling and Normalization
  15. Textual Feature Extraction
  16. Time-Based Features
  17. Interaction Features
  18. Handling Imbalanced Classes
  19. Domain-Specific Feature Engineering
  20. Handling Time-Series Data
  21. Handling Missing Data
  22. Dimensionality Reduction
  1. Learning Rate Scheduling
  2. Association Rules
  3. Data Augmentation
  4. Forecasting Techniques
  5. Ensemble Techniques
  6. Reinforcement Learning
  7. Cross-Validation
  8. Network Analytics/ GeoSpatial Analytics
  9. External Validation
  10. Clustering
  11. Cross-Validation
  12. Batch Size Selection
  13. Model Interpretability
  14. Regularization
  15. Natural Language Processing
  16. Weight Initialization
  17. Binary Classification Techniques
  18. Model Comparison
  19. Hyperparameter Tuning
  20. Regular Monitoring and Logging
  21. Transfer Learning
  22. Early Stopping
  23. Performance Visualization
  24. Regularization Techniques
  25. Transfer Learning
  26. Evaluation Metrics
  27. Word Embeddings
  28. GridSearchCV, RandomisedSearchCV, BayesianSearchCV
  29. Recommendation Engine
  30. Batch Normalization
  31. Regression Analysis
  32. Blackbox - Neural Network Models
  33. AutoML
  34. Multiclass Classification Techniques
  1. code repository
  2. model registry
  3. Databases
  4. Data Preprocessing pipeline models
  5. Datawarehouse
  1. Streamlit
  2. Containerization
  3. Prediction Logging
  4. Feedback Collection
  5. Data Drift Monitoring
  6. Concept Drift Detection
  7. Model Versioning
  8. Flask
  9. Performance Metrics
  10. Serverless Computing
  11. Model Serialization
  12. Edge Deployment
  13. Model Drift
  14. Alerting and Notification
  15. Bias and Fairness Assessment
  16. Cloud Deployment
  17. Model Health Monitoring
  18. FastAPI
ML Workflow Intermediate - Architecture
  • Element belongs to model
  • Element not belongs to model
Training Pipeline
Data Collection

Data Collection

Inference API

API Stream

Web crawler

API Stream

Web crawler

Python logo

Selenium

Data Ingestion

Data Ingestion

Data Landing Zone

Store Data from all the Sources
Store Data from all the Sources

Store Data from all the Sources

Data Cleaning / Preprocessing

Data Cleaning / Preprocessing

Derived & Base features

Data Training & Modelling

Data Training & Modelling

Inference Pipeline
Input Data for Forecasting

Input Data for Forecasting

Input Data

Cleaned & Processed Data

Inference

Inference

Inference pickle
Inference Joblib
streamlit
Inference API