Profile Photo

Thomas K John

AI/ML Cloud Engineer

AWS ML Specialty Logo AWS Solutions Architect Logo AWS SysOps Admin Logo aws cloud practitioner Terraform Associate Logo

Professional Summary

Experienced Machine Learning Engineer with 7+ years of delivering AI/ML solutions across on-premise and cloud environments, spanning e-commerce, manufacturing, and enterprise IT. Proficient in building scalable data pipelines and machine learning models using Python, PySpark, and SQL on platforms like Databricks, AWS Bedrock, and Dataiku DSS.

Proven ability to drive business impact through:

Proficient in designing and deploying ML workloads using a broad range of AWS services, including compute, storage, networking, and ML tools—combined with infrastructure-as-code automation using Terraform.

Demonstrating real-world AWS ML expertise through a live cloud-hosted portfolio. Open to global remote roles focused on cloud solution design, migration, and architecture with a strong foundation in ML integration.

Work Experience

  • Designed a feature-rich RAG chatbot leveraging Databricks Mosaic AI, with multilingual support and image input handling to enhance user experience and precision in knowledge retrieval.
Key Skills: Python, Databricks Mosaic AI, Generative AI, Large Language Models (LLMs)

Client: Solute GMBH - A Germany-based technology company known for developing digital commerce solutions, including the popular price comparison platform billiger.de.

  • Project 1. Traffic Partner Blacklisting Project

    • Spearheaded a data-driven initiative to enhance online shop performance by improving traffic quality and profitability.
    • Designed and implemented an analytics pipeline using Python and PySpark for large-scale traffic and profit analysis.
    • Defined performance-based blacklisting criteria, significantly improving return on ad spend (ROAS)
    • Ensured quality and transparency through peer reviews, Confluence documentation, and Jira-based workflow.
  • Project 2. Reporting Tool Evaluation Project

    • Led a 3-month strategic initiative to evaluate the optimal reporting and BI tool for future cloud-based integration and analytics modernization.
    • Conducted an in-depth analysis of Looker, Power BI, and the existing on-premise cube-based reporting tool.
    • Evaluated tools based on criteria such as web-based development, version control, collaboration features, self-service capabilities for business users, and performance.
    • Collaborated with cross-functional teams to gather requirements and understand current and future reporting needs.
    • Developed and executed test cases to compare performance, usability, and integration capabilities of each tool.
    • Presented findings and recommendations to senior management, highlighting the pros and cons of each solution and aligning them with our strategic goals.
Key Skills:
  • Programming Languages: Python and PySpark (Apache Spark for Python)
  • Data Analysis: SQL, Hive and Spark SQL
  • Data Science Platform: Databricks
  • BI Reporting Tools: Google Looker, Microsoft Power BI, icCube
  • Optimization: Mixed Integer Linear Programming

Client: Morgan Stanley (A US based global investment bank and financial services company)

  • Led financial data analysis using Python, Hadoop, Hive, and SQL to generate accurate forecasts.
  • Utilized Apache Spark for large-scale data processing, delivering timely and actionable insights.
  • Designed and maintained data pipelines using Dataiku DSS for efficient data preparation and transformation.
  • Migrated legacy data science workflows from SAS to Python, improving scalability and flexibility.
  • Trained team members on Apache Spark and Dataiku DSS, fostering knowledge sharing and consistency.
  • Applied agile methodologies using Jira to manage project timelines and cross-functional collaboration.
  • Presented key analytical findings and financial forecasts via Tableau dashboards to business stakeholders.
Key Skills:
  • Programming Languages: Python and PySpark (Apache Spark for Python)
  • Data Analysis: SQL, Hive and Spark SQL
  • Data Science Platform: Dataiku DSS
  • Machine Learning and Deep Learning: Supervised and Unsupervised Learning, Financial Forecasting, Model Training & Evaluation, Feature Engineering, Hyperparameter Tuning
  • Reporting Tool: Tableau
  • Project 1. Semiconductor Wafer Defect Signature Analysis

    • Developed a computer vision model to analyze semiconductor wafer images, accurately detecting defect patterns such as circles, lines, and blobs to assist in quality control.
  • Project 2. AI-based approach for MRO Optimization and Similar Parts Detection

    • Designed an AI-based solution for MRO (Maintenance, Repair, and Operations) optimization, identifying and consolidating duplicate spare parts from an inventory of over one million items, significantly improving inventory efficiency and part approval workflows.
  • Project 3. Monitoring failures in a Hadoop cluster through log analysis

    • Built a resource usage monitoring system for YARN-based applications and developed statistical analysis models using Apache Spark ML.
Key Skills:
  • Programming Languages: R, Python and PySpark (Apache Spark for Python)
  • Data Analysis: SQL, Hive and Spark SQL
  • Data Science Platforms: Databricks, Dataiku DSS
  • Machine Learning and Deep Learning: Supervised and Unsupervised Learning, Model Training & Evaluation, Feature Engineering, Hyperparameter Tuning, Deep Learning Architectures (CNNs, RNNs, Transformers)
  • Reporting Tool: Tableau
  • ML/AI Frameworks & Libraries: OpenCV, TensorFlow, Keras, Scikit-learn
  • Developed a network controller to enhance traffic monitoring and analysis efficiency.
  • Implemented a rule-based classification algorithm to automatically prioritize ACL rules for traffic redirection.
  • Improved monitoring service accuracy and responsiveness, receiving excellent feedback from customers.
Key Skills: Java/J2EE, Linux, Database Replication, SDLC, SQL, Telecommunications NMS

Client: ADVA Optical Networking, Bangalore, India

  • Developed an RPM-based ENC installer supporting fresh installations, high-availability (HA) setups, upgrades/downgrades, and patch management.
  • Migrated build infrastructure from ANT to Maven.
  • Automated CI/CD pipelines using Jenkins, streamlining build and deployment processes.
  • Implemented key features including feature-based licensing, and auto-restoration of database replication.
Key Skills: Java, Maven, Jenkins, RPM, Linux, Database Replication, SDLC
  • Worked on an application framework for developing hybrid mobile applications across platforms including Android, iOS, and Windows Phone.
  • Contributed to cross-platform development efforts, ensuring code reusability and consistent UI/UX across devices.
Key Skills: Java, SDLC, SQL

Contributed to the development and support of a Network Management System (NMS) for configuring and managing telecom devices. Worked on backend modules using Java, Hibernate, and JMS, and gained experience in asynchronous messaging, SNMP-based communication, and enterprise system integration.

Key Skills: Java, Hibernate, EJB, JMS, SNMP, SQL, VMware vSphere, SDLC, Telecommunications NMS

Client: Alcatel-Lucent (France-based telecom company)

Developed and maintained large-scale Java-based OSS applications for telecom service provisioning. Gained hands-on experience with system design, API integrations (SOAP/XML), and test-driven development in an enterprise environment.

Key Skills: Core Java, SQL, JUnit, SDLC, Telecommunications NMS, Linux

Cloud & AI Engineering Projects

Cloud Resume Challenge - AWS Serverless Portfolio Project with Terraform

Built a fully serverless personal website on AWS, demonstrating real-world use of cloud-native architecture with AWS services (i.e. S3, CloudFront, Route 53, ACM, Lambda, DynamoDB and API Gateway), and automated deployment using Terraform.

  • Static Website Hosting with Amazon S3
  • Global Content Delivery via CloudFront
  • Custom Domain and SSL with Route 53 and ACM
  • Visitor Counter API powered by API Gateway, AWS Lambda and AWS DynamoDB
  • Fully Automated Infrastructure using Terraform
  • CI/CD Integration with AWS CodePipeline
GitHub: Backend with Terraform GitHub: Frontend (HTML/CSS/JS)

AWS AI Code Generator with Bedrock, Lambda, API Gateway, S3, and Terraform

This project demonstrates a serverless AI-powered code generation service using Amazon Bedrock (Titan model) integrated with AWS Lambda and exposed via API Gateway.

  • Amazon API Gateway: Provides a REST API endpoint for external requests.
  • AWS Lambda: Executes the python script to process inputs.
  • Amazon Bedrock: Powers the generative AI functionality using amazon.titan-text-premier-v1 foundational model.
  • Amazon S3: Stores generated code files with proper extensions (.py, .java, .js, .sql, etc.)
  • Terraform: Automates provisioning of S3, Lambda, IAM roles, and API Gateway resources.
GitHub: Source Code

Education

Bachelor of Technology (B.Tech), Electronics & Communication Engineering
Rajiv Gandhi Institute of Technology (RIT), Kottayam
Affiliated to Mahatma Gandhi University, Kerala
2004 - 2008

Certifications

AWS ML Specialty Logo

AWS Certified Machine Learning Specialty

Issuer: Amazon Web Services (AWS)

Issued: Jan, 2025

Expires: Jan, 2028

Credential ID: 1d787f0ad76048e2a278e1bb98a574ea

Show Credential
AWS Solutions Architect Logo

AWS Certified Solutions Architect Associate

Issuer: Amazon Web Services (AWS)

Issued: May, 2025

Expires: May, 2028

Credential ID: d19029d4a4144b9ca9251abcf37e7eaf

Show Credential
AWS SysOps Admin Logo

AWS Certified SysOps Administrator Associate

Issuer: Amazon Web Services (AWS)

Issued: Jun, 2025

Expires: Jun, 2028

Credential ID: dd3ff3a042344b05aa6cbbf9b3bb3690

Show Credential
AWS Cloud Practitioner

AWS Certified Cloud Practitioner

Issuer: Amazon Web Services (AWS)

Issued: Jun, 2019

Expired: Jun, 2022

Credential ID: FXLNNWZ1BNF1QCG2

Show Credential
Terraform Associate Logo

HashiCorp Certified: Terraform Associate (003)

Issuer: HashiCorp

Issued: Jun, 2025

Expires: Jun, 2027

Credential ID: ef81560b8b3b4d68a91a1896e81c201e

Show Credential
This site uses a lightweight visitor counter with no tracking or cookies.