Moustachir
IT ProtocolsEngineering Protocols

AI/ML Engineering Protocol

Standard operating procedures for AI/ML Engineering at Moustachir Com

1. INTRODUCTION

This document defines the standard operating procedures for AI/ML Engineering at Moustachir Com. It applies to all AI/ML Engineers, Data Scientists, and ML Ops specialists, whether internal team members or external partners.

For general engineering standards (Git Flow, Testing Strategy, Code Quality), refer to the General Engineering Protocol.


2. ONBOARDING CHECKLIST

Goal: First model training pipeline running in 48 hours.

2.1 Access Provisioning

  • GitHub: Accept invite to the Organization.
  • Cloud AI Services: Access to AWS SageMaker, Google AI Platform, or Azure ML (as applicable).
  • GPU Resources: Access to GPU instances for training.
  • Notion: Access to the Development Board and AI/ML project documentation.
  • Vector Database: Access to Pinecone, Weaviate, or equivalent (if applicable).

2.2 Environment Setup

  • Python: Use the version specified in .python-version or pyproject.toml (typically 3.9+).
  • Package Manager: We use poetry or pip with requirements.txt.
  • Virtual Environment: Always use a virtual environment (venv, conda, or poetry).
  • IDE: VS Code or PyCharm with extensions:
    • Python
    • Jupyter
    • Pylint / Black
    • GitLens
  • Jupyter: Install Jupyter Lab or Jupyter Notebook for experimentation.
  • ML Frameworks: Install project-specific frameworks:
    • TensorFlow / PyTorch / JAX
    • Transformers (Hugging Face)
    • LangChain / LlamaIndex (for LLM applications)

2.3 Repository Setup

  1. Clone the repository: git clone <repo-url>
  2. Create virtual environment: python -m venv venv or poetry install
  3. Activate environment: source venv/bin/activate (Linux/Mac) or venv\Scripts\activate (Windows)
  4. Install dependencies: pip install -r requirements.txt or poetry install
  5. Copy .env.example to .env and populate API keys (OpenAI, Anthropic, etc.).
  6. Download sample datasets or models as specified in the project README.

2.4 Additional Setup for Agentic AI Developers

If you are working on AI Agent systems (autonomous agents, multi-agent systems, tool-using LLMs), install the following additional tools:

  • Agentic Frameworks:
    • LangGraph (for stateful agent workflows)
    • CrewAI (for multi-agent orchestration)
    • AutoGen (Microsoft's multi-agent framework)
    • LangChain Agents (tool-calling and reasoning loops)
  • Tool Integration:
    • Function Calling: Familiarize with OpenAI/Anthropic function calling APIs
    • MCP (Model Context Protocol): For standardized tool integration
  • Observability:
    • LangSmith or LangFuse: For tracing agent execution and debugging
    • Weights & Biases: For tracking agent performance metrics
  • Testing:
    • pytest: For unit testing agent components
    • Agent evaluation frameworks: For testing agent reliability and safety

Key Concepts to Understand:

  • ReAct Pattern: Reasoning + Acting loops
  • Tool Use: How agents call external functions/APIs
  • Memory Management: Short-term vs. long-term memory in agents
  • Multi-Agent Communication: Message passing, coordination protocols

3. DAILY WORKFLOW

3.1 Experimentation

  • Jupyter Notebooks: Use for exploration and prototyping.
  • Experiment Tracking: Use MLflow, Weights & Biases, or TensorBoard to track experiments.
  • Version Control for Models: Use DVC (Data Version Control) or similar for model versioning.

3.2 Model Development

  • Data Preprocessing: Document all preprocessing steps. Make them reproducible.
  • Feature Engineering: Keep feature engineering code modular and testable.
  • Model Training: Log hyperparameters, metrics, and artifacts.
  • Evaluation: Use consistent evaluation metrics. Compare against baselines.

3.3 Production Deployment

  • Model Serving: Package models for serving (FastAPI, TorchServe, TensorFlow Serving).
  • Monitoring: Implement model performance monitoring and drift detection.
  • Scalability: Design for horizontal scaling where needed.

3.4 LLM-Specific (if applicable)

  • Prompt Engineering: Document prompts and their versions.
  • RAG (Retrieval-Augmented Generation): Implement proper chunking, embedding, and retrieval strategies.
  • Cost Management: Monitor API usage and costs for third-party LLM APIs.

4. DATA & ETHICS

  • Data Privacy: Ensure compliance with data privacy regulations (GDPR, etc.).
  • Bias Detection: Test models for bias and fairness.
  • Explainability: Provide model explanations where required.

Table of contents