Day 0: Trouble shooting and FAQs

  • Kaggle notebook
  • API key pay in AI Studio
  • Sign up for a Discord account and join us on the Kaggle Discord server.
    We have the following channels dedicated to this event:
    • #5dgai-announcements: find official course announcements and livestream recordings.
    • #5dgai-introductions: introduce yourself and meet other participants from - around the world.
    • #5dgai-question-forum: Discord forum-style channel for asking questions and discussions about the assignments.
    • #5dgai-general-chat: a general channel to discuss course materials and network with other participants.

Day 1: Foundational Large Language Models & Text Generation and Prompt Engineering

  • 📝 My Kaggle Notebooks
  • 🎒Assignment
    1. Complete the Intro Unit – “Foundational Large Language Models & Text Generation”:
      • Listen to the summary podcast episode for this unit.
      • To complement the podcast, read the “Foundational Large Language Models & Text Generation” whitepaper.
    2. Complete Unit 1 – “Prompt Engineering”:
      • Listen to the summary podcast episode for this unit.
      • To complement the podcast, read the “Prompt Engineering” whitepaper.
      • Complete these codelabs on Kaggle:
      • Make sure you phone verify your Kaggle account before starting, it’s necessary for the codelabs.
      • Want to have an interactive conversation? Try adding the whitepapers to NotebookLM.
  • 💡What You’ll Learn
    • Today you’ll explore the evolution of LLMs, from transformers to techniques like fine-tuning and inference acceleration. You’ll also get trained in the art of prompt engineering for optimal LLM interaction.
    • The code lab will walk you through getting started with the Gemini API and cover several prompt techniques and how different parameters impact the prompts.

Summary of the key points & callouts

Day 1 - Prompting [TP]

  • The examples were built leveraging Gemini 2.0 and the latest google-genai SDK
  • Gemini Models covered are
    • gemini-2.0-flash
    • gemini-2.0-flash-thinking-exp

Thinking model: Trained to generate the "thinking process" the model goes through as part of it's response. This provides us with high quality responses without needing specialised prompting like CoT or ReAct.

  • Two types of content generation covered are
    • client.models.generate_content
      • Single-turn text-in/text out structure
    • client.models.generate_content_stream
      • Instead of waiting for the entire response, the model sends back chunks or parts of the generated content as they become available as an iterable.
    • `client.chats.create
      • Multi-turn chat structure including access to chat history.
  • Configs covered are
    • temperature
    • top_p
    • max_output_tokens

Specifying this parameter does not influence the generation of the output tokens, so the output will not become more stylistically or textually succinct, but it will stop generating tokens once the specified length is reached.

  • Prompting techniques covered are
  • Techniques to enforce the LLM output to follow the supplied schema
    • enum mode
    • json mode
      • Can be achieved using typingTypedDict or dataclass
  • Code prompting scenarios covered are
    • Generating code
    • Code execution
      • Can automatically run generated code using ToolCodeExecution
    • Explaining code

Day 1 - Evaluation and structured output [TP]

  • Gemini Models covered are
    • gemini-2.0-flash
  • Concepts covered include
    • Summarising a (pdf) document
    • Evaluating question answering quality
      • Gauge the quality of the LLM generated summary for the user question on a document.
      • Criteria used are
        • Instruction following
        • Groundedness
        • Completeness
        • Conciseness
        • Fluency
      • Two popular model-based metrics covered are:
          1. Pointwise evaluation: Evaluate a single I/O pair against some criteria (e.g., 5 levels of grading)
          1. Pairwise evaluation: Compare two outputs against each other and pick the better one (e.g., Compare against a baseline model response).
    • Caching & Memoization

Day 2: Embeddings & Vector

Summary of the key points & callouts

Day 2 - Document Q&A with RAG [TP]

  • Gemini Models covered include
    • text-embedding-004
  • Two embedding task types (official documentation) covered are
    • retrieval_document - Specifies the given text is a document from the corpus being searched.
    • retrieval_query - Specifies the given text is a query in a search/retrieval setting.
  • ChromaDB methods (official documentation) covered are
    • db.add(...)
    • db.count()
    • db.peek(...)
    • db.query(...)
  • RAG Concepts covered
    • 0. Creating embeddings
      1. Retrieval: Finding relevant documents
      1. & 3. Augmented Generation: Answer the questions

Day 2 - Embeddings and similarity scores [TP]

  • Gemini Models covered include
    • text-embedding-004
  • Embedding task types (official documentation) covered are
    • semantic_similarity
  • Concepts covered include

Day 2 - Classifying embeddings with Keras [TP]

  • Gemini Models covered include
    • text-embedding-004
  • Embedding task types (official documentation) covered are
    • classification
  • Concepts covered include
    • Building a simple classification model with 1 hidden layer (and ~0.6M trainable params) using keras.Sequential

Day 3: AI Agents

  • 📝 My Kaggle Notebooks
  • 🎒Assignment
  • 💡What You’ll Learn
    • Learn to build sophisticated AI agents by understanding their core components and the iterative development process.
    • The code labs cover how to connect LLMs to existing systems and to the real world. Learn about function calling by giving SQL tools to a chatbot, and learn how to build a LangGraph agent that takes orders in a café.

Summary of the key points & callouts

Day 3 - Function calling with the Gemini AP [TP]

  • Gemini models covered include
    • gemini-2.0-flash
    • gemini-2.0-flash-exp
  • Concepts covered include
    • Function calling (leveraging openAPI schema) with tools
      • Three important parts that can be observed in chat history are
          1. text, 2) function_call, 3) function_response
    • Compositional function calling

Need to revisit "Compositional function calling" concept as the code was not working completely as expected

Day 3 - Building an agent with LangGraph [TP]

  • Gemini models covered include
    • gemini-2.0-flash
  • Concepts covered include
    • LangGraph
      • graph structure
      • state schema
      • node (action or a step)
      • edge (transition between states)
        • conditional edge
      • tools
        • stateless tools
        • stateful tools

Model should not directly have access to the apps internal state, or it risks being manipulated arbitrarily. Rather, we provide tools that update the state.

Day 4: Domain-Specific Models

  • 📝 My Kaggle Notebooks
  • 🎒Assignment
  • 💡What You’ll Learn
    • In today’s reading, you’ll delve into the creation and application of specialized LLMs like SecLM and MedLM/Med-PaLM, with insights from the researchers who built them.
    • In the code labs you will learn how to add real world data to a model beyond its knowledge cut-off by grounding with Google Search.  You will also learn how to fine-tune a custom Gemini model using your own labeled data to solve custom tasks.

Summary of the key points & callouts

Day 4 - Google Search grounding [TP]

  • Gemini models covered include
    • gemini-2.0-flash
  • Concepts covered include
    • Search Grounding using GoogleSearch
      • grounding_metadata - Includes link to search suggestions, supporting docs, and info on how they were used
        • grounding_chunks - Contains source URI
        • grounding_supports - For each output text chunk, contains the index of the grounding chunk and the corresponding confidence score (start_index, stop_index).
    • Search with Tools
      • GoogleSearch - Google Search Grounding Tool
      • ToolCodeExecution - Code generation & execution tool

Day 4 - Fine tuning a custom model [TP]

  • Gemini models covered include
    • gemini-1.5-flash-001
    • gemini-1.5-flash-001-tuning - This is deprecated now
  • Concepts covered include
    • Fine-tuning a custom model
      • Tuned model requires no prompting or system instructions and outputs succinct text from the classes we provide in the training data.
      • Thereby, saving total tokens require,

Day 5: MLOps for Generative AI

  • 📝 My Kaggle Notebooks
  • 🎒Assignment
  • 💡What You’ll Learn
    • Discover how to adapt MLOps practices for Generative AI and leverage Vertex AI’s tools for foundation models and generative AI applications such as AgentOps for agentic applications.

Miscellaneous


To do or clarify