Day 0: Trouble shooting and FAQs

Kaggle notebook
API key pay in AI Studio
Sign up for a Discord account and join us on the Kaggle Discord server.
We have the following channels dedicated to this event:
- #5dgai-announcements: find official course announcements and livestream recordings.
- #5dgai-introductions: introduce yourself and meet other participants from - around the world.
- #5dgai-question-forum: Discord forum-style channel for asking questions and discussions about the assignments.
- #5dgai-general-chat: a general channel to discuss course materials and network with other participants.

Day 1: Foundational Large Language Models & Text Generation and Prompt Engineering

📝 My Kaggle Notebooks
- ✅ Day 1 - Prompting [TP]
- ✅ Day 1 - Evaluation and structured output [TP]
🎒Assignment
1. Complete the Intro Unit – “Foundational Large Language Models & Text Generation”:
  - Listen to the summary podcast episode for this unit.
  - To complement the podcast, read the “Foundational Large Language Models & Text Generation” whitepaper.
2. Complete Unit 1 – “Prompt Engineering”:
  - Listen to the summary podcast episode for this unit.
  - To complement the podcast, read the “Prompt Engineering” whitepaper.
  - Complete these codelabs on Kaggle:
    - Prompting fundamentals
    - Evaluation and structured data
  - Make sure you phone verify your Kaggle account before starting, it’s necessary for the codelabs.
  - Want to have an interactive conversation? Try adding the whitepapers to NotebookLM.
💡What You’ll Learn
- Today you’ll explore the evolution of LLMs, from transformers to techniques like fine-tuning and inference acceleration. You’ll also get trained in the art of prompt engineering for optimal LLM interaction.
- The code lab will walk you through getting started with the Gemini API and cover several prompt techniques and how different parameters impact the prompts.

Summary of the key points & callouts

Day 1 - Prompting [TP]

The examples were built leveraging Gemini 2.0 and the latest google-genai SDK
Gemini Models covered are
- gemini-2.0-flash
- gemini-2.0-flash-thinking-exp

Thinking model: Trained to generate the "thinking process" the model goes through as part of it's response. This provides us with high quality responses without needing specialised prompting like CoT or ReAct.

Two types of content generation covered are
- client.models.generate_content
  - Single-turn text-in/text out structure
- client.models.generate_content_stream
  - Instead of waiting for the entire response, the model sends back chunks or parts of the generated content as they become available as an iterable.
- `client.chats.create
  - Multi-turn chat structure including access to chat history.
Configs covered are
- temperature
- top_p
- max_output_tokens

Specifying this parameter does not influence the generation of the output tokens, so the output will not become more stylistically or textually succinct, but it will stop generating tokens once the specified length is reached.

Prompting techniques covered are
- Zero-shot Prompting
- One-shot Prompting
- Few-shot Prompting
- Chain of Thought (CoT)
  - Reason only
- ReAct: Reason and Act
  - Thought > Acton > Observation > …
Techniques to enforce the LLM output to follow the supplied schema
- enum mode
- json mode
  - Can be achieved using typingTypedDict or dataclass
Code prompting scenarios covered are
- Generating code
- Code execution
  - Can automatically run generated code using ToolCodeExecution
- Explaining code

Day 1 - Evaluation and structured output [TP]

Gemini Models covered are
- gemini-2.0-flash
Concepts covered include
- Summarising a (pdf) document
- Evaluating question answering quality
  - Gauge the quality of the LLM generated summary for the user question on a document.
  - Criteria used are
    - Instruction following
    - Groundedness
    - Completeness
    - Conciseness
    - Fluency
  - Two popular model-based metrics covered are:
    - 1. Pointwise evaluation: Evaluate a single I/O pair against some criteria (e.g., 5 levels of grading)
    - 1. Pairwise evaluation: Compare two outputs against each other and pick the better one (e.g., Compare against a baseline model response).
- Caching & Memoization

Day 2: Embeddings & Vector

📝 My Kaggle Notebooks
🎒Assignment
- Complete Unit 2: “Embeddings and Vector Stores/Databases”, which is:
  - [Optional] Listen to the summary podcast episode for this unit (created by NotebookLM).
- Read the “Embeddings and Vector Stores/Databases” whitepaper.
- Complete these code labs on Kaggle:
  1. Build a RAG question-answering system over custom documents
  2. Explore text similarity with embeddings
  3. Build a neural classification network with Keras using embeddings.
💡What You’ll Learn
- Today you will learn about the conceptual underpinning of embeddings and vector databases and how they can be used to bring live or specialist data into your LLM application. You’ll also explore their geometrical powers for classifying and comparing textual data.

Summary of the key points & callouts

Day 2 - Document Q&A with RAG [TP]

Gemini Models covered include
- text-embedding-004
Two embedding task types (official documentation) covered are
- retrieval_document - Specifies the given text is a document from the corpus being searched.
- retrieval_query - Specifies the given text is a query in a search/retrieval setting.
ChromaDB methods (official documentation) covered are
- db.add(...)
- db.count()
- db.peek(...)
- db.query(...)
RAG Concepts covered
- 0. Creating embeddings
- 1. Retrieval: Finding relevant documents
- 1. & 3. Augmented Generation: Answer the questions

Day 2 - Embeddings and similarity scores [TP]

Gemini Models covered include
- text-embedding-004
Embedding task types (official documentation) covered are
- semantic_similarity
Concepts covered include
- Similarity score using cosine similarity (or dot product). More details here

Day 2 - Classifying embeddings with Keras [TP]

Gemini Models covered include
- text-embedding-004
Embedding task types (official documentation) covered are
- classification
Concepts covered include
- Building a simple classification model with 1 hidden layer (and ~0.6M trainable params) using keras.Sequential

Day 3: AI Agents

📝 My Kaggle Notebooks
- ✅Day 3 - Function calling with the Gemini AP [TP]
- ✅Day 3 - Building an agent with LangGraph [TP]
🎒Assignment
- Complete Unit 3: “Generative AI Agents”, which is:
  - [Optional] Listen to the summary podcast episode for this unit (created by NotebookLM).
  - Read the “Generative AI Agents” whitepaper.
  - Complete these code labs on Kaggle:
    1. Talk to a database with function calling
    2. Build an agentic ordering system in LangGraph
💡What You’ll Learn
- Learn to build sophisticated AI agents by understanding their core components and the iterative development process.
- The code labs cover how to connect LLMs to existing systems and to the real world. Learn about function calling by giving SQL tools to a chatbot, and learn how to build a LangGraph agent that takes orders in a café.

Summary of the key points & callouts

Day 3 - Function calling with the Gemini AP [TP]

Gemini models covered include
- gemini-2.0-flash
- gemini-2.0-flash-exp
Concepts covered include
- Function calling (leveraging openAPI schema) with tools
  - Three important parts that can be observed in chat history are
    - 1. text, 2) function_call, 3) function_response
- Compositional function calling

Need to revisit "Compositional function calling" concept as the code was not working completely as expected

Day 3 - Building an agent with LangGraph [TP]

Gemini models covered include
- gemini-2.0-flash
Concepts covered include
- LangGraph
  - graph structure
  - state schema
  - node (action or a step)
  - edge (transition between states)
    - conditional edge
  - tools
    - stateless tools
    - stateful tools

Model should not directly have access to the apps internal state, or it risks being manipulated arbitrarily. Rather, we provide tools that update the state.

Day 4: Domain-Specific Models

📝 My Kaggle Notebooks
- ✅ Day 4 - Google Search grounding [TP]
- ✅ Day 4 - Fine tuning a custom model [TP]
🎒Assignment
- Complete Unit 4: “Domain-Specific LLMs”, which is:
  - [Optional] Listen to the summary podcast episode for this unit (created by NotebookLM).
  - Read the “Solving Domain-Specific Problems Using LLMs” whitepaper.
  - Complete these code labs on Kaggle:
    1. Use Google Search data in generation
    2. Tune a Gemini model for a custom task
💡What You’ll Learn
- In today’s reading, you’ll delve into the creation and application of specialized LLMs like SecLM and MedLM/Med-PaLM, with insights from the researchers who built them.
- In the code labs you will learn how to add real world data to a model beyond its knowledge cut-off by grounding with Google Search. You will also learn how to fine-tune a custom Gemini model using your own labeled data to solve custom tasks.

Summary of the key points & callouts

Day 4 - Google Search grounding [TP]

Gemini models covered include
- gemini-2.0-flash
Concepts covered include
- Search Grounding using GoogleSearch
  - grounding_metadata - Includes link to search suggestions, supporting docs, and info on how they were used
    - grounding_chunks - Contains source URI
    - grounding_supports - For each output text chunk, contains the index of the grounding chunk and the corresponding confidence score (start_index, stop_index).
- Search with Tools
  - GoogleSearch - Google Search Grounding Tool
  - ToolCodeExecution - Code generation & execution tool

Day 4 - Fine tuning a custom model [TP]

Gemini models covered include
- gemini-1.5-flash-001
- gemini-1.5-flash-001-tuning - This is deprecated now
Concepts covered include
- Fine-tuning a custom model
  - Tuned model requires no prompting or system instructions and outputs succinct text from the classes we provide in the training data.
  - Thereby, saving total tokens require,

Day 5: MLOps for Generative AI

📝 My Kaggle Notebooks
- …
🎒Assignment
- Complete Unit 5 - “MLOps for Generative AI”:
  - Listen to the summary podcast episode for this unit.
  - To complement the podcast, read the “MLOps for Generative AI” whitepaper.
  - No codelab for today! During the livestream tomorrow, we will do a code walkthrough and live demo of goo.gle/agent-starter-pack, a resource created for making MLOps for Gen AI easier and accelerating the path to production. Please go through the repository in advance.
  - Want to have an interactive conversation? Try adding the whitepaper to NotebookLM.
💡What You’ll Learn
- Discover how to adapt MLOps practices for Generative AI and leverage Vertex AI’s tools for foundation models and generative AI applications such as AgentOps for agentic applications.

Miscellaneous

Gemini comparison
- Model comparison
- Rate Limits
Evaluation
- Google documentations:
  - Define your evaluation metrics
  - Metric prompt templates for model-based evaluation

Thangavel PrasanthTP

Explorer

Google-5-Day-Gen-AI-Intensive-Course

Day 0: Trouble shooting and FAQs

Day 1: Foundational Large Language Models & Text Generation and Prompt Engineering

Summary of the key points & callouts

Day 1 - Prompting [TP]

Day 1 - Evaluation and structured output [TP]

Day 2: Embeddings & Vector

Summary of the key points & callouts

Day 2 - Document Q&A with RAG [TP]

Day 2 - Embeddings and similarity scores [TP]

Day 2 - Classifying embeddings with Keras [TP]

Day 3: AI Agents

Summary of the key points & callouts

Day 3 - Function calling with the Gemini AP [TP]

Day 3 - Building an agent with LangGraph [TP]

Day 4: Domain-Specific Models

Summary of the key points & callouts

Day 4 - Google Search grounding [TP]

Day 4 - Fine tuning a custom model [TP]

Day 5: MLOps for Generative AI

Miscellaneous

To do or clarify

Graph View

Table of Contents

Backlinks