Genkit Hands-on

In this codelab, you'll start with the initial setup of Genkit, then operate features like Code Execution and Function Calling in a local environment. Although it's very straightforward, this codelab experience will show you how efficiently you can develop using Genkit.

Prerequisites

Node.js v20+
npm

Please access the following site to get the Gemini API key.

Get API key | Gemini

As of November 2024, the free plan of the Gemini API is sufficient for this hands-on session, so there's no need for a paid plan.

Pricing models | Gemini

Run the following curl command in your terminal, replacing YOUR_API_KEY with your actual key, and confirm that a response is successfully returned.

curl \
  -H 'Content-Type: application/json' \
  -d '{"contents":[{"parts":[{"text":"Explain Firebase in under 100 words."}]}]}' \
  -X POST 'https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash-latest:generateContent?key=YOUR_API_KEY'

If you're using Windows, please confirm with the following command on PowerShell.

curl `
  -H "Content-Type: application/json" `
  -d '{"contents":[{"parts":[{"text":"Explain Firebase in under 100 words."}]}]}' `
  -X POST 'https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash-latest:generateContent?key=YOUR_API_KEY'

In this section, we'll run Genkit locally with its minimal configuration. Run the following commands to initialize the project.

$ npm create genkitx

? Select template › - Use arrow-keys. Return to submit.
❯   Minimal - This is a Minimal template
    MCP

? Enter your project name › hello-genkit

✅ Project "hello-genkit" has been successfully generated

$ cd hello-genkit

Set the Gemini API key you obtained earlier as an environment variable.

export GEMINI_API_KEY=<your-api-key>

If you're using Windows, set the Gemini API key as an environment running the following command on PowerShell.

$env:GEMINI_API_KEY=<your API key>

Let's check the file src/index.ts. This code represents the actual implementation of a generative AI request using Genkit, which can be written in about 20 lines.

import { genkit, z } from 'genkit'
import { googleAI, gemini20Flash } from '@genkit-ai/googleai'
import { startFlowServer } from '@genkit-ai/express'
import { logger } from 'genkit/logging'
logger.setLogLevel('debug')

const ai = genkit({
  plugins: [googleAI()],
  model: gemini20Flash,
})

const mainFlow = ai.defineFlow({
  name: 'mainFlow',
  inputSchema: z.string(),
}, async (input) => {
  const { text } = await ai.generate(input)
  return text
})

startFlowServer({ flows: [mainFlow] })

Genkit will start with the following command, and Developer Tools will automatically launch.

npm start

In the Flows menu, select mainFlow defined in the code above. Enter a string and select the Run button to send a prompt to Gemini.

Prompt: Explain Firebase in under 100 words.

Hello Genkit! | Flow

Press the View trace button to see detailed Input and Output from the Gemini API.

Hello Genkit! | Trace

With Gemini's Code Execution, you can generate and execute Python code. Only one line needs to be changed.

-  model: gemini25FlashPreview0417,
+  model: gemini25FlashPreview0417.withConfig({ codeExecution: true }),

Open Developer Tools, input prompts that require programming into mainFlow, and try the following requests:

Simulate the ratio of heads to tails after flipping a coin 100,000 times.
Calculate the 100th Fibonacci number.
Execute the following code in Python: print('Hello World')

Here is the result.

Code Execution | Flows

In the View trace menu, you can see the Python code that was executed.

Code Execution | Trace

Challenge

Try thinking up prompts that require Code Execution and give them a try.

Function Calling allows generative AI to call predefined functions as needed to fulfill user requests. Here are some possible use cases:

Extracting content from URLs or PDFs included in user requests.
Calling external APIs, such as:
- Adding to a calendar
- Sending notifications to a chat service
- Searching internal documents
- Google search

In this codelab, you'll implement a tool to extract the contents of a URL and try calling it with Function Calling using Cheerio as the HTML parser.

npm i cheerio

Remove Code Execution for now.

-  model: gemini25FlashPreview0417.withConfig({ codeExecution: true }),
+  model: gemini25FlashPreview0417,

Import cheerio.

  import { genkit, z } from 'genkit'
  import { googleAI, gemini25FlashPreview0417 } from '@genkit-ai/googleai'
+ import * as cheerio from 'cheerio'

Add the following function under the definition of the ai variable in src/index.ts. The first argument specifies the tool's configuration values, and the second argument specifies the process to execute.

const webLoader = ai.defineTool(
  {
    name: "webLoader",
    description:
      "When a URL is received, it accesses the URL and retrieves the content inside.",
    inputSchema: z.object({ url: z.string() }),
    outputSchema: z.string(),
  },
  async ({ url }) => {
    const res = await fetch(url)
    const html = await res.text()
    const $ = cheerio.load(html)
    $("script, style, noscript").remove()
    if ($("article")) {
      return $("article").text()
    }
    return $("body").text()
  },
)

Specify tools in the generate method parameter and include webLoader in the array. Since tools can be specified as an array, you can set multiple tools, and generative AI will select the necessary tool for execution based on the description in defineTool. Just like prompt engineering, tuning description is essential.

-  const { text } = await ai.generate(input)
+  const { text } = await ai.generate({ prompt: input, tools: [webLoader] })

The final source code can be found at the following GitHub URL.

https://github.com/tanabee/genkit-codelab/blob/main/steps/function-calling/src/index.ts

Now that the code is complete, open Developer Tools. You'll see that webLoader has been added to the Tools menu. Select webLoader, enter the following URL, and execute it.

URL: https://medium.com/firebase-developers/implementing-function-calling-using-genkit-0c03f6cb9179

Function Calling | Tool

The content of the URL was extracted. In Genkit Developer Tools, you can test tools individually to verify their functionality before incorporating them into a Flow, making development more efficient.

Next, select mainFlow from the Flow menu. Enter the following prompt and execute it.

Prompt: First, fetch the content inside URL. Next, summarize the content in less than 200 words. https://medium.com/firebase-developers/implementing-function-calling-using-genkit-0c03f6cb9179

You can see that the content has been summarized based on the extracted content.

Function Calling | Flow

Look at the View trace. You'll see that while two requests were made to the Gemini API, webLoader was called in between, confirming that the tool was indeed called.

Function Calling | Trace

Challenge

Try defining your own tools and implementing Function Calling.

Model Context Protocol (MCP) is a standardized protocol that allows generative AI applications to securely and efficiently access external data sources and tools. The main difference from Function Calling we experienced earlier is that MCP provides protocol-level standardization.

Function Calling is a mechanism where developers define and call tools for individual needs. On the other hand, MCP is a general-purpose protocol designed to share tools and data sources across different AI models and applications.

The official MCP GitHub lists MCP servers for various services, and by leveraging this ecosystem, you can easily connect to external services.

In this section, we'll use GitHub's MCP server to access GitHub from Genkit.

Create a separate directory from what we've been working with and run npm create genkitx. Select MCP as the template. The project name is up to you.

% npm create genkitx

> npx
> create-genkitx

? Select template › - Use arrow-keys. Return to submit.
    Minimal
    VertexAI
❯   MCP - This is a MCP template
    Firebase

? Enter your project name › <your project name>

When the project is successfully created, the following message will be displayed. Follow the guide to execute the commands.

※GitHub personal access token can be created at GitHub Settings.

✅ Project <your project name> has been successfully generated

You can start your project with the following commands:
cd <your project name>
echo "GEMINI_API_KEY=<your-api-key>" > .env
echo "GITHUB_PERSONAL_ACCESS_TOKEN=<your-github-personal-access-token>" >> .env
npm start
Enjoy building with Genkit! 👍

Source Code

Open src/index.ts to see the differences from previous examples.

MCP client definition has been added.

const githubClient = mcpClient({
  name: 'github',
  serverProcess: {
    command: 'npx',
    args: ['-y', '@modelcontextprotocol/server-github'],
    env: process.env as Record<string, string>,
  },
})

The githubClient has been added to plugins during Genkit initialization.

const ai = genkit({
  plugins: [
    githubClient,
    googleAI(),
  ],
  model: gemini25FlashPreview0417,
})

Also, github/search_repositories has been added to tools.

  const { text } = await ai.generate({
    prompt,
    tools: ['github/search_repositories']
  })

With these changes, you can now search repositories on GitHub via MCP.

Testing

Run npm start to launch Developer Tools. Select mainFlow from the Flows menu and request the generative AI: Tell me the top 10 GitHub repositories related to Genkit.

The results are returned.

MCP | Flows

Select View trace to confirm that github/search_repositories is being used appropriately.

MCP | Trace

Summary

While Function Calling provides high customizability, MCP enables flexible tool sharing through standardization. MCP's reusability allows you to easily incorporate existing tools, enabling you to focus on developing new features.

Challenge

If you have time, try other MCP servers (e.g., file system, database) to experience MCP's flexibility.

modelcontextprotocol/servers

This concludes the hands-on session. It's impressive that so much can be achieved with such little code. Here are some next steps for those who want to dive deeper.

Prerequisites

Challenge

Challenge

Source Code

Testing

Summary

Challenge

Next steps