In this codelab, you'll start with the initial setup of Genkit, then operate features like Code Execution and Function Calling in a local environment. Although it's very straightforward, this codelab experience will show you how efficiently you can develop using Genkit.
Please access the following site to get the Gemini API key.
As of November 2024, the free plan of the Gemini API is sufficient for this hands-on session, so there's no need for a paid plan.
Run the following curl command in your terminal, replacing YOUR_API_KEY
with your actual key, and confirm that a response is successfully returned.
curl \
-H 'Content-Type: application/json' \
-d '{"contents":[{"parts":[{"text":"Explain Firebase in under 100 words."}]}]}' \
-X POST 'https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash-latest:generateContent?key=YOUR_API_KEY'
If you're using Windows, please confirm with the following command on PowerShell.
curl `
-H "Content-Type: application/json" `
-d '{"contents":[{"parts":[{"text":"Explain Firebase in under 100 words."}]}]}' `
-X POST 'https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash-latest:generateContent?key=YOUR_API_KEY'
In this section, we'll run Genkit locally with its minimal configuration. Run the following commands to initialize the project.
mkdir hello-genkit && cd hello-genkit
npm init -y
npm i -D genkit-cli
npm i genkit @genkit-ai/googleai @genkit-ai/express
Set the Gemini API key you obtained earlier as an environment variable.
export GOOGLE_GENAI_API_KEY=<your API key>
If you're using Windows, set the Gemini API key as an environment running the following command on PowerShell.
$env:GOOGLE_GENAI_API_KEY=<your API key>
Please create src/index.ts
file and paste the following code.
import { genkit, z } from 'genkit'
import { googleAI, gemini15Flash } from '@genkit-ai/googleai'
import { startFlowServer } from '@genkit-ai/express'
import { logger } from 'genkit/logging'
logger.setLogLevel('debug')
const ai = genkit({
plugins: [googleAI()],
model: gemini15Flash,
})
const mainFlow = ai.defineFlow({
name: 'mainFlow',
inputSchema: z.string(),
}, async (input) => {
const { text } = await ai.generate(input)
return text
})
startFlowServer({ flows: [mainFlow] })
Genkit will start with the following command, and Developer Tools will automatically launch.
npx genkit start -o -- npx tsx --watch src/index.ts
In the Flows menu, select mainFlow
defined in the code above. Enter a string and select the Run
button to send a prompt to Gemini.
Explain Firebase in under 100 words.
Press the View trace
button to see detailed Input and Output from the Gemini API.
With Gemini's Code Execution, you can generate and execute Python code. Only one line needs to be changed.
- model: gemini15Flash,
+ model: gemini15Flash.withConfig({ codeExecution: true }),
Open Developer Tools, input prompts that require programming into mainFlow, and try the following requests:
Simulate the ratio of heads to tails after flipping a coin 100,000 times.
Calculate the 100th Fibonacci number.
Execute the following code in Python: print('Hello World')
Here is the result.
In the View trace menu, you can see the Python code that was executed.
Try thinking up prompts that require Code Execution and give them a try.
Function Calling allows generative AI to call predefined functions as needed to fulfill user requests. Here are some possible use cases:
In this codelab, you'll implement a tool to extract the contents of a URL and try calling it with Function Calling using Cheerio as the HTML parser.
npm i cheerio
Remove Code Execution for now.
- model: gemini15Flash.withConfig({ codeExecution: true }),
+ model: gemini15Flash,
Import cheerio.
import { genkit, z } from 'genkit'
import { googleAI, gemini15Flash } from '@genkit-ai/googleai'
+ import * as cheerio from 'cheerio'
Add the following function under the definition of the ai
variable in src/index.ts
. The first argument specifies the tool's configuration values, and the second argument specifies the process to execute.
const webLoader = ai.defineTool(
{
name: "webLoader",
description:
"When a URL is received, it accesses the URL and retrieves the content inside.",
inputSchema: z.object({ url: z.string() }),
outputSchema: z.string(),
},
async ({ url }) => {
const res = await fetch(url)
const html = await res.text()
const $ = cheerio.load(html)
$("script, style, noscript").remove()
if ($("article")) {
return $("article").text()
}
return $("body").text()
},
)
Specify tools
in the generate
method parameter and include webLoader
in the array. Since tools
can be specified as an array, you can set multiple tools, and generative AI will select the necessary tool for execution based on the description
in defineTool
. Just like prompt engineering, tuning description is essential.
- const { text } = await ai.generate(input)
+ const { text } = await ai.generate({ prompt: input, tools: [webLoader] })
The final source code can be found at the following GitHub URL.
https://github.com/tanabee/genkit-codelab/blob/main/steps/function-calling/src/index.ts
Now that the code is complete, open Developer Tools. You'll see that webLoader has been added to the Tools menu. Select webLoader, enter the following URL, and execute it.
URL: https://medium.com/firebase-developers/implementing-function-calling-using-genkit-0c03f6cb9179
The content of the URL was extracted. In Genkit Developer Tools, you can test tools individually to verify their functionality before incorporating them into a Flow, making development more efficient.
Next, select mainFlow from the Flow menu. Enter the following prompt and execute it.
Prompt: First, fetch the content inside URL. Next, summarize the content in less than 200 words. https://medium.com/firebase-developers/implementing-function-calling-using-genkit-0c03f6cb9179
You can see that the content has been summarized based on the extracted content.
Look at the View trace. You'll see that while two requests were made to the Gemini API, webLoader was called in between, confirming that the tool was indeed called.
Try defining your own tools and implementing Function Calling.
This concludes the hands-on session. It's impressive that so much can be achieved with such little code. Here are some next steps for those who want to dive deeper.