Skip to contents

The goal of quackingllama is to facilitate efficient interactions with LLMs; its current target use-case is text classification (e.g. categorise or tag contents, or extract information from text). Key features include:

  • facilitate consistently formatted responses (through Ollama’s structured ouputs)
  • facilitate local caching (by storing results in a local DuckDB database)
  • facilitate initiating text classification tasks (through examples and convenience functions)
  • facilitate keeping a record with details about how each response has been received

Installation

You can install the development version of quackingllama from GitHub with:

# install.packages("pak")
pak::pak("giocomai/quackingllama")

Default options and outputs

In order to facilitate consistent results, by default quackingllama sets the temperature of the model to 0: this means that it will always return the same response when given the same prompt. When caching is enabled, responses can then consistently be retrieved from the local cache without querying again the LLMs.

All functions consistently return results in a data frame (a tibble).

Key functionalities will be demonstrated through a series of examples.

As the package is developed further, some of the less intuitive tasks (e.g. defining a schema) will be facilitated through dedicated convenience functions.

Basic examples

Text generation

Let’s generate a short piece of text. Results are returned in a data frame, with the response in the first column and all relevant metadata about the query stored along with it.

pol_df <- ql_prompt(prompt = "Describe an imaginary political leader in less than 100 words.") |>
  ql_generate()

str(pol_df)
#> tibble [1 × 22] (S3: tbl_df/tbl/data.frame)
#>  $ response            : chr "**Elias Voss** – *\"The Pragmatic Sage\"*\n\nA towering figure with silver-streaked hair and a perpetually thou"| __truncated__
#>  $ prompt              : chr "Describe an imaginary political leader in less than 100 words."
#>  $ thinking            : chr NA
#>  $ created_at          : chr "2026-03-14T19:38:49.757712483Z"
#>  $ done                : logi TRUE
#>  $ done_reason         : chr "stop"
#>  $ total_duration      : num 1.52e+10
#>  $ load_duration       : num 2.11e+09
#>  $ prompt_eval_count   : num 27
#>  $ prompt_eval_duration: num 7.56e+08
#>  $ eval_count          : num 142
#>  $ eval_duration       : num 1.21e+10
#>  $ timeout             : num 300
#>  $ keep_alive          : chr "5m"
#>  $ think               : logi FALSE
#>  $ model               : chr "ministral-3:3b"
#>  $ system              : chr "You are a helpful assistant."
#>  $ format              : chr ""
#>  $ seed                : int 0
#>  $ temperature         : num 0
#>  $ "hash"              : chr "hash"
#>  $ hash                : chr "984f9fdaf2a9f3519375fc8faab345ee"
cat(">", stringr::str_split(string = pol_df$response,
                            pattern = "\n",
                            simplify = TRUE))

Elias Voss“The Pragmatic Sage” A towering figure with silver-streaked hair and a perpetually thoughtful gaze, Elias Voss rose from a humble scholar’s cottage to unite a fractured nation through quiet diplomacy. His speeches were woven from ancient wisdom and modern pragmatism, blending idealism with ruthless efficiency. He dismantled corrupt systems with a smile, rebuilt economies with a whisper, and won hearts by listening more than he spoke. His greatest legacy? A nation that dared to dream—while ensuring no one starved. “Leadership isn’t power,” he’d say, “it’s the art of making people believe they could do it themselves.”

If we are interested in variations of this text, we can easily create them:

# TODO accept multiple prompts by default

pol3_df <- purrr::map(
  .x = c("progressive", "conservative", "centrist"),
  .f = \(x) {
    ql_prompt(prompt = glue::glue("Describe an imaginary {x} politician in less than 100 words.")) |>
      ql_generate()
  }
) |>
  purrr::list_rbind()

pol3_df$response
#> [1] "**Dr. Eleanor Voss** is a progressive visionary blending idealism with pragmatism. A former community organizer, she champions universal healthcare, green energy, and worker cooperatives while advocating for restorative justice. Her policies prioritize equity—like free college and rent control—while addressing systemic change through grassroots movements. A fierce advocate for climate justice, she pushes for carbon taxes with direct democracy funds. Humble yet bold, she rejects dogma, calling for \"radical collaboration\" over ideological battles. Her speeches blend hope with hard truths, making her both inspiring and uncompromising—a leader who refuses to let progress stall."                                                                                                              
#> [2] "**Senator Marcus Holloway** is a staunch conservative firebrand, a man of unshakable faith and unyielding tradition. With a voice like gravel and a demeanor of stern resolve, he champions \"law and order\" while demonizing \"woke\" progress. His policies prioritize \"family values\" and \"American exceptionalism,\" often framing them as divine mandates. Holloway opposes \"socialist\" policies, mocks \"elite\" media, and preaches \"tough love\" for the \"forgotten.\" His rallies are packed with patriotic fervor, where he scorns \"globalists\" and \"cultural Marxists.\" Though often divisive, his unwavering conviction keeps him a beloved (and feared) figure in the GOP."                                                                                                                       
#> [3] "**Name:** Dr. Eleanor Voss\n**Party:** *Centrist Unity Coalition (CUC)*\n**Motto:** *\"Progress without extremes—balancing ambition with wisdom.\"*\n\nA pragmatic reformer with a PhD in public policy, Voss blends progressive ideals with fiscal responsibility. She champions universal healthcare but insists on \"smart\" subsidies, opposing both single-payer chaos and private-sector exploitation. On climate, she pushes green tech but argues for \"transition, not revolution.\" Economically, she supports worker protections but opposes overregulation, favoring \"market-friendly\" policies. A master of compromise, she’s beloved by moderates but mocked by zealots—yet her centrist pragmatism keeps her in power. *\"I don’t want to fix everything—just make sure we don’t break anything first.\"*"

These are, as it is the customary default behaviour of LLMs, free form texts. Depending on the task at hand, we may want to have text in a more structured format. To do so, we must provide the LLM with a schema of how we want it to to return data.

Schema can be very simple, e.g., if we want our response to feature only a “name” and “description” field, and both should be character strings, we’d use the following schema:

# TODO convenience function to facilitate creation of common schemas

schema <- list(
  type = "object",
  properties = list(
    `name` = list(type = "string"),
    `description` = list(type = "string")
  ),
  required = c("name", "description")
)
pol_schema_df <- ql_prompt(
  prompt = "Describe an imaginary political leader.",
  format = schema
) |>
  ql_generate()

pol_schema_df$response |>
  yyjsonr::read_json_str()
#> $name
#> [1] "Dr. Elias Voss"
#> 
#> $description
#> [1] "A visionary yet enigmatic political leader born in 1968 in the small, mountainous town of Schwarzwald, Germany. Raised in a family of miners and scholars, Elias developed an early fascination for both the struggles of the working class and the intellectual pursuits of his ancestors."

or slightly more complex, for example making clear that we expect a field to be numeric, and another one to pick between one of a set of options:

schema <- list(
  type = "object",
  properties = list(
    `name` = list(type = "string"),
    `age` = list(type = "number"),
    `gender` = list(
      type = "string",
      enum = c("female", "male", "non-binary")
    ),
    `motto` = list(type = "string"),
    `description` = list(type = "string")
  ),
  required = c(
    "name",
    "age",
    "gender",
    "motto",
    "description"
  )
)

And the returned is formatted as expected:

pol_schema_df <- ql_prompt(
  prompt = "Describe an imaginary political leader.",
  format = schema
) |>
  ql_generate()

pol_schema_df$response |>
  yyjsonr::read_json_str()
#> $name
#> [1] "Dr. Elias Voss"
#> 
#> $age
#> [1] 68
#> 
#> $gender
#> [1] "male"
#> 
#> $motto
#> [1] "Unity Through Vision"
#> 
#> $description
#> [1] "Dr. Elias Voss is a charismatic and visionary political leader born in the heart of a once-thriving industrial city in Europe. Raised in modest circumstances, he developed a deep appreciation for both the struggles of the working class and the complexities of global politics. His journey to leadership began in the 1980s when he became a prominent figure in labor unions, advocating for workers' rights and economic justice. His speeches were known for their eloquence and his ability to connect with ordinary people, making him a beloved figure among the masses."

Having the response in a structured format allows for easily storing results in a data frame and processing them further.

pol3_schema_df <- purrr::map(
  .x = c("progressive", "conservative", "centrist"),
  .f = \(x) {
    ql_prompt(
      prompt = glue::glue("Describe an imaginary {x} politician."),
      format = schema
    ) |>
      ql_generate()
  }
) |>
  purrr::list_rbind()

pol3_schema_responses_df <- purrr::map(
  .x = pol3_schema_df$response,
  .f = \(x) {
    yyjsonr::read_json_str(x) |>
      tibble::as_tibble()
  }
) |>
  purrr::list_rbind()

pol3_schema_responses_df
#> # A tibble: 3 × 5
#>   name                    age gender motto                           description
#>   <chr>                 <int> <chr>  <chr>                           <chr>      
#> 1 Dr. Elena Vasquez        58 female Progress through dialogue, equ… Dr. Elena …
#> 2 Alexandre Dubois         65 male   Order, Tradition, and Responsi… Alexandre …
#> 3 Étienne Laurent-Duval    52 male   Balance, Pragmatism, and Commo… Étienne La…

This has obvious advantages for many data processing tasks, and, as will be seen, can effectively be used to enhance the consistency of text classification tasks. But first, let’s discuss caching and some of the options that determine output.

Caching and options

So far, local caching has not been enabled: this means that even when the exacts same response is expected, this will still be requested to the LLM, which can be exceedingly time-consuming especially for repetitive tasks, or for data processing pipelines that may recurrently encounter the same data.

Caching is the obvious answer to this process, but when do we expect exactly the same response from the LLM, considering that LLMs do not necessarily return the same response even when given the same prompt?

Two parameters are particularly relevant for understanding this, temperature and seed.

What is “temperature”? Ollama’s documentation concisely clarifies the effect of this parameter by suggesting that “Increasing the temperature will make the model answer more creatively.” LLMs often have the default temperature set to 0.7 or 0.8. In brief, when temperature is set to its maximum value of 1, the LLMs will provide more varied responses. When temperature is set to 0, the LLMs are at their more consistent: they always provide the same response to the same prompt.

What does it mean in practices? For example, that if I set the temperature to 0 and ask the same LLM to generate a haiku, I will always get the very same haiku, no matter how many times I run this command.

ql_prompt(prompt = "A reasonably funny haiku", temperature = 0) |>
  ql_generate() |>
  dplyr::pull(response)
#> [1] "Moonlight glows—\na cat steals my sandwich,\nthen naps on my toes."
ql_prompt(prompt = "A reasonably funny haiku", temperature = 0) |>
  ql_generate() |>
  dplyr::pull(response)
#> [1] "Moonlight glows—\na cat steals my sandwich,\nthen naps on my toes."
ql_prompt(prompt = "A reasonably funny haiku", temperature = 0) |>
  ql_generate() |>
  dplyr::pull(response)
#> [1] "Moonlight glows—\na cat steals my sandwich,\nthen naps on my toes."

If I set the temperature to 1, I get every time a different haiku (ok, not very different, really, but still different).

ql_prompt(prompt = "A reasonably funny haiku", temperature = 1) |>
  ql_generate() |>
  dplyr::pull(response)
#> [1] "Soft rain, umbrella—\nfriendly stranger leans near to share\ntea with stranger too"
ql_prompt(prompt = "A reasonably funny haiku", temperature = 1) |>
  ql_generate() |>
  dplyr::pull(response)
#> [1] "Moon’s glow—\nPizza slices turn my lawn\ninto a sloppy joe"
ql_prompt(prompt = "A reasonably funny haiku", temperature = 1) |>
  ql_generate() |>
  dplyr::pull(response)
#> [1] "**Shoes on feet so tired,**\n**Dinner with the neighbor's cat—**\n**Bread crumbs left? *Yes.***"

But then, replicability of results is possible even when the temperature is set to a value higher than 0. We just need to set the same seed, and we’ll consistently get the same result.

ql_prompt(prompt = "A reasonably funny haiku", temperature = 1, seed = 2025) |>
  ql_generate() |>
  dplyr::pull(response)
#> [1] "**Paws, tail, then—**\n**Barking’s just a habit...**\n**Dog’s the boss.** 🐶"
ql_prompt(prompt = "A reasonably funny haiku", temperature = 1, seed = 2025) |>
  ql_generate() |>
  dplyr::pull(response)
#> [1] "**Paws, tail, then—**\n**Barking’s just a habit...**\n**Dog’s the boss.** 🐶"
ql_prompt(prompt = "A reasonably funny haiku", temperature = 1, seed = 2025) |>
  ql_generate() |>
  dplyr::pull(response)
#> [1] "**Paws, tail, then—**\n**Barking’s just a habit...**\n**Dog’s the boss.** 🐶"

Two additional components determine if the response is exactly the same in different instances: system and format. The system parameter is passed along with the prompt to the LLM, and by default is set to the generic “You are a helpful assistant.”. This is a reasonable generic option, but there may be good reasons to be more specific depending on the task at hand.

For example, if we set as the system message “You are a 19th century romantic poet.”, the style of the response will change (somewhat) accordingly.

ql_prompt(
  prompt = "A reasonably funny haiku",
  temperature = 0,
  system = "You are a 19th century romantic writer."
) |>
  ql_generate() |>
  dplyr::pull(response)
#> [1] "*\"Whispers drift on breezes soft—*\n*Moonlit garden hums a tune,*\n*Love’s ghost dances, light as dust.\"*\n\n*(A touch of whimsy, dear reader—let the stars weave their magic!)* 🌙✨"

As discussed above, format is relevant only for instances when a structured output is requested to the LLM by providing a schema. For example, if we provided a different schema, the output would also have been different.

schema <- list(
  type = "object",
  properties = list(
    `haiku` = list(type = "string"),
    `why_funny` = list(type = "string")
  ),
  required = c(
    "haiku",
    "why_funny"
  )
)

haiku_str_df <- ql_prompt(
  prompt = "Write a funny haiku, and explain why it is supposed to be funny.",
  format = schema
) |> ql_generate()

haiku_str_df |>
  dplyr::pull(response) |>
  yyjsonr::read_json_str()
#> $haiku
#> [1] "Pizza crust burns—oh no!\nCrumbles fly like tiny tornadoes,\nDinner’s a crime scene."
#> 
#> $why_funny
#> [1] "This haiku plays on the absurdity of a pizza disaster! The first line sets up a dramatic, almost comedic panic—burning pizza crust—while the second line escalates the chaos with the chaotic, chaotic crumbles. The third line ties it all together with a cheeky twist: 'Dinner’s a crime scene.' It’s funny because it’s a relatable, exaggerated moment of kitchen mishap, framed in a way that feels both silly and slightly surreal."

In brief, when should we expect to receive exactly the same response from the LLM, hence, making it possible to retrieve it from cache if already parsed? The following conditions must apply:

  • same model
  • same system parameter
  • same format, i.e., same schema (if given).
  • same prompt
  • and
    • either the same seed and any value for temperature OR
    • any seed and temperature set to zero

If the above conditions are met, and caching is enabled, the response will be retrieved from the local cache, rather than from the LLM.

It’s easy to enable caching for the current session with ql_enable_db(). By default, the database is stored in the current working directory, but this can be changed with ql_set_db_options().

Now even prompts that would take the LLM many seconds to process can be returned efficiently from cache:

Text classification

First, let’s create some texts that we will then try to classify:

schema <- list(
  type = "object",
  properties = list(
    `party name` = list(type = "string"),
    `political leaning` = list(
      type = "string",
      enum = c("progressive", "conservative")
    ),
    `political statement` = list(type = "string")
  ),
  required = c(
    "party name",
    "political leaning",
    "political statement"
  )
)

parties_df <- purrr::map2(
  .x = rep(c("progressive", "conservative"), 5),
  .y = 1:10,
  .f = \(x, y) {
    ql_prompt(
      prompt = glue::glue("Describe an imaginary {x} political party, inventing their party name and a characteristic political statement."),
      format = schema,
      temperature = 1,
      seed = y
    ) |>
      ql_generate()
  }
) |>
  purrr::list_rbind()

parties_responses_df <- purrr::map(
  .x = parties_df$response,
  .f = \(x) {
    yyjsonr::read_json_str(x) |>
      tibble::as_tibble()
  }
) |>
  purrr::list_rbind()

parties_responses_df
#> # A tibble: 10 × 3
#>    `party name`                        `political leaning` `political statement`
#>    <chr>                               <chr>               <chr>                
#>  1 EcoHarmony Collective               progressive         "The future of our p…
#>  2 Protectora et Conservatrix America… conservative        "We stand for preser…
#>  3 Vanguard Horizon (VH)               progressive         "Rejection of the st…
#>  4 The Union of American Sovereignty … conservative        "**A New Era of Amer…
#>  5 Équilibre Démocrate Progressiste (… progressive         "The future of socie…
#>  6 United Front for National Sovereig… conservative        "In an era defined b…
#>  7 **Unio Progressiva Ecológica (UPE)  progressive         "**A Justicia Sosten…
#>  8 Conservatio Libertas (CL)           conservative        "Respect for traditi…
#>  9 **EcoHorizons**                     progressive         "A democratic social…
#> 10 Proprietas America – Guardians of … conservative        "PALIRE stands for '…

Then we ask a different model to categorise results (in this example, text generation with ministral-3:3b, text categorisation with gemma3:4b). Trimming explanations in the following table for clarity.

category_schema <- list(
  type = "object",
  properties = list(
    `political leaning` = list(
      type = "string",
      enum = c("progressive", "conservative")
    ),
    `explanation` = list(type = "string")
  ),
  required = c(
    "political leaning",
    "explanation"
  )
)

categories_df <- purrr::map(
  .x = parties_responses_df[["political statement"]],
  .f = \(current_statement) {
    ql_prompt(
      prompt = current_statement,
      system = "You identify the political leaning of political parties based on their statements.",
      format = category_schema,
      temperature = 0,
      model = "gemma3:4b"
    ) |>
      ql_generate()
  }
) |>
  purrr::list_rbind()

categories_responses_df <- purrr::map(
  .x = categories_df$response,
  .f = \(x) {
    yyjsonr::read_json_str(x) |>
      tibble::as_tibble()
  }
) |>
  purrr::list_rbind()



responses_combo_df <- dplyr::bind_cols(
  parties_responses_df |>
    dplyr::rename(`given political leaning` = `political leaning`) |>
    dplyr::select(`political statement`, `given political leaning`),
  categories_responses_df |>
    dplyr::rename(`identified political leaning` = `political leaning`)
)

responses_combo_df |>
  dplyr::mutate(explanation = stringr::str_trunc(explanation, width = 200) |> 
                  stringr::str_remove_all(pattern =  "\n")) |>
  knitr::kable()
political statement given political leaning identified political leaning explanation
The future of our planet and society must be built on sustainability, equity, and solidarity. Our vision is a world where resources are managed fairly, ecosystems are protected, and every human being has access to a thriving life, free from systemic exploitation and environmental degradation. Progress is achieved through grassroots innovation, international cooperation, and policies that prioritize human well-being over profit. progressive progressive The statement strongly advocates for sustainability, equity, and solidarity, which are core tenets of progressive political thought. It explicitly critiques prioritizing profit over human well-bein…
We stand for preserving American liberties, safeguarding traditional values, and defending strong national sovereignty. PCA advocates for fiscal responsibility, limited government, and a robust defense policy to protect the homeland. We believe in empowering individual freedom within a framework that upholds religious, familial, and cultural traditions while addressing modern challenges like immigration reform and economic opportunity for all. PCA’s vision is built on a commitment to patriotism and a principled opposition to excessive global influence, aiming to reassert American leadership through wise governance and mutual respect for democratic ideals. conservative conservative This statement clearly aligns with conservative principles. Let’s break down why:* Preserving American Liberties, Safeguarding Traditional Values: This is a cornerstone of conservative ideolo…
Rejection of the status quo in favor of a holistic, equitable, and future-oriented society; VH believes in climate justice, economic solidarity, and democratic self-determination. The party emphasizes participatory governance, cultural preservation, and ecological stewardship, advocating for a world where marginalized voices lead societal transformation. Their manifesto: ‘Build bridges, not barriers—where everyone can thrive.’ progressive progressive VH’s core values – rejection of the status quo, holistic and equitable society, future-oriented thinking – firmly place them within the progressive political spectrum. Their specific commitments to…
A New Era of American Strength—Protecting Tradition, Defending Liberty, and Strengthening Our Nation conservative conservative The phrase “A New Era of American Strength” immediately suggests a desire for national revival and power, often associated with conservative movements. The emphasis on “Protecting Tradition,” “Defe…
The future of society is built upon three essential pillars: justice equitable for all citizens, protection of the Earth for current and future generations, and social, cultural, and economic cohesion through collective empowerment and democratic participation. The EDP advocates for a society that ensures the well-being of people—across all socio-economic and cultural divides—while safeguarding biodiversity, sustainability, and human rights. It champions policies that address systemic inequalities, combat climate change, and foster collaboration between governments, citizens, and international organizations to create a fair, equitable, and resilient global order. Our vision includes democratic governance, fair wages, strong worker rights, education and healthcare for all, and a sustainable economy that prioritizes people over profits. progressive progressive The text presents a clearly progressive vision for society, advocating for key tenets of the left-leaning political spectrum. Here’s a breakdown of why it leans progressive:* **Justice Equitable …
In an era defined by chaos, we stand firm for the principles of order, family, and the unchanging bedrock of American ideals. We believe that progress must be measured against its capacity to uplift the working class, preserve the sanctity of life, and defend the values that bind us together. No nation prospers when its borders are crumbling, its traditions forgotten, or its sovereignty undermined by foreign influence. Our agenda is rooted in restoring faith in the institutions that have kept this nation strong: strong families, strong laws, and strong American leadership. We pledge to protect the sanctity of the unborn, uphold the sanctity of marriage, safeguard free speech, ensure secure borders, and stand against the tide of globalization that has weakened the middle class. Our commitment is not to division but to harmony: unity through shared heritage and the shared belief that each generation must carry forward the values that define who we are. Together, we will build a future where our greatest strengths—our work ethic, our religious faith, and our unshaken spirit—guide us toward a tomorrow built on the pillars of dignity, pride, and sovereignty. conservative conservative This statement strongly aligns with conservative political ideology. Here’s a breakdown of why:* Emphasis on Order & Tradition: Phrases like
A Justicia Sostenible para el Futuro progressive progressive The phrase ‘Justicia Sostenible para el Futuro’ (Sustainable Justice for the Future) inherently leans towards progressive values. It combines the concepts of justice and sustainability, both of whi…
Respect for tradition, the protection of property, and the preservation of order are the pillars of society, but we must never forget our responsibility to future generations. We stand for strong national identity, limited government interference in personal lives, free-market capitalism with a moral compass, and a commitment to defending liberty while maintaining the strength of the American way of life. Together, we will uphold the values that have made our nation great. conservative conservative This statement aligns strongly with conservative values and ideology. Let’s break down why:* Respect for Tradition, Protection of Property, Preservation of Order: These are core tenets of con…
A democratic socialism of the future—building equitable societies while preserving biodiversity and fostering grassroots innovation. Together, we envision a world where public goods, climate justice, and cultural autonomy shape our collective destiny. progressive progressive The statement advocates for ‘democratic socialism,’ a political and economic system rooted in progressive values. Here’s a breakdown of why it aligns with that leaning:* Democratic Socialism:
PALIRE stands for ‘Proprietas America – Guardians of Liberty & Heritage’ and is a modern conservative force devoted to preserving the American tradition, defending traditional values, and ensuring prosperity through strength, faith, and free enterprise. Its policies center on sovereign autonomy for states, fiscal responsibility, cultural continuity, and a strong military. conservative conservative PALIRE (Proprietas America – Guardians of Liberty & Heritage) identifies as a modern conservative movement. This is supported by their stated focus on traditional values, a strong military, fiscal …

In this stereotyped case, the LLM categorises most statements as expected and provide a broadly meaningful explanation for the choice (if you try with shorter sentences, e.g., just a political motto, the correct response rate decreases substantially). Fundamentally:

  • responses are returned in a predictable and user-defined format, consistently responding with user-defined categories
  • responses are cached locally:
    • re-running a categorisation task is efficient
    • the categorisation of a large set of texts can be interrupted at will, and already processed contents will not be categorised again.

Querying with different models can have a substantial impact on the quality of results.

Pass images to the model

You can pass images and have multimodal models such as e.g. “llama3.2-vision” or (the considerably smaller) “llava-phi3” consider them in their response. Just pass the path of the relevant image to ql_prompt(). For example, if we ask to describe the logo of this package, we get the following reponse:

library("quackingllama")

img_path <- fs::path(
  system.file(package = "quackingllama"),
  "help",
  "figures",
  "logo.png"
)

resp_df <- ql_prompt(
  prompt = "what is this?",
  images = img_path,
  model = "qwen3.5:2b"
) |>
  ql_generate()


cat(">", stringr::str_split(string = pol_df$response,
                            pattern = "\n",
                            simplify = TRUE))

Elias Voss“The Pragmatic Sage” A towering figure with silver-streaked hair and a perpetually thoughtful gaze, Elias Voss rose from a humble scholar’s cottage to unite a fractured nation through quiet diplomacy. His speeches were woven from ancient wisdom and modern pragmatism, blending idealism with ruthless efficiency. He dismantled corrupt systems with a smile, rebuilt economies with a whisper, and won hearts by listening more than he spoke. His greatest legacy? A nation that dared to dream—while ensuring no one starved. “Leadership isn’t power,” he’d say, “it’s the art of making people believe they could do it themselves.”

resp_df <- ql_prompt(
  prompt = "what is this?",
  images = img_path,
  model = "llava-phi3"
) |>
  ql_generate()


cat(">", resp_df$response)
#> > The image features a close-up of an alpaca's face, which is set against a black background. The alpaca has a yellow and gray mask on its face, with the eyes closed in what appears to be sleep or rest. The mask covers the alpaca's mouth and nose, giving it a unique appearance. The image is framed by a pink border, adding a pop of color to the overall composition. The alpaca seems calm and at ease despite the unusual accessory.

Thinking models

In May 2025, Ollama started supporting “thinking” models (more details in the post announcing the feature). Pay attention to the fact that not all reasoning models available via Ollama actually support thinking mode; as of July 2025, only three models were effectively supported (deepseek-r1, qwen3, and magistral). An up-to-date list should be available on Ollama’s website.

When thinking mode is enabled, the LLM goes through an iterative “thinking” process before providing its answer. The “thinking” process is expressed in plain English and can be seen along with the response. When thinking is enabled, response time grows considerably, so you may want to extend the timeout options.

See the following example:

strawberry_t_df <- ql_prompt(
  prompt = "How many r are there in strawberry? Provide a concise answer.",
  model = "deepseek-r1:1.5b",
  think = TRUE) |>
  ql_generate(keep_alive = "10m", timeout = 1000)

Here’s the thinking:


cat(">", strawberry_t_df$thinking |> stringr::str_replace_all(pattern = stringr::fixed("\n"), replacement = "\n > ")) 

Okay, so I need to figure out how many ’r’s are in the word “strawberry.” Hmm, let me start by writing down the word: S T R A W B E R R Y.

Wait, that doesn’t seem right. Let me check again. The spelling is S-T-R-A-W-B-E-R-R-Y. So I think I missed an ‘r’ somewhere. Let me go through each letter one by one to make sure I don’t miss any.

  1. S – nope.
  2. T – nope.
  3. R – yes, that’s the first ‘r’.
  4. A – nope.
  5. W – nope.
  6. B – nope.
  7. E – nope.
  8. R – second ‘r’.
  9. R – third ‘r’.
  10. Y – nope.

Wait a minute, I think I might have missed an ‘R’ at the end of “berry.” Let me count again: S T R A W B E R R Y. So after B comes E, then R (that’s one), then another R (two), and finally Y. So that makes three ’r’s in total.

But wait, I’m not sure if the word is spelled correctly. Maybe it’s “strawberry” without an extra ‘R’ at the end? Let me check a dictionary or something to confirm. Oh, no, it does have two more ’R’s after E and before Y. So that makes three ’r’s in total.

I think I was overcomplicating it by thinking about the word “strawberry” as if it were spelled without those extra letters. But actually, it is spelled with two more ’R’s. So the answer should be three ’r’s.

And here is the response:

cat(">", strawberry_t_df$response)

There are three ’r’s in the word “strawberry.”

About context windows and time-outs

ollama is great in enabling easy local deployment of local LLMs, but comes with some embedded defaults that may come with unintended consequences.

About the context window

If you look at the model page of one of the models available from Ollama’s website, you may well notice that some of these come with very large context windows. For example, gemma3 boasts a “128K context window”, big enough to include book-length inputs. You may well expect that, by default, this context window is fully available to you. You would, however, be mistaken: no matter the model’s capabilities, Ollama truncates the input at 2048 tokens: as a user, you would notice it only if you looked at the ollama serve logs, or because you notice unsatisfying results, as truncation happens in a way that is mostly invisible to the client. This is a known issue with Ollama, and until this is approached more sensibly by Ollama, the user should take core of this limitation themselves (quackingllama will likely include a dedicated warning in future versions). The easiest workaround is to re-create a new model with a larger context window: it’s a matter of a few seconds, following the instructions reported in the relevant issue on Ollama.

Basically, from the command line you do something like this:

$ ollama run gemma3
>>> /set parameter num_ctx 65536
Set parameter 'num_ctx' to '65536'
>>> /save gemma3-64k
Created new model 'gemma3-64k'
>>> /bye

And in a matter of seconds you will get a gemma3 model with a 64k context window, which you’ll be able to use by choosing gemma3-64k as model.

About timeout and keep_alive

Congratulations, now you can enjoy bigger context windows. This is all nice, but this makes it also more likely that you are going to stumble into time-out issues, as processing lengthy prompts can take many minutes.

There are two parameters that determine how long quackingllama will wait for a response from ollama before throwing an error.

  • one is ollama’s keep_alive argument, that basically tells how long the model should remain in memory after it is called. By default, this is “5m” for five minutes. If the model doesn’t get a response in time, it throws an error.
  • one is httr2’s timeout argument, that expresses how long the client should be waiting for a response. This defaults to “300”, as it is expressed in seconds, and corresponds to 5 minutes.

The combined effect of these two arguments may not be exactly as you expect (a 5 minute keep_alive may actually let the model run for 10 minutes, if your timeout argument is big enough), but either way, be mindful and if you do expect lengthy response times, do set both values to an adequately high value.

On the other hand, if you know you have short prompts and expect quick responses, the defaults are more efficient, and will just move on sooner if the model is stuck for whatever reason.

In the logo you may or may not recognise a quacking llama, or maybe, just a llama wearing a duck mask. The reference is obviously to two of the main tools used by this package: ollama and DuckDB. Image generated on my machine with stablediffusion.