Skip to main content

4 posts tagged with "v0.3.1"

View All Tags

· 2 min read
Yung-Hsiang Hu

Kuwa's RAG application (DocQA/WebQA/DatabaseQA/SearchQA) supports customization of advanced parameters through the Bot's model file starting from version v0.3.1, allowing a single Executor to be virtualized into multiple RAG applications. Detailed parameter descriptions and examples are as follows.

Parameter Description

The following parameter contents are the default values for the v0.3.1 RAG application.

Shared Parameters for All RAGs

PARAMETER retriever_embedding_model "thenlper/gte-base-zh" # Embedding model name
PARAMETER retriever_mmr_fetch_k 12 # MMR fetch k chunks
PARAMETER retriever_mmr_k 6 # MMR fetch k chunks
PARAMETER retriever_chunk_size 512 # Length of each chunk in characters (not restricted for DatabaseQA)
PARAMETER retriever_chunk_overlap 128 # Overlap length between chunks in characters (not restricted for DatabaseQA)
PARAMETER generator_model None # Specify which model to answer, None means auto-selection
PARAMETER generator_limit 3072 # Length limit of the entire prompt in characters
PARAMETER display_hide_ref False # Do not show references

DocQA, WebQA, SearchQA Specific Parameters

PARAMETER crawler_user_agent "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36" # Crawler UA string

SearchQA Specific Parameters

PARAMETER search_advanced_params "" # Advanced search parameters (SearchQA only)
PARAMETER search_num_url 3 # Number of search results to retrieve [1~10] (SearchQA only)

DatabaseQA Specific Parameters

PARAMETER retriever_database None # Path to vector database on local Executor

Usage Example

Suppose you want to create a DatabaseQA knowledge base and specify a model to answer, you can create a Bot,
select DocQA as the base model, and fill in the following Modelfile.

PARAMETER generator_model "model_access_code" # Specify which model to answer, None means auto-selection
PARAMETER generator_limit 3072 # Length limit of the entire prompt in characters
PARAMETER retriever_database "/path/to/local/database/on/executor" # Path to vector database on local Executor

· One min read
Yung-Hsiang Hu

Kuwa v0.3.1 has preliminary support for commonly used visual language models (VLMs). In addition to text inputs, such models can also take images as input and respond to user instructions based on the content of the images. This tutorial will guide you through the initial setup and usage of VLMs.

· 5 min read
Yung-Hsiang Hu

Kuwa v0.3.1 adds Kuwa Speech Recognizer based on the Whisper speech recognition model, which can generate transcripts by uploading audio files, supporting timestamps and speaker labels.

Known Issues and Limitations

Hardware requirements

The default Whisper medium model is used with speaker diarization disabled. The VRAM consumption on GPU is shown in the following table.

Model NameNumber of parametersVRAM requirementRelative recognition speed
tiny39 M~1 GB~32x
base74 M~1 GB~16x
small244 M~2 GB~6x
medium769 M~5 GB~2x
large1550 M~10 GB1x
pyannote/speaker-diarization-3.1
(Speaker Diarization)
-~3GB-

Known limitations

  1. Currently, the input language cannot be detected automatically and must be specified manually.
  2. Currently, the speaker identification module is multi-threaded, causing the model to be reloaded each time, resulting in a longer response time.
  3. Content is easily misjudged when multiple speakers speak at the same time.