- Semantic Search Semantic search enables users to query Knowledge Bases using natural language. When searching semantically, you reference the content column in your SQL statement. MindsDB will interpret the input as a semantic query and use vector-based similarity to find relevant results.
- 
Metadata Filtering
It allows users to query Knowledge Bases based on the available metadata fields. These fields can be used in the WHEREclause of a SQL statement.
- Relevance Filtering Every semantic search result is assigned a relevance score, which indicates how closely a given entry matches your query. You can filter results by this score to ensure only the most relevant entries are returned.
- Hybrid Search Hybrid search combines the flexibility of semantic search and exact keyword matching. Learn more here.
Learn more about features of knowledge bases available via SQL API.
find() Function
Knowledge bases provide an abstraction that enables users to see the stored data.
Note that here a sample knowledge base created and inserted into in the previous Example sections is searched.
Data Stored in Knowledge Base
The following columns are stored in the knowledge base.id
It stores values from the column defined in the id_column parameter when creating the knowledge base. These are the source data IDs.
chunk_id
Knowledge bases chunk the inserted data in order to fit the defined chunk size. If the chunking is performed, the following chunk ID format is used: <id>:<chunk_number>of<total_chunks>:<start_char_number>to<end_char_number>.
chunk_content
It stores values from the column(s) defined in the content_columns parameter when creating the knowledge base.
metadata
It stores the general metadata and the metadata defined in the metadata_columns parameter when creating the knowledge base.
distance
It stores the calculated distance between the chunk’s content and the search phrase.
relevance
It stores the calculated relevance of the chunk as compared to the search phrase. Its values are between 0 and 1.
Note that the calculation method of 
relevance differs as follows:- When the ranking model is provided, the default relevanceis equal or greater than 0, unless defined otherwise in theWHEREclause.
- When the reranking model is not provided and the relevanceis not defined in the query, then no relevance filtering is applied and the output includes all rows matched based on the similarity and metadata search.
- When the reranking model is not provided but the relevanceis defined in the query, then the relevance is calculated based on thedistancecolumn (1/(1+ distance)) and therelevancevalue is compared with this relevance value to filter the output.
Semantic Search
Users can query a knowledge base using semantic search by providing the search phrase (calledcontent) to be searched for.
When querying a knowledge base, the default values include the following:
- 
relevance
 If not provided, its default value is equal to or greater than 0, ensuring there is no filtering of rows based on their relevance.
- 
LIMIT
 If not provided, its default value is 10, returning a maximum of 10 rows.
Note that when specifying both The query extracts 20 rows (as defined in the 
relevance and LIMIT as follows:LIMIT clause) that match the defined content. Next, these set of rows is filtered out to match the defined relevance.relevance in order to get only the most relevant results.
relevance filter, the output is limited to only data with relevance score of the provided value. The available values of relevance are between 0 and 1, and its default value covers all available relevance values ensuring no filtering based on the relevance score.
Users can limit the number of rows returned.
Metadata Filtering
Besides semantic search features, knowledge bases enable users to filter the result set by the defined metadata.relevance column values are not calculated.
Users can do both, filter by metadata and search by content.