EVALUATE KNOWLEDGE_BASE Syntax
With the EVALUATE KNOWLEDGE_BASE command, users can evaluate the relevancy and accuracy of the documents and data returned by the knowledge base.
Below is the complete syntax that includes both required and optional parameters.
test_table
This is a required parameter that stores the name of the table from one of the data sources connected to MindsDB. For example, test_table = my_datasource.my_test_table defines a table named my_test_table from a data source named my_datasource.
This test table stores test data commonly in form of questions and answers. Its content depends on the version parameter defined below.
Users can provide their own test data or have the test data generated by the EVALUATE KNOWLEDGE_BASE command, which is performed when setting the generate_data parameter defined below.
version
This is an optional parameter that defines the version of the evaluator. If not defined, its default value is doc_id.
-
version = 'doc_id'The evaluator checks whether the document ID returned by the knowledge base matched the expected document ID as defined in the test table. -
version = 'llm_relevancy'The evaluator uses a language model to rank and evaluate responses from the knowledge base.
generate_data
This is an optional parameter used to configure the test data generation, which is saved into the table defined in the test_table parameter. If not defined, its default value is false, meaning that no test data is generated.
Available values are as follows:
-
A dictionary containing the following values:
from_sqldefines the SQL query that fetches the test data. For example,'from_sql': 'SELECT id, content FROM my_datasource.my_table'. If not defined, it fetches test data from the knowledge base on which theEVALUATEcommand is executed:SELECT chunk_content, id FROM my_kb.countdefines the size of the test dataset. For example,'count': 100. Its default value is 20.
When providing thefrom_sqlparameter, it requires specific column names as follows:-
With
version = 'doc_id', thefrom_sqlparameter should contain a query that returns theidandcontentcolumns, like this:'from_sql': 'SELECT id_column_name AS id, content_column_names AS content FROM my_datasource.my_table' -
With
version = 'llm_relevancy', thefrom_sqlparameter should contain a query that returns thecontentcolumn, like this:'from_sql': 'SELECT content_column_names AS content FROM my_datasource.my_table'
-
A value of
true, such asgenerate_data = true, which implies that default values forfrom_sqlandcountwill be used.
evaluate
This is an optional parameter that defines whether to evaluate the knowledge base. If not defined, its default value is true.
Users can opt for setting it to false, evaluate = false, in order to generate test data into the test table without running the evaluator.
llm
This is an optional parameter that defines a language model to be used for evaluations, if version is set to llm_relevancy.
If not defined, its default value is the reranking_model defined with the knowledge base.
Users can define it with the EVALUATE KNOWLEDGE_BASE command in the same manner.
save_to
This is an optional parameter that stores the name of the table from one of the data sources connected to MindsDB. For example, save_to = my_datasource.my_result_table defines a table named my_result_table from the data source named my_datasource. If not defined, the results are not saved into a table.
This table is used to save the evaluation results.
By default, evaluation results are returned after executing the EVALUATE KNOWLEDGE_BASE statement.
Evaluation Results
When usingversion = 'doc_id', the following columns are included in the evaluation results:
totalstores the total number of questions.total_foundstores the number of questions to which the knowledge bases provided correct answers.retrieved_in_top_10stores the number of top 10 questions to which the knowledge bases provided correct answers.cumulative_recallstores data that can be used to create a chart.avg_query_timestores the execution time of a search query of the knowledge base.namestores the knowledge base name.created_atstores the timestamp when the evaluation was created.
version = 'llm_relevancy', the following columns are included in the evaluation results:
avg_relevancystores the average relevancy.avg_relevance_score_by_kstores the average relevancy at k.avg_first_relevant_positionstores the average first relevant position.mean_mrrstores the Mean Reciprocal Rank (MRR).hit_at_kstores the Hit@k value.bin_precision_at_kstores the Binary Precision@k.avg_entropystores the average relevance score entropy.avg_ndcgstores the average nDCG.avg_query_timestores the execution time of a search query of the knowledge base.namestores the knowledge base name.created_atstores the timestamp when the evaluation was created.