Chat with set
The "chat with set" function lets you analyze a set of patents using natural language with a Large Language Model (LLM). You can ask your dataset any question, the LLM will analyze it and provide you with answers.
In this section, we explain to you how this feature works and give you some best practices when using "chat with set".
The LLM behind "chat with set" lets you ask questions about your patent dataset. The LLM will then read the patents and provide you with an answer such that you don't have to read them all yourself. The workflow is always the same. You ask your question > the LLM figures out if it needs to apply any filters > retrieves the relevant data > and answers your question.
At the moment, LLMs have a limited context window (think of this as its memory). This means that the LLM cannot query your entire dataset at once if it is too large. As a rule of thumb, the LLM will do the following:
- Dataset with < 1000 patent families: The LLM will extract the families' Titles, Abstracts, Claims, Organizations, and Patent offices, to answer your questions.
- Dataset with >1000 patent families: The LLM will sample up to 1000 families and extract the families' Titles, Abstracts, Claims, Organizations, and Patent offices, to answer your questions.
The above-mentioned number of families is a rule of thumb. The exact number of families that can be analyzed by the LLM depends on how many words are used in the titles, abstracts, and claims of the inventions. If your domain has especially long abstracts or claims, the total number of families analyzed will be lower, and vice versa.
Please note that if you ask questions that can be interpreted in different ways, the LLM can give different answers to the same question over multiple iterations. Also note that when you ask follow-up questions, the LLM will need to query the database again to respond. This means that every question is a fresh interaction, and that every interaction is a "fresh" interaction.
Whenever the LLM refers to specific patents in its responses to your questions, it will hyperlink the patents such that you can easily analyze them further.
The LLM will always show you which patents it took into consideration when answering your question. To see them, click: "Click here to see the patents that were considered", such as in the example below.
You can still manually filter your dataset with the filter options on the right-hand side before using the 'chat with set' feature. Chat with set will only analyze the inventions that are in your filtered dataset. For example, if you first filter on only US patents, and a specific organization, chat with set will only answer questions based on those inventions.
Chat with set can also control various filters. These are the filters it has control over and a few examples on how to use it: Publication year: You can ask chat with set to "look for the main innovation trends in the past 3 years". It will then apply the publication year filter itself before answering your question. Patent offices: You can ask chat with set to "identify the main topics of invention in Chinese patents in this dataset". It will then apply the Chinese patent office filter itself before answering your question. Organization: You can chat with the set to "analyze all inventions by "Samsung" in this dataset and explain their innovation efforts in this dataset". It will then apply the organization filter for "Samsung" and related organizations itself before answering your question. Publication Number: You can ask chat with set to "please compare patent US-10462849-B1, & US-10383371-B2, and explain the main differences to me". It will then apply the publication number filter itself before answering your question. Status: You can ask chat with set to "Analyze all pending patents in this dataset and tell me about their main topics". It will then apply the status filter itself before answering your question. **For additional filters, such as similarity, keywords, etc, you still have to filter manually before using chat with set to analyze the results.
You could ask:
- What are the latest trends in this domain?
- What are the main trends in this domain over the past 5 years?
- What are the topics being discussed in this dataset and how many inventions belong to each topic?
- What are the patent numbers that relate to (your topic)? List the publication numbers in a comma-separated list.
- What are the top 3 startups in this domain? Provide a summary of their most important inventions.
- Compare the portfolio's of Organization X and Organization Y and explain the main differences to me.
- What are the differences between patent X and patent Y?
- How did this technological domain develop? Compare 5 years ago with the current year and explain the differences to me.