How GetFocus Works
How to filter search results?
27 min
filters let you refine any dataset in getfocus so it contains exactly the patents you need this article covers every filter type available, when to use each one, and how to combine them effectively important to know important to know filters are applied on top of your existing dataset — they do not change the underlying search query filters stack applying multiple filters narrows results further each time llm filters llm filters landscaping patents is challenging finding the right combination of keywords to map out an entire technological domain is complex and time consuming you might miss out on important patents because you forgot to include an important keyword, or add noise because you are adding keywords that also match on unintended domains to help you out with this problem, we have introduced a feature called, llm filters llm filters use a large language model to read each patent and classify it as relevant or irrelevant based on your natural language instructions they are the most powerful filter type, but also the slowest when to use llm filters when to use llm filters llm filters are particularly useful in the following scenarios traditional keyword or vector search filters aren't giving you accurate enough results your filtering criteria is complex and hard to express as keywords (e g a specific process with multiple technical conditions) you want to filter in plain language rather than constructing boolean expressions example instruction "find patents related to pyrolysis, a thermochemical decomposition process that operates under an inert atmosphere, typically reaching temperatures between 400°c and 650°c " how does llm filters work internally? how does llm filters work internally? llm filters operate by utilizing user instructions to identify which type of patents should be matched through the filter here's a step by step breakdown user instruction the process begins when you provide specific instructions detailing the kinds of patents you want to filter llm processing the large language model (llm) reviews each patent within the dataset the filter then scans through patents title, abstract, claims and description to classify each patent as either "relevant" or "irrelevant" based on your instructions this step is contingent upon the size of your dataset and could take anywhere from 5 to 60 minutes classification and storage after the llm has finished its evaluation, the publication numbers for patents deemed "relevant" or "irrelevant" are collected and stored in our database filter application once the llm filter process is complete, you can effortlessly apply it to the dataset this enables quick and efficient filtering, making it easy for you to access the information that matters most what are the limitations of llm filters? what are the limitations of llm filters? while llm filters can be incredibly useful, it's important to be aware of their limitations dataset size restriction due to the cost and time associated with running llm filters, only the first 10,000 results in a dataset are processed processing speed the speed of llm filters is inherently tied to the speed of the llm itself although we employ parallelization techniques to boost efficiency, the average processing speed is approximately 12 patents per second family level we process only the representative member from a patent family this is the member that best matches your filters and is displayed by default on the family card consequently, other family members are not checked ⚠️ important datasets with an applied llm filter become static — new patents published after the filter ran will not appear automatically click re run to process newly added patents (already processed patents are skipped) what are best practices when using llm filtering? what are best practices when using llm filtering? to make the most of llm filtering, follow these best practices 💡 tip start with fast filters (keyword, patent offices, status, year) to reduce your dataset size before running an llm filter 💡 tip be specific in your instructions vague instructions lead to inconsistent classification instead of "find patents about batteries", write "find patents describing solid state lithium ion batteries with ceramic electrolyte separators " https //docs getfocus eu/prompting like a pro#undefined llm filter prompts filter filter s s keyword filtering keyword filtering keyword search filters your dataset to patent families that contain — or don't contain — specific words how to use how to use open the filters panel and expand keyword filtering enter terms in the include field and/or the exclude field choose which patent sections to search against (see below) click filter to apply include vs exclude logic include vs exclude logic include uses and by default — every keyword you add must be present in the patent family example adding " bike " and " tire " returns only families that contain both words exclude uses or by default — any family containing an excluded keyword is removed example excluding " saddle " and " gears " removes any family that contains either word the logic behind separating the default relationship between include and exclude keywords is that you typically want to zoom in on a particular concept in the dataset with include, and want to exclude many more aspects with exclude which sections to search which sections to search you can match keywords against invention title , abstract , claims , and description ⚠️ important consider unchecking description to improve result quality the description section often contains broad background text that isn't specific to the invention's topic, which can cause false positives creating custom boolean keyword filters creating custom boolean keyword filters normally, when you enter multiple keywords (e g , bike tyre) in an “include” field, they’re combined with an and operator by default—meaning the system looks for patents or documents containing both words to override that default, you can type bool before your keywords and manually include other boolean operators 💡 important to override the default logic, prefix your keywords with bool and write your own boolean expression example example bool bike or bicycle (in the “include” field) this searches for items that contain either “bike” or “bicycle” (or both) this function can be useful to cover various ways of referring to the same concept it bypasses the default requirement that both words must be present important boolean operators important boolean operators for for custom boolean filters custom boolean filters operator what it does example bool bool (prefix) enables custom boolean mode bool bike or bicycle matches zero or more characters combus → combustion, combust… ? ? matches exactly one character te?t → test, text, tent… ( ) ( ) groups words that must appear together as a phrase (bike tire) → "bike tire" in sequence " " " " exact phrase match in fixed order "high catalytic activity" proximity search words within n words of each other "machine learning" 5 ⚠️ important when using bool mode, parentheses ( ) no longer enforce word sequence — they group terms with and logic instead for strict phrase matching in boolean mode, use " " quotes custom boolean filtering summary custom boolean filtering summary use bool to override default and and manually apply other operators like or use and ? to allow flexible character matching, but be cautious with wildcards to avoid performance issues use () to group words that must appear right next to each other use "" for exact phrases in a fixed order use to allow two words to appear near each other, with a specified distance by combining these tools, you gain fine grained control over your searches, enabling you to include or exclude results more precisely ⚠️ important avoid overusing wildcards (e g a b c ) the system has to evaluate every possible character combination, which can slow down or overload the search publication numbers publication numbers the publication numbers filter under advanced filters, helps to find specific patents in the search results using their publication number the standard patent publication format \[2 letter country code] \[patent number] \[kind code] example us 20180226168 a1 to filter for multiple patents at once, separate publication numbers with a new line, semicolon, pipe (|) , or tab organization organization and and ultimate ultimate owner owner filters use these filters to scope your dataset by who owns the patents organization — filters by the direct assignee listed on the patent ultimate owner — groups patents by the parent company at the top of the corporate hierarchy use this to capture patents held by subsidiaries under a single umbrella how to use (both filters) how to use (both filters) open the filter panel and expand organization or ultimate owner either scroll the list and tick/untick entries, or type a name in the search field to find a specific company click filter to apply 💡 tip use ultimate owner ultimate owner when researching a large corporation to ensure you capture patents filed by its subsidiaries and acquired companies patent offices patent offices limits results to patent families that have at least one member filed at the selected office(s) how to use how to use expand patent offices in the filters panel select or search for the offices you want click filter status status if you only want to see patent families that have members with a certain status, you can filter by status there are 5 statuses 'active', 'in force', 'pending', 'inactive', and 'unknown' in force in force families have at least 1 member that is currently granted and still in force pending pending families are those where all patent family members are currently still pending active active families are those which are either in force or pending inactive inactive families are those where all patent family members have expired for any reason unknown unknown families are those where we do not have good information from the patent office about the current state of the patent family members, and thus the family as a whole how to use how to use expand patent offices in the filters panel select or search for the offices you want click filter publication year, priority year, application year, expiration year filters publication year, priority year, application year, expiration year filters both filters work the same way — you select a date range, and only patent families within that range are shown publication year preset options last year, last 3, 5, 10, or 20 years custom enter a range like 2010–2017 expiration year preset options next year, next 3, 5, 10, or 20 years custom enter a range like 2025–2030 click filter to apply after entering your selection in the advanced filters section you can find filters for priority year and application year both filters work the same way — you select a date range, and only patent families within that range are shown 💡 all date filters can be combined, giving you greater flexibility and precision when exploring your dataset click filter to apply after entering your selection similarity similarity the similarity (%) filter allows you to set a similarity threshold to your dataset e g the minimum similarity score to still be included in your search results it can be adjusted by setting a specific number in the input field or by simply dragging the bar similarity thresholds are an important part of vector search since vector search works through grouping and including similar documents to your query, at some point you will start getting less desirable results for example, if you search for 'dog food', your top results are likely to all contain dog food patent family members however, at some point further down the similarity list, there may be 'cat food' patent families, as cat food is at least somewhat related to dog food the height of the bars on the chart indicates the number of patents at each similarity level how to use how to use expand similarity in the filters panel drag the slider or type a value (e g 83 7 ) the bar chart shows how many patents fall at each similarity level — use it to identify where results start becoming less relevant click filter to apply 💡 tip scroll through your results before setting a threshold find the point where results start feeling off topic, note the similarity score at that point, and use it as your cutoff advanced filter advanced filter vector search in filters vector search in filters just as you can use vector search to create initial search results, you can also use it to filter them the vector search filter under advanced filters allows you to filter search results by using phrases, sentences, or even entire paragraphs unlike keyword search, the vector search filter works on meaning rather than exact words use it when the concept you're looking for can be expressed in many different ways how to use how to use open the filters panel and expand advanced filters → vector search enter a phrase, sentence, or paragraph in the positive field (what you want) and/or the negative field (what you don't want) click filter to apply ‼️ please note that the maximum input for the vector search filter is 384 words, any input after 384 words will not be vectorized and will thus not affect the filtering outcome positive vs negative fields positive vs negative fields positive uses and logic — multiple entries must each be semantically present negative uses or logic — patent families similar to any negative entry are excluded 💡 tip write sentences and paragraphs, not keywords the more context you give, the more accurate the semantic match example instead of "synthetic leather", write "artificial leather, also known as synthetic leather or faux leather, is a material designed to simulate the look and feel of genuine leather " c c ountry of origin these filters scope your dataset by the country where the patent holding organisation is headquartered — regardless of where the patents are filed country of origin (ultimate owner) — uses the hq country of the top level parent company country of origin (organization) — uses the hq country of the direct assignee how to use (both filters) how to use (both filters) expand the relevant country of origin filter select/deselect countries from the list, or type a country name to search click filter to apply


